Cover Image
Trutz Fries

Research YouTube-Channels with the help of Python and the YouTube-API

11/02/2023 • Reading time: ca. 5 min • by Trutz Fries

If you want to display ads on YouTube, it is important to consider which channels or videos your ads are shown on, since they are related to your target group.

This article will teach you how to find suitable channels on YouTube using specific keywords and a Python script.

TL/DR: The link to the notebook can be found at the end of the article.

  1. Necessary libraries
  2. Relevant functions
  3. How to start the research
  4. Listing the data

Necessary libraries

First off, we have to install specific libraries that are necessary to access the YouTube API and to save the results as an Excel-file later on.

!pip install pandas openpyxl
!pip install google-api-python-client
!pip install python-slugify

We also need to import these libraries:

import os
import google_auth_oauthlib.flow
import googleapiclient.discovery
import googleapiclient.errors
import json
import time
import pandas as pd
from datetime import datetime
from slugify import slugify

To access the interface, you need a Google-API-Key. An API-Key is an unique identificationnumber which is required during the exchange of two programs. This exchange occurs through interfaces, also known as APIs and the API-Key is able to grant access to the requested data on the API-server.

You can access the YouTube API right here →, but first you need to generate your own API-Key.

This can be done here:

https://developers.google.com/youtube/registering_an_application?hl=en

YouTube API explained
YouTube API

Once you're done, you have to save the key in a .env file, or you can directly embed it in the script.

# Load variables from .env file
load_dotenv('.env')

# Use variables
api_key = os.getenv('YOUTUBE_API_KEY')

# Alternative: Set it directly
# https://console.cloud.google.com/apis/credentials
# api_key = "xxx"

Pay attention: The YouTube API only allows about 1000 requests per day. Therefore larger research projects have to be well-planned.

Relevant functions

Now that everything we need for the sript to work is now installed, we can move on to the interesting part.

With the help of the following function, we can search YouTube and obtain either relevant channels or videos as a result:

# Searches channels for a query
def search_youtube(query, type = "channel"):
    youtube = googleapiclient.discovery.build("youtube", "v3", developerKey=api_key)

    next_page_token = None
    all_results = []
    i = 0

    while True:
        i = i + 1
        print(f"Query: {query} | Type {type} | Page: {i}")
        request = youtube.search().list(
            part="snippet",
            type=type, # channel of video
            q = query, # Set below
            regionCode = regionCode, # Set below
            maxResults=50,
            order="date",
            pageToken=next_page_token
        )
        response = request.execute()
        all_results += response['items']

        time.sleep(2) # Avoid rate limits

        if 'nextPageToken' in response:
            next_page_token = response['nextPageToken']
        else:
            break

    return all_results

Here's an example:

response = search_youtube("amazon vendor", "video", 1)
print(json.dumps(response[:2], indent=2)) # First three elements

And this is the response from the YouTube API (extract):

[
  {
    "kind": "youtube#searchResult",
    "etag": "1gd_KoispJGxmVgkmnYZ6X4L0X8",
    "id": {
      "kind": "youtube#channel",
      "channelId": "UCB757sBODw3eNPUl8jbdRZQ"
    },
    "snippet": {
      "publishedAt": "2023-11-04T11:20:18Z",
      "channelId": "UCB757sBODw3eNPUl8jbdRZQ",
      "title": "Amazon Amazing Deals",
      "description": "Welcome to Amazon Amazing Dealsamazon online products, amazon sell products online, how to sell digital products online on ...",
      "thumbnails": {
        "default": {
          "url": "https://yt3.ggpht.com/naoFnpSQ7xsrImq_N1sit9okp0GZur3MGk_lPxL4wGbVg8kzunMRY3L3Aw-zDThccgPyPEO7Iw=s88-c-k-c0xffffffff-no-rj-mo"
        },
        "medium": {
          "url": "https://yt3.ggpht.com/naoFnpSQ7xsrImq_N1sit9okp0GZur3MGk_lPxL4wGbVg8kzunMRY3L3Aw-zDThccgPyPEO7Iw=s240-c-k-c0xffffffff-no-rj-mo"
        },
        "high": {
          "url": "https://yt3.ggpht.com/naoFnpSQ7xsrImq_N1sit9okp0GZur3MGk_lPxL4wGbVg8kzunMRY3L3Aw-zDThccgPyPEO7Iw=s800-c-k-c0xffffffff-no-rj-mo"
        }
      },
      "channelTitle": "Amazon Amazing Deals",
      "liveBroadcastContent": "none",
      "publishTime": "2023-11-04T11:20:18Z"
    }
  }
  ...

Besides that, we can obtain important information about a channel using a second script.

# Returns channel details for a channelId
def get_channel_details(channel_id):
    youtube = googleapiclient.discovery.build("youtube", "v3", developerKey=api_key)
    request = youtube.channels().list(
        part=["snippet", "statistics"],
        id = channel_id
    )
    response = request.execute()
    return response['items']

The script needs the channel-ID e.g. UCxkIzPnPzWLz4IeuxIROflg

Example:

response = get_channel_details('UCxkIzPnPzWLz4IeuxIROflg')
print(json.dumps(response, indent=2))

Result:

[
  {
    "kind": "youtube#channel",
    "etag": "tF64tb7FLNTj7rU_3khtRJfqyn4",
    "id": "UCxkIzPnPzWLz4IeuxIROflg",
    "snippet": {
      "title": "Travis Marziani",
      "description": "This channel will teach you how to leverage Amazon FBA, Shopify, Adwords, Facebook ads and other secret internet marketing strategies to create a business you love, create freedom and enjoy the journey.  \n\nI have done well over 7 figures of sales on and off Amazon, and want to share what I wish I would have known when I first started, including how to create a passion product.  \n\nI was  depressed and stuck in the corporate world , when I quit that to start  an online dance clothing business with my mom.  After years of growing the revenue but struggling to make enough profit to support my self I had to move back in with my parents.  It was at this point I realized I want to create a business I was passionate about, and I took all the lessons learned and created my first passion product which went on to make me six figures passive income, and more importantly I loved my business, my life and I started enjoying the journey.\n\nFollow Me on Instagram: @travismarziani for daily tips",
      "customUrl": "@travismarziani",
      "publishedAt": "2015-07-09T18:38:41Z",
      "thumbnails": {
        "default": {
          "url": "https://yt3.ggpht.com/ytc/APkrFKbRNTzyFG7dJrtniw0I_PpU8r6qAiFh5c7BNgKHTQ=s88-c-k-c0x00ffffff-no-rj",
          "width": 88,
          "height": 88
        },
        "medium": {
          "url": "https://yt3.ggpht.com/ytc/APkrFKbRNTzyFG7dJrtniw0I_PpU8r6qAiFh5c7BNgKHTQ=s240-c-k-c0x00ffffff-no-rj",
          "width": 240,
          "height": 240
        },
        "high": {
          "url": "https://yt3.ggpht.com/ytc/APkrFKbRNTzyFG7dJrtniw0I_PpU8r6qAiFh5c7BNgKHTQ=s800-c-k-c0x00ffffff-no-rj",
          "width": 800,
          "height": 800
        }
      },
      "localized": {
        "title": "Travis Marziani",
        "description": "This channel will teach you how to leverage Amazon FBA, Shopify, Adwords, Facebook ads and other secret internet marketing strategies to create a business you love, create freedom and enjoy the journey.  \n\nI have done well over 7 figures of sales on and off Amazon, and want to share what I wish I would have known when I first started, including how to create a passion product.  \n\nI was  depressed and stuck in the corporate world , when I quit that to start  an online dance clothing business with my mom.  After years of growing the revenue but struggling to make enough profit to support my self I had to move back in with my parents.  It was at this point I realized I want to create a business I was passionate about, and I took all the lessons learned and created my first passion product which went on to make me six figures passive income, and more importantly I loved my business, my life and I started enjoying the journey.\n\nFollow Me on Instagram: @travismarziani for daily tips"
      },
      "country": "US"
    },
    "statistics": {
      "viewCount": "21222438",
      "subscriberCount": "330000",
      "hiddenSubscriberCount": false,
      "videoCount": "552"
    }
  }
]

How to start the research

With these two scripts we can now begin the research, but before that, we need to define three parameters:

  • The regionCode, which allows us to target a specific region for channels
  • Our keywords (queries), that determine what we are searching for
  • The amount of pages (max_pages) we receive after each search. The YouTube API provides a maximum of 50 results per page.
regionCode = "US"

queries = ["amazon marketing", "amazon fba"] # Add queries here!

max_pages = 2 # Defines the number of results per page (50 results per page)

Now we can finally start the main part of the script:

# Create empty array
data = []

channel_ids = []

for query in queries:
    channels = search_youtube(query, "channel")
    videos   = search_youtube(query, "video")

    channel_ids_channels = [item['snippet']['channelId'] for item in channels]
    channel_ids_videos = [item['snippet']['channelId'] for item in videos]

    channel_ids_temp = list(set(channel_ids_channels + channel_ids_videos))

    print(f"Query: {query}: Found {len(channel_ids_channels)} channels and {len(channel_ids_videos)} videos")

    channel_ids.append(channel_ids_temp)

# Flatten list of lists
channel_ids_flat = [item for sublist in channel_ids for item in sublist]

# Remove duplicates
channel_ids_flat = list(set(channel_ids_flat))

print(f"Found {len(channel_ids_flat)} unique channel_ids")

# Search for channels for this query
for channel_id in channel_ids_flat:
    channel_details = get_channel_details(channel_id)

In addition to the channel-ID, we also receive information such as the number of subscribers or the total amount of views the channel has generated. This gives us the opportunity to sort out channels that do not generate enough attention, ensuring that only relevant channels with a large audience are listed in the end.

 for detail in channel_details:
        data.append({
            'id': detail['id'],
            'description': detail['snippet'].get('description', ''),
            'customUrl': detail['snippet'].get('customUrl', ''),
            'link': 'https://www.youtube.com/c/' + detail['snippet'].get('customUrl', ''),
            'country': detail['snippet'].get('country', ''),
            'viewCount': detail['statistics'].get('viewCount', ''),
            'subscriberCount': detail['statistics'].get('subscriberCount', ''),
            'videoCount': detail['statistics'].get('videoCount', '')
        })

# Convert the list of dicts to a DataFrame
df = pd.DataFrame(data)

# Make columns integers
df = df.astype({"viewCount":"int","subscriberCount":"int","videoCount":"int"})

print(f"Before duplicates: {len(df)}")

# Remove duplicates
df = df.drop_duplicates(subset='id', keep="first")

print(f"Before filter: {len(df)}")

# Filter only channels which have more than 50 subscribers
df = df[df['subscriberCount'] > 50]

# Filter only channels which have more than 5 videos
df = df[df['videoCount'] > 5]

# Sort df by subscribers
df = df.sort_values('subscriberCount', ascending=False)

print(f"Final result: {len(df)}")

# Get the current date and time and format it
current_datetime = datetime.now().strftime("%Y-%m-%d-%H-%M")
query_slug = slugify(query)
filename = f"{current_datetime}-{regionCode}-channel_details.xlsx"

# Save the DataFrame to an Excel file with the prefixed filename
df.to_excel(filename, index=False)

df

The script performs the following steps for each keyword:

  • Find relevant channels
  • Find relevant videos
  • Identify the channel-IDs of the channels and videos and remove duplicates
  • Gather data from each channel-ID
  • Filter out channels that do not meet the required number of subscribers
  • Filter out channels that do not have the requested keyword in their description
  • Save everything as an Excel-file

Listing the data

To structure the potentially extensive list of YouTube channels, you can transfer the list into an Excel-file. However, it is advisable to review the list and sort out channels that do not meet your requirements, as there may still be channels that do not target yout desired audience.

The entire Google Colab book can be found here →.

To use this script, you must first apply your API-Key.

Good Luck!

Would you like to have a better overview on Amazon?
Monitor your listings 14 days for free!
Do you have any questions? Don't hesitate to call us or send us an email!
Tel. +49 221-29 19 12 32 | info@amalytix.com