Amazon Keywords Research: Brand Analytics Python Tutorial

Understanding Amazon Brand Analytics Search Term Report
How to Find Missing Amazon Keywords with Python
Required Data for Amazon Keyword Analysis
Step-by-Step Amazon Keyword Discovery Process
Key Takeaways for Amazon Keyword Optimization

Maximizing your Amazon keyword strategy is crucial for product visibility. This guide shows you how to use Amazon Brand Analytics and Python to identify missing keywords that your competitors rank for. By analyzing search term data programmatically, you can uncover valuable keyword opportunities and optimize your listings for better search performance.

For more Amazon optimization strategies, check out our guides on Amazon SEO basics and backend keyword optimization.

Understanding Amazon Brand Analytics Search Term Report

The Brand Analytics search term report is published weekly by Amazon. It contains up to 1 Million most frequently used search terms on Amazon for the US or other countries. Moreover, it points out the three ASINS most frequently clicked by customers after they searched for the respective search term. You will get further information about the report in our article about Brand Analytics Searchterm Report.

Amazon Brand Analytics Searchterm Report

How to Find Missing Amazon Keywords with Python

The idea of identifying missing keywords for your listing is quite simple and can be broken down into five steps:

We determine ...

... for which search terms your product is clicked
... which other ASINs are clicked for these search terms determined in 1.
... for which search terms the ASINs from 2. are clicked on
... which search terms are contained in 3. but not in 1.
... which search terms from 4. are not included in the listing

Once coded, the script can be executed for all products. But more about this later.

Required Data for Amazon Keyword Analysis

To determine the keyword ideas, you need two sets of data:

a current Brand Analytics search term report. The report is available for brand owner on Amazon and can be downloaded from Brand Analytics.
a CSV file with your own listing data.

The CSV file with the listing data should contain the following columns:

Marketplace (e.g. "US")
ASIN
Title
Bullet 1
Bullet 2
Bullet 3
Bullet 4
Bullet 5

Optional, you can add the hidden/backend keywords to the script (also called "hidden keywords") if you want to check them as well.

Step-by-Step Amazon Keyword Discovery Process

Let's go through the steps using an example. Here we take the "RUF Whipped Cream" as the product.

Example Storage Container

Now we go through the steps explained above. Before that, we have to download our data.

Step 1: Load Amazon Brand Analytics Data

First of all, we load the Brand Analytics search term data:

fileNameBA = "./Amazon-Searchterms-US.csv"
thousandSeparator = "," # US
columns = ["Search Term","Search Frequency Rank","#1 Clicked ASIN","#2 Clicked ASIN","#3 Clicked ASIN"] # US

# Load data
dfBA = pd.read_csv(fileNameBA, thousands=thousandSeparator, usecols=columns, engine="python", error_bad_lines=True, encoding='utf-8', skiprows=1,  sep=",")

# Rename columns
dfBA.columns = ['searchterm', 'rank', '1', '2', '3']

# Unmelt dfBA from wide to long
dfBA_Long = dfBA.melt(id_vars=["searchterm", "rank"], var_name="position", value_name="ASIN")

# Make position an int
dfBA_Long = dfBA_Long.astype({"position": int})

# Drop N/A
dfBA_Long = dfBA_Long.dropna()

# Reset index and sort
dfBA_Long_WithIndex = dfBA_Long.set_index('searchterm')
dfBA_Long_WithIndex = dfBA_Long_WithIndex.sort_index()

We have depivotated the data so that it is now in the following format (excerpt):

searchterm	rank	position	ASIN
shaw hart	237,620	3	B08T9TTC67
nautical rug	344,663	1	B01DVDBSWG
belts for boys	167,907	1	B08P8K1HRF
hori switch controller	368,320	1	B08KT7ML1R
rear window prime video	352,193	2	B00D5UK8DQ

Load listing data

By now we still need the listing information. We assume the information has already been loaded in a dataframe called df_products:

Listing-Daten für unser Produkt

By now we have everything we need. Time for our five-step approach!

Step 2: Find Current Amazon Keywords for Your Product

First, we determine in the Brand Analytics search term report for which keywords our product is found:

# Get "owned keywords", i.e. keywords the ASIN in question is clicked on already
ASIN = 'B0000BYCGF'
ownedKeywords = []
foundKeywords = dfBA_Long[dfBA_Long['ASIN'] == ASIN].searchterm.unique()
ownedKeywords.append(foundKeywords)

the result:

searchterm	rank	position	ASIN
glass bowls with lids	48,139	1	B0000BYCGF
pyrex bowls with lids	150,202	1	B0000BYCGF
glass bowl with lid	394,231	1	B0000BYCGF
glass bowls with lids food storage	480,649	1	B0000BYCGF
pyrex bowls	53,075	2	B0000BYCGF
pyrex glass bowls with lids	406,410	2	B0000BYCGF
pyrex containers	423,993	2	B0000BYCGF
pyrex storage	433,550	2	B0000BYCGF
pyrex storage containers with lids	7,893	3	B0000BYCGF
pyrex bowl	330,919	3	B0000BYCGF
glass pyrex containers with lids	339,344	3	B0000BYCGF
pyrex glass	456,289	3	B0000BYCGF

Or shown in an array format:

['glass bowls with lids' 'pyrex bowls with lids' 'glass bowl with lid'
 'glass bowls with lids food storage' 'pyrex bowls'
 'pyrex glass bowls with lids' 'pyrex containers' 'pyrex storage'
 'pyrex storage containers with lids' 'pyrex bowl'
 'glass pyrex containers with lids' 'pyrex glass']

Step 3: Identify Competitor ASINs for Your Keywords

Let's see the other ASINs which rank for these keywords:

# Get other ASINs from competitors for ownedKeywords
otherASINs = []
for searchterm in ownedKeywords[0]:
    # print(searchterm)
    foundASINs = dfBA_Long[dfBA_Long['searchterm'] == searchterm].ASIN.unique().flatten()
    otherASINs.append(foundASINs)

# Flatten array of arrays
flat_list_ASINs = [item for sublist in otherASINs for item in sublist]

# Make array unique
flat_list_ASINs = set(flat_list_ASINs)

# Remove own ASIN
flat_list_ASINs.remove(ASIN)

We receive these ASINs:

{'B00LGLHUA0',
 'B00M2J7PCI',
 'B0157G34AY',
 'B0161EG5IE',
 'B07L51SFVS',
 'B07VKSNSTB',
 'B07WT6K984',
 'B082SN4QH6',
 'B08FCBVY8G',
 'B08HR5815V',
 'B08VD783DS'}

Step 4: Extract Competitor Keywords from Amazon Data

Now we check for which keywords the ASINs above from step 2 are clicked:

# Get keywords the other ASINs are clicked on
keywordsFromOtherASINs = dfBA_Long[dfBA_Long['ASIN'].isin(flat_list_ASINs)].searchterm.unique()

We receive the result (excerpt for readability):

['pyrex' 'pyrex storage containers with lids'
 'glass storage containers with lids' 'mixing bowl'
 'pyrex glass storage containers with lids' 'glass measuring cup'
 'pyrex measuring cup' 'glass bowls' 'glass mixing bowls' 'glass bowl'
 'pyrex bowls' 'glass tupperware sets with lids' 'measuring cup set'
 'liquid measuring cups' 'food storage containers glass'
 'glass containers with lids' 'pyrex mixing bowls' 'measuring cups glass'
 ...
 'glass food containers with lids' 'kitchen necessities' 'cooking bowls'
 'kitchen essentials for new home' 'storage containers for food'
 'pyrex 2 cup' 'measuring bowls' 'glass food container'
 'glass storage container' 'clear bowl']

Step 5: Discover New Keyword Opportunities

We now need to determine which keywords are included in step three but are not found in step one.

# Get keywords which other ASINs are clicked on but not the own ASIN yet
A = np.array(ownedKeywords)
B = np.array(keywordsFromOtherASINs)
missingKeywords = np.setdiff1d(B, A)

We receive the result:

['baking bowls' 'big bowl' 'clear bowl' 'cooking bowls'
 'food containers glass' 'food storage containers glass'
 'food storage glass' 'glass airtight food storage containers'
 'glass bowl' 'glass bowl set' 'glass bowls' 'glass containers'
 'glass containers for food storage with lids'
 'glass containers with lids' 'glass food container'
 'glass food containers' 'glass food containers with lids'
 'glass food storage' 'glass food storage containers'
 'glass food storage containers with lids'
 'glass food storage containers with lids airtight'
 'glass kitchen storage containers' 'glass measuring cup'
 'glass measuring cups' 'glass measuring cups pyrex'
 'glass measuring cups set' 'glass mixing bowl' 'glass mixing bowl set'
 'glass mixing bowls' 'glass mixing bowls with lids'
 'glass mixing bowls with lids set' 'glass serving bowl' 'glass storage'
 'glass storage container' 'glass storage containers'
 'glass storage containers with lids' 'glass tupperware set'
 'glass tupperware sets with lids' 'kitchen essentials for new home'
 'kitchen necessities' 'large glass bowl' 'large measuring cup'
 'liquid measuring cup' 'liquid measuring cup glass'
 'liquid measuring cups' 'measure cup' 'measure cups' 'measurement cup'
 'measuring' 'measuring bowls' 'measuring cup' 'measuring cup glass'
 'measuring cup set' 'measuring cups' 'measuring cups glass'
 'measuring glass' 'measuring tools & scales' 'mesururing cup'
 'mixing bowl' 'mixing bowls glass' 'pyrex' 'pyrex 2 cup'
 'pyrex 2 cup measuring cup glass' 'pyrex glass bowls'
 'pyrex glass measuring cup' 'pyrex glass storage containers'
 'pyrex glass storage containers with lids' 'pyrex measuring cup'
 'pyrex measuring cup set' 'pyrex measuring cups' 'pyrex mixing bowls'
 'pyrex mixing bowls with lids' 'pyrex set' 'storage containers for food'
 'tupperware glass' 'tupperware sets glass']

Afterwards we separate the terms and keep all keywords with more than four characters:

# Flatten array of arrays and remove duplicates using a set
missingKeywords_flattened = set(' '.join(missingKeywords).split(' '))

# Only get keywords which have a minimum length of 4
missingKeywords_flattened_reduced = [str for str in missingKeywords_flattened if len(str) >= 4]
missingKeywords_flattened_reduced.sort()

We receive the following result:

['airtight', 'baking', 'bowl', 'bowls', 'clear', 'container', 'containers', 'cooking', 'cups', 'essentials', 'food', 'glass', 'home', 'kitchen', 'large', 'lids', 'liquid', 'measure', 'measurement', 'measuring', 'mesururing', 'mixing', 'necessities', 'pyrex', 'scales', 'serving', 'sets', 'storage', 'tools', 'tupperware', 'with']

Step 6: Identify Missing Keywords in Your Amazon Listing

Now we check which keywords from step 4 are missing in our listing.

Besides, we convert everything to lower case and combine the bullet points to form a long sentence.

# Get the product title for ASIN in question
productTitle = df_Products[df_Products['ASIN (child)'] == ASIN]['Product Title'].values[0].lower()

# Get a string of all 5 bullet points
allBullets = []
currentProduct = df_Products[df_Products["ASIN (child)"] == ASIN]
for i in range (1,6):
    allBullets.append(currentProduct['Bullet Point ' + str(i)].values[0])

allBulletsCombined = ' '.join(allBullets)

Afterwards we can check whether a keyword is included:

# Check if a term from missingKeywords_flattened_reduced is not in product title or bullet
termsNotFoundInListing = []
for term in missingKeywords_flattened_reduced:
    if (term.lower() not in productTitle) and (term.lower() not in allBulletsCombined.lower()):
        termsNotFoundInListing.append(term)

print("Missing keywords: " + str(termsNotFoundInListing))

As a result, we get the following:

Missing keywords: ['airtight', 'baking', 'bowl', 'bowls', 'clear', 'cooking', 'cups', 'essentials', 'home', 'kitchen', 'large', 'liquid', 'measure', 'measurement', 'measuring', 'mesururing', 'mixing', 'necessities', 'scales', 'serving', 'sets', 'tools', 'tupperware']

These keywords should - if appropriate - be added to the listing and with a bit of luck, the product will soon be clicked for these keywords as well. Of course, the keywords should not be used blindly, so check them for sense.

Key Takeaways for Amazon Keyword Optimization

If you have the data, even a large product catalog can be checked for missing keywords in a few seconds. The script below was developed for this purpose. It even checks a detailed product catalog within seconds and determines which keywords still need to be added. The results are saved in an Excel file.