Cover Image
...

How to Identify Missing Amazon Listing Keywords Using Brand Analytics and Python

11/03/2021 • Reading time: ca. 6 min • by Trutz Fries

In the context of search engine optimization on Amazon (Amazon SEO), it is essential that relevant keywords are placed within the listing.

In this article, we'll show you how to identify missing keywords for your listings using the Brand Analytics Search Term Report and some Python code.

  1. What is included in the Brand Analytics search term report?
  2. How to find missing keywords?
  3. What data is required?
  4. Identify keyword ideas
    1. Load Brand Analytics data
    2. Load listing data
    3. For which keywords does our product rank? (Step 1)
    4. Which ASINs rank for the keywords from step 1? (Step 2)
    5. For which keywords do these ASINs rank? (Step 3)
    6. Which keywords are new? (Step 4)
    7. Which keywords are missing in the listing? (step 5)
  5. Conclusion

What is included in the Brand Analytics search term report?

The Brand Analytics search term report is published weekly by Amazon. It contains up to 1 Million most frequently used search terms on Amazon for the US or other countries. Moreover, it points out the three ASINS most frequently clicked by customers after they searched for the respective search term. You will get further information about the report in our article about Brand Analytics Searchterm Report.

Amazon Brand Analytics Searchterm Report
Amazon Brand Analytics Searchterm Report

How to find missing keywords?

The idea of identifying missing keywords for your listing is quite simple and can be broken down into five steps:

We determine ...

  1. ... for which search terms your product is clicked
  2. ... which other ASINs are clicked for these search terms determined in 1.
  3. ... for which search terms the ASINs from 2. are clicked on
  4. ... which search terms are contained in 3. but not in 1.
  5. ... which search terms from 4. are not included in the listing

Once coded, the script can be executed for all products. But more about this later.

What data is required?

To determine the keyword ideas, you need two sets of data:

  1. a current Brand Analytics search term report. The report is available for brand owner on Amazon and can be downloaded from Brand Analytics.
  2. a CSV file with your own listing data.

The CSV file with the listing data should contain the following columns:

  • Marketplace (e.g. "US")
  • ASIN
  • Title
  • Bullet 1
  • Bullet 2
  • Bullet 3
  • Bullet 4
  • Bullet 5

Optional, you can add the hidden/backend keywords to the script (also called "hidden keywords") if you want to check them as well.

Identify keyword ideas

Let's go through the steps using an example. Here we take the "RUF Whipped Cream" as the product.

Example Storage Container
Example Storage Container

Now we go through the steps explained above. Before that, we have to download our data.

Load Brand Analytics data

First of all, we load the Brand Analytics search term data:

fileNameBA = "./Amazon-Searchterms-US.csv"
thousandSeparator = "," # US
columns = ["Search Term","Search Frequency Rank","#1 Clicked ASIN","#2 Clicked ASIN","#3 Clicked ASIN"] # US

# Load data
dfBA = pd.read_csv(fileNameBA, thousands=thousandSeparator, usecols=columns, engine="python", error_bad_lines=True, encoding='utf-8', skiprows=1,  sep=",")

# Rename columns
dfBA.columns = ['searchterm', 'rank', '1', '2', '3']

# Unmelt dfBA from wide to long
dfBA_Long = dfBA.melt(id_vars=["searchterm", "rank"], var_name="position", value_name="ASIN")

# Make position an int
dfBA_Long = dfBA_Long.astype({"position": int})

# Drop N/A
dfBA_Long = dfBA_Long.dropna()

# Reset index and sort
dfBA_Long_WithIndex = dfBA_Long.set_index('searchterm')
dfBA_Long_WithIndex = dfBA_Long_WithIndex.sort_index()

We have depivotated the data so that it is now in the following format (excerpt):

searchtermrankpositionASIN
shaw hart237,6203B08T9TTC67
nautical rug344,6631B01DVDBSWG
belts for boys167,9071B08P8K1HRF
hori switch controller368,3201B08KT7ML1R
rear window prime video352,1932B00D5UK8DQ

Load listing data

By now we still need the listing information. We assume the information has already been loaded in a dataframe called df_products:

Listing-Daten für unser Produkt
Listing-Daten für unser Produkt

By now we have everything we need. Time for our five-step approach!

For which keywords does our product rank? (Step 1)

First, we determine in the Brand Analytics search term report for which keywords our product is found:

# Get "owned keywords", i.e. keywords the ASIN in question is clicked on already
ASIN = 'B0000BYCGF'
ownedKeywords = []
foundKeywords = dfBA_Long[dfBA_Long['ASIN'] == ASIN].searchterm.unique()
ownedKeywords.append(foundKeywords)

the result:

searchtermrankpositionASIN
glass bowls with lids48,1391B0000BYCGF
pyrex bowls with lids150,2021B0000BYCGF
glass bowl with lid394,2311B0000BYCGF
glass bowls with lids food storage480,6491B0000BYCGF
pyrex bowls53,0752B0000BYCGF
pyrex glass bowls with lids406,4102B0000BYCGF
pyrex containers423,9932B0000BYCGF
pyrex storage433,5502B0000BYCGF
pyrex storage containers with lids7,8933B0000BYCGF
pyrex bowl330,9193B0000BYCGF
glass pyrex containers with lids339,3443B0000BYCGF
pyrex glass456,2893B0000BYCGF

Or shown in an array format:

['glass bowls with lids' 'pyrex bowls with lids' 'glass bowl with lid'
 'glass bowls with lids food storage' 'pyrex bowls'
 'pyrex glass bowls with lids' 'pyrex containers' 'pyrex storage'
 'pyrex storage containers with lids' 'pyrex bowl'
 'glass pyrex containers with lids' 'pyrex glass']

Which ASINs rank for the keywords from step 1? (Step 2)

Let's see the other ASINs which rank for these keywords:

# Get other ASINs from competitors for ownedKeywords
otherASINs = []
for searchterm in ownedKeywords[0]:
    # print(searchterm)
    foundASINs = dfBA_Long[dfBA_Long['searchterm'] == searchterm].ASIN.unique().flatten()
    otherASINs.append(foundASINs)

# Flatten array of arrays
flat_list_ASINs = [item for sublist in otherASINs for item in sublist]

# Make array unique
flat_list_ASINs = set(flat_list_ASINs)

# Remove own ASIN
flat_list_ASINs.remove(ASIN)

We receive these ASINs:

{'B00LGLHUA0',
 'B00M2J7PCI',
 'B0157G34AY',
 'B0161EG5IE',
 'B07L51SFVS',
 'B07VKSNSTB',
 'B07WT6K984',
 'B082SN4QH6',
 'B08FCBVY8G',
 'B08HR5815V',
 'B08VD783DS'}

For which keywords do these ASINs rank? (Step 3)

Now we check for which keywords the ASINs above from step 2 are clicked:

# Get keywords the other ASINs are clicked on
keywordsFromOtherASINs = dfBA_Long[dfBA_Long['ASIN'].isin(flat_list_ASINs)].searchterm.unique()

We receive the result (excerpt for readability):

['pyrex' 'pyrex storage containers with lids'
 'glass storage containers with lids' 'mixing bowl'
 'pyrex glass storage containers with lids' 'glass measuring cup'
 'pyrex measuring cup' 'glass bowls' 'glass mixing bowls' 'glass bowl'
 'pyrex bowls' 'glass tupperware sets with lids' 'measuring cup set'
 'liquid measuring cups' 'food storage containers glass'
 'glass containers with lids' 'pyrex mixing bowls' 'measuring cups glass'
 ...
 'glass food containers with lids' 'kitchen necessities' 'cooking bowls'
 'kitchen essentials for new home' 'storage containers for food'
 'pyrex 2 cup' 'measuring bowls' 'glass food container'
 'glass storage container' 'clear bowl']

Which keywords are new? (Step 4)

We now need to determine which keywords are included in step three but are not found in step one.

# Get keywords which other ASINs are clicked on but not the own ASIN yet
A = np.array(ownedKeywords)
B = np.array(keywordsFromOtherASINs)
missingKeywords = np.setdiff1d(B, A)

We receive the result:

['baking bowls' 'big bowl' 'clear bowl' 'cooking bowls'
 'food containers glass' 'food storage containers glass'
 'food storage glass' 'glass airtight food storage containers'
 'glass bowl' 'glass bowl set' 'glass bowls' 'glass containers'
 'glass containers for food storage with lids'
 'glass containers with lids' 'glass food container'
 'glass food containers' 'glass food containers with lids'
 'glass food storage' 'glass food storage containers'
 'glass food storage containers with lids'
 'glass food storage containers with lids airtight'
 'glass kitchen storage containers' 'glass measuring cup'
 'glass measuring cups' 'glass measuring cups pyrex'
 'glass measuring cups set' 'glass mixing bowl' 'glass mixing bowl set'
 'glass mixing bowls' 'glass mixing bowls with lids'
 'glass mixing bowls with lids set' 'glass serving bowl' 'glass storage'
 'glass storage container' 'glass storage containers'
 'glass storage containers with lids' 'glass tupperware set'
 'glass tupperware sets with lids' 'kitchen essentials for new home'
 'kitchen necessities' 'large glass bowl' 'large measuring cup'
 'liquid measuring cup' 'liquid measuring cup glass'
 'liquid measuring cups' 'measure cup' 'measure cups' 'measurement cup'
 'measuring' 'measuring bowls' 'measuring cup' 'measuring cup glass'
 'measuring cup set' 'measuring cups' 'measuring cups glass'
 'measuring glass' 'measuring tools & scales' 'mesururing cup'
 'mixing bowl' 'mixing bowls glass' 'pyrex' 'pyrex 2 cup'
 'pyrex 2 cup measuring cup glass' 'pyrex glass bowls'
 'pyrex glass measuring cup' 'pyrex glass storage containers'
 'pyrex glass storage containers with lids' 'pyrex measuring cup'
 'pyrex measuring cup set' 'pyrex measuring cups' 'pyrex mixing bowls'
 'pyrex mixing bowls with lids' 'pyrex set' 'storage containers for food'
 'tupperware glass' 'tupperware sets glass']

Afterwards we separate the terms and keep all keywords with more than four characters:

# Flatten array of arrays and remove duplicates using a set
missingKeywords_flattened = set(' '.join(missingKeywords).split(' '))

# Only get keywords which have a minimum length of 4
missingKeywords_flattened_reduced = [str for str in missingKeywords_flattened if len(str) >= 4]
missingKeywords_flattened_reduced.sort()

We receive the following result:

['airtight', 'baking', 'bowl', 'bowls', 'clear', 'container', 'containers', 'cooking', 'cups', 'essentials', 'food', 'glass', 'home', 'kitchen', 'large', 'lids', 'liquid', 'measure', 'measurement', 'measuring', 'mesururing', 'mixing', 'necessities', 'pyrex', 'scales', 'serving', 'sets', 'storage', 'tools', 'tupperware', 'with']

Which keywords are missing in the listing? (step 5)

Now we check which keywords from step 4 are missing in our listing.

Besides, we convert everything to lower case and combine the bullet points to form a long sentence.

# Get the product title for ASIN in question
productTitle = df_Products[df_Products['ASIN (child)'] == ASIN]['Product Title'].values[0].lower()

# Get a string of all 5 bullet points
allBullets = []
currentProduct = df_Products[df_Products["ASIN (child)"] == ASIN]
for i in range (1,6):
    allBullets.append(currentProduct['Bullet Point ' + str(i)].values[0])

allBulletsCombined = ' '.join(allBullets)

Afterwards we can check whether a keyword is included:

# Check if a term from missingKeywords_flattened_reduced is not in product title or bullet
termsNotFoundInListing = []
for term in missingKeywords_flattened_reduced:
    if (term.lower() not in productTitle) and (term.lower() not in allBulletsCombined.lower()):
        termsNotFoundInListing.append(term)

print("Missing keywords: " + str(termsNotFoundInListing))

As a result, we get the following:

Missing keywords: ['airtight', 'baking', 'bowl', 'bowls', 'clear', 'cooking', 'cups', 'essentials', 'home', 'kitchen', 'large', 'liquid', 'measure', 'measurement', 'measuring', 'mesururing', 'mixing', 'necessities', 'scales', 'serving', 'sets', 'tools', 'tupperware']

These keywords should - if appropriate - be added to the listing and with a bit of luck, the product will soon be clicked for these keywords as well. Of course, the keywords should not be used blindly, so check them for sense.

Conclusion

If you have the data, even a large product catalog can be checked for missing keywords in a few seconds. The script below was developed for this purpose. It even checks a detailed product catalog within seconds and determines which keywords still need to be added. The results are saved in an Excel file.

Keyword Suggestions in Excel
Keyword Suggestions in Excel

Here → you can download the entire notebook.

Would you like to have a better overview on Amazon?
Monitor your listings 14 days for free!
Do you have any questions? Don't hesitate to call us or send us an email!
Tel. +49 221-29 19 12 32 | info@amalytix.com