How I Use the Google Trends Dataset on BigQuery to Stay Tuned With the World

How I Use the Google Trends Dataset on BigQuery to Stay Tuned With the World
Photo by Peggy Anke / Unsplash

If you are interested in blogging, you have encountered Google Trends at one time or another. It’s a fantastic place to find trending topics to write about, and many professional bloggers consider it an essential website in their toolbox.

But did you know that you can access some of the raw data for free?

In Google BigQuery, you can find the two following datasets (both are included in BigQuery’s 1TB/mo of free tier processing):

Each has two tables, one storing the 25 most searched terms and another storing the 25 fastest-rising terms.

But the data is far from self-explanatory, and I banged my head against the wall for days before I understood how to use these two datasets.

Let me break it down for you.


About the data

For the international data, we have the following columns:

  • country_name
  • region_name
  • term
  • rank
  • score
  • week
  • refresh_date
  • percent_gained (fastest-rising only)

The United States data has dma_name instead of region and country. That's the geographic division used to measure local television viewing.

Now, the most confusing columns in the dataset are refresh_date and week because they don't mean what I thought they did.

Every day, the dataset gets updated with the 25 most searched terms (and fastest-rising) for each country. Each term receives a rank between 1 and 25.

refresh_date is the day when the data was added. It expires after 30 days, giving way to more recent information.

When I ran the following query, I was surprised when every one of those rows had the same term but different values in scoreweek, and region_name.

SELECT term, week, rank, score, region_name
FROM `bigquery-public-data.google_trends.international_top_terms`
WHERE refresh_date = "2024-01-14"
AND country_name = "Sweden"
AND rank = 1

After a while, I realized that scoreweek, and region_name show how the term's relative popularity has changed over time and between regions.

In other words, these columns are only relevant when you look at the history of a term, but you can’t use them in relation to other terms.


A potential application

Unfortunately, only looking at the top 25 terms is not enough to create real value for businesses or individuals.

The best use case I could come up with was to run the following query in order to continuously get the highest ranking terms right now for each country.

SELECT
    country_name,
    term,
    ARRAY_AGG(STRUCT(rank,week) ORDER BY week DESC, refresh_date DESC LIMIT 1) x
  FROM
      `bigquery-public-data.google_trends.international_top_terms`
  WHERE
      refresh_date = 
          (SELECT
              MAX(refresh_date)
          FROM
          `bigquery-public-data.google_trends.international_top_terms`)
          AND
      week = 
          (SELECT
              MAX(week)
          FROM
          `bigquery-public-data.google_trends.international_top_terms`
          )
  GROUP BY    
      country_name, term
  ORDER BY
      (SELECT country_name FROM UNNEST(x)), (SELECT rank FROM UNNEST(x))

You can look at specific countries to understand what’s important in their part of the world. It’s a good way to improve your general knowledge.

I created the following page on my website to showcase the latest top results for all countries in the dataset: https://oscarleo.com/google-trends


Conclusion

Today, you learned how to use the Google Trends datasets in BigQuery.

It’s an exciting dataset that can improve your understanding of the world, but the open data doesn’t have enough information to provide real value to businesses.

Thank you for reading!