How I Use the Google Trends Dataset on BigQuery to Stay Tuned With the World
If you are interested in blogging, you have encountered Google Trends at one time or another. It’s a fantastic place to find trending topics to write about, and many professional bloggers consider it an essential website in their toolbox.
But did you know that you can access some of the raw data for free?
In Google BigQuery, you can find the two following datasets (both are included in BigQuery’s 1TB/mo of free tier processing):
Each has two tables, one storing the 25 most searched terms and another storing the 25 fastest-rising terms.
But the data is far from self-explanatory, and I banged my head against the wall for days before I understood how to use these two datasets.
Let me break it down for you.
About the data
For the international data, we have the following columns:
country_name
region_name
term
rank
score
week
refresh_date
percent_gained
(fastest-rising only)
The United States data has dma_name
instead of region and country. That's the geographic division used to measure local television viewing.
Now, the most confusing columns in the dataset are refresh_date
and week
because they don't mean what I thought they did.
Every day, the dataset gets updated with the 25 most searched terms (and fastest-rising) for each country. Each term receives a rank
between 1 and 25.
refresh_date
is the day when the data was added. It expires after 30 days, giving way to more recent information.
When I ran the following query, I was surprised when every one of those rows had the same term
but different values in score
, week
, and region_name
.
SELECT term, week, rank, score, region_name
FROM `bigquery-public-data.google_trends.international_top_terms`
WHERE refresh_date = "2024-01-14"
AND country_name = "Sweden"
AND rank = 1
After a while, I realized that score
, week
, and region_name
show how the term's relative popularity has changed over time and between regions.
In other words, these columns are only relevant when you look at the history of a term, but you can’t use them in relation to other terms.
A potential application
Unfortunately, only looking at the top 25 terms is not enough to create real value for businesses or individuals.
The best use case I could come up with was to run the following query in order to continuously get the highest ranking terms right now for each country.
SELECT
country_name,
term,
ARRAY_AGG(STRUCT(rank,week) ORDER BY week DESC, refresh_date DESC LIMIT 1) x
FROM
`bigquery-public-data.google_trends.international_top_terms`
WHERE
refresh_date =
(SELECT
MAX(refresh_date)
FROM
`bigquery-public-data.google_trends.international_top_terms`)
AND
week =
(SELECT
MAX(week)
FROM
`bigquery-public-data.google_trends.international_top_terms`
)
GROUP BY
country_name, term
ORDER BY
(SELECT country_name FROM UNNEST(x)), (SELECT rank FROM UNNEST(x))
You can look at specific countries to understand what’s important in their part of the world. It’s a good way to improve your general knowledge.
I created the following page on my website to showcase the latest top results for all countries in the dataset: https://oscarleo.com/google-trends
Conclusion
Today, you learned how to use the Google Trends datasets in BigQuery.
It’s an exciting dataset that can improve your understanding of the world, but the open data doesn’t have enough information to provide real value to businesses.
Thank you for reading!