Use SQL Queries in BigQuery to extract data for use in Google Sheets

In this post I’ll describe how you can use SQL queries in BigQuery to extract relevant data and transfer it to Google Sheets.

This could have value if you wanted to share the data with others, or wanted to connect this sheets data to Google Data Studio for whatever reason.

Taking data from BigQuery and putting it into google sheets using SQL

I’ll run through some examples of SQL queries that might be useful.

Using BigQuery Public Datasets

So the Dataset that I’ll use as an example is a table from the usa_names public dataset.

BigQuery has some very interesting public datasets which you can take a look at.

BigQuery public datasets

There are many different varieties that you can choose to access. The one we will be using today is the same one I used in a previous blog post. It is the USA Names dataset from the U.S. Social Security Administration.

USA names public data set in google big query

Let’s take a look at this very interesting dataset in BigQuery.

Looking at the Schema of the dataset.

Before we start using queries, let’s take a moment to examine the Schema of the dataset. We see that we have five fields. Three are strings (state, gender, name) and two are integers (year and number).

Schema details for a BG dataset

We also get a description of each field which helps us to know what each one means.

Looking at the entire dataset

If we want to see the full number of entries in the dataset, we can use the following SQL query;

SELECT * FROM `bigquery-public-data.usa_names.usa_1910_current`;

This selects ALL the data in the dataset, and we see that it has returned a whopping 6 million rows.

Query editor in Google Big Query showing the number of rows per page. SQL Queries in BigQuery.

So I’ll give you some examples of custom SQL queries you can use, and how you can use SQL to choose the data that is relevant to you.

Examples of custom SQL Queries you can run

I’ll run through some SQL Queries in BigQuery that you can adapt for your own datasets.

List of US States

Here are some custom SQL queries you can run to generate unique and smaller datasets.

SELECT DISTINCT state FROM `bigquery-public-data.usa_names.usa_1910_current` LIMIT 1000
query results showing number of US states. SQL Queries in BigQuery.

Let’s look at adding some more information to this query.

List of US States with number of Names, and in descending order

Let’s say we want a list of states, the number of names in each state, and for the list to be in the order of highest to lowest number of names.

To do this we use this query;

SELECT DISTINCT state AS State,
sum (number ) AS number_of_names
FROM `bigquery-public-data.usa_names.usa_1910_current` 
GROUP BY state
ORDER BY number_of_names desc;

We use a few new SQL functions for this query.

  • sum(number) AS number_of_names – gives us the total number of names.
  • GROUP BY state – gives us the total number of names by state.
  • ORDER BY number_of_names desc – sorts the list from most names to least names.
SQL Queries in BigQuery query results image.

List of Names with number of Names, and in descending order

We can use a formula similar to the one before. We just need to replace DISTINCT state with DISTINCT name, and to GROUP BY name, instead of state.

SELECT DISTINCT name AS Name,
sum (number ) AS number_of_names
FROM `bigquery-public-data.usa_names.usa_1910_current` 
GROUP BY name
ORDER BY number_of_names desc;

We then have our list of names and the number of times they appear.

query results for SELECT DISTINCT name AS Name,
sum (number ) AS number_of_names
FROM `bigquery-public-data.usa_names.usa_1910_current` 
GROUP BY name
ORDER BY number_of_names desc;

Top 10 Girls Names in 1957

Let’s say we want to see the ten most popular names for girls in a given year. Let’s say 1957.

So we know we have the Gender field which is described as “Sex (M=male or F=female)”. We also have the Year field which is a “4-digit year of birth”. We can add these parameters to our query by using the SQL WHERE Clause.

SELECT DISTINCT name AS Name,
sum (number ) AS number_of_names
FROM `bigquery-public-data.usa_names.usa_1910_current` 
WHERE year=1957 
AND gender="F"
GROUP BY name
ORDER BY number_of_names desc;

We can see that Mary, Susan and Linda were the top three names for girls in 1957.

query results for SELECT DISTINCT name AS Name,
sum (number ) AS number_of_names
FROM `bigquery-public-data.usa_names.usa_1910_current` 
WHERE year=1957 
AND gender="F"
GROUP BY name
ORDER BY number_of_names desc;

Sending custom BigQuery Data to Google Sheets

So let’s say we want to do some data analysis on some of our BigQuery data in Google Sheets.

Let’s also say that we are only interested in the top 100 names from Kentucky from 1987 until 1994.

SELECT DISTINCT name AS Name,
sum (number ) AS number_of_names
FROM `bigquery-public-data.usa_names.usa_1910_current` 
WHERE year>1987 
AND year<1994
AND state="KY"
GROUP BY name
ORDER BY number_of_names desc
LIMIT 100;

And we have our results.

Top 100 baby names in the state of Kentucky. SQL Queries in BigQuery.

Click on SAVE RESULTS and then choose the option to save as a Google Sheet.

Saving SQL Queries in BigQuery as a Google Sheet

You’ll then see a notification in the bottom left corner of your screen confirming that the data has been saved to a Google Sheet.

Open the new sheet

You can click on ‘Open’ to view the Google Sheet.

Open the results page

We have our BigQuery data in our Sheet, and can analysis and manipulate it in any way we want. For example, we could create a vaguely psychedelic pie chart as shown below.

A pie chart in Google Sheets

I hope you enjoyed reading this post about SQL Queries in BigQuery and learned something new!

My Other Blog Posts

I hope you enjoyed this article. If you did, you might enjoy some of my previous blog posts!

I’m Michael


I’m a South African digital analyst living in Toronto, Canada. I blog about digital marketing, web analytics and data visualization.

Get in touch with me!
LinkedIn
Twitter
Email