Using the Extract Data connector in Google Data Studio
In this post I’ll look at the Extract Data connector in Google Data Studio.
After posting a poll on Twitter, it seems not too many people actually use the extract data function.
I’ll show you the steps to follow to extract the data, as well as how it can impact the speed of your reports.
Using the Extract Data connector
Create a new Data Studio report and choose a new data source. We will choose Extract Data as our data source.
You’ll then be presented with ‘Available Data Sources’ that are available for to extract data from, as shown below.
Once you’ve selected your data source, you will need to choose what Dimensions and Metrics you want. I’ve used a Google Analytics source to demonstrate how to extract data. You can see all the different dimensions and metrics in list of ‘Available Fields’ on the right.
We’ll keep things simple and only choose Date and Default Channel Grouping as our dimensions. For metrics, I’ll just pick Sessions and Users. For the date range I’ll choose the last 30 days.
You can see on the right side how many different dimensions and metrics there are in a normal Google Analytics data source. Note that we are leaving all of them behind and only picking the fields we will be using in our report.
We know have our two dimensions and two metrics that we will use in our dashboard. We can click ‘Save and Extract’ to continue.
You’ll notice that there is an option for it to Auto Update. I’ve set it to repeat Daily, so that the data will stay fresh and updated.
We can make any final changes to the fields in our data source before adding to the report.
Since I only chose to extract data from the last 30 days, if I adjust the date settings it shows no data from more than 30 days prior.
So this is good for a monthly report, but bad if we want to be able to play around and explore the data.
Testing out the Extracted Data vs regular Data
So we want to see if using Extracted Data helped speed up our dashboard. Reports being ‘slow’ can be a common complaint with others view Data Studio reports.
I’ve created created two scorecards, a time series chart and a table (with filtering) showing off the previous month’s data. The filters on the table affected both sets of data.
Below you can see that when I click ‘Organic Search’ on the Extracted Data table, the Extracted data time series and scorecards are filtered quicker than the normal Google Analytics data.
If I click ‘Direct’ on the regular data table, the Extracted data time series, scorecards and table also filter sooner than the regular data.
This gives an idea of what using extracted data can do. Sometimes speed can be an issue with reports, particularly if you are sharing them with colleagues who don’t normally use Data Studio.
For example, if you are sharing a report with your manager or CEO you want the data to be quickly accessible. If they have to wait for charts and tables to load, they could get frustrated. Therefore using Extracted Data could be a way to ensure your report loads correctly and is fast and seamless to sue.
Below is the actual Data Studio report. You can play around and see if you notice differences in speed between the Extracted data and regular data.
Difference in number of Users
You’ll notice the number of Users are different between the Extracted Data and the regular Google Analytics data.
For example, the number of Users in the extracted data is 6820.
The number of number of Users in the regular data is 5804. Almost one thousand fewer Users.
Why is this the case?
According to Bounteous;
“Contrary to popular belief, the Users metric in standard Google Analytics views doesn’t really represent individual users and people. Rather, this number is based on a cookie that is set by the user’s browser. That means if you access the website from a different browser or device, you might be counted as multiple users.”
Therefore the difference in Users numbers is likely due to the time frame specified by the extracted data and the full time frame allowed in the regular Google Analytics data.
I hope you enjoyed this post, and know a bit more about using the Extract Data connector in Google Data Studio.
My Other Blog Posts
If you liked this article you might like my other blog posts.
Hi! Thank you for your blog post. I tried it out today and for some reason it didn’t work and showed “no data” (I kept an eye on dates!) Anyway, the speed was why I wanted to extract data. I’ll try again next week 😉
Hi again. I played around with data extracting, and I got… garbage. I searched the topic again and I found this article: https://www.tribalism.com.au/2020/06/17/google-analytics-data-studio-data-extract-discrepancies.html
That’s exactly what happened in my case: data was not consistent and some of metrics don’t work at all (bounce rate, to name one of them). Apparently data extraction is good only for some cases, and only for some metrics. Well… Disappointing.
Hey Sorry to hear that! Hope you figure it out!
Hi, I’ve my own connector that serves data from csv files with a ClientId parameter, so when a client, for example, ClientId 10 sees a report, it loads automatically a csv with only ClientId 10 data (every ClientId has his own csv, and the connector resolves what csv has to load using a parameter).
How does the Extract data connector works in this case? I mean, when I update my data, does this connector gets data from what csv file? Every client? Or how do I have to configure the connector so when a ClientId sees a report, he visualices his own data and not another ClientId.
Thanks
Hey, this is great. Could you help me with an issue I found with data sources created using the Extract Data? The Extracted data source cannot work with different date ranges as the user desires by using the date range controller in any dashboard. Charts using the extracted data source don’t change the data when selecting different date ranges; the metrics remain in the total value of the date range set when the data source was created. I may be missing something. Do you have any example where the extract data change according to different date ranges, weekly, quarterly, or annually? Also, checking your demo on this page, I found the same issue. Thanks
Did you confirm that the chart was now connected to the new extracted data source, and that the date field in the new extracted data source was placed in the “Date Range Dimension” field?
When comparing the users data across original ga4 connector and extract data source there is a discrepancy why?
Here when we are extracting the data then it was calculating the sum of those users .
Is there any way to get the extracted users data same as the Original GA4 connector data
Sorry I’m not 100% sure of the reason you’d need to consult the GA4 documentation.