Universal Analytics Data Extraction - What are your options?

Mark Lee & Charlie Billingham 

With only a few months left to go until Google presses the big red button on Universal Analytics and removes all data forever, marketers who require historic data need to act quickly to preserve any data they may need.

As a brief reminder of timelines, standard Universal Analytics properties stopped collecting hits starting the 1st July 2023. For 360 customers, this timeline was extended up to 1st July 2024. However, the shut-down of individual features is already well underway, and it is recommended that you are migrated to GA4 by March. This is because a number of key features are being deprecated in early March, a full list of which you can see here. Each will make the migration more difficult if you are looking to preserve your historic data.

 

So what are the options for migrating data out of Universal Analytics?

BigQuery (360 Only)

The key benefit to using BigQuery is that it provides 360 customers with the richest unsampled dataset. This includes the hit-level data that is only accessible via this method, enabling marketers and analysts to conduct granular analysis using SQL.

The drawback to this approach, however, is that it will only backdate to either 13 months or 1 billion hits, whichever is smaller. This means that if you have been using Universal Analytics for more than a year and have never linked UA to BigQuery, any data after 13 months is still going to be lost from the 1st July. The other drawback is that this feature may not be available until the 1st July. Both the real-time and daily exports to BigQuery will stop in early March, meaning any data you wish to capture between March and July will also be lost.

 

Analytics Reporting API 

Where BigQuery is limited to a maximum of 13 months of data, the rest of the approaches will let you extract data over any timeline which you specify. The downside is that it is not always as simple as setting up BigQuery with a few clicks.

Utilising the API could result in data being sampled as it is extracted, reducing the accuracy of the metrics. Further, you are not able to access the hit-level information and are still bound by data restrictions across combining different dimension levels. This means that you will likely be unable to build one API call to extract all of the meaningful data to your business, leading to data silos which may not be able to be joined together.

The main advantage of the API is that for extracting aggregate reporting (or even some types of granular reporting) it offers the greatest flexibility. This includes for reducing sampling (by reducing timeframes or the amount of dimensions and metrics in each query), as well as when the data is extracted; JSON, CSV, a cloud provider or other location. With the API you can choose where and how this data is stored for complete flexibility.

The Merkle Engineering Team is currently developing an automated solution using Python that enables the data extraction from the UA properties (from a specific period of time) and sends that data back to the dataset specified in Big Query in form of a table, so clients can store their data for a longer timeframe. Testing of this bespoke solution is underway and will likely be made more widely available to our broader client-base over the coming weeks, so please reach out to your assigned Merkle contact if this is something you might be interested in leveraging.

 

Sheets Extension

A simple and quick method, the Sheets Extension offers users a simple method to extract data from UA. Marketers or analysts can create multiple reports to be extracted more quickly than the other methods mentioned so far, if all that is needed is top-line aggregate data. This could be the best approach for you.

The downside is that it is limited to just 5000 rows. This means that unless you run reports across multiple tabs, extracting any type of granular data over a long timeframe is just not feasible. Additionally, this option is the most likely to result in sampling, which could reduce the accuracy of any insights gleaned.

 

Report Download

Possibly the quickest and easiest method of those available. The option to download reports directly from Universal Analytics either using Custom Reports, Unsampled Reports or Standard Reports, presents users with a simple way to download the data they use for reporting. An advantage to this method is that even when using Standard or Custom Reporting, you will be able to see clearly if there is any sampling within the data or not, before you download the data.

The negative, on the other hand, is that if you frequently run into sampling and are not a 360 customer, the only method to reduce it would be to either reduce the timeframe of the report or reduce the already limited number of dimensions within the report (five dimensions per report). This could result in having to manually download a number of reports, each of which could be time-consuming.

 

Conclusion

If you are a 360 customer, we would recommend that you link BigQuery while you still can, to glean the greatest depth of year-on-year reporting from Universal Analytics which will be available to you. In addition, utilising the unsampled reports feature can help to fill-in the gaps at a top-level for any data you require which BigQuery is not able to export.

For Standard customers, our recommendation changes depending on your situation. If all you need is high-level summaries, utilising either the sheets extension or downloading custom reports may prove to be suitable. If you require a more complete dataset, however, we recommend utilising the API to extract the data which you need.

If you're currently wondering what the best approach to your historic Universal Analytics data is and how you should archive this, or have any Analytics questions in general, please reach out to your dedicated Merkle support team, or contact us here and we'll be happy to walk through the options which are best for you in more detail.