How to save historical Google Analytics data using Big Query

Apr 06, 2023

6 Minutes Read

By Shubhangi Chauhan

GA4

Initially, web tracking and third-party cookies were the foundation of Universal Analytics. However, in today's digital world, users can quickly discover and purchase products from mobile apps in addition to websites. They also desire improved privacy regulations and greater data control. Google consequently unveiled GA4 as a new method of evaluation in the post-cookie era.

The most recent Google update states that fresh data will no longer be processed by Universal Analytics as of July 1, 2023. Your Universal Analytics properties won't continue to gather data after this period.

It's time to start using GA4 if you haven't already. You can gather more historical and user data in GA4 if you put up your property early.

It is designed to handle large amounts of data and is built on Google's infrastructure, which allows it to process large queries quickly, even on massive datasets. It offers a pay-as-you-go pricing model, which means you only pay for the queries you run and the amount of data you store.

Why is it important to save historical data?

Saving historical data is important because it allows you to track changes over time and make more informed decisions based on past performance.

Here are some specific reasons why saving historical data is important:

  1. Trend analysis: By analyzing historical data, you can identify trends and patterns that can help you make better predictions and plan for the future.
  2. Performance tracking: By tracking historical data, you can measure your progress over time and identify areas where you may need to make improvements.
  3. Long-term planning: Historical data can provide insights that are useful for long-term planning and strategy development.
  4. Compliance and auditing: In many industries, it is a legal requirement to retain historical data for a certain period of time for compliance and auditing purposes.
  5. Business intelligence: Historical data can be used to gain business intelligence and competitive insights, such as customer behaviour and market trends.

Overall, saving historical data is essential for understanding how your business has performed in the past, identifying areas for improvement, and making informed decisions about the future.

Ways to Export the Universal Analytics Data

Data from Universal Analytics can be saved in a variety of methods and later compared to GA4 in reports.

  1. Through Excel Sheets: The cheapest and simplest method for saving Universal Analytics data is to export it to Google Sheets or Excel Spreadsheets. There are some limitations to be aware of, though.
    Pros:

Being well-known- We work in marketing. Spreadsheets are great.

Simple Exports - The majority of UA reports can be saved.Google Sheets or XLSX files.

Cost- Using your preferred programme will probably cost you nothing extra.

Cons:

Scale - Depending on the level of detail required by their data requirements, larger, more popular sites might require multiple sheets.

Versatility - While it is possible to email or share individual files, working with spreadsheets across bigger teams has drawbacks.

Manual Reload - You are forced to physically export your UA reports from the web UI each time you need to refresh the data unless you automate a download or export.

  1. Export to a special database: The biggest businesses might be able to store UA data in their current data repositories or storage facilities. Many businesses with these specialised solutions already possess the knowledge required to combine data from CRM and ERP systems with marketing analytics data.
    Pros:
    To make wiser marketing choices, combine your Universal Analytics statistics with information from other sources.

Cons:

More money must be spent on specialised data storage options.

  1. Transfer Data from Universal Analytics to the Cloud: A cloud-based storing solution might be your best option if you are more technically savvy and at ease using APIs. You can extract, transform, and load (ETL) sizable data sets for later study using database integrations.

For most Google customers, BigQuery is their preferred storage option. However, BigQuery has some benefits when integrating with other Google products like Google Ads, Data Studio, and Sheets. Of course, you can use Amazon or Microsoft cloud products as well.

Pros:

Once you've established the proper API connections, it's simple to consume big data sets.

With the Google Analytics Reporting API, you have more freedom to select the precise data sets you require for your upcoming reports.

The cost of cloud storage is reasonable.

Why Big Query over other alternatives?

BigQuery is a cloud-based data warehouse provided by Google Cloud Platform that allows users to store and analyse large volumes of data quickly and easily. It is a highly scalable and cost-effective solution for storing and processing data, and it can handle both structured and unstructured data.

There are several reasons why you might choose to use BigQuery to save historical data over other alternatives:

  1. Scalability: BigQuery is highly scalable and can handle large volumes of data with ease, making it an ideal solution for storing historical data.
  2. Querying Speed: BigQuery uses a distributed architecture that allows it to process queries quickly, even on large datasets.
  3. Cost-effectiveness: BigQuery offers a pay-as-you-go pricing model, which means you only pay for the data you store and the queries you run. This can make it a more cost-effective solution compared to other data warehousing options.
  4. Integration with Google Analytics: BigQuery can integrate directly with Google Analytics, allowing you to easily export and analyse historical data from your GA account.
  5. SQL support: BigQuery supports SQL, which means that you can use familiar SQL syntax to analyse your data, making it accessible to a wide range of users.

Overall, BigQuery can be a powerful tool for storing and analysing historical data, especially when combined with other Google Cloud Platform services like Google Analytics. Its scalability, speed, and cost-effectiveness make it a popular choice for businesses of all sizes.

How to plan the data transfer in Big query using SQL Queries

When planning data transfer in BigQuery, there are several factors to consider, such as the volume and frequency of the data transfers, the data format, the data source and destination, and the performance requirements.

[Picture credits: Google]

Here are some steps you can follow to plan data transfer in BigQuery:

  1. Choose the appropriate data transfer method: BigQuery supports several methods for data transfer, including streaming inserts, batch loads, and data transfer services such as Cloud Storage transfer service and third-party data connectors. Each method has its own advantages and limitations, so choose the one that best fits your use case.
  2. Determine the data format: BigQuery supports several data formats, including CSV, JSON, Avro, and Parquet. Choose the format that is most appropriate for your data source and the downstream applications that will consume the data.
  3. Set up the data source: Configure the data source like the analytics view id to export data in the chosen format and schedule the data transfers to run at appropriate intervals. Make sure the data source has the necessary permissions to export data to BigQuery.
  4. Configure the destination: Create a BigQuery dataset and table to store the transferred data. Choose appropriate partitioning and clustering options to optimise query performance.
  5. Define data transformation and processing requirements: Depending on the use case, you may need to perform data transformations and processing on the transferred data before storing it in BigQuery. Determine the tools and services needed for this step and incorporate them into your data transfer plan.
  6. Monitor and optimize performance: Monitor the data transfer process and optimize performance as needed. Use BigQuery's monitoring and troubleshooting tools to identify and resolve issues.
  7. Set up security and access control: Define the appropriate security and access control measures to protect the transferred data and ensure compliance with regulatory requirements.

By following these steps, you can plan a data transfer process that meets your performance, security, and compliance requirements, while also optimising cost and efficiency.

How to formulate the queries in Big Query

To formulate queries in BigQuery, you can use SQL-like syntax to select, filter, aggregate, and manipulate data in your datasets. Here are some basic steps to follow when formulating queries in BigQuery:

  1. Open the BigQuery console: Open the BigQuery console in your web browser and sign in to your account.
  2. Choose the appropriate dataset: Select the dataset that contains the data you want to query from the navigation pane on the left side of the console.
  3. Create a new query: Click on the "Compose Query" button to open a new query window.
  4. Write the query: In the query editor window, write your SQL-like query using the appropriate syntax. You can use the SELECT statement to choose the columns you want to display, the FROM statement to specify the dataset and table you want to query, the WHERE statement to filter the rows based on specific conditions, and the GROUP BY and ORDER BY statements to group and sort the data, respectively.
  5. Test the query: Click on the "Run" button to test the query and see the results. You can also use the "Preview" button to preview a small sample of the data before running the full query.
  6. Refine the query: Refine the query as needed by modifying the SQL syntax, adjusting the filters and conditions, or adding additional clauses such as JOIN, UNION, or subqueries.
  7. Save and share the query: When you are satisfied with the query, you can save it for future use or share it with others in your organisation by clicking on the "Save" button and choosing an appropriate name and description.

By following these steps, you can formulate and execute queries in BigQuery to extract insights from your data and answer important business questions.

Below is an example of a simple query in BigQuery that calculates the total revenue for each product category in a hypothetical e-commerce dataset:

SELECT 
 product_category,
 
SUM(price * quantity) AS total_revenue
FROM 
 
`my_project.my_dataset.my_table`
GROUP BY 
 product_category
ORDER BY 
 total_revenue
DESC

In this example, the query selects the "product_category" and calculates the total revenue for each category by multiplying the "price" and "quantity" fields together and summing the result. It then groups the results by "product_category" and orders them in descending order based on the total revenue. The query assumes that the e-commerce data is stored in a table named "my_table" in a dataset named "my_dataset" in the "my_project" project.

This query could be used by a business to understand which product categories are generating the most revenue, helping them to make decisions around inventory, marketing, and product development.

Conclusion:

It is a highly scalable and cost-effective solution for storing and processing data, and it can handle both structured and unstructured data. With its support for SQL, integrations with other Google Cloud Platform services, and machine learning capabilities, BigQuery is a powerful and versatile solution for businesses of all sizes looking to compile and examine a lot of info..

By leveraging the benefits of BigQuery, businesses can gain valuable insights into their operations, identify areas for improvement, and make data-driven decisions that can help drive growth and success.