Aakash Yadav

Oct 22, 2024 • 2 min read

Write better log queries, faster with pipe syntax in BigQuery and Cloud Logging

Write better log queries, faster with pipe syntax in BigQuery and Cloud Logging

Log data is a gold mine for understanding how your application works. It can help you optimize performance, strengthen security, and improve user experience. But the sheer volume and complexity of logs generated by modern applications can be overwhelming. How can you dig out the meaningful stuff from all that data?

BigQuery had recently announced a series of innovations in that are designed to revolutionize the way you manage, analyze, and derive value from your logs.

BigQuery pipe syntax: SQL for log data

BigQuery pipe syntax is a new approach to SQL that is specifically designed for the semi-structured nature of log data. It introduces an intuitive, top-down syntax that mirrors how you naturally approach data transformations. By visually separating different stages of a query with the pipe symbol (|>), it becomes remarkably easy to understand the logical flow of data transformation. Each step is clear, concise, and self-contained, making your queries more approachable for both you and your team.

BigQuery's pipe syntax isn't just about cleaner SQL—it's about unlocking a more intuitive and efficient way to work with your data. Instead of wrestling with code, you can experience faster insights, improved collaboration, and more time spent extracting value from your data.

This streamlined approach is especially powerful when it comes to log analysis, which is rarely a straight line from question to answer. Analyzing logs often means sifting through mountains of data to find specific events or patterns.

You explore, you discover, and you refine your approach as you go. Pipe syntax embraces this iterative approach. You can smoothly chain together filters (WHERE), aggregations (COUNT), and sorting (ORDER BY) to extract those golden insights. You can also add or remove steps in your data processing as you uncover new insights, easily adjusting your analysis on the fly.

Imagine you want to count the total number of users who were affected by the same errors more than 100 times in the month of January. As shown below, the pipe syntax’s linear structure clearly shows the data flowing through each transformation: starting from the table, filtering by the dates, counting by user id and error type, filtering for errors >100, and finally counting the number of users affected by the same errors.

-- Pipe Syntax 
FROM log_table 
|> WHERE datetime BETWEEN DATETIME '2024-01-01' AND '2024-01-31'
|> AGGREGATE COUNT(log_id) AS error_count GROUP BY user_id, error_type
|> WHERE error_count>100
|> AGGREGATE COUNT(user_id) AS user_count GROUP BY error_type

The same example in the standard syntax will typically require using a subquery and non linear structure.

-- Standard Syntax 
SELECT error_type, COUNT(user_id)
FROM (
   SELECT user_id, error_type, 
   count (log_id) AS error_count 
   FROM log_table 
   WHERE datetime BETWEEN DATETIME '2024-01-01' AND DATETIME '2024-01-31'
   GROUP BY user_id, error_type
)
GROUP BY error_type
WHERE error_count > 100;

Join Aakash on Peerlist!

Join amazing folks like Aakash and thousands of other people in tech.

Create Profile

Join with Aakash’s personal invite link.

0

6

0