Code360 powered by Coding Ninjas X Naukri.com. Code360 powered by Coding Ninjas X Naukri.com
Last Updated: Mar 27, 2024
Difficulty: Easy

Logging Query Language

Leveraging ChatGPT - GenAI as a Microsoft Data Expert
Speaker
Prerita Agarwal
Data Specialist @
23 Jul, 2024 @ 01:30 PM

Introduction

When you build a log bucket and enable Log Analytics for it, Cloud Logging makes the log data available in Logging's Log Analytics interface. You don't have to route and keep a separate copy of the data in BigQuery if you enable Log Analytics. Log Analytics allows you to examine log data in the same way that BigQuery does. You may still query and analyze the data using the Logging features you're used to.

You may configure a view of the data in the Logs Analytics-enabled bucket directly in BigQuery if you wish to mix and use your logs data with other data in BigQuery. The identical query that works in Log Analytics works in BigQuery as well.

Let us learn how the data is queried using the logging query language.

Logging query language

You may use the Logging query language through the Google Cloud console's Logs Explorer, the Logging API, or the command-line interface. The Logging query language may query data and construct filters to generate sinks and log-based metrics.

A query is a Boolean expression that specifies a subset of all the log entries in the Google Cloud resource you have chosen, such as a Cloud project or folder.

You may construct searches based on the LogEntry indexed column using the logical operators AND and OR. The Logging query language syntax looks like this when the resource.type field is used in the following examples:

  • Simple restriction: resource.type = "gae_app"
  • Conjunctive restriction: resource.type = "gae_app" AND severity = ERROR
  • Disjunctive restriction: resource.type = "gae_app" OR resource.type = "gce_instance"
  • Alternatively: resource.type = ("gae_app" OR "gce_instance")
  • Complex conjunctive/disjunctive expression: resource.type = "gae_app" AND (severity = ERROR OR "error")

Syntax notation

The following sections offer an introduction to the Logging query language syntax and go into depth about how queries are built and how matching is accomplished. Some of the examples include explanatory text in the form of comments.

Take note of the following:

  • A query cannot be longer than 20,000 characters.
  • Except for regular expressions, the Logging query language is case-insensitive.

Syntax summary

The syntax of the Logging query language may be conceived of in terms of queries and comparisons.

A query is a string that contains the following expression:

expression = ["NOT"] comparison { ("AND" | "OR") ["NOT"] comparison }

A comparison can be expressed as a single value or as a Boolean expression:

"The cat in the hat"

resource.type = "gae_app"

The first line is an example of a single-value comparison. These kinds of comparisons are global constraints. Each log entry field is implicitly compared to the value using the has operator. In this case, the comparison is successful if any field in a LogEntry or its payload has the words "The cat in the hat."

The second line is an example of a Boolean expression comparison of the form [FIELD_NAME] [OP] [VALUE]. The elements of the comparison are as follows:

  • [FIELD_NAME]: a field in a log entry. Example, resource.type.
  • [OP]: comparison operator. Example, =.
  • [VALUE]: a number, string, function, or parenthesized expression. For example, "gae_app". For JSON null values, use NULL_VALUE.

Boolean operators

AND and OR are short-circuit operators in Boolean logic. The NOT operator takes priority over the OR and AND operators in that order. For example, the following two expressions are interchangeable:

a OR NOT b AND NOT c OR d

(a OR (NOT b)) AND ((NOT c) OR d)

Between comparisons, you can omit the AND operator. You may also use the - (minus) operator instead of the NOT operator. For instance, the following two questions are identical:

a=b AND c=d AND NOT e=f

a=b c=d ~ e=f

AND and NOT are always used in this article.

You may use the AND, OR, and NOT operators on any filters except those used by log views. Only AND and log views support NOT operations.

To combine AND and OR rules in the same statement, use parenthesis for nesting the rules. If you do not utilize parenthesis, your query may not function properly.

Boolean operators must always be capitalized. As search phrases, lowercase and, or, and not being processed.

Read more, Introduction to JQuery

Comparisons

Comparisons are structured as follows:

[FIELD_NAME] [OP] [VALUE]

The elements of the comparison are as follows:

[FIELD_NAME]: is the path name of a log entry field. Examples of field names include:

  • resource.type
  • resource.labels.zone
  • resource.labels.project_id
  • insertId
  • jsonPayload.httpRequest.protocol
  • labels."compute.googleapis.com/resource_id"


The path name must be double-quoted if a route name contains special characters. Because it contains a forward slash /, compute.googleapis.com/resource id must be double-quoted.

[OP]: is one of the following comparison operators.

  • =           -- equal
  • !=          -- not equal
  • > < >= <=   -- numeric ordering
  • :           -- "has" matches any substring in the log entry field
  • =~          -- regular expression search for a pattern
  • !~          -- regular expression search not for a pattern


[VALUE]: can be a number, a string, a function, or a parenthesized expression. Strings can represent any text and Boolean, enumeration, and byte-string data. Prior to the comparison, the [VALUE] is transformed to the field's type. NULL VALUE is used for JSON null values.

Use the following syntax to search for a JSON null value:

  • sonPayload.field = NULL_VALUE      -- includes "field" with null value
  • NOT jsonPayload.field = NULL_VALUE  -- excludes "field" with null value
  • If [VALUE] is a parenthesized Boolean comparison combination, the field name and comparison operator are applied to each element. As an example:
  • jsonPayload.cat = ("longhair" OR "shorthair")
  • jsonPayload.animal : ("lovely" AND "pet")


The first comparison determines if the field cat is "longhair" or "shorthair." The second verifies that the field animal's value contains both the words "lovely" and "pet," in any order.

Field path identifiers

Log entries are all instances of the LogEntry type. The identifier that is (or begins) the left-hand side of the comparison must be a LogEntry type field.

The following is the most recent list of log input fields. If relevant, each field is followed by the following level of names for that field:

  • httpRequest: { cacheFillBytes, cacheHit, cacheLookup, cacheValidatedWithOriginServer, latency, protocol, referer, remoteIp, requestMethod, requestSize, requestUrl, responseSize, serverIp, status, userAgent }
  • insertId
  • jsonPayload { variable }
  • labels { variable }
  • logName
  • metadata { systemLabels, userLabels }
  • operation{ id, producer, first, last }
  • protoPayload { @type, variable }
  • receiveTimestamp
  • resource { type, labels }
  • severity
  • sourceLocation: { file, line, function }
  • spanId
  • textPayload
  • timestamp
  • trace

The following are some field path Identifiers that you may use in your comparisons:

resource.type: If your first path identifier is a resource, the following identifier must be a MonitoredResource type field.

httpRequest.latency: If your first path identifier is httpRequest, the following identifier must be a HttpRequest type field.

labels.[KEY]: If labels is your first path identifier, the following identifier, [KEY], must be one of the keys from the key-value pairs in the labels field.

logName:Since the logName field is a string, it cannot be followed by any subfield names.

Special characters

LogEntry fields that include special characters must be quoted. For example:

jsonPayload.":name":apple

jsonPayload."foo.bar":apple

jsonPayload."\"foo\"":apple

Monitored resource types

Select a monitored resource type for quicker queries. See Monitored resource types for a list of resource types.

Compute Engine VMs, for example, use the resource type gce_instance, while Amazon EC2 instances use aws_ec2_instance. The example below explains how to limit your queries to both types of VMs:

resource.type = ("gce_instance" OR "aws_ec2_instance")

Missing fields

If you use a field name in a query and it doesn't appear in a log entry, the field is either missing, undefined, or defaulted:

  • If the field is part of the log entry's payload (jsonPayload or protoPayload), or if it is in a label in the log entry's labels section, it is missing. Using a missing field will not result in an error, but all comparisons will fail silently.
    Examples: jsonPayload.nearest_store, protoPayload.name.nickname
  • The field is defaulted if it is defined in the LogEntry type. Comparisons are carried out as though the field existed and had its default value.
    Examples: httpRequest.remoteIp, trace, operation.producer
  • Otherwise, the field is undefined, which is discovered before the query is executed.
    Examples: thud, operation.thud, textPayload.thud

Object and array types

Each log entry field can store a scalar, object, or array.

  • A scalar field contains a single number, such as 174.4 or -1. A string is also regarded as a scalar. Scalar types include fields that may be converted to (or from) strings, such as Duration and Timestamp.
  • An object type is a collection of named values, such as the JSON value:
    {"age": 24, "height": 67}
  • A collection of items of the same type is stored in an array field. A field containing measurements, for example, may include an array of numbers:
    {8.5, 9, 6}

Values and conversions

The right-hand side value is converted to the type of the log entry field as the first step in assessing a comparison. In comparisons, scalar field types are permitted, as are two extra kinds whose values are represented as strings: TimeStamp and Duration.

Field types and their permitted query value in tabular form

Types of log fields

The type of a log entry field is decided as follows:

Protocol buffer fields are log fields defined in the type LogEntry and the component type. There are defined types for protocol buffer fields.

Log fields in protoPayload objects are likewise protocoled buffer fields with defined types. The name of the protocol buffer type is contained in the protoPayload field "@type." 

When a log entry is received, the kinds of log fields inside jsonPayload are deduced from the field's value:

  • Fields with unquoted numbers have the type double.
  • Fields with true or false values are of type bool.
  • Strings are the type of fields whose values are strings.

Since long (64-bit) numbers cannot be represented precisely as double values, they are stored in string fields.

Only protocol buffer fields recognize the Duration and Timestamp types. Those values are stored in string fields elsewhere.

Comments

Comments begin with two dashes (--), and any content after the dashes is ignored until the line ends. Comments can be included at the start of a filter, between words, or at the conclusion.

Example:

-- All of our target users are emitted by Compute Engine instances.

 resource.type = "gce_instance"

 -- Looking for logs from "alex".

 jsonPayload.targetUser = "alex"

Comparison operators

The equality (=,!=) and inequality (<, <=, >, >=) operators have different meanings depending on the underlying type of the left-hand field name.

  • All numeric types: Normal meaning of equality and inequality is applicable to numbers.
  • bool: Equality means having the same Boolean value. true>false defines inequality.
  • enum: Equality means having the same enumeration value. The underlying numeric values of the enumeration literals are used by Inequality.
  • Duration: Equality means having the same length of time. The length of the time determines inequality. As an example, "1s">"999ms" as durations.
  • Timestamp: Equality refers to the same point in time. If a and b are Timestamp values, a b indicates that an occurred earlier in time than b.
  • bytes: Operands are compared left to right, byte by byte.
  • string: Letter case is ignored in comparisons.

Global restrictions

A global restriction exists when the comparison consists of a single value. The has (:) operator is used in logging to detect if any field in a log entry, or its payload, includes the global restriction. If it does, the comparison is successful.

Example:

The most basic query stated in terms of a global restriction has only one value:

"The Cat in the Hat" 

For a more intriguing query, combine global restrictions with the AND and OR operators. For example, if you want to see all log entries with a field containing cat and another field containing either hat or bat, submit the query as:

(cat AND (hat OR bat))

There are three global restrictions in this case: cat, hat, and bat. These global restrictions are applied independently, and the results are aggregated in the same way the expression would have been written without parenthesis.

Functions

Built-in functions can be used as global restrictions in queries:

function = identifier ([argument, {argument} ]), where argument is a value, field name, or parenthesized expression.

Below are the different functions:  

log_id

The log_id function retrieves log entries from the logName field that match the provided [LOG_ID] argument:

Syntax:

log_id([LOG_ID])

source

The source function looks for log entries from a specific resource in the hierarchy of organizations, folders, and Cloud projects.

The source function does not match any of the child resources. For example, using source(folders/folder_123) matches logs from the folder_123 resource rather than logs from the Cloud project resources included within folder_123.

Syntax:

source(RESOURCE_TYPE/RESOURCE_ID)

sample

The sample function chooses a fraction of all log entries:

Syntax: 

sample([FIELD], [FRACTION])

[FIELD] denotes the name of a log entry field, such as logName or jsonPayload.a field. The field's value affects whether the log entry is included in the sample. The field type must be either string or numeric. Setting [FIELD] to insertId is an excellent choice because that field has a distinct value for each log entry.

[FRACTION] is the percentage of log entries with values for [FIELD]. It is a value larger than 0.0 but less than 1.0. For example, if you enter 0.01, the sample will comprise around one percent of all log entries with [FIELD] values. If [FRACTION] is 1, then all log entries with [FIELD] values are picked.

ip_in_net

The ip_in_net function determines whether an IP address in a log entry belongs to a subnet. You can use this to determine whether a request came from an internal or external source.

Syntax:

ip_in_net([FIELD], [SUBNET])

[FIELD] is a string-valued log entry field containing an IP address or range. The field can be repeated, but only one of the repeated fields must have an address or range in the subnet.

[SUBNET] is a string constant representing an IP address or range. If [SUBNET] is not a valid IP address or range is an error.

Searching by time

You can set specific limits on the date and time of log entries to display in the interface. For example, if you include the following conditions in your query, the preview displays only the log entries from the specified 30-minute period, and you won't be able to scroll outside of that time frame:

When creating a query with a timestamp, you must use the dates and times below.

timestamp >= "2016-11-29T23:00:00Z" timestamp <= "2016-11-29T23:30:00Z"

Using regular expressions

Regular expressions can be used to build queries and filters for sinks, metrics, and anywhere log filters are used. Regular expressions can be used in the Query builder and with the Google Cloud CLI.

The following are the characteristics of regular expression queries:

  • A regular expression can only match fields of the string type.
  • String normalization is not used; for example, kubernetes is not the same as KUBERNETES. 
  • By default, queries are not anchored and are case-sensitive.
  • On the right side of the regular expression comparison operator, = ~and!~, Boolean operators can be used to connect multiple regular expressions.

The following is the structure of a regular expression query:

Match a pattern:

jsonPayload.message =~ "regular expression pattern"

Contradicts a pattern:

jsonPayload.message !~ "regular expression pattern"

The = and! convert the query to a regular expression query, and the pattern to be matched must be enclosed in double quotation marks. To search for patterns that contain double quotation marks, use a backslash to escape them.

Finding log entries quickly

The following LogEntry fields are always indexed while logging:

Custom indexed fields can also be added to any log bucket.

Optimize your queries

Reduce the number of logs, the number of log entries, or the time span of your searches to make them faster. Better yet, you can reduce all three.

Examples:

Use the right log name: Specify the log containing the log items in question. Examine one of your log entries to ensure you have the correct log name.

Choose the right log entries: If you know the log entries you're looking for are coming from a particular VM instance, identify it. Examine one of the log entries you wish to search for to ensure that the label names are correct.

Choose the right time period: Set a time period for the search. The GNU/Linux date command is a simple way to determine relevant timestamps in RFC 3339 format.

Get the tech career you deserve, faster!
Connect with our expert counsellors to understand how to hack your way to success
User rating 4.7/5
1:1 doubt support
95% placement record
Akash Pal
Senior Software Engineer
326% Hike After Job Bootcamp
Himanshu Gusain
Programmer Analyst
32 LPA After Job Bootcamp
After Job
Bootcamp

Troubleshooting

Syntax issues

If you're having trouble with the expressions in your queries, take a look at the following:

With matching parentheses and quote marks, your query follows the syntax standards.

  • Your log entry field names are spelled appropriately.
  • Uppercase letters are used for Boolean operations (AND, OR, NOT).
  • Make sure you use NULL VALUE to represent JSON null values.
  • For clarity, Boolean expressions used as global restrictions or as the right-hand side of comparisons should be parenthesized. For example, the two requests below appear to be the same, yet they are not: insertId = "ABC-1" OR "ABC-2" -- ERROR!? insertId = ("ABC-1" OR "ABC-2")
  • There must be no special characters in the unquoted text. Add two quote marks when in doubt

Frequently Asked Questions

What is a log query?

The processing of logging query language is based on a data flow concept. Each query can refer to one or more logs; the output is a table dataset. There are various operators in the query language for finding, filtering, and aggregating structured and unstructured logs.

How do I query logs in GCP?

Enable Show query to view your search phrases within the query expression. After you've entered your search criteria, press the Enter key or click Run query. The query results are presented in the Query results window.

How do I enable GCP Logging?

In the left sidebar, navigate to APIs & auth > APIs. To activate the Google Cloud Logging API, toggle the status button from OFF to ON. When I go to my API console, I don't find any Google Cloud Logging API options.

Conclusion

In this article, we have extensively discussed the Logging query language. Our discussion mainly focused on what the query language is, its syntax, and how to implement it efficiently.

We hope this blog has helped you enhance your Google Cloud Platform knowledge. To learn more about Google Cloud Platform concepts, refer to our articles on All about GCP Certifications: Google Cloud Platform | Coding Ninjas Blog.  

Refer to our guided paths on the Coding Ninjas Studio platform to learn more about DSA, DBMS, Competitive Programming, Python, Java, JavaScript, etc. 

Refer to the links problemstop 100 SQL problemsresources, and mock tests to enhance your knowledge.

For placement preparations, visit interview experiences and interview bundle.

Do upvote our blog to help other ninjas grow. Happy Coding!

Thank you image

Topics covered
1.
Introduction
2.
Logging query language
2.1.
Syntax notation
2.2.
Syntax summary
2.3.
Boolean operators
2.4.
Comparisons
2.4.1.
Field path identifiers
2.4.2.
Special characters
2.4.3.
Monitored resource types
2.4.4.
Missing fields
2.4.5.
Object and array types
2.4.6.
Values and conversions
2.4.7.
Types of log fields
2.4.8.
Comments
2.4.9.
Comparison operators
2.4.10.
Global restrictions
2.4.11.
Functions
2.5.
Searching by time
2.6.
Using regular expressions
2.7.
Finding log entries quickly
2.7.1.
Optimize your queries
3.
Troubleshooting
3.1.
Syntax issues
4.
Frequently Asked Questions
4.1.
What is a log query?
4.2.
How do I query logs in GCP?
4.3.
How do I enable GCP Logging?
5.
Conclusion