Code360 powered by Coding Ninjas X Naukri.com. Code360 powered by Coding Ninjas X Naukri.com
Table of contents
1.
Introduction
2.
Basic concepts
2.1.
Log Entries
2.2.
Logs
2.3.
Retention Period
2.4.
Query and Filter
2.5.
Monitored Resources
2.6.
Log Router
2.7.
Sinks
2.8.
Supported Destinations
3.
Storing, Viewing, and Managing logs
3.1.
Log Buckets
3.1.1.
_Default log bucket
3.1.2.
User-defined log buckets
3.2.
Regionalization
3.3.
Organization policy
4.
Configure and manage sinks
4.1.
Prerequisites
4.2.
Create an Aggregated Sink
4.3.
Create Filters for Aggregated Sinks
4.4.
Set Destination Permissions
5.
View Logs in Sink Destinations
5.1.
Cloud Storage
5.2.
Routing Frequency
5.3.
Logs Organization
5.4.
BigQuery
5.5.
Table Organization
5.6.
Pub/Sub
6.
BigQuery Schema for Logs
6.1.
Field Naming Conventions
6.2.
Payload Fields with @type
6.2.1.
Naming Rules for @type
6.3.
Audit Logs Fields
6.4.
Mismatches in Schema
7.
Troubleshoot Routing and Sinks
7.1.
Destination is Missing Logs
7.2.
View Errors
7.3.
Activity Stream
7.4.
Types of Sink Errors
7.4.1.
Incorrect Destination
7.4.2.
Managing Sinks Issues
7.4.3.
Organizational Policy Issues
7.4.4.
Quota Issues
8.
Configure Default Resource Settings for Logging
8.1.
Specify Storage Region
8.2.
Disable the _Default Sink
9.
Frequently Asked Questions
9.1.
What is Google Cloud Platform?
9.2.
What are the GCP cloud storage libraries and tools?
9.3.
What is the pricing model in the GCP cloud?
10.
Conclusion 
Last Updated: Mar 27, 2024

Key Concepts of Cloud Logging

Author Shivani Singh
0 upvote
Leveraging ChatGPT - GenAI as a Microsoft Data Expert
Speaker
Prerita Agarwal
Data Specialist @
23 Jul, 2024 @ 01:30 PM

Introduction

Cloud logging allows you to manage log data from multiple cloud resources from a single location. Log data is critical for assessing and improving cloud performance and security. However, effectively leveraging log data from comprehensive and multi architectures can be extremely difficult. Simplicity and granular visibility are frequently required to ensure efficient cloud logging. Cloud logging is based on the generation of log files, which are collections of data that record events that occur in your systems. Requests, transactions, user information, and timestamps are all examples of data that can be found in log files. The specific data collected by logs is determined by how your elements are configured. There are several types of logs to collect when performing cloud logging. Event logs, transaction logs, message logs, and audit logs are examples of these. You can use log management software to ingest, process, and correlate data to make the collection and aggregation process easier.

In this blog, we will discuss all key concepts of cloud logging. 

Basic concepts

Cloud Logging is a product in Google Cloud's operations suite. It includes log storage, the Logs Explorer user interface, and an API for managing logs programmatically. Logging allows you to read and write log entries, query your logs, and control how your logs are routed, stored, and used.

Log Entries

Entries in the log

A log entry records the status of a computer system or describes specific events or transactions that occur in it. Log entries are generated by your own code, Google Cloud services on which your code is running, third-party applications, and the infrastructure on which Google Cloud is based. Certain log entries identify particular events that occur within the system. You can use these log entries to generate messages that reassure users that everything is fine or to provide information when something goes wrong. Other log entries may contain information about transactions handled by a system or component. A load balancer, for example, logs every request it receives. A load balancer may also record information such as the requested URL and the HTTP response code.

A log entry must include at least the following:

  • A timestamp indicating when the event occurred or when it was received by Cloud Logging.
  • The monitored resource is responsible for the log entry.
  • A payload, also known as a message, can be either unstructured text or structured text in JSON format.
  • The name of the log to which it is assigned.
  • Log entries can also contain metadata. The severity of each associated log entry could be included in such metadata.

Logs

A log is a named collection of log entries contained within a Cloud-Based resource, such as a Cloud project. Logs only exist if they contain log entries.

The full path of the resource to which the log entries belong is followed by a simple log ID, such as Syslog, or a structured ID that includes the log's writer, such as compute.googleapis.com/activity.

Retention Period

The retention period is the amount of time that log entries are kept in Cloud Logging. The entries are then deleted. Logging Quotas and Limits lists the retention periods for various types of logs. Some of your logs' retention periods can be customized.

Query and Filter

A query is a Logging query language expression that returns log entries that match the expression. In the Logs Explorer and the Logging API, queries are used to select and view log entries, such as those from a specific VM instance or those arriving in a specific time period with a specific severity level.

A filter is a Logging query language expression that is used in sinks to route logs that match the expression to a storage destination. When creating logs-based metrics, you can also use filters to route logs that match that interpretation to Cloud Monitoring.

Monitored Resources

Each log entry includes the name of a monitored resource to indicate where it came from. Individual Compute Engine VM instances, Google Kubernetes Engine containers, database instances, and so on are examples.

Log Router

All logs, including audit logs, platform logs, and user-written logs, are routed through the Log Router to the Cloud Logging API. The Log Router compares each log entry to existing rules to determine which log entries should be ingested (stored) in log buckets, which log entries should be routed to a destination, and which log entries should be excluded (discard).

Sinks

Sinks govern how logs are routed by Cloud Logging. You can use sinks to route some or all of your logs to supported destinations or to prevent log entries from being saved in Logging. A sink has a destination and a filter that determines which log entries to the route.

Supported Destinations

You can use the Log Router to route specific logs to supported destinations in any Cloud project. The following sink destinations are supported by logging:

  • Cloud Storage: JSON files are stored in Cloud Storage buckets; it is a low-cost, long-term storage solution.
  • BigQuery: Tables created in BigQuery datasets; allow for big data analysis.
  • JSON-formatted messages are delivered to Pub/Sub topics; Logging supports third-party integrations such as Splunk.
  • Cloud Logging: Log entries are stored in log buckets; Cloud Logging offers storage with customizable retention periods.
Get the tech career you deserve, faster!
Connect with our expert counsellors to understand how to hack your way to success
User rating 4.7/5
1:1 doubt support
95% placement record
Akash Pal
Senior Software Engineer
326% Hike After Job Bootcamp
Himanshu Gusain
Programmer Analyst
32 LPA After Job Bootcamp
After Job
Bootcamp

Storing, Viewing, and Managing logs

The following section describes how logs are stored in Cloud Logging and how you can view and manage them.

Log Buckets

Cloud Logging stores and organizes log data in log buckets, which are buckets in your Google Cloud developments, billing accounts, folders, and organizations. The logs you store in Cloud Logging are indexed, optimized, and delivered in real-time, allowing you to analyze them. Cloud Logging buckets are distinct from the similarly named Cloud Storage buckets.

Logging automatically creates two log buckets for each Cloud project, billing account, folder, and organization: _Required and _Default. Logging automatically creates sinks named _Required and _Default, which route logs to the correspondingly named buckets in the default configuration. You can turn off the _Default sink, which sends logs to the _Default log bucket. 

_Default log bucket

Unless you disable or otherwise modify the _Default sink, any log entry that isn't ingested by the _Required bucket is routed to the _Default bucket by the _Default sink. See Manage sinks for information on how to modify them. The _Default bucket cannot be deleted.

User-defined log buckets

In any Cloud project, you can also create user-defined log buckets. You can route any subset of your logs to any log bucket by applying sinks to your user-defined log buckets, allowing you to choose which Cloud project your logs are stored in and which other logs are stored with them.

Regionalization

Log buckets are a type of regional resource. The infrastructure that stores, indexes, and searches your logs is physically located somewhere. Google manages that infrastructure, ensuring that your applications are available in multiple zones within that region.

Organization policy

To ensure that your organization meets your compliance and regulatory requirements, you can develop an organizational policy. You can specify which regions your organization can create new log buckets in using an organization policy. You can also prevent your company from creating new log buckets in certain regions.

Cloud Logging only enforces your newly created organization policy on new log buckets, not on existing log buckets.

Configure and manage sinks

Sinks govern how logs are routed by Cloud Logging. You can use sinks to route some or all of your logs to supported destinations or to prevent log entries from being saved in Logging. 

Prerequisites

Before you build a sink, make the following checks:

  • You have a Google Cloud folder or organization that contains logs that can be viewed in Logs Explorer.
  • For the Google Cloud organization or folder from which you're routing logs, you have one of the following IAM roles:

Ownership (roles/ownership)

Logging Administrator (logging. admin)

Configuration Writer for Logs (roles/logging.configWriter)

  • You have or have the ability to create a resource in a supported destination.

Create an Aggregated Sink

Do the following to create an aggregated sink for your folder or organization:

Step 1: Navigate to the Logging > Log Router page in the Google Cloud console. Choose an already existing folder or organization.

Step 2: Choose to Create a sink. Enter the following information in the Sink Details panel:

  • Sink name: Enter an identifier for the sink; keep in mind that you cannot rename the sink after you create it, but you can delete it and create a new one.
  • Sink description (optional): Describe the sink's purpose or use case.
     

Step 3: Select the sink service and destination in the Sink destination panel:

  • Choose a sink service: Choose the service to which you want your logs to be routed. You can choose from the following destinations depending on the service you choose:
  • Logging bucket in the cloud: Choose or create a Logging bucket.
  • BigQuery table: Choose or create a specific dataset to receive routed logs. Partitioned tables are an additional option.
  • Cloud Storage bucket: Choose or create a Cloud Storage bucket to receive routed logs.
  • Pub/Sub topic: Choose or create a specific topic to receive routed logs.
  • Splunk: For your Splunk service, select the Pub/Sub topic.
     

If your sink destination is a BigQuery dataset, for example, the sink destination would be as follows:

bigquery.googleapis.com/projects/PROJECT ID/datasets/DATASET ID

Step 4: Do the following in the Choose logs to include in the sink panel:

  • Choose Include logs ingested by this resource and all child resources to generate an aggregated sink.
  • Enter a filter expression that matches the log entries you want to include in the Build inclusion filter field. If no filter is specified, all logs from the selected resource are routed to the destination.
  • Select Preview logs to ensure you entered the correct filter. This launches the Logs Explorer in a new tab.
     

Step 5: Choose to Create a sink.

Create Filters for Aggregated Sinks

Like any other sink, your aggregated sink has a filter that selects individual log entries. 

The following are some filter comparison examples that can be useful when using the aggregated sinks feature. The following notation is used in some examples:

  • The substring operator is: Don't use the = operator to replace any additional filter comparisons.
  • The colored text indicates variables. Replace them with appropriate values.
     

It should be noted that the length of a filter cannot exceed 20,000 characters.

Set Destination Permissions

To allow your sink to route to its destination, do the following:

Step 1: Obtain the new sink's writer identity—an email address—from it. Navigate to the Log Router page and select more vert > View to sink details from the menu. The Sink details panel displays the writer's identity.

Step 2: If you have Owner access to the destination, add the service account in the following manner:

  • Add the sink's writer identity to your Cloud Storage bucket and assign it the Storage Object Creator role for Cloud Storage destinations.
  • Add the sink's writer identity to your dataset and assign it the BigQuery Data Editor role for BigQuery destinations.
  • Add the sink's writer identity to your topic and assign it the Pub/Sub Publisher role for Pub/Sub, including Splunk.
  • Add the sink's writer individuality to the destination bucket and grant it the roles/logging.bucketWriter permission for Logging bucket destinations in different Cloud projects.
     

Send the writer identity service account name to someone with Owner access to the sink destination if you don't have it. The writer's identity should then be added to the sink destination using the instructions from the previous step.

View Logs in Sink Destinations

This section describes locating log entries routed from Cloud Logging to supported destinations.

Cloud Storage

Step 1: In the Google Cloud console, navigate to Cloud Storage Browser

Step 2: Navigate to the Cloud Storage Browser.

Step 3: Choose the Cloud Storage bucket that will serve as your routing destination.

Routing Frequency

Hourly batches of log entries are saved to Cloud Storage buckets. It could take up to 3 hours for the first entries to appear.

Logs Organization

Logging writes a set of files to a Cloud Storage bucket when you route logs to it.

The files are arranged in directory hierarchies according to log type and date. In the LogEntry reference, the log type, referred to as [LOG ID], can be a simple name like Syslog or a compound name like appengine.googleapis.com/request log. If these logs were stored in a bucket called my-gcs-bucket, the directories would be named as follows: 

my-gcs-bucket/syslog/YYYY/MM/DD/

my-gcs-bucket/appengine.googleapis.com/request_log/YYYY/MM/DD/

Logs from multiple types of resources can be stored in a single Cloud Storage bucket. Each file is about 3.5 GiB in size.

BigQuery

Step 1: Navigate to the Google Cloud console's BigQuery page:

Step 2: Choose the dataset that will be the sink's destination.

Step 3: Choose a table from the dataset. You can view the log entries on the Details tab, or you can query the table to get your data.

Table Organization

Logging creates dated tables to retain the routed log entries when you route logs to a BigQuery dataset. Log entries are placed in tables with names based on the log names and timestamps of the entries. 

Pub/Sub

Do the following to view your routed logs as they are streamed through Pub/Sub:

Step 1: Navigate to the Google Cloud console's Pub/Sub page:

Step 2: Find or create a subscription to the log sink's topic and pull a log entry from it. It is possible that you will have to wait for a new log entry to be published.

BigQuery Schema for Logs

This section describes the format and rules that must be followed when routing log entries from Cloud Logging to BigQuery.

Field Naming Conventions

When sending logs to BigQuery, the following naming conventions apply to the log entry fields:

  • The length of log entry field names cannot exceed 128 characters.
  • Log entry field names can only contain alphanumeric characters. Any unsupported characters in field names are removed and replaced with underscore characters. For example, jsonPayload.foo%% would be transformed to jsonPayload.foo__.
  • Even after transformation, log entry field names must begin with an alphanumeric character; any leading underscores are removed.
  • The corresponding BigQuery field names for log entry fields of the LogEntry type are the same as the log entry fields.
  • The corresponding BigQuery field names for any user-supplied log entry fields are normalized to lowercase, but the naming is otherwise preserved.

Payload Fields with @type

Structured data can be found in log entry payloads. An optional type specifier in the following format can be included in any structured field:

@type: type.googleapis.com/[TYPE]

Naming Rules for @type

Structured fields with type specifiers are typically given BigQuery field names with an [TYPE] appended to them. Any string can be used as the value of [TYPE].

The @type naming conventions only apply to the top level of jsonPayload or protoPayload; nested fields are ignored. Logging removes the prefix type.googleapis.com when dealing with top-level structured payload fields.

Audit Logs Fields

ProtoPayload.request, protoPayload.response, and protoPayload.metadata are audit log payload fields with @type specifiers but are treated as JSON data. That is, their BigQuery schema names are their field names with Json concatenated to them, and they contain JSON-formatted string data.

Mismatches in Schema

The schema for the destination BigQuery table is determined by the first log entry received by BigQuery. BigQuery creates a table with columns based on the fields and types of the first log entry. When log entries are written to the destination table and one of the following errors occurs, a schema mismatch occurs: 

  • A subsequent log entry modifies the field type of an existing field in the table.
  • A new log entry contains a field that isn't in the current schema, and inserting that field into the destination table would push the BigQuery column limit past its limit.

If the new field does not exceed the column limit, the destination table can accept it.

Troubleshoot Routing and Sinks

This section explains how to use the Google Cloud console to view and troubleshoot common sink-related issues, as well as how to view and troubleshoot configuration errors or unforeseen consequences.

Destination is Missing Logs

The most common sink-related problem is when logs appear to be missing from a sink destination. In some cases, no error is generated, but logs may be inaccessible when you attempt to access them at your destination. Check your sink's system log-based metrics if you suspect it isn't properly routing logs:

  • exports/byte count: The number of bytes routed from log entries.
  • exports/log entry count: The number of log entries routed.
  • exports/error count: The number of log entries that we're unable to be routed.

The metrics have labels that record the counts by sink name and destination name and tell you whether your sink is successfully routing log data or failing.

View Errors

Logging provides error messages for incorrectly configured sinks for each of the supported sink destinations. There are several ways to view sink-related errors like Examining the error logs for the sink, Email sink error notifications to yourself and Using the Activity Stream feature in the Google Cloud Console.

Activity Stream

You can use the Google Cloud console Activity Stream to see truncated versions of your sinks' configuration errors. Carry out the following actions:

  • Check that you have the access qualifications to view errors in the Activity Stream; your Google Cloud resource must have the Owner role.
  • Navigate to the Activity Stream for the Cloud project or other resource where you created the sink.
  • Select Activity types > Configuration and Resource type > Logging export sink from the Filters panel.
  • Change the Date/Time to see sink errors for the appropriate time period.

Any sink setup errors that apply to the resource appear as Cloud Logging sink configuration errors in the list. Each error includes a link to one of the generated log entries.

Types of Sink Errors

Incorrect Destination

If you configure a sink but then receive a configuration error stating that the destination could not be found when Logging attempts to route logs, consider the following:

  • Your sink configuration contains a misspelling or other formatting error in the specified sink destination.
  • You must update the sink's configuration to specify the existing destination properly.
  • The specified location may have been removed.
  • You can change the sink's configuration to use a different, existing destination or recreate the destination with the same name.
     

In either case, go to the Logs Router in the Google Cloud console to resolve any issues.

Managing Sinks Issues

If you disabled a sink to stop log ingestion but still see logs being routed, wait a few minutes for the changes to the sink to take effect.

If a sink attempts to route a log entry but does not have the necessary IAM permissions for the sink's destination, the sink reports an error that you can view and skips the log entry.

Organizational Policy Issues

If you try to route a log entry but run into an organization policy that prevents Logging from writing to the sink's destination, the sink will fail to route to the chosen destination and will report an error. If you notice errors related to organizational policies, you can take the following steps:

  • Update the destination's organization policy to remove the constraints that are preventing the sink from routing log entries; this assumes you have the appropriate permissions to update the organization policy. See Creating and editing policies for details.
     

You should update your sink in the Log Router if you cannot update the organization policy.

Quota Issues

When sinks write logs, the Cloud projects wherein the sinks were created are subject to destination-specific quotas. If the quotas are reached, the sink will no longer route logs to the destination. Reduce the amount of log data being routed by revamping your sink's filter to match fewer log entries to resolve quota exhaustion issues. You could use your filter's sample function to select a subset of the total number of log entries.

Configure Default Resource Settings for Logging

Specify Storage Region

Log buckets are containers that store and organize log data in your Cloud projects, billing accounts, folders, and organizations. Logging creates two log buckets for each Cloud project, billing account, folder, and organization: _Required and _Default, which are automatically stored in an unspecified global location.

To specify the storage location for your organization's and its child resources' _Required and _Default buckets, use the gcloud alpha logging settings update command with the —organization and —storage-location flags:

gcloud alpha logging settings update --organization=ORGANIZATION_ID --storage-location=LOCATION

Replace the variables in the preceding command as follows:

  • ORGANIZATION ID is the ID of the Google Cloud organization for which the default setting is desired. See here to find the ID. Putting the organization together LOCATION refers to the region in which the data will be stored. See Data regionality: Supported regions for a list of supported storage locations.
  • Use the gcloud alpha logging settings describe command to view your organization's settings, including the default storage location:

gcloud alpha logging settings describe --organization=ORGANIZATION_ID

Disable the _Default Sink

To prevent logs from being ingested into the organization's _Default buckets, disable all of its _Default sinks. If you disable ingestion into a resource's _Default bucket, the logs that would have been routed to that bucket are removed from Logging storage, unless they are explicitly included in another user-defined sink for that resource.

Use the gcloud alpha logging settings update command with the —organization and —disable-default-sink flags to disable the _Default sinks for your organization and any of its child resources:

gcloud alpha logging settings update --organization=ORGANIZATION_ID --disable-default-sink

Frequently Asked Questions

What is Google Cloud Platform?

Google Cloud Platform is a Google cloud platform that allows users to access cloud systems and computing services. GCP provides a wide range of cloud computing services in the compute, database, storage, migration, and networking domains.

What are the GCP cloud storage libraries and tools?

Google Cloud Platform Console, which performs basic object and bucket operations.

GustilCommand-line Tool, which provides a command-line interface for cloud storage. Cloud Storage Client Libraries, that provide programming support for various languages such as Java, Ruby, and Python.

What is the pricing model in the GCP cloud?

When working on Google Cloud, the user is charged by Google Compute Engine based on compute instance, network use, and storage. Google Cloud charges virtual machines on a per-second basis, with a minimum charge of one minute. The cost of storage is then calculated based on the amount of data stored.

The network cost is calculated based on the amount of data transferred between virtual machine instances communicating with one another over the network.

Conclusion 

To sum it up, in this blog, we discuss cloud logging and its key concepts. We also discussed many operations which can be performed like Storing, viewing, and managing logs, configuring and managing sinks, troubleshooting routing and sinks, and many more. 

For more content, Refer to our guided paths on Coding Ninjas Studio to upskill yourself.

Do upvote our blogs if you find them helpful and engaging!

Happy Learning!

Live masterclass