Code360 powered by Coding Ninjas X Naukri.com. Code360 powered by Coding Ninjas X Naukri.com
Last Updated: Mar 27, 2024

Creating and Managing Instances in Cloud Spanner

Leveraging ChatGPT - GenAI as a Microsoft Data Expert
Speaker
Prerita Agarwal
Data Specialist @
23 Jul, 2024 @ 01:30 PM

Introduction

Cloud Spanner is a fully managed relational database service that offers transactional consistency at a global scale. It provides automatic and synchronous replication for high availability and support for two SQL dialects: Google Standard SQL and PostgreSQL. It combines transactions, SQL queries, and relational structures with the scalability to associate with non-relational or NoSQL databases. Creating and Managing instances in Cloud Spanner is a crucial step as instances are responsible for allocating resources used by the databases. Each Cloud Spanner instance can have multiple databases. Cloud Spanner also provides options to backup and restore databases on demand. 

Working with Instances

Spanner is a fully managed service that can oversee its underlying tasks and resources, including monitoring and restarting processes with zero downtime. Spanner does not allow manually stopping or restarting a given instance. Instance creation includes an instance configuration and the compute capacity. These determine the location and amount of storage resources for the instance. 

An instance configuration defines the geographic placement/region and replication of the databases in that instance. It can be configured to either regional or multi-region.

Compute capacity determines the amount of server and storage resources available to the databases in an instance. The compute capacity is specified in terms of processing units or nodes, with 1000 processing units equal to 1 node.

Nodes and Processing units

Instances with less than 1000 processing units are built for smaller data sizes, queries, and workloads. They have limited resources and may result in non-linear scaling and performance for some workloads with increased latencies. For such instances, Cloud Spanner allocates 409.6 GB of data for every 100 processing units in the database. It also allocates server resources in a single server task per zone.

For instances of 1 node or more, Cloud Spanner assigns 4 TB of data for each node.  It allocates server resources in multiple server tasks per zone, with one task for each 1000 processing units. It uses numerous server tasks per zone, unlike instances with less than 1000 processing units. This provides better performance and enables Cloud Spanner to create database splits.

Regional and Multi-regional Configurations

All the resources are contained within one Google Cloud region in regional configurations. In multi-regional configurations, the resources span more than one region. This setting determines where the data is stored, for an instance. Google Cloud services are available across North America, South America, Europe, Asia, and Australia. 

For all regional configurations, Cloud Spanner maintains three read-write replicas within a different Google Cloud zone in that region. Each read-write replica contains a full copy of the operational database to serve read-write and read-only requests.

If an application needs to read data from multiple geographic locations or if writes originate from a different location than the reads, then a multi-regional configuration might be a better choice. This configuration allows users to replicate the database's data in multiple zones, even across numerous regions, as defined by the instance configuration. Multi-region configurations enable applications to achieve faster reads in more places at the cost of a slight increase in write latency.

Replication

Cloud Spanner uses replicas in different zones to keep up availability even when a single-zone failure occurs. Cloud Spanner automatically performs replication at the byte level from the underlying distributed filesystem. It writes the database mutations to files in this filesystem, which takes care of replicating and recovering the files during a machine or disk failure.

Cloud Spanner creates replicas of each database split, and all of its data is physically stored together in the replica.

The benefits of Replications are high data availability across different regions and continents. It delivers a single database experience and has a firm consistency. This makes application development and maintenance faster and easier.

Types

Cloud Spanner has three replicas: read-write, read-only, and witness replicas. Single-region instances use only read-write replicas, while multi-region use a combination of all three types.

Cloud Spanner replicas

  1. Read-write replicas support reads and writes. They maintain a full copy of the data and are used in single-region instances.
     
  2. Read-only replicas support only reads and are used in multi-region instances. They maintain a full copy of the data replicated from read-write replicas.
     
  3. Witness replicas do not maintain a full copy of the data and are used only in multi-region instances. They do not support reads.

Create and Manage Instances

Users can create instances using the Google Console or the Cloud CLI.

Step 1: Go to Create an Instance on the Console.

Step 2: Enter the instance name, instance IDconfiguration and compute capacity for the instance you want to create.

Step 3: Click on Create.

Options to Edit, List, and Delete are available on the Spanner Instances Page in the Console.

Get the tech career you deserve, faster!
Connect with our expert counsellors to understand how to hack your way to success
User rating 4.7/5
1:1 doubt support
95% placement record
Akash Pal
Senior Software Engineer
326% Hike After Job Bootcamp
Himanshu Gusain
Programmer Analyst
32 LPA After Job Bootcamp
After Job
Bootcamp

Working with Databases

A Cloud Spanner database contains tables, views, and indexes. In the Cloud Spanner hierarchy, a database is a child of an instance and the parent of a schema. Databases inherit properties from their parent instance, including its configuration and the available compute capacity. The properties that can be set for a database are

  • The dialect - Google Standard SQL or PostgreSQL. 
     
  • An IAM policy defines the access rules applied to the tables and data inside the database.
     
  • The type of encryption key could be a Google-managed or a customer-managed encryption key.
     
  • Database policies like default leader region, query optimiser version, query optimiser statistics package version, and version retention period.

Create a Database

Step 1: On the Instances page in the Console, select the Instance to create the database.

Step 2:Click on Create Database and enter the name and choose a dialect.

Step 3: Optionally provide DDL statements to define the schema for Google Standard SQL-dialect databases

Step 4: Click on Create.

Update a Database

Step 1: On the Instances page in the Console, select the Instance that contains the database to be updated.

Step 2: Select the Database and click on Write DDL.

Step 3: Enter the DDL statements.

Step 4: Click on Submit to apply changes.

Delete a Database

Step 1: On the Instances page in the Console, select the Instance that contains the database to be deleted.

Step 2: Select the Database and click on Delete.

Connect 

Instances can access the Cloud Spanner API from Compute Engine using a service account to act on the user’s behalf. A service account provides application default credentials for applications, so there is no need to configure each Compute Engine instance for personal user credentials. You can configure the service account on instances with full access to all Cloud APIs or only read and write access to Cloud Spanner databases.

Create and connect a VM to access Cloud Spanner

Configure Instance with full access to all Cloud APIs.

This configuration can be done for easy development and testing procedures.

Step 1: Go to the Compute Engine VM page and choose the current project.

Step 2: Click on Create Instance.

Step 3: In the Identity and API access section, select Allow full access to all Cloud APIs.

Step 4: Configure and click on Create.

Configure Instance with Service account

This is to restrict instance access to specific APIs and roles and create a service account with permission only to access Cloud Spanner databases. It is recommended for production environments.

Step 1: Select and switch to a service account that will act on your behalf to access Cloud Spanner.

Step 2: Select the project on the Compute Engine VM instances page and click on Continue.

Step 3: Click on Create an Instance.

Step 4: In the Identity and API access section, select the service account from the list under a Service account. Click on Create.

PGAdapter

Users can connect psql to a PostgreSQL-dialect database in Cloud Spanner. psql is the command line used for PostgreSQL. A PGAdapter is a proxy that supports the PostgreSQL interface for Cloud Spanner. It exposes an endpoint on localhost to support the PostgreSQL wire protocol. It translates the PostgreSQL wire protocol into the Cloud Spanner wire protocol. A PostgreSQL client such as psql can connect to a Cloud Spanner database via this proxy.

PGAdapter can run standalone in a VM or be containerised and packaged as a Docker image. It can also use the supplied JAR file to create and start a PGAdapter instance in a Java application.

Backup and Restore

Cloud Spanner’s backup and restore features allow users to create backups of databases on demand and restore them to protect against operator and application errors. Backups are highly encrypted and can be retained for up to a year from the time of their creation.

Using Cloud Console

Step 1: Select Backup/Restore on the instance overview page from the left pane.

Step 2: Click on Create in the Backups table.

Step 3: Fill in the required information and click on Create.

Step 4: Select a backup from the table to restore a database and click on Restore.

Using Cloud CLI

Step 1: Configure gcloud with the current project.

gcloud config set core/project my-project

Step 2: Run the following command to create a backup with the name backup_example with a retention period of 1 year.

cloud spanner backups create backup_example --instance=test-instance \
    --database=backup_example_db --retention-period=1y --async

Step 3: To track the backup process run the below command.

cloud spanner operations describe _auto_op_234567 \
    --instance=test-instance --backup=backup_example

Step 4: Restore a database by executing this command.

cloud spanner databases restore --async \
    --destination-instance=test-instance --destination-database=example-restored \
    --source-instance=test-instance --source-backup=backup_example

Step 5: Run this command to check the progress of the restoration.

cloud spanner operations describe  _auto_op_bb8e360b256b04bf \
    --instance=test-instance --database=example-restored

Point-in-Time Recovery

Cloud Spanner point-in-time recovery provides protection against unintentional deletion or writes. PITR can recover the data from a point in time in the past up to a maximum of 7 days seamlessly. A database default retains all versions of its data and schema for 1 hour. This time limit can be increased to 7 days through the version_retention_period option. The version Retention period can be set on the Backup/Restore tab on the database Overview page.

Users can recover a portion of the database or the entire database. To retrieve a portion of the database, a stale read specifying a query condition and timestamp is specified. The timestamp must be more recent than the database's earliest_version_time. The results are written back into the live database. 

To recover an entire database, backup and export options can be used by specifying a timestamp in the past. This is typically used to recover from data corruption issues when there is a need to revert the entire database to a point in time before the corruption occurred. 

Frequently Asked Questions

Does Cloud Spanner automatically scale instances based on CPU usage?

Cloud Spanner uses an autoscaler, a companion tool that allows users to automatically increase or decrease the number of nodes or processing units in one or more Spanner instances based on their utilisation. It monitors the instances and automatically adds or removes compute capacity to ensure they stay within the recommended maximums for CPU utilisation and storage per node.

What are the benefits of Replication in Cloud Spanner?

The benefits of Replications are high data availability across different regions and continents. It delivers a single database experience and has a strong consistency. This makes application development and maintenance faster and easier.

How is the performance affected for databases with high retention periods and overwrite rates?

Increased storage utilisation can decrease performance. Cloud Spanner uses additional computing resources to maintain old versions of data. An increased retention period implies that schema versions must be retained for longer durations. Such databases show high CPU usage and latency. Hence, it consumes more time to time to perform schema updates.

Conclusion

This blog discusses creating and managing instances and databases in Cloud Spanner. It also discusses how to connect VM to Spanner instances, Backup and Restore and finally, Point-in-time recovery.

Check out our articles on Cloud Logging in GCPMonitoring Agent and Identity Access ManagementExplore our Library on Coding Ninjas Studio to gain knowledge on Data Structures and Algorithms, Machine Learning, Deep Learning, Cloud Computing and many more! Test your coding skills by solving our test series and participating in the contests hosted on Coding Ninjas Studio! 

Looking for questions from tech giants like Amazon, Microsoft, Uber, etc.? Look at the problems, interview experiences, and interview bundle for placement preparations.

Upvote our blogs if you find them insightful and engaging! Happy Coding!

Thank you

Topics covered
1.
Introduction
2.
Working with Instances
2.1.
Nodes and Processing units
2.2.
Regional and Multi-regional Configurations
2.3.
Replication
2.3.1.
Types
2.4.
Create and Manage Instances
3.
Working with Databases
3.1.
Create a Database
3.2.
Update a Database
3.3.
Delete a Database
4.
Connect 
4.1.
Create and connect a VM to access Cloud Spanner
4.1.1.
Configure Instance with full access to all Cloud APIs.
4.1.2.
Configure Instance with Service account
4.2.
PGAdapter
5.
Backup and Restore
5.1.
Using Cloud Console
5.2.
Using Cloud CLI
6.
Point-in-Time Recovery
7.
Frequently Asked Questions
7.1.
Does Cloud Spanner automatically scale instances based on CPU usage?
7.2.
What are the benefits of Replication in Cloud Spanner?
7.3.
How is the performance affected for databases with high retention periods and overwrite rates?
8.
Conclusion