Table of contents
1.
Introduction
2.
Replication
2.1.
How it works
2.2.
Use cases
2.2.1.
Isolate serving applications from batch reads
2.2.2.
Improve availability
2.2.3.
Provide near-real-time backup
2.2.4.
Ensure your data has a global presence
2.3.
Replication settings
2.3.1.
Isolate batch analytics workloads from other applications
2.3.2.
Create high availability (HA)
2.3.3.
Provide near-real-time backup
2.3.4.
Maintain high availability and regional resilience
2.3.5.
Store data close to your users
3.
Failovers
3.1.
Types of failovers
3.1.1.
Manual failovers
3.1.2.
Automatic failovers
3.2.
Manage failovers
3.2.1.
Perform a manual failover
3.2.2.
Perform an automatic failover
4.
Backups
4.1.
Features
4.2.
Managing backups
4.2.1.
Create a table backup
4.2.2.
Restore from a table backup
4.2.3.
Delete a backup
5.
Creating and managing instance labels
5.1.
Labels
5.2.
Add or updating an instance's labels
5.3.
Remove a label from an instance
6.
Frequently Asked Questions
6.1.
What is the gcloud CLI?
6.2.
What are the four types of cloud storage?
6.3.
What is cloud bursting?
6.4.
What is IAM?
7.
Conclusion
Last Updated: Mar 27, 2024

Key Concepts of Cloud Bigtable

Career growth poll
Do you think IIT Guwahati certified course can help you in your career?

Introduction

Google Bigtable is a column-oriented, distributed data store made by Google Inc. to handle huge amounts of structured data associated with the company's and Web services operations. Applications like the Google App Engine Datastore, Google Personalized Search, Google Earth, and Google Analytics all use Google Bigtable as their database. Bigtable was initially created to enable applications needing tremendous scalability; the technology was meant to be utilized with petabytes of data.

In this blog, let us start our discussion with replication for Bigtable and then gradually move on to backup and restore.

Replication

Replication for Cloud Bigtable lets you increase the durability and availability of your data by copying it across multiple zones or regions within the same region. Additionally, you can separate workloads by sending various requests to various clusters.

How it works

Bigtable supports replicated clusters in up to eight Google Cloud regions where Bigtable is available. There can only be one cluster in each zone in a region. You can access the data from your instance even if one Google Cloud zone or region goes unavailable by using clusters spread across multiple zones or regions.

Bigtable immediately begins synchronizing your data amongst the clusters when you establish an instance with multiple clusters, making a unique, independent duplicate of your data in each zone where your instance has a cluster. Similar to this, Bigtable moves your current data from the zone of the original cluster to the zone of the new cluster, then synchronizes changes to your data between the zones when you add a new cluster to an existing instance.

Bigtable automatically replicates all data changes, including all of the following kinds of updates:

  • Data updates to existing tables
  • Updated and removed tables
  • Additions and deletions of column families
  • Modifications to the garbage collection rules for a column family

You can read and write in each cluster because Bigtable treats each cluster in your instance as a primary cluster. Additionally, you can configure your instance to redirect requests from certain application types to various clusters.

You should know the limitations when changing the garbage collection policies for replicated tables before adding clusters to an instance.

Use cases

Some everyday use cases for Bigtable replication are:

Isolate serving applications from batch reads

Users of the application may see a performance hit when a batch analytics job with many large reads runs concurrently on a single cluster with an application with a mix of reads and writes. To ensure that batch tasks don't affect users of your apps, you can utilise replication to route batch analytics processes and application traffic to distinct clusters using app profiles with single-cluster routing.

Improve availability

Your data's availability and durability are restricted to the zone in which that cluster is located if an instance only has one cluster. By maintaining distinct copies of your data in several zones or regions and dynamically switching between clusters as necessary, replication can increase durability and availability.

Provide near-real-time backup

In some circumstances, you'll always need to route requests to a single cluster—for instance, if you can't afford to read old data. However, you may still employ replication by using one cluster to handle requests and keeping a second cluster around for backup purposes. You can reduce downtime if the serving cluster becomes unreachable by manually switching to the backup cluster.

Ensure your data has a global presence

Replication can be set up in several places worldwide to bring your data closer to your clients. To direct application traffic to the closest cluster, you may, for instance, construct an instance with replicated clusters in the US, Europe, and Asia.

Replication settings

Some common use cases for enabling Cloud Bigtable replication are:

Isolate batch analytics workloads from other applications

Users of the application may see a performance hit when a batch analytics job that executes several large reads on a single cluster is running concurrently with an application that executes a mix of reads and writes. Application traffic and batch analytics jobs can be sent to different clusters using app profiles and single-cluster routing in replication, keeping users of your apps unaffected by batch jobs.

Create high availability (HA)

If an instance has only 1 cluster, your data's durability and availability are limited to the zone where that cluster is located. . By maintaining distinct copies of your data in several zones or regions and dynamically switching between clusters as necessary, replication can increase both durability and availability.

Provide near-real-time backup

In some circumstances, you'll always need to route requests to a single cluster—for instance, if you can't afford to read old data. However, you may still employ replication by using one cluster to handle requests and keeping a second cluster around for backup purposes. You can reduce downtime if the serving cluster becomes unreachable by manually switching to the backup cluster.

Maintain high availability and regional resilience

Let's suppose you have client concentrations in two different areas of a continent. Bigtable clusters should be used to service each customer concentration as close to the clients as is practical. Within each region, you want your data to be highly available, and you might also want a failover option in case one or more of your clusters becomes unavailable.

You can create an instance for this use case with 2 clusters in area A and 2 clusters in region B. High availability is provided by this setup even if you cannot connect to a Google Cloud region. Additionally, it offers regional resilience since even if one zone disappears, the other cluster in that zone's region remains operational.

Store data close to your users

By running your application nearby to your users and storing your data as close to your application as you can, you can reduce latency if you have users all over the world. Your data is automatically duplicated across all the clusters you establish with Bigtable in different Google Cloud regions.

Failovers

Replication enables incoming traffic to switch to another cluster inside the same instance if a Cloud Bigtable cluster becomes unresponsive. Depending on the app profile an application uses and how the app profile is set, failovers can either be manual or automatic.

The operation of manual and automatic failovers in a replication-using instance is described below.

Types of failovers

Manual failovers

You must utilize your best judgment to determine when to begin failing over to a different cluster if an app profile use single-cluster routing to send all requests to a single cluster.

The following signs could suggest that switching to a different cluster would be beneficial:

  • The cluster begins to report a significant amount of transient system failures.
  • Numerous requests begin to time out.
  • The average response time increases to an unacceptably high degree.

Automatic failovers

When a multi-cluster routing app profile is used, Bigtable automatically manages failovers. Bigtable directs traffic to the closest cluster accessible when the closest cluster cannot process a request.

Even if a cluster is down for very little time, automatic failovers can take place. For instance, Bigtable will often retry a request on another cluster if it is routed to one cluster and that cluster responds unreasonably slowly or gives a temporary error.

Manage failovers

Perform a manual failover

Use a manual failover if an app profile routes all requests to a single cluster and that cluster becomes unresponsive.

Perform an automatic failover

Automated failovers are automatic when using Bigtable. You don't need to do anything if an app profile employs multi-cluster routing and the cluster closest to the application server develops a problem. Even if the cluster is temporarily unwell, Bigtable automatically switches to the closest healthy cluster to process requests until the ailing cluster has recovered.

Backups

Backups can assist you in recovering from operator errors like mistakenly deleting a table or application-level data damage. In either the same instance the backup was produced in or a different instance, you can restore from a backup to a new table.

Features

Fully integrated: The Bigtable service handles backups; no import or export is required.

Cost-effective: By using Bigtable backups, you can avoid paying for the services used to export, store, and import data.

Automatic expiration: A user-specified expiration date, up to 30 days after the backup is made, is automatically set for each backup.

Flexible restore options: Restoring from a backup to a table in a separate instance from where the backup was produced gives you a variety of possibilities.

Managing backups

You can work with Bigtable backups with the help of the following:

  • The console.
  • The Google Cloud CLI.
  • The Cloud Bigtable client libraries.

You can also use the API directly, although we highly advise against doing so unless you have to use a language that the Bigtable client libraries do not support.

Create a table backup

  1. Navigate to the console's Bigtable instances page.
  2. Open the instance list.
  3. Select the instance containing the table you wish to backup by clicking it.
  4. For the table you want to backup, click Create backup.
  5. Use the dropdown menu to select the Cluster ID for the cluster that should store the backup if you're using replication. (The cluster is pre-selected if you clicked Create backup next to a cluster ID on the Tables page rather than an instance ID.)
  6. Set an expiration date and give the backup a special ID.
  7. Press Create.
  8. The backup and its details are displayed in the console's filtered version of the Backups page.
  9. To view the backup's status, click Activity.
  10. The status column shows Backup complete when the backup has been completed.

Restore from a table backup

  1. Navigate to the console's Bigtable instances page.
  2. Open the instance list
  3. Select the instance containing the backup you wish to restore by clicking it.
  4. Click Backups in the left menu window.
  5. For the backup you want to restore, click Restore.
  6. Choose the instance to which you wish to perform a backup.
  7. The new table cannot be used with instances that don't have enough storage. The instance is inaccessible if you do not have authorization to create a table in it. 
  8. The destination instance must also be CMEK-protected if you restore from a backup that is CMEK-protected.
  9. For the table that will be built from the backup, enter a unique ID. This ID cannot be changed at a later time.
  10. Click Restore.
  11. The console shows the Tables page filtered to show the new table.
  12. The console displays the restore status for each cluster. When the status column for all clusters shows Ready the table has been restored and replicated to all clusters in the instance.

Delete a backup

  • Navigate to the console's Bigtable instances page.
  • Select the instance that contains backup.
  • Click Backups in the left menu window.
  • Expand the More menu next to the term "Restore," then click "Delete" for the backup you need to get rid of.
  • Click Delete after entering the backup ID in the " Confirm deletion section."

Creating and managing instance labels

Labels

A key-value pair called a label aids in organising your Google Cloud instances. Each resource can have a label attached to it, and the labels can be used to filter the resources. The billing system receives information about labels so that you can organise your billed charges by label.

Add or updating an instance's labels

After you build a Bigtable instance, you may add labels to the instance and update already-existing labels using the Google Cloud dashboard. To add or modify labels, you can alternatively utilise the RPC Admin API or the REST Admin API.

To add or update labels for a Bigtable instance using the console:

  1. Open the console and see the list of Bigtable instances.
  2. Open the instance list
  3. Mark the box next to each instance whose labels you want to modify.
  4. Click Show info panel in the top right corner to open the Labels panel if it is not already open.
  5. Add and update labels as needed.
  6. To add a new label, click Add label, then type the value and key for the label.
  7. Update a label's value to make a change. For an already-existing label, the key cannot be altered.
  8. Press Save.

Remove a label from an instance

When using the console to remove a label from a Bigtable instance:

  1. Open the console and see the list of Bigtable instances.
  2. Open the instance list
  3. Check the box next to each instance whose labels you want to remove.
  4. Press Show info panel in the top right corner to open the Labels panel if it is not already open.
  5. Click the X next to each label that you want to remove.
  6. Click Save.

Frequently Asked Questions

What is the gcloud CLI?

A collection of tools for creating and managing Google Cloud resources is called the Google Cloud CLI. These tools enable you to automate a variety of typical platform operations via scripts, other automation methods, or the command line.

What are the four types of cloud storage?

The four types of cloud storage are private cloud storage, private cloud storage, hybrid cloud storage, and community cloud storage.

What is cloud bursting?

Hybrid clouds are related to cloud bursting. The notion is that a certain application typically runs in a local computer environment or a private cloud.

What is IAM?

Identity and access management (IAM) is a centralised and consistent method to automate access controls, maintain user identities, and adhere to compliance standards in both traditional and containerized settings.

Conclusion

I hope this article gave you insights into the key concepts of Cloud Bigtable supported by Google.

Refer to our guided paths on Coding Ninjas Studio to learn more about DSA, Competitive Programming, System Design, JavaScript, etc. Enroll in our courses, refer to the mock test and problems available, interview puzzles, and look at the interview bundle and interview experiences for placement preparations.

Check out this problem - Smallest Distinct Window.

We hope this blog has helped you increase your knowledge regarding AWS Step functions, and if you liked this blog, check other links. Do upvote our blog to help other ninjas grow. Happy Coding!"

Grammarly report: Report

Live masterclass