Table of contents
1.
Introduction
2.
Modify an instance
2.1.
Before you Start
2.2.
Configure autoscaling
2.2.1.
Enable autoscaling
2.2.2.
Disable autoscaling
2.2.3.
Change autoscaling settings
2.3.
Add or remove nodes manually
2.4.
Add a cluster
2.5.
Delete a cluster
2.6.
Move data to a new location
2.7.
Manage app profiles
2.8.
Manage labels
2.9.
Change an instance's display name
3.
Delete an instance
4.
Scaling clusters
4.1.
Scaling options
4.2.
Limitations
4.2.1.
Node availability
4.2.2.
Delay while nodes rebalance
4.2.3.
Latency increases caused by scaling down too quickly
4.2.4.
Schema design issues
4.3.
How to scale Bigtable programmatically
4.3.1.
Monitoring API metrics
4.3.2.
Sample code
5.
Autoscaling
5.1.
Advantages of Autoscaling
5.2.
How autoscaling works
6.
Frequently Asked Questions
6.1.
What is the difference between cloud Datastore and Cloud Bigtable?
6.2.
How is data stored in Bigtable?
6.3.
What is Rowkey in Bigtable?
7.
Conclusion
Last Updated: Mar 27, 2024

Advanced Features in Cloud Bigtable

Author Sanjana Yadav
0 upvote
Career growth poll
Do you think IIT Guwahati certified course can help you in your career?

Introduction

Cloud Bigtable is a sparsely filled table with billions of rows and thousands of columns that may contain terabytes or even petabytes of data. Each row has an indexed value; this value is known as the row key. Bigtable is suited for storing massive volumes of single-keyed data in a low-latency environment. It has a high read and write throughput with minimal latency, making it an excellent data source for MapReduce processes.

To utilize Cloud Bigtable, you must first establish instances that include clusters to which your apps may connect. Nodes are computing units that handle your data and conduct maintenance duties in each cluster.

We have already learned about creating instances in the previous article.

Let us now learn some advanced features in Cloud Bigtable.  

Modify an instance

Before you Start

If you wish to utilize Bigtable's command-line tools, you must first install the Google Cloud CLI and the cbt CLI.

Configure autoscaling

Enable autoscaling

  1. In the terminal, navigate to the list of Bigtable instances.
  2. Choose the instance you wish to modify, then click Edit instance.
  3. Click Edit for the cluster you wish to update under Configure clusters.
  4. Choose Autoscaling. 
  5. Fill in the following values:
    • Minimum number of nodes
    • Maximum number of nodes
    • CPU utilization target
    • Storage utilization target
  6. Click the Save option.

Disable autoscaling

  1. In the terminal, navigate to the list of Bigtable instances.
  2. Choose the instance you wish to modify, then click Edit instance.
  3. Click Edit for the cluster you wish to update under Configure clusters.
  4. Select Manual node allocation.
  5. In the Quantity field, enter the number of cluster nodes.
    There are certain exceptions to the rule that each cluster in an instance should have the same number of nodes. Discover nodes and replication.
  6. Click the Save option.

Change autoscaling settings

  1. In the terminal, navigate to the list of Bigtable instances.
  2. Choose the instance you wish to modify, then click Edit instance.
  3. Click Edit for the cluster you wish to update under Configure clusters.
  4. Enter new values for any of the following fields that you want to modify.
    • Minimum number of nodes
    • Maximum number of nodes
    • CPU utilization target
    • Storage utilization target
  5. Click the Save option.

Add or remove nodes manually

When the node scaling mode of a cluster is manual, you can add or delete nodes, and the number of nodes remains constant until you alter it again.

  1. In the terminal, navigate to the list of Bigtable instances.
  2. Choose the instance you wish to modify, then click Edit instance.
  3. Click Edit for the cluster you wish to update under Configure clusters.
  4. Select Manual node allocation.
  5. In the Quantity field, enter the number of cluster nodes.
    There are certain exceptions to the rule that each cluster in an instance should have the same number of nodes. Discover nodes and replication.
  6. Click the Save option.

Add a cluster

Clusters can be added to an existing instance. Clusters can be added to an instance in up to 8 regions where Bigtable is accessible. A zone in a region can only have one cluster. Your use case determines the best places for new clusters.

If your instance is CMEK-protected, each new cluster must use the same CMEK key as the cluster. Before adding a new cluster to a CMEK-protected instance, identify or establish a CMEK key in the area where the cluster will be located.

  1. In the terminal, navigate to the list of Bigtable instances.
  2. Choose the instance you wish to modify, then click Edit instance.
  3. Click Add cluster under Configure clusters. If this button is not enabled, the instance has reached its maximum number of clusters.
  4. Enter a cluster ID and choose a region and zone for the cluster.
  5. Enter the number of cluster nodes. There are exceptions to the rule that each cluster in an instance should have the same number of nodes. Discover nodes and replication.
  6. Select or input a customer-managed key if the instance is CMEK-protected. The cluster and the CMEK key must be in the same area.
  7. Select Add.
  8. Repeat for each subsequent cluster, then click Save. Bigtable establishes the cluster and begins replicating your data to it. As replication begins, CPU consumption may rise.
  9. Examine the replication parameters in the default app profile to see whether they are appropriate for your replication use case.

Delete a cluster

If an instance has many clusters, you can remove all except one of them. Replication is automatically off when all but one cluster are deleted.

Bigtable does not enable you to remove a cluster in the following cases:

  • Bigtable will not enable you to remove a cluster if one of your application profiles routes all traffic to it. Before deactivating the cluster, you must first change or delete the application profile.
  • If you add additional clusters to an existing instance, you won't be able to remove clusters until the first data copy to the new clusters is finished.

Below are the steps to delete a cluster:

  1. In the terminal, navigate to the list of Bigtable instances.
  2. Choose the instance you wish to modify, then click Edit instance.
  3. Under Configure clusters, choose the cluster to be deleted and click Delete cluster .
  4. Click Undo to cancel the delete action, which is accessible until you click Save. Otherwise, press the Save option.

Move data to a new location

To relocate data in a Bigtable instance to a new zone or region, create a new cluster in the desired place and then remove the cluster in the desired location. You don't have to worry about any queries failing since the deleted cluster stays available until the data has been replicated to the new cluster. Bigtable immediately replicates all data to the new cluster.

Manage app profiles

Application profiles, often known as app profiles, govern how your apps connect to a replication-enabled instance. Every instance that has several clusters has its default app profile. You may also construct various bespoke app profiles for each instance, with a separate app profile for each type of application you run.

Manage labels

Labels are key-value pairs that may be used to organize similar instances and store instance metadata.

Change an instance's display name

To modify an instance's display name, which is used by the console to identify the instance:

  1. In the terminal, navigate to the list of Bigtable instances.
  2. Choose the instance you wish to modify, then click Edit instance.
  3. Change the name of the instance, then click Save.

Delete an instance

When you use one of the Cloud Bigtable client libraries, you may remove an instance programmatically. To manually remove a Bigtable instance, follow these steps:

  1. In the terminal, navigate to the list of Bigtable instances.
  2. Click Delete instance after selecting the instance to be deleted. A confirmation dialog is shown.

    Src: https://cloud.google.com/bigtable/img/delete-instance-confirm.png 
  3. Follow the prompts in the confirmation popup before clicking Delete. The instance is permanently removed.

Scaling clusters

Scaling a cluster is the process of adding or deleting nodes from a cluster in response to changes in the cluster's workload or data storage requirements. Scaling your Cloud Bigtable cluster depending on data such as CPU use might be beneficial. For example, if your cluster is under severe strain and has a high CPU utilization, you can add nodes to the cluster until the CPU utilization decreases. You may also save money by removing nodes from the cluster when they are not in use.

Scaling options

A Bigtable cluster can be scaled in the following ways:

  • Autoscaling
  • Manual node allocation
  • Programmatically autoscaling

In most circumstances, you should utilize Bigtable's built-in autoscaling feature if you want automated scaling. When you enable this function, Bigtable monitors the cluster in real-time and automatically adjusts the number of nodes according to your preferences.

Limitations

Consider the following constraints before enabling autoscaling or configuring programmatic scaling for your Bigtable cluster.

Node availability

Node quotas apply regardless of whether a cluster uses manual node allocation or autoscaling.

Delay while nodes rebalance

It might take up to 20 minutes under load after adding nodes to a cluster before you see a meaningful increase in the cluster's performance. Consequently, if your workload includes short bursts of high activity, adding nodes to your cluster based on CPU load will not enhance performance since the short burst of activity will be finished by the time Bigtable rebalances your data.

To account for this delay, add nodes to your cluster before increasing its demand, either programmatically or using the Google Cloud app. This strategy allows Bigtable to redistribute your data among the new nodes before the burden grows. Change the number of nodes in clusters that employ manual node allocation.

Latency increases caused by scaling down too quickly

When scaling down the number of nodes in a cluster, try not to reduce the cluster size by more than 10% in a 10-minute period. Scaling down too rapidly might result in performance issues, such as increased latency if the cluster's surviving nodes become suddenly overloaded.

Schema design issues

If your table's schema design is flawed, adding nodes to your Bigtable cluster may not increase performance. For example, if you have many reads or writes to a single row in your table, they will all travel to the same node in your cluster; hence, more nodes will not enhance speed. In contrast, increasing nodes will often enhance speed if your table's reads and writes are uniformly spread across rows.

How to scale Bigtable programmatically

You may wish to create your application to scale your Bigtable cluster in some situations. This section discusses how to scale a cluster programmatically and includes a code sample to get you started. It also discusses certain limits to be aware of before using programmatic scaling.

Bigtable's Cloud Monitoring API offers several metrics. You may monitor these metrics for your cluster programmatically, then use one of the Bigtable client libraries or the Google Cloud CLI to add or delete nodes according to the metrics. After resizing your cluster, you may track its performance via the console, a Cloud Monitoring custom dashboard, or programmatically.

Monitoring API metrics

The Monitoring API provides a number of metrics for monitoring the present condition of your cluster. The following are some of the most relevant metrics for programmatic scaling:

  • bigtable.googleapis.com/cluster/cpu_load: Cluster's CPU load.
  • bigtable.googleapis.com/cluster/node_count: Number of nodes in the cluster.
  • bigtable.googleapis.com/cluster/storage_utilization: Storage used as a fraction of total storage capacity.
  • bigtable.googleapis.com/server/latencies: Distribution of server request latencies for a table.

Sample code

You can use one of the following sample tools as a starting point for your own programmatic scaling tool:

When a node's CPU load exceeds a certain threshold, the sample tools join it to a Bigtable cluster. Similarly, the sample tools delete nodes from a Bigtable cluster when their CPU load falls below a certain threshold. To execute the sample tools, follow the instructions on GitHub for each example.

To acquire information on the CPU load on the cluster, the sample tools apply the following code:

Timestamp now = timeXMinutesAgo(0);
Timestamp fiveMinutesAgo = timeXMinutesAgo(5);
TimeInterval interval =
    TimeInterval.newBuilder().setStartTime(fiveMinutesAgo).setEndTime(now).build();
String filter = "metric.type=\"" + CPU_METRIC + "\"";
ListTimeSeriesPagedResponse response =
    metricServiceClient.listTimeSeries(projectName, filter, interval, TimeSeriesView.FULL);
return response.getPage().getValues().iterator().next().getPointsList().get(0);

The sample tools employ the Bigtable client library to scale the cluster based on the CPU 

load:

double latestValue = getLatestValue().getValue().getDoubleValue();
if (latestValue < CPU_PERCENT_TO_DOWNSCALE) {
  int clusterSize = clusterUtility.getClusterNodeCount(clusterId, zoneId);
  if (clusterSize > MIN_NODE_COUNT) {
    clusterUtility.setClusterSize(clusterId, zoneId,
      Math.max(clusterSize - SIZE_CHANGE_STEP, MIN_NODE_COUNT));
  }
} else if (latestValue > CPU_PERCENT_TO_UPSCALE) {
  int clusterSize = clusterUtility.getClusterNodeCount(clusterId, zoneId);
  if (clusterSize <= MAX_NODE_COUNT) {
    clusterUtility.setClusterSize(clusterId, zoneId,
      Math.min(clusterSize + SIZE_CHANGE_STEP, MAX_NODE_COUNT));
  }
}

Autoscaling

The number of nodes in the cluster remains fixed with manual node allocation until you modify it. Bigtable continually monitors the cluster and automatically modifies the number of nodes in the cluster when autoscaling is enabled. Autoscaling is available in all Bigtable regions and works on HDD and SSD clusters.

Advantages of Autoscaling

The following are some of the advantages of autoscaling:

Costs

Because Bigtable decreases the number of nodes in your cluster whenever possible, autoscaling can help you save money. This might assist you in avoiding over-provisioning.

Performance

Bigtable's autoscaling feature allows it to automatically add nodes to a cluster when a workload changes or data storage requirements rise. This contributes to the achievement of workload performance goals by ensuring that the cluster has enough nodes to fulfill the required CPU usage and storage needs.

Automation

Autoscaling simplifies management. You don't need to manually monitor and scale the cluster size or develop an application to do so because the Bigtable service does these activities for you.

How autoscaling works

The technique of automatically scaling or altering the size of a cluster by adding or deleting nodes is known as autoscaling. Bigtable automatically changes the size of your cluster when you enable autoscaling. When your cluster's workload or storage requirements change, Bigtable either scales up, adds nodes to the cluster, or scales down, eliminating nodes from the cluster.

Bigtable autoscaling calculates the number of nodes needed depending on the following parameters:

  • CPU utilization target
  • Storage utilization target
  • Minimum number of nodes
  • Maximum number of nodes

Each scaling dimension provides a suggested node count, and Bigtable chooses the highest one by default. Bigtable grows the cluster to 12 nodes if, for example, your cluster requires 10 nodes to reach your storage usage objective but 12 to satisfy your CPU utilization target.

Bigtable continually optimizes storage, rebalancing data between nodes as the number of nodes varies to guarantee that traffic is distributed relatively and no node is overwhelmed.

Bigtable automatically rebalances the nodes in your cluster after scaling it up for maximum performance. While scaling and rebalancing occur, all requests continue to reach the cluster. 

Requests may have an excessive delay or fail if a cluster has reached its maximum number of nodes and the CPU usage target has been surpassed. Write requests will fail if a cluster has reached its maximum number of nodes and the storage utilization limit has been exceeded.

To avoid any influence on latency, nodes are eliminated at a slower pace when a cluster is scaled-down than when it is scaled up.

Frequently Asked Questions

What is the difference between cloud Datastore and Cloud Bigtable?

BigTable is designed for big data volumes and analytics, whereas Datastore is designed to offer high-value transactional data to applications.

How is data stored in Bigtable?

Bigtable is a row-oriented database, which means that all data for a single row is saved simultaneously, first by column family, then by column. Data is kept in reversed-timestamp order, so getting the most recent value is simple and quick while getting the oldest value is difficult.

What is Rowkey in Bigtable?

You may give each dataset a unique row key prefix so that Bigtable stores the linked data in a continuous range of rows that you can then query by row key prefix.

Conclusion

In this article, we have extensively discussed instances in Cloud Bigtable. Our discussion mainly focused on advanced features of the Cloud Bigtable, such as modifying and deleting instances in Cloud Bigtable. We also learned about scaling clusters and Autoscaling.

We hope this blog has helped you enhance your Google cloud knowledge. To learn more about Google cloud concepts, refer to our articles on All about GCP Certifications: Google Cloud Platform | Coding Ninjas Blog.  

Refer to our guided paths on the Coding Ninjas Studio platform to learn more about DSA, DBMS, Competitive Programming, Python, Java, JavaScript, etc. 

Refer to the links problemstop 100 SQL problemsresources, and mock tests to enhance your knowledge.

For placement preparations, visit interview experiences and interview bundle.

Do upvote our blog to help other ninjas grow. Happy Coding!

An image that displays a thankyou message from coding ninjas.

Live masterclass