Code360 powered by Coding Ninjas X Naukri.com. Code360 powered by Coding Ninjas X Naukri.com
Last Updated: Mar 27, 2024

Concepts for Cloud Bigtable

Leveraging ChatGPT - GenAI as a Microsoft Data Expert
Speaker
Prerita Agarwal
Data Specialist @
23 Jul, 2024 @ 01:30 PM

Introduction

Before reading this page, you should be familiar with Bigtable's overview. You should also know about instances, clusters, and nodes.

Let us have a quick revisit on instances before moving ahead:

A Bigtable instance is a data container. 

Instances have one or more clusters in various zones. Each cluster has at least one node.

You should be aware of the following properties of an instance:

The type of storage (SSD or HDD)

When you create an instance, you must specify whether the instance's clusters will use solid-state disks (SSD) or hard disk drives to store data (HDD). SSDs are frequently, but not always, the most efficient and cost-effective option.

The application profiles, which are primarily for replication instances

Bigtable uses the instance you construct to store application profiles or app profiles. 

Keeping this basic knowledge in mind, let us now learn to create, modify and delete instances in cloud BigTable.

Choose between SSD and HDD storage

You may use the Logging query language through the Google Cloud console's Logs Explorer, the Logging API, or the command-line interface. The Logging query language may query data and construct filters to generate sinks and log-based metrics.

A query is a Boolean expression that specifies a subset of all the log entries in the Google Cloud resource you have chosen, such as a Cloud project or folder.

You may construct searches based on the LogEntry indexed column using the logical operators AND and OR. The Logging query language syntax looks like this when the resource.type field is used in the following examples:

  • Simple restriction: resource.type = "gae_app"
  • Conjunctive restriction: resource.type = "gae_app" AND severity = ERROR
  • Disjunctive restriction: resource.type = "gae_app" OR resource.type = "gce_instance"
  • Alternatively: resource.type = ("gae_app" OR "gce_instance")
  • Complex conjunctive/disjunctive expression: resource.type = "gae_app" AND (severity = ERROR OR "error")

Advantages of SSD storage

When in doubt, go for SSD storage.

There are various reasons why SSD storage is generally preferred for your Bigtable cluster:

  • SSDs are substantially quicker and provide more consistent performance than HDDs.
  • HDD throughput is much lower than SSD throughput.
  • Individual row reads on an HDD are very slow.
  • Unless you're storing massive volumes of data, HDD's cost savings are small compared to the cost of the nodes in your Bigtable cluster.

Disadvantage of SSD storage

One possible disadvantage of SSD storage is that it necessitates additional nodes in your clusters depending on the quantity of data stored. In practice, though, you may require those extra nodes not merely to handle the quantity of data that you're storing but also to keep up with incoming traffic.

Use cases for HDD storage

HDD storage is appropriate for use cases that match the following requirements:

  • You anticipate storing at least 10 TB of data.
  • The data will not be used to power a user-facing or latency-sensitive application.
  • Your workload is classified as one of the following:
  • Batch workloads with scans and writes, and just occasional random reads of a limited number of rows or point read.
  • Data archival is the practice of writing enormous amounts of data and infrequently reading that data.

For example, suppose you intend to store a large amount of historical data for many remote-sensing devices and subsequently utilize the data to create daily reports. In that case, the cost savings from HDD storage may outweigh the performance tradeoff. However, if you want to utilize the data to present a real-time dashboard, using HDD storage is generally not a good idea—reads would be considerably more frequent in this instance, and reads that are not scans are significantly slower with HDD storage.

Switching between SSD and HDD storage

When you create a Bigtable instance, you may choose between SSD and HDD storage for the instance. You cannot alter the kind of storage utilized for the instance using the Google Cloud app.

If you wish to alter the storage type on which a table is saved, use the backups feature:

  1. Create or plan to use an instance that uses the desired storage type.
  2. Create a backup of the table.
  3. Restore to a new table in the other instance from the backup.
Get the tech career you deserve, faster!
Connect with our expert counsellors to understand how to hack your way to success
User rating 4.7/5
1:1 doubt support
95% placement record
Akash Pal
Senior Software Engineer
326% Hike After Job Bootcamp
Himanshu Gusain
Programmer Analyst
32 LPA After Job Bootcamp
After Job
Bootcamp

Create an instance

Before you start

Prepare your environment:

  • Select or create a Google Cloud project via the Google Cloud app's project selector page.
  • Check that billing for your Cloud project is enabled. 
  • Allow access to the Cloud Bigtable and Cloud Bigtable Admin APIs. Enable the APIs 
  • Install and launch the Google Cloud CLI.


Plan your configuration :

  1. Optional: If you intend to enable replication, do the following steps:
    Take a few moments to go over the replication overview.
    Determine your replication use case.
    Based on your use case and the location of your application and traffic, choose which region or regions your instance should be located.
    Determine how you will route incoming requests using application profiles.
     
  2. Optional: If you wish to utilize customer-managed encryption keys (CMEK) instead of the default Google-managed encryption, complete the procedures under Creating a CMEK-enabled instance and have your CMEK key ID ready before creating your new instance.

Create an instance

  1. Navigate to the Create instance page in the console.
     
  2. Give the instance a name. To identify your instance, the console shows this name.
     
  3. Enter the instance ID.  The instance ID is the instance's permanent identification.
     
  4. Click the Continue button.
     
  5. Select whether your clusters will use an SSD or HDD disk.
     
  6. Click continue.
     
  7. For the first cluster, enter a cluster ID. The cluster ID is the cluster's permanent identifier.
     
  8. Select the region and zone in which the initial cluster will run.
     
  9. Select a cluster node scaling mode. 
    1. Enter the number of Bigtable nodes for the initial cluster for Manual node allocation. If you're unsure how many nodes you'll need, go with the default.  More nodes can be added later.
    2. Enter the following settings for Autoscaling:
      • Minimum number of nodes
      • Maximum number of nodes
      • CPU utilization target
      • Storage utilization target
         
  10. (Optional) Complete the following steps to protect your instance with CMEK instead of the default Google-managed encryption:
    1. Select Show encryption options.
    2. Mark the checkbox next to Use a customer-managed encryption key (CMEK).
    3. Choose or enter the resource name for the CMEK key used by the cluster. This cannot be added afterward.
    4. If you are asked to provide access to the CMEK key's service account, choose Grant. To execute this activity, your user account must be granted the Cloud KMS Admin role.
    5. Click the Save option.
       
  11. (Optional) Complete the following further steps to enable replication now:
    1. Select Show advanced options.
    2. Click Add cluster, specify the cluster's configuration, and then click Add. Repeat this step to add more clusters to the instance. Later, you may enable replication by adding a cluster.
      A zone in a region can only have one cluster. Change the zone for your first cluster if the Add cluster option is disabled.
      To make an instance with more than six clusters, first, make an instance with six clusters, then add different clusters to it.
       
  12. To create the instance, click Create.
     
  13. Examine the replication parameters in the default app profile to see whether they are appropriate for your replication use case. It is possible that you may need to adjust the default app profile or establish custom app profiles.  

Frequently Asked Questions

What is a Bigtable instance?

A Bigtable instance is a data container. Instances have one or more clusters in various zones. Each cluster has at least one node. A table is associated with an instance, not a cluster or node. You are utilizing replication if you have an instance with several clusters.

What is the difference between Bigtable and BigQuery?

Bigtable is a wide-column NoSQL database designed for high-volume reads and writes. BigQuery, on the other hand, is a large-scale enterprise data warehouse for structured relational data.

How is replication handled in Bigtable?

Bigtable Replication improves data availability and durability by duplicating information across numerous zones within a region or across many regions. Replication aids in workload isolation by directing various sorts of requests to separate clusters via application profiles.

Conclusion

In this article, we have extensively discussed instances in Cloud Bigtable. Our discussion mainly focused on storage types and scenarios for selecting the right type and on creating an instance.

We hope this blog has helped you enhance your Google cloud knowledge. To learn more about Google cloud concepts, refer to our articles on All about GCP Certifications: Google Cloud Platform | Coding Ninjas Blog.  

Refer to our guided paths on the Coding Ninjas Studio platform to learn more about DSA, DBMS, Competitive Programming, Python, Java, JavaScript, etc. 

Refer to the links problemstop 100 SQL problemsresources, and mock tests to enhance your knowledge.

For placement preparations, visit interview experiences and interview bundle.

Do upvote our blog to help other ninjas grow. Happy Coding!

An image that displays a thankyou message from coding ninjas.

Topics covered
1.
Introduction
2.
Choose between SSD and HDD storage
2.1.
Advantages of SSD storage
2.1.1.
Disadvantage of SSD storage
2.2.
Use cases for HDD storage
2.3.
Switching between SSD and HDD storage
3.
Create an instance
3.1.
Before you start
3.2.
Create an instance
4.
Frequently Asked Questions
4.1.
What is a Bigtable instance?
4.2.
What is the difference between Bigtable and BigQuery?
4.3.
How is replication handled in Bigtable?
5.
Conclusion