Code360 powered by Coding Ninjas X Naukri.com. Code360 powered by Coding Ninjas X Naukri.com
Table of contents
1.
Introduction
2.
Architecture
2.1.
Admin cluster
3.
vSphere requirements
3.1.
Version compatibility, supported versions, and License requirements 
3.2.
Hardware requirements
3.3.
vCenter user account privileges
3.4.
Resource requirements for admin workstation, admin cluster, and user clusters
4.
Google Cloud project
4.1.
Log in
4.2.
Grant roles to your SDK account
5.
Installing Google Cloud CLI
6.
Upgrading Anthos clusters on VMware
6.1.
Locating your configuration and information files to prepare for an upgrade
6.2.
Install bundle for upgrade
6.3.
Troubleshooting the upgrade process
6.4.
Troubleshooting a user cluster upgrade issue
6.5.
Troubleshooting an admin cluster upgrade issue
6.6.
Known issues
6.6.1.
Upgrading the admin workstation might fail if the data disk is nearly full
6.6.2.
Version 1.7.0: Changes to Anthos Config Management updates
6.6.3.
Stackdriver references the old version
6.6.4.
Nodes fail to complete their upgrade process
6.7.
About VMware DRS rules enabled in version 1.1.0-gke.6
7.
Deploying an application
7.1.
Creation of deployment
7.2.
Deleting your Service
7.3.
Deleting your Deployment
8.
Setting up your load balancer for Anthos clusters on VMware
8.1.
Advantages of bundled load balancing
8.2.
Setting aside virtual IP addresses
8.3.
Setting aside nodePort values
8.4.
Configuring the load balancer
8.5.
Creating Services in your cluster
9.
Storage
9.1.
vSphere storage
9.2.
Container Storage Interface
9.3.
Kubernetes in-tree volume plugins
10.
Running preflight checks
10.1.
Traffic between the admin workstation and the test VMs
10.2.
When should I run preflight checks?
10.3.
Preflight check results
11.
Remove static IP addresses from a cluster
11.1.
Remove IP addresses from a user cluster
11.2.
Remove IP addresses from an admin cluster
12.
Security 
12.1.
Authentication and Authorization
12.2.
Control plane security
12.3.
Node security
12.4.
Node upgrades
12.5.
Audit logging
12.6.
Encryption
12.7.
Kubernetes Secrets
13.
Logging and monitoring
13.1.
Cloud Logging and Cloud Monitoring
13.2.
Working of logging and monitoring for Anthos clusters
13.3.
Configuring logging and monitoring agents for Anthos clusters on VMware
13.4.
Configuration requirements for logging and monitoring
13.5.
Pricing
13.6.
Working of Prometheus and Grafana for Anthos clusters
13.7.
Multi-cluster monitoring
14.
Frequently Asked Questions
14.1.
Describe GCP storage.
14.2.
What does Google Cloud Storage mean by a bucket?
14.3.
What does "bucket" in Google Cloud Storage mean?
15.
Conclusion
Last Updated: Mar 27, 2024
Medium

Anthos clusters on VMware

Author Shivani Singh
0 upvote
Leveraging ChatGPT - GenAI as a Microsoft Data Expert
Speaker
Prerita Agarwal
Data Specialist @
23 Jul, 2024 @ 01:30 PM

Introduction

Software called Anthos clusters on VMware (GKE on-prem) allows Google Kubernetes Engine (GKE) to be used in on-premises data centers. You may build, manage, and upgrade Kubernetes clusters in your on-premises environment using Anthos clusters on VMware.

With Connect, your on-premises and on-cloud Kubernetes clusters can be viewed and accessed from the same Google Cloud console interface.

The vSphere environment in your data center hosts Anthos clusters on VMware. VMware's server virtualization program is called vSphere. The vCenter Server from VMware is used by Anthos clusters on VMware to manage your clusters.

Architecture

One or more user clusters, an admin cluster, and an admin workstation are all included in Anthos clusters on VMware installation. The virtual machines (VMs) in a VMware Anthos cluster are all part of the same vSphere cluster. On VMware clusters, Anthos clusters may be located in one or more vSphere clusters.

image that shows workflow of gcp architecture and user

Admin cluster

The foundational layer of Anthos clusters on VMware is the admin cluster. On VMware components, it runs the following Anthos clusters:

  • Cluster management control plane The Kubernetes API server, the scheduler, and numerous controllers for the admin cluster are all part of the admin cluster's control plane.
  • control planes for user clusters. The control plane for the user cluster is run by a node in the admin cluster for each user cluster. The scheduler, various controllers for the user cluster, and the Kubernetes API server are all components of the control plane.
  • Add-ons. Numerous Kubernetes add-ons, including Grafana, Prometheus, and the operations suite from Google Cloud, are used by the admin cluster. Add-ons are launched by Anthos clusters on VMware on distinct admin cluster nodes than other control plane elements.

Your containerized workloads and services are installed and executed on user clusters.

Get the tech career you deserve, faster!
Connect with our expert counsellors to understand how to hack your way to success
User rating 4.7/5
1:1 doubt support
95% placement record
Akash Pal
Senior Software Engineer
326% Hike After Job Bootcamp
Himanshu Gusain
Programmer Analyst
32 LPA After Job Bootcamp
After Job
Bootcamp

vSphere requirements

The vSphere infrastructure in your data center hosts Anthos clusters on VMware (GKE on-prem).

Version compatibility, supported versions, and License requirements 

Depending on the version of Anthos clusters on VMware you're using, there are different vSphere requirements. VMware's server virtualization program is called vSphere. ESXi and vCenter Server are components of vSphere.

These ESXi and vCenter Server versions are supported by Anthos clusters running on VMware:

  • Update 3+ to 6.5
  • Update 3+, 6.7
  • 7.0.1+ Update

The versions for ESXi and vCenter Server must be 6.7 Update 3+ or 7.0 Update 1+ if you utilize the vSphere CSI driver.

The following vSphere licenses are required:

  • A license for either vSphere Enterprise Plus or vSphere Standard.
  • Because it enables the Distributed Resource Scheduler, we recommend the Enterprise Plus license (DRS).
  • You must also buy a support subscription for at least a year in addition to this license.
  • A license for vCenter Server Standard. You must also buy a support subscription for at least a year in addition to this license.

Hardware requirements

An array of physical hosts running the VMware ESXi hypervisor serve as the foundation for Anthos clusters on VMware. By default, Anthos clusters on VMware distribute the nodes of your admin cluster and user cluster across at least three physical hosts in your datacenter using Distributed Resource Scheduler (DRS) anti-affinity rules. Your vSphere environment must comply with the following requirements in order to use this feature:

  • VMware DRS is turned on. vSphere Enterprise Plus license edition is required for VMware DRS.
  • The Host. Inventory is part of your vSphere user account.
  • Create a new cluster privilege.
  • At least three physical hosts are accessible.

Set antiAffinityGroups.enabled to false in the admin cluster and user accounts if DRS is not enabled or if there are not at least three hosts where vSphere VMs can be scheduled.

vCenter user account privileges

For the many users in your organization, including your Anthos cluster administrator and the users who develop on those clusters, you can establish custom roles in vCenter or use vCenter system roles.

You must have the necessary permissions for the vCenter user account you use to install Anthos clusters on VMware. An Anthos cluster administrator, for instance, has full access to all vCenter objects thanks to the rights granted to a user account with the Administrator role in vCenter. To grant the required permissions to the users of your cluster, you establish custom roles for other vCenter user accounts.

  • Use the following table to determine what privileges your Anthos cluster users must have at a minimum.
  • The following instructions allow a user account with administrator access to establish a custom vCenter role, specify the minimal set of permissions for that role, and then assign that custom role to an already-existing vCenter user account.

Resource requirements for admin workstation, admin cluster, and user clusters

When you first deploy Anthos clusters on VMware, the actual ESXi hosts in your data center must be able to meet the storage, CPU, and RAM requirements of the virtual machines you will build. Additionally, your data center has to have adequate virtual disc space to accommodate PersistentVolumeClaims (PVCs) made by Prometheus and the operations toolkit for Google Cloud.

These resources are needed for the initial installation of Anthos clusters on VMware:

  • Virtual disc space of 2280 GB
  • 36 vCPU
  • 98241 MB of RAM.

Google Cloud project

Log in

Your SDK account property is used by the command-line program gkeadm to generate service accounts. In order to build an admin workstation, it is crucial that you set your SDK account property before launching gkeadm.

Use any Google Account to sign in. Your SDK account property is set to:

gcloud auth login

Make that the SDK account property is correctly configured.

gcloud config list

Grant roles to your SDK account

These IAM roles are required for the Google Account that is designated as your SDK account property in order for gkeadm to generate and manage service accounts for you:

  • Resourcemanager. project
  • IamAdmin \sserviceusage.service
  • UsageAdmin \siam.service
  • AccountCreator \siam.service
  • AccountKeyAdmin

You need to have specific permissions on your Cloud project in order to grant roles.

Installing Google Cloud CLI

To set up the gcloud CLI and associated tools:

  1. Installing the gcloud CLI.
  2. Obtain the most recent parts and versions:
  3. Execute the gcloud CLI update command as follows:
    gcloud components update
  4. Execute the following commands to install the anthos-auth and kubectl components:
    gcloud components install kubectl
    gcloud components install anthos-auth

Upgrading Anthos clusters on VMware

Any version that is part of the current minor release or the one after it can be upgraded immediately. You might update from 1.8.0 to 1.8.3 or from 1.8.1 to 1.9.0, for instance.

You must upgrade through one version of each minor release between your current version and your target version if you are upgrading to a version that is not included in the next minor release. It is not feasible to update straight, for instance, from version 1.7.2 to version 1.9.0. Prior to upgrading to version 1.9.0, you must first upgrade from version 1.7.2 to version 1.8.x.

The admin workstation should be upgraded first, followed by the user clusters, and finally the admin cluster. If you want to keep the admin cluster running on its current version, you do not have to upgrade it right after upgrading the user clusters.

  • Get the gkeadm tool now. The target version of your upgrade must match the version of gkeadm.
  • Admin workstation upgrade.
  • Upgrade your user clusters from your administrative workstation.
  • Upgrade your admin cluster from your admin workstation.

Locating your configuration and information files to prepare for an upgrade

You filled out an admin workstation configuration file that was generated by gkeadm create config before you built your admin workstation. This file's default name is admin-ws-config.yaml. Additionally, gkeadm made a file with information for you. This file's default name is the same as the name of the admin workstation you are now using.

Find the information file and the configuration file for your admin workstation. They must follow the instructions in this manual. You won't need to supply these files when you run the upgrade commands if they are already present in your current directory and have their default names.

The —config and —info-file switches are used to indicate these files if they are in another location or if you have changed their filenames.

You can make a new copy of your output information file if it goes missing.

Install bundle for upgrade

You must install the correct bundle in order to make a version available for cluster setup or upgrade. To install a bundle for TARGET VERSION, the version you wish to upgrade to, simply follow these instructions.

Run this command to verify the gkectl and cluster versions right now:

gkectl version --kubeconfig ADMIN_CLUSTER_KUBECONFIG --details

Look for the following problems and address them as necessary based on the output you receive.

  • The new upgrading flow is not immediately accessible if the gkectl version is less than 1.7. Upgrade your admin workstation to 1.7 in order to begin using the new upgrade flow after upgrading all of your clusters to 1.6 in accordance with the original upgrade procedure.
  • Upgrade all of your clusters to be one minor version lower than the TARGET VERSION if the current admin cluster version is more than one minor version lower.
  • Upgrade the admin workstation to the TARGET VERSION if the gkectl version is less than the TARGET VERSION.

Troubleshooting the upgrade process

If you run into problems when performing the suggested update process, use these suggestions to fix them. These recommendations presuppose that you started with a version 1.6.2 configuration and are now working through the suggested upgrading procedure.

Troubleshooting a user cluster upgrade issue

Let's say you upgrade a user cluster or test the canary cluster and discover a problem with 1.7. You learn from Google Support that a forthcoming patch version 1.7.x will address the problem. You can carry out the following:

  • Continue to use 1.6.2 in production.
  • When the 1.7.x patch release is available, test it in a canary cluster.
  • When you feel comfortable with it, upgrade all production user clusters to version 1.7.x.
  • Upgrading to 1.7.x for the admin cluster.

Troubleshooting an admin cluster upgrade issue

You must get in touch with Google Support if you run into problems when upgrading the admin cluster in order to fix the admin cluster problem.

In the interim, the new upgrade flow allows you to continue to take advantage of new user cluster features without being hindered by admin cluster upgrades, allowing you to lower the admin cluster's update frequency if you so choose. Use the Container-Optimized OS nodepool, for instance, which was introduced in version 1.7. Your upgrade procedure can go like this:

  • Upgrade user clusters in production to 1.7.
  • Maintain the admin cluster at version 1.6 and keep getting security updates;
  • Test the upgrade of the admin cluster from version 1.6 to version 1.7 in a test environment and report any concerns;
  • If a 1.7 patch release resolves your problem, you may decide to upgrade the production admin cluster from 1.6 to this 1.7 patch release.

Known issues

Upgrading the admin workstation might fail if the data disk is nearly full

The system tries to back up the current admin workstation locally when upgrading to a new admin workstation, thus if you execute the gkectl upgrade admin-workstation command and the data disc is almost full, the upgrade may fail. Use the gkectl upgrade admin-workstation command with the additional flag —backup-to-local=false to avoid creating a local backup of the present admin workstation if you are unable to free up enough space on the data drive.

Version 1.7.0: Changes to Anthos Config Management updates

Anthos clusters on VMware came with the images needed to install and upgrade Anthos Config Management in versions earlier than 1.7.0. The Anthos Config Management software must be added separately as of version 1.7.0 because it is no longer part of the Anthos clusters on the VMware bundle. The software is not upgraded unless you take action if Anthos Config Management was previously used on your cluster or clusters.

Stackdriver references the old version

Before version 1.2.0-gke.6, Stackdriver is unable to update its configuration upon cluster upgrades due to a known problem. The fact that Stackdriver still uses an outdated version stops it from having access to the most recent improvements to its telemetry pipeline. Cluster troubleshooting may be challenging for Google Support due to this problem.

Nodes fail to complete their upgrade process

Repeated efforts to upgrade a node to the control plane version may fail if you have PodDisruptionBudget objects specified that are not able to allow any extra disruptions. We advise scaling up the Deployment or Horizontal Pod Autoscaler to allow the node to drain while still adhering to the PodDisruptionBudget preset in order to avoid this failure.

To view all PodDisruptionBudget items that forbid any disturbances, click here:

kubectl get poddisruptionbudget --all-namespaces -o jsonpath='{range .items[?(@.status.disruptionsAllowed==0)]}{.metadata.name}/{.metadata.namespace}{"\n"}{end}'

About VMware DRS rules enabled in version 1.1.0-gke.6

Anthos clusters on VMware now automatically generate VMware Distributed Resource Scheduler (DRS) anti-affinity rules for your user cluster's nodes, distributing them across at least three physical hosts in your data center. This feature is available in version 1.1.0-gke.6.

Make sure the following requirements are met by your vSphere environment before upgrading:

  • VMware DRS is turned on. vSphere Enterprise Plus license edition is required for VMware DRS. Enabling VMware DRS in a cluster is where you can find out how to activate DRS.
  • The Host. Inventory is owned by the vSphere username specified in your credentials configuration file. edit-cluster authorization.
  • At least three physical hosts are accessible.

You can still upgrade if your vSphere environment satisfies the aforementioned requirements, but you must disable anti-affinity groups in order to upgrade a user cluster from 1.3.x to 1.4.x.

Deploying an application

Creation of deployment

1.Create the deployment by copying the manifest into a file called my-deployment.YAML.

kubectl apply --kubeconfig [USER_CLUSTER_KUBECONFIG] -f my-deployment.yaml


2. Gather fundamental details about your deployment:

kubectl --kubeconfig [USER_CLUSTER_KUBECONFIG] get deployment my-deployment


3. The output demonstrates that the Deployment has three accessible Pods:

NAME            READY   UP-TO-DATE   AVAILABLE   AGE
my-deployment   3/3     3            3           27s


4. List the pods that are part of your deployment:

kubectl --kubeconfig [USER_CLUSTER_KUBECONFIG] get pods

Deleting your Service

1.Remove your Service:

kubectl --kubeconfig [USER_CLUSTER_KUBECONFIG] delete service my-service


2. Make sure your Service has been removed by checking:

kubectl --kubeconfig [USER_CLUSTER_KUBECONFIG] get services


3. The output is missing my service.

Deleting your Deployment

1.Delete the deployment:

kubectl --kubeconfig [USER_CLUSTER_KUBECONFIG] delete deployment my-deployment


2. Confirm the deletion of your deployment:

kubectl --kubeconfig [USER_CLUSTER_KUBECONFIG] get deployments


3. My deployment is no longer displayed in the output.

Setting up your load balancer for Anthos clusters on VMware

Load balancers can be operated in one of three ways by Anthos clusters on VMware (GKE on-premise) clusters: integrated, packaged, or manually. The F5 BIG-IP load balancer is used by an Anthos cluster on a VMware cluster when the integrated option is selected.

Anthos clusters on VMware offer and control the load balancer using the included option. A load balancer requires very little setup and does not require you to obtain a license.

Advantages of bundled load balancing

Compared to manual load balancing, bundled load balancing offers the following benefits:

  • Both cluster setup and load balancer configuration can be handled by a single team. A cluster administration team, for instance, wouldn't need to rely on a different networking team to purchase, operate, and install the load balancer beforehand.
  • Virtual IP addresses (VIPs) on the load balancer are automatically configured by Anthos clusters on VMware. Anthos clusters on VMware set up the load balancer with VIPs for the Kubernetes API server, the ingress service, and the cluster add-ons during the cluster building process.
  • Dependencies between groups, administrators, and organizations are diminished. Particularly, the group in charge of a cluster is less reliant on the team in charge of the network.

Setting aside virtual IP addresses

You must reserve a number of virtual IP addresses (VIPs) that you intend to utilize for load balancing, regardless of the load balancing mode. In integrated and bundled mode, the F5 BIG-IP or Seesaw load balancer is automatically configured to use the VIPs when you define them in the cluster configuration file for Anthos clusters running on VMware. You have to manually set up your load balancer in manual mode to use the VIPs.

You must reserve this VIP for your administrative cluster:

  • Kubernetes API server VIP

You must reserve these VIPs for each user cluster you intend to create:

  • VIP for the API server for Kubernetes
  • VIP access for the entry service

Setting aside nodePort values

The Kubernetes API server, the ingress service, and the add-on service are all implemented as Kubernetes Services of type NodePort on Anthos clusters on VMware clusters. Anthos clusters on VMware choose the nodePort values for these Services automatically when using integrated or bundled load balancing mode. You must specify the nodePort values to be used for these Services in manual load balancing mode.

Configuring the load balancer

Anthos clusters on VMware automatically configure the load balancer with the VIPs that you specify in the cluster configuration file when using integrated or bundled load balancing mode. You must configure your load balancer using the VIPs you have selected in manual mode. Depending on the load balancer you're using, there are several configuration options.

Creating Services in your cluster

You can build a service of the type LoadBalancer and define a VIP for the service when using integrated or packaged load balancing mode. The VIP on the load balancer is automatically configured by Anthos clusters on VMware.

You cannot expose a Service of the type LoadBalancer to external clients when using manual load balancing mode. Instead, you can follow these instructions to make a Service accessible to outside clients:

  • Make a NodePort-type Service.
  • Select a VIP to receive your service.
  • Set up your load balancer manually to direct traffic sent to the VIP to your Service.

Storage

With the help of VMware vSphere storage, Kubernetes in-tree volume plugins (or "drivers"), and Container Storage Interface (CSI) drivers, Anthos clusters running on VMware can connect to external block or file storage systems.

On a vSphere datastore, Anthos clusters on VMware employ a default Kubernetes StorageClass to provision storage for stateful applications. A StorageClass can be used to provide various storage volumes.

vSphere storage

Anthos clusters on VMware use vSphere storage by default. For its etcd data, the admin cluster needs a pre-provisioned vSphere datastore.

Anthos clusters on VMware use the vSphere Kubernetes volume plugin to dynamically provide new virtual machine discs (VMDKs) in a vSphere datastore when you start a user cluster. (Note that prior to version 1.2, admin and user clusters shared the same datastore.)

The admin and user clusters' vSphere datastores may be supported via NFS, vSAN, or VMFS on a block device such as an external storage array. Each block device must be connected to every host in a multi-host scenario, and the datastore must be set up using the Mount Datastore on Additional Hosts option on every host.

Container Storage Interface

An open standard API called Container Storage Interface (CSI) enables Kubernetes to make any storage system accessible to workloads running in containers. Workloads can connect directly to a compatible storage device without going through vSphere storage when CSI-compliant volume drivers are installed in Anthos clusters on VMware clusters.

You must install the CSI driver offered by your storage vendor in order to use CSI in your cluster. Then, you can designate the driver's StorageClass as the default StorageClass or configure workloads to use it. To qualify their storage systems with Anthos clusters on VMware, we have partnered with numerous storage providers.

Kubernetes in-tree volume plugins

There are numerous in-tree volume plugins included with Kubernetes. For your stateful workloads, you can choose to employ any of these to offer block or file storage. Workloads can connect directly to storage using in-tree plugins instead of going through vSphere storage.

Many of the in-tree plugins do not allow dynamic provisioning, whereas vSphere storage automatically provides it for volumes inside a datastore backed by any iSCSI, FC, or NFS storage device. You must manually create PersistentVolumes in order to use them.

Running preflight checks

To create an Anthos cluster on VMware configuration file during installation, execute gkectl create-config. Your installation is driven by the configuration file, where you enter details about your vSphere environment, network, load balancer, and desired cluster layout. Either before or after setting up an admin workstation, you can create a configuration file.

You utilize the file to create your clusters in your on-premises environment once you've adjusted it to suit the requirements of your environment and clusters.

Run gkectl check-config before creating clusters to verify the configuration file with various preflight checks. Fix the problems and revalidate the file if the command returns any FAILURE messages.

Traffic between the admin workstation and the test VMs

The preflight check generates test VMs for the cluster in default mode. An HTTP server that listens on port 443 and the node ports you provided in your configuration file is running on each test VM.

The test VMs have a number of IP addresses assigned to them. The preflight check uses a DHCP server to assign IP addresses to the test VMs if your configuration file specifies that your cluster nodes will receive their IP addresses from one. The preflight check assigns static IP addresses that you provided in your IP block files to the test VMs if your configuration file specifies that your cluster nodes will be issued static IP addresses.

Using the various IP addresses that have been assigned to the test VMs, the preflight check running on the admin workstation sends HTTP requests to the test VMs. The requests are forwarded to both the node ports you defined in your configuration file and port 443, respectively.

When should I run preflight checks?

Running preflight tests early and before attempting to construct clusters is great practice. Preflight tests can help you make sure that your network and vSphere environment are configured properly.

Run gkectl check-config twice if you are running Anthos clusters with VMware version 1.2.0-gke.6:

  1. Activate gkectl check-config —fast.
  2. Activate gkectl prepare.
  3. Rerun gkectl check-config but omit the —fast option.

Gkectl prepares and uploads the VM template for the cluster node OS image to your vSphere environment, which is why it needs to execute twice. Before you perform the entire set of validations, that VM template must be in place.

The check-config command uploads the VM template for Anthos clusters on VMware versions 1.2.1 and later, allowing you to perform all necessary validations before executing gkectl prepare:

  1. Without the —fast flag, run gkectl check-config.
  2. Activate gkectl prepare.

Preflight check results

The following outcomes may be obtained via preflight checks:

SUCCESS

The field and its value were found to be valid.

FAILURE

The field's value or both of its components failed the test. Fix the problems and revalidate the file if a check returns a FAILURE message.

SKIPPED

The fact that the check is irrelevant to your configuration is probably why it was ignored. For instance, tests for DNS and node IPs—relevant exclusively to a static IP configuration—are ignored if you are using a DHCP server.

UNKNOWN

A non-zero code was returned by the skip. Results that are UNKNOWN can be regarded as failed checks. UNKNOWN typically means that the check was unable to execute a system package, such as nslookup or gcloud.

Remove static IP addresses from a cluster

An IP block file is used to specify a list of IP addresses when building a cluster with nodes that have static IP addresses. You can take out some of the IP addresses from the cluster if you subsequently realize that you supplied more IP addresses than was necessary.

Remove IP addresses from a user cluster

Step 1: Each associated user cluster with the admin cluster has an OnPremUserCluster custom resource. Edit the OnPremUserCluster custom resource for your user cluster in the admin cluster:

kubectl --kubeconfig ADMIN_CLUSTER_KUBECONFIG edit onpremusercluster USER_CLUSTER_NAME \
    --namespace USER_CLUSTER_NAME-gke-onprem-mgmt


Step 2: Delete particular IPs from the ipBlocks section.


Step 3: End the editing process.


Step 4: View all of the Machine objects in the default namespace in your user cluster: 

kubectl --kubeconfig USER_CLUSTER_KUBECONFIG get machines --output yaml


Step 5: Remove each Machine object that makes use of a discarded IP address.


Step 6: View the cluster node addresses after a short while:

kubectl --kubeconfig USER_CLUSTER_KUBECONFIG get nodes --output wide


Step 7: Check to make sure no removed IP addresses show up in the result.

myhost2   Ready ... 198.51.100.2
myhost3   Ready ... 198.51.100.3
myhost4   Ready ... 198.51.100.4

Remove IP addresses from an admin cluster

Make sure you will have enough IP addresses left over once they are removed. One IP address is required for the control-plane node of the admin cluster, two are required for add-on nodes, and a third address is required for a temporary node to be utilized during upgrades. Additionally, you require one to three addresses for the user cluster control plane for each related user cluster. Three nodes are needed in the admin cluster for the control plane of each high-availability (HA) user cluster. Each non-HA user cluster needs one admin cluster node for the user cluster's control plane.

Step 1: Ascertain the IP address being utilized for the admin cluster's control-plane node:

kubectl --kubeconfig ADMIN_CLUSTER_KUBECONFIG get nodes --output wide


Step 2: Modify the OnPremAdminCluster custom resource in the administrative cluster:

kubectl --kubeconfig ADMIN_CLUSTER_KUBECONFIG edit onpremadmincluster --namespace kube-system


Step 3: Delete particular IPs from the ipBlocks section. Make sure you don't change the IP address assigned to the admin cluster's control-plane node.


Step 4: End the editing process.


Step 5: View all of the Machine objects in the default namespace in your admin cluster:

kubectl --kubeconfig ADMIN_CLUSTER_KUBECONFIG get machines --output yaml


Step 6: Remove each Machine object that makes use of a discarded IP address.


Step 7: View the addresses of the cluster nodes:

kubectl --kubeconfig ADMIN_CLUSTER_KUBECONFIG get nodes --output wide


Step 8: Check to make sure no removed IP addresses show up in the result.

gke-admin-master-hdn4z  Ready  control-plane,master  198.51.100.101
gke-admin-node-abcd   Ready ... 198.51.100.103
gke-admin-node-efgh   Ready ... 198.51.100.104
my-user-cluster-ijkl  Ready ... 198.51.100.105
my-user-cluster-mnop  Ready ... 198.51.100.106
my-user-cluster-qrst  Ready ... 198.51.100.107
my-user-cluster-uvwx  Ready ... 198.51.100.108

Security 

The contents of your container image, the container runtime, the cluster network, and access to the cluster API server are just a few of the security measures that Anthos clusters on VMware offer to help protect your workloads.

Your clusters and workloads will be most protected if you adopt a layered strategy. The notion of least privilege can be used to determine how much access you provide your users and workloads. To provide the appropriate level of flexibility and security, you might need to make trade-offs.

Authentication and Authorization

Through the Cloud console, you can use OpenID Connect (OIDC) or a Kubernetes Service Account token to log into Anthos clusters on VMware clusters.

Use Kubernetes Role-Based Access Control to set up more granular access to Kubernetes resources at the cluster level or within Kubernetes namespaces (RBAC). You may define precisely which processes and resources you want users and service accounts to have access to using RBAC. You can manage access for any specified certified identity using RBAC.

Anthos clusters on VMware disable traditional Attribute Based Access Control to further streamline and simplify your authentication and authorization strategy for the Kubernetes Engine (ABAC).

Control plane security

The Kubernetes API server, scheduler, controllers, etcd database, where your Kubernetes configuration is stored, are all parts of the control plane. Local administrators oversee the control plane components in Anthos clusters on VMware, as opposed to Google managing and maintaining them in Kubernetes Engine.

All communication in Anthos clusters running on VMware takes place over TLS channels, which are controlled by etcd, cluster, and org:

  • The etcd CA encrypts traffic between etcd replicas as well as communication from the API server to the replicas. This CA is personally signed.
  • All internal Kubernetes API clients can communicate safely with the API server thanks to the cluster CA (kubelets, controllers, schedulers). This CA is personally signed.
  • The Kubernetes API is provided to outside users by the org CA, an external CA. You are the CA's manager.

Node security

Your workloads are deployed in VMware instances using Anthos clusters on VMware, which are connected to your clusters as nodes. The sections that follow demonstrate how to use the node-level security tools offered by Anthos clusters on VMware.

Node upgrades

You ought to regularly upgrade your nodes. You may occasionally need to upgrade your nodes more quickly due to security concerns with the container runtime, Kubernetes, or the node operating system. The most recent versions of each node's software are upgraded when you upgrade your cluster.

Audit logging

Administrators may keep, query, process, and receive alerts on events that take place in their Anthos clusters on VMware environments by using Kubernetes Audit logging. The logged data can be utilized by administrators for forensic investigation, real-time alerting, or tracking who is using a fleet of Kubernetes Engine clusters.

Anthos clusters on VMware automatically log administrative activity. Depending on the kinds of processes you're looking to examine, you can additionally log data access events.

The only on-premises API server that the Connect agent communicates with is the local one, and each cluster needs to have its own set of audit logs. That cluster logs every activity users take through Connect from the UI.

Encryption

Use Cloud Key Management Service (Cloud KMS) for key management if your Anthos clusters on VMware clusters and workloads securely connect to Google Cloud services over Cloud VPN. You may manage cryptographic keys for your services using Cloud KMS, a key management service located in the cloud. Since Identity and Access Management (IAM) and Cloud Audit Logging are connected with Cloud KMS, you can manage key permissions and keep track of how they are utilized. Protect Secrets and other sensitive data that you need to keep by using Cloud KMS. Alternatively, you might use one of the following options:

  • Secrets of Kubernetes
  • Thales Hashicorp Vault network Luna HSM
  • Hardware security module for Google Cloud (HSM)

Kubernetes Secrets

Your clusters' Kubernetes Secrets resources hold private information like passwords, OAuth tokens, and SSH keys. Sensitive information is safer stored in Secrets than it is in Pod specs or unencrypted ConfigMaps. Using Secrets lowers the possibility of exposing sensitive data to unauthorized individuals and offers you control over how the data is utilized.

Logging and monitoring

For cluster logging and monitoring, Anthos clusters on VMware (GKE on-prem) offer a variety of alternatives, including cloud-based managed services, open source tools, and verified interoperability with third-party commercial solutions.

Cloud Logging and Cloud Monitoring

The built-in observability solution for Google Cloud is the operations suite (formerly Stackdriver). It provides metrics gathering, monitoring, dashboarding, and alerting in addition to a fully managed logging system. Similar to cloud-based GKE clusters, Anthos clusters on VMware clusters are monitored by Cloud Monitoring.

Customers seeking a single, simple-to-configure, robust cloud-based observability solution will find Cloud Logging and Cloud Monitoring to be the appropriate choice. When operating workloads exclusively on Anthos clusters on VMware or workloads on GKE and Anthos clusters on VMware, we strongly advise logging and monitoring. You might think about additional options for an end-to-end view of applications with components operating on Anthos clusters on VMware and conventional on-premises infrastructure.

Grafana and Prometheus are two well-liked open source monitoring tools:

  • Application and system metrics are gathered via Prometheus.
  • Multiple alerting mechanisms are used by Alertmanager to send out alerts.
  • A dashboarding tool is Grafana.

Every admin cluster and user cluster has the option of turning on Prometheus and Grafana. For application teams with prior expertise with those technologies, or for operational teams who wish to keep application metrics within the cluster and for problem-solving when network connectivity is lost, Prometheus and Grafana are recommended.

Working of logging and monitoring for Anthos clusters

When you build a new admin or user cluster, logging and monitoring agents are deployed and turned on in each cluster. The agents' scope of data collection can be configured, and it focuses on system components.

Each cluster's logging and monitoring agents consist of:

  • Metric agent GKE (gke-metrics-agent). a DaemonSet for communicating metrics with the Cloud Monitoring API.
  • Forwarded logs (stackdriver-log-forwarder). A Fluent Bit DaemonSet that sends each machine's logs to Cloud Logging The log forwarder locally caches and resends log entries for a maximum of four hours. Logs are deleted if the buffer fills up or if the log forwarder cannot connect to the Cloud Logging API for more than four hours.
  • GKE metrics agent worldwide (gke-metrics-agent-global). a deployment that uses the Cloud Monitoring API to deliver metrics.
  • Agent for metadata (stackdriver-metadata-agent). a deployment that communicates with the Stackdriver Resource Metadata API to send metadata for Kubernetes resources like pods, deployments, or nodes.

Configuring logging and monitoring agents for Anthos clusters on VMware

Depending on your settings and setup, the agents installed with Anthos clusters on VMware gather information about system components for cluster maintenance and troubleshooting.

System components only (default scope)

Upon installation, agents gather logs and metrics for Google-provided system components, such as performance information (such as CPU and memory consumption). All workloads in the admin cluster, as well as those in the namespaces kube-system, gke-system, gke-connect, istio-system, and configuration-management-system, are included in this.

Optimized metrics (default metrics)

An optimized set of a container and kubelet measurements are automatically collected and reported to Google Cloud's operations suite by the metrics agents running in the cluster (formerly Stackdriver). This optimized set of metrics requires fewer resources to gather, which enhances performance and scalability in general. Due to the significant number of items that need to be tracked, this is particularly crucial for metrics at the container level.

Configuration requirements for logging and monitoring

You must configure the Cloud project that holds the logs and metrics you wish to examine in order to view Cloud Logging and Cloud Monitoring data. Your logging-monitoring project is the name of this cloud project.

  1. Make sure your logging-monitoring project has the following APIs enabled:
  • Stackdriver API
  • Cloud Monitoring API
  • Cloud Logging API
  • Config Monitoring for Ops API
  1. On your logging-monitoring project, grant the following IAM roles to your logging-monitoring service account.
  • logging.logWriter
  • monitoring.metricWriter
  • stackdriver.resourceMetadata.writer
  • monitoring.dashboardEditor
  • opsconfigmonitoring.resourceMetadata.writer

Pricing

Anthos system logs and analytics are free to use.

Among the Anthos system logs and metrics in a VMware cluster are the following:

  • Metrics and logs from every component in an administrative cluster
  • Components in these namespaces' logs and metrics in a user cluster: Kube, GKE, GKE Connect, Knative Serving, Istio, Monitoring, Configuration Management, Gatekeeper, and Cnrm Systems

Working of Prometheus and Grafana for Anthos clusters

Grafana and Prometheus are disabled by default when an Anthos cluster is formed on a VMware cluster.

Two replicas of the Prometheus Server are running on two different nodes and are configured in a highly available configuration. To support clusters with up to five nodes and up to 30 Pods that serve configurable metrics on each, resource requirements are changed. Prometheus features a dedicated persistent volume with preallocated disc space to store data for a four-day retention period plus an additional safety buffer.

You can independently configure the dedicated monitoring stack for the admin control plane and each user cluster. Each administrator and user cluster comes with a monitoring stack that offers a full range of features: Grafana for observability, and Prometheus Server for monitoring.

Multi-cluster monitoring

Prometheus and Grafana are installed on the admin cluster and are carefully configured to offer insight across all Anthos clusters on the VMware instance, including the admin cluster and each user cluster. As a result, you can:

  • To get metrics from each admin cluster and user cluster, utilize a Grafana dashboard.
  • View metrics from specific user clusters on Grafana dashboards; direct queries can access the full resolution of the metrics.
  • Access node-level and workload metrics for user clusters for compiled queries, dashboards, and alerts (workload metrics are limited to workloads running in the kube-system namespace).
  • Set up alerts for particular clusters.

Also see, kubernetes interview questions

Frequently Asked Questions

Describe GCP storage.

Google offers a cloud storage solution called GCP storage. You can use it at any time and from any location to view your info. This storage is very secure, scalable, and long-lasting. You can store your personal data, application data, client data, and more with this storage service.

What does Google Cloud Storage mean by a bucket?

The most fundamental storage units for data are buckets. You can organize data and assign control access using buckets. The name of the bucket, along with the location where its contents are kept, is globally distinctive. Additionally, it provides a default storage class that is used when adding objects to the bucket without a storage class specified. The buckets can be created and deleted without restriction.

What does "bucket" in Google Cloud Storage mean?

Buckets are the most basic kind of data storage. Using buckets, you may organize data and assign control access. Both the bucket's name and the spot where its contents are kept are instantly recognizable. When adding objects to the bucket without specifying a storage class, it also offers a default storage class that is used. The ability to add and remove buckets is unrestricted.

Conclusion

To sum it up, in this blog, we discussed Anthos clusters on VMware, architecture, vSphere requirements, Version compatibility, supported versions and License requirements, google cloud project, Installing Google Cloud CLI, Upgrading Anthos clusters on VMware, and deploying an application. We also discussed Setting up your load balancer for Anthos clusters on VMware, security, and last but not least logging and monitoring.

For more content, Refer to our guided paths on Coding Ninjas Studio to upskill yourself.

Do upvote our blogs if you find them helpful and engaging!

Happy Learning!

thank you image

Live masterclass