Do you think IIT Guwahati certified course can help you in your career?
No
Introduction
Puppet is a configuration management tool. It is used for managing the infrastructure on physical or virtual devices. It is a Ruby-based open-source software. It facilitates real-time control of complicated infrastructure. Puppet has a declarative language for describing system configuration. Troubleshooting is a type of problem-solving frequently used to fix broken components or operations on a machine or a system. A logical, systematic search for the problem's source is required to address an issue and get the product or process back in operation.
Let's dive into the article to learn about troubleshooting in Puppet.
Troubleshoot problems with your Puppet Enterprise (PE) installation by the steps given in this blog.
Log locations: You can use the log files produced by the software included with Puppet Enterprise (PE) for troubleshooting.
Troubleshooting installations: You can look for these issues when the installation doesn't work.
Troubleshooting disaster recovery: You can check for these issues if commands for disaster recovery fail.
Troubleshooting puppet infrastructure run commands: If the puppet infrastructure run commands fail, review the logs at /var/log/puppetlabs/installer/bolt_info.log and check for these issues.
Troubleshooting connections between components: You can check for communication, certificate, DNS, and NTP issues if agent nodes cannot get configurations.
Troubleshooting the databases: You can use these techniques to resolve issues with the console's supporting databases.
Troubleshooting SAML connections: When establishing a connection between a SAML identity provider and PE, frequent problems and errors can appear. It included failed redirects, refused communications, and unsuccessful group binding.
Troubleshooting backup and restore: You can check for these issues if a backup or restoration fails.
Troubleshooting Code Manager
Troubleshooting Windows: Troubleshoot issues with Windows PE installations, including failed upgrades, unsuccessful installations, manifest application errors, and other problems.
Log locations
You can use the log files generated by software distributed with Puppet Enterprise(PE) for troubleshooting.
Primary server logs
Agent logs
Console and console services logs
Installer logs
Database logs
Orchestration logs
Troubleshooting installation
When the installation fails, check the following issues.
Misconfigured DNS
DNS stands for Domain Name Server. It is like an internet phonebook. It manages the mapping of IP addresses with the domain names (like codingninjas.com). For a successful installation, it must be correctly configured.
You should verify whether the agents you chose during installation can reach the primary server hostname.
You must verify whether the primary server can reach the primary server hostname.
You need to verify that the console component and the primary server can communicate with each other when present on different servers.
Misconfigured security settings
For a successful installation, security and firewall settings should be configured correctly.
You need to verify that ports 8140 and 443 should be allowed for inbound traffic on primary servers.
You must ensure that the server accepts traffic through IP addresses with valid DNS names. It is when primary servers have several network interfaces.
Troubleshooting disaster recovery
Disaster recovery means creating a copy(replica) of your primary servers. The following issues should be checked when your disaster recovery commands fail:-
Latency over WAN
Latency is the time data takes to pass from one point to another on a network. So, enabling and provision commands can fail when the replica and primary server communicate over high, slow latency or lossy connection. Try to re-run the command if this happens.
Replica connected to the compiler
The provision command generates an error if you try to set up a provisioning node connected to a compiler. The error can be like given below:-
Failure during provision command during the puppet agent run on replica 2:
Failed to generate additional resources using 'eval_generate': Error 500 on SERVER: Server Error: Not authorized to call search on /file_metadata/pe_modules with {:rest=>"pe_modules", :links=>"manage", :recurse=>true, :source_permissions=>"ignore", :checksum_type=>"md5"}
Source: /Stage[main]/Puppet_enterprise::Profile::Primary_master_replica/File[/opt/puppetlabs/server/share/installer/modules]File: /opt/puppetlabs/puppet/modules/puppet_enterprise/manifests/profile/primary_master_replica.ppLine: 64
Edit /etc/puppetlabs/puppet.conf on the replica you want to set up. So that server_list and server settings will be used by the primary server instead of a compile.
Server and server_list are both set in the agent configuration file
A warning can appear when settings for both server and server_list are present in the agent configuration file. It occurs after a replica is enabled. You can hide the warning or ignore it by removing the server setting. Only server_list should be present in the agent configuration.
Node groups are empty
The orchestrator is used to run Puppet when enabling and provisioning a replica. It helps to run Puppet on different groups of nodes. The orchestrator will report that there is nothing to do when a group of nodes is empty. In the puppet job show output, the job is marked as failed. It is not a problem, and the tool expects this response.
Troubleshooting puppet infrastructure run commands
Review the logs at /var/log/puppetlabs/installer/bolt_info.log when puppet infrastructure run commands fail. Also, check the following issues after reviewing the log file. But first of all, what do puppet infrastructure run commands do? It helps you to act as the root user among all nodes that the command touches.
Running commands as a non-root user
All puppet infrastructure run commands allow you to act as the root user on all nodes touched by the command. Run puppet infrastructure run command as a non-root user; you should be able to SSH. To succeed, SSH into impacted nodes as the same non-root user.
Passing hashes from the command line
The hash must be wrapped in quotes when passed on the command line. When it is given as a part of a puppet infrastructure run command, it is much like a JSON object. For example:-
Check the following for communication, DNS, certificate, and NTP issues if the agent node cannot retrieve the configurations.
Agents can’t reach the primary server
If Agent Nodes want to retrieve configurations, they must have the ability to communicate with the server.
If the agent cannot reach the server, you can run the command telnet<PRIMARY_HOSTNAME> 8140. It will return a Name or service not known error.
In such a case, follow these steps:
You need to verify that the server can reach the DNS name. Your agent must be able to recognize this DNS name.
You also need to verify the pre-puppetserver service is running.
Agents don’t have signed certificates
The primary server should sign the agent certificates. The agent CSR (certificate signing request) hasn't been signed if the node's Puppet agent logs contain a warning. These warnings are present in the current SSL session.
Run the command puppet cert list on the primary server to generate a list of pending CSRs.
Run the command puppetserver ca sign <NODE_NAME> to sign a node certificate.
Agents aren’t using the primary server’s valid DNS name
Agents can only communicate with the primary server using one valid hostname when the primary server is deployed. When you run puppet agent --configprint server on the agent node if you don't get one of the primary server's valid DNS names. DNS name is the name that you selected when installing the primary server. The agent node and primary server are unable to interact.
Open the file /etc/puppetlabs/puppet/puppet.conf and change the valid DNS name from the server setting.
Login as root and run the following command to reset the server's valid DNS names.
puppet infrastructure run regenerate_primary_certificate --dns_alt_names=<COMMA-SEPARATED_LIST_OF_DNS_NAMES>
Time is out of sync
The time and date should be synced on agent nodes and the primary server. Running the date command returns inconsistent or incorrect dates when time is out of sync on nodes. To get the sync time, set up the NTP. But NTP is not reliable for virtual machines.
Node certificates have invalid dates
When the certificates are created, the time and date should be synced. When certificates are signed out of sync, you will get invalid dates when the following command run
Run puppet agent --test each impacted agent node to generate a new certificate request.
Run puppetserver ca sign <NODE_NAME> on the primary server to sign each request.
Node is reusing a certname
A new node cannot request new certificates if a new node reuses an old node certname. The primary server retains the previous node certificate.
By running the following command, you can clear the node certificate:-
puppetserver ca clean --certname <NODE_CERT_NAME>
Run puppet agent --test on each impacted agent node to generate a new certificate request.
Run puppetserver ca sign <NODE_NAME> on the primary server to sign each request.
Agent can’t reach the filebucket server
Agents cannot backup files on the server to the filebucket when the server installed with the certname doesn't match its hostname.
Orchestrator can’t connect to the PE Bolt server
For debugging faulty connections, there are two options. It is between the Bolt server and the orchestrator.
Run Puppet after setting the bolt_server_loglevel argument in the puppet enterprise::profile::bolt_server class.
In the file /etc/puppetlabs/bolt-server/conf.d/bolt-server.conf, manually modify the loglevel parameter.
Frequently Asked Questions
Why is Puppet used?
Puppet is an open-source software configuration management and deployment tool. It's most commonly used on Linux and Windows to pull the strings on multiple application servers simultaneously.
Is Puppet still used?
Big-name companies like Oracle, Google, and many others run their data servers using Puppet.
Is Puppet an automation tool?
You may manage and automate the configuration of servers using Puppet.
Is Puppet easy to use?
Puppet is not an easy tool to manage. Its configurations use its language known as Puppet DSL(Domain Specific Language).
Conclusion
In this article, we have extensively discussed the basic concept of troubleshooting in Puppet. We have also explained troubleshooting, log locations, and troubleshooting installation. We also discussed disaster recovery, troubleshooting connections between components, and more detail.
We hope this blog has helped you enhance your basic concept of troubleshooting knowledge. If you want to learn more, check out our articles on Ansible vs. Puppet, DevOp's best things, DevOps tools, and reasons to build a career in DevOps.