Table of contents
1.
Introduction
2.
Troubleshooting
3.
Log locations
4.
Troubleshooting installation
4.1.
Misconfigured DNS
4.2.
Misconfigured security settings
5.
Troubleshooting disaster recovery
5.1.
Latency over WAN
5.2.
Replica connected to the compiler
5.3.
Server and server_list are both set in the agent configuration file
5.4.
Node groups are empty
6.
Troubleshooting puppet infrastructure run commands
6.1.
Running commands as a non-root user
6.2.
Passing hashes from the command line
7.
Troubleshooting connections between components
7.1.
Agents can’t reach the primary server
7.2.
Agents don’t have signed certificates
7.3.
Agents aren’t using the primary server’s valid DNS name
7.4.
Time is out of sync
7.5.
Node certificates have invalid dates
7.6.
Node is reusing a certname
7.7.
Agent can’t reach the filebucket server
7.8.
Orchestrator can’t connect to the PE Bolt server
8.
Frequently Asked Questions
8.1.
Why is Puppet used?
8.2.
Is Puppet still used?
8.3.
Is Puppet an automation tool?
8.4.
Is Puppet easy to use?
9.
Conclusion
Last Updated: Mar 27, 2024

Basic Concept of Troubleshooting in Puppet

Author Aditi
0 upvote
Career growth poll
Do you think IIT Guwahati certified course can help you in your career?

Introduction

Puppet is a configuration management tool. It is used for managing the infrastructure on physical or virtual devices. It is a Ruby-based open-source software. It facilitates real-time control of complicated infrastructure. Puppet has a declarative language for describing system configuration. Troubleshooting is a type of problem-solving frequently used to fix broken components or operations on a machine or a system. A logical, systematic search for the problem's source is required to address an issue and get the product or process back in operation.

Let's dive into the article to learn about troubleshooting in Puppet.

Basic concept of troubleshooting in puppet

Must Recommended Topic, Types of Agents in Artificial Intelligence.

Troubleshooting

Troubleshoot problems with your Puppet Enterprise (PE) installation by the steps given in this blog.

  • Log locations: You can use the log files produced by the software included with Puppet Enterprise (PE) for troubleshooting. 
  • Troubleshooting installations: You can look for these issues when the installation doesn't work.
  • Troubleshooting disaster recovery: You can check for these issues if commands for disaster recovery fail.
  • Troubleshooting puppet infrastructure run commands: If the puppet infrastructure run commands fail, review the logs at /var/log/puppetlabs/installer/bolt_info.log and check for these issues.
  • Troubleshooting connections between components: You can check for communication, certificate, DNS, and NTP issues if agent nodes cannot get configurations.
  • Troubleshooting the databases: You can use these techniques to resolve issues with the console's supporting databases.
  • Troubleshooting SAML connections: When establishing a connection between a SAML identity provider and PE, frequent problems and errors can appear. It included failed redirects, refused communications, and unsuccessful group binding.
  • Troubleshooting backup and restore: You can check for these issues if a backup or restoration fails.
  • Troubleshooting Code Manager
  • Troubleshooting Windows: Troubleshoot issues with Windows PE installations, including failed upgrades, unsuccessful installations, manifest application errors, and other problems.

Log locations

You can use the log files generated by software distributed with Puppet Enterprise(PE) for troubleshooting.

  • Primary server logs
  • Agent logs
  • Console and console services logs
  • Installer logs
  • Database logs
  • Orchestration logs

Troubleshooting installation

When the installation fails, check the following issues.

Misconfigured DNS

DNS stands for Domain Name Server. It is like an internet phonebook. It manages the mapping of IP addresses with the domain names (like codingninjas.com). For a successful installation, it must be correctly configured.

  •  You should verify whether the agents you chose during installation can reach the primary server hostname.
  • You must verify whether the primary server can reach the primary server hostname.
  • You need to verify that the console component and the primary server can communicate with each other when present on different servers.

Misconfigured security settings

For a successful installation, security and firewall settings should be configured correctly.

  • You need to verify that ports 8140 and 443 should be allowed for inbound traffic on primary servers.
  • You must ensure that the server accepts traffic through IP addresses with valid DNS names. It is when primary servers have several network interfaces.

Troubleshooting disaster recovery

Disaster recovery means creating a copy(replica) of your primary servers. The following issues should be checked when your disaster recovery commands fail:-

Latency over WAN

Latency is the time data takes to pass from one point to another on a network. So, enabling and provision commands can fail when the replica and primary server communicate over high, slow latency or lossy connection. Try to re-run the command if this happens.

Replica connected to the compiler

The provision command generates an error if you try to set up a provisioning node connected to a compiler. The error can be like given below:-

Failure during provision command during the puppet agent run on replica 2:
Failed to generate additional resources using 'eval_generate': Error 500 on SERVER: Server Error: Not authorized to call search on /file_metadata/pe_modules with {:rest=>"pe_modules", :links=>"manage", :recurse=>true, :source_permissions=>"ignore", :checksum_type=>"md5"}
Source: /Stage[main]/Puppet_enterprise::Profile::Primary_master_replica/File[/opt/puppetlabs/server/share/installer/modules]File: /opt/puppetlabs/puppet/modules/puppet_enterprise/manifests/profile/primary_master_replica.ppLine: 64

Edit /etc/puppetlabs/puppet.conf on the replica you want to set up. So that server_list and server settings will be used by the primary server instead of a compile.

Server and server_list are both set in the agent configuration file

A warning can appear when settings for both server and server_list are present in the agent configuration file. It occurs after a replica is enabled. You can hide the warning or ignore it by removing the server setting. Only server_list should be present in the agent configuration.

Node groups are empty

The orchestrator is used to run Puppet when enabling and provisioning a replica. It helps to run Puppet on different groups of nodes. The orchestrator will report that there is nothing to do when a group of nodes is empty. In the puppet job show output, the job is marked as failed. It is not a problem, and the tool expects this response.

Troubleshooting puppet infrastructure run commands

Review the logs at /var/log/puppetlabs/installer/bolt_info.log when puppet infrastructure run commands fail. Also, check the following issues after reviewing the log file. But first of all, what do puppet infrastructure run commands do? It helps you to act as the root user among all nodes that the command touches. 

Running commands as a non-root user

All puppet infrastructure run commands allow you to act as the root user on all nodes touched by the command. Run puppet infrastructure run command as a non-root user; you should be able to SSH. To succeed, SSH into impacted nodes as the same non-root user.

Passing hashes from the command line

The hash must be wrapped in quotes when passed on the command line. When it is given as a part of a puppet infrastructure run command, it is much like a JSON object. For example:-

'{"parameter_one": "value_1", “parameter_two”: “value_2”}'

Troubleshooting connections between components

Check the following for communication, DNS, certificate, and NTP issues if the agent node cannot retrieve the configurations.

Troubleshooting Connection

Agents can’t reach the primary server

If Agent Nodes want to retrieve configurations, they must have the ability to communicate with the server.

If the agent cannot reach the server, you can run the command telnet<PRIMARY_HOSTNAME> 8140. It will return a Name or service not known error.

In such a case, follow these steps:

  1. You need to verify that the server can reach the DNS name. Your agent must be able to recognize this DNS name.
  2. You also need to verify the pre-puppetserver service is running.

Agents don’t have signed certificates

The primary server should sign the agent certificates. The agent CSR (certificate signing request) hasn't been signed if the node's Puppet agent logs contain a warning. These warnings are present in the current SSL session.

  • Run the command puppet cert list on the primary server to generate a list of pending CSRs.
  • Run the command puppetserver ca sign <NODE_NAME> to sign a node certificate.

Agents aren’t using the primary server’s valid DNS name

Agents can only communicate with the primary server using one valid hostname when the primary server is deployed. When you run puppet agent --configprint server on the agent node if you don't get one of the primary server's valid DNS names. DNS name is the name that you selected when installing the primary server. The agent node and primary server are unable to interact.

  • Open the file /etc/puppetlabs/puppet/puppet.conf and change the valid DNS name from the server setting.
  • Login as root and run the following command to reset the server's valid DNS names.
puppet infrastructure run regenerate_primary_certificate --dns_alt_names=<COMMA-SEPARATED_LIST_OF_DNS_NAMES>

Time is out of sync

The time and date should be synced on agent nodes and the primary server. Running the date command returns inconsistent or incorrect dates when time is out of sync on nodes. To get the sync time, set up the NTP. But NTP is not reliable for virtual machines.

Node certificates have invalid dates

When the certificates are created, the time and date should be synced. When certificates are signed out of sync, you will get invalid dates when the following command run

openssl x509 -text -noout -in $(puppet config print --section master ssldir)/certs/<NODE_NAME>.pem
  • By running the following command on the primary server, delete the certificates with invalid dates:-
puppetserver ca clean --certname <NODE_CERT_NAME>
  • By running the following command to delete the SSL directory with invalid certificates on the nodes:-
rm -r $(puppet config print --section master ssldir)
  • Run puppet agent --test each impacted agent node to generate a new certificate request.
  • Run puppetserver ca sign <NODE_NAME> on the primary server to sign each request.

Node is reusing a certname

A new node cannot request new certificates if a new node reuses an old node certname. The primary server retains the previous node certificate.

  • By running the following command, you can clear the node certificate:-
  • puppetserver ca clean --certname <NODE_CERT_NAME>
  • Run puppet agent --test on each impacted agent node to generate a new certificate request.
  • Run puppetserver ca sign <NODE_NAME> on the primary server to sign each request.

Agent can’t reach the filebucket server

Agents cannot backup files on the server to the filebucket when the server installed with the certname doesn't match its hostname.

Orchestrator can’t connect to the PE Bolt server

For debugging faulty connections, there are two options. It is between the Bolt server and the orchestrator.

  • Run Puppet after setting the bolt_server_loglevel argument in the puppet enterprise::profile::bolt_server class.
  • In the file /etc/puppetlabs/bolt-server/conf.d/bolt-server.conf, manually modify the loglevel parameter.

Frequently Asked Questions

Why is Puppet used?

Puppet is an open-source software configuration management and deployment tool. It's most commonly used on Linux and Windows to pull the strings on multiple application servers simultaneously.

Is Puppet still used?

Big-name companies like Oracle, Google, and many others run their data servers using Puppet.

Is Puppet an automation tool?

You may manage and automate the configuration of servers using Puppet.

Is Puppet easy to use?

Puppet is not an easy tool to manage. Its configurations use its language known as Puppet DSL(Domain Specific Language).

Conclusion

In this article, we have extensively discussed the basic concept of troubleshooting in Puppet. We have also explained troubleshooting, log locations, and troubleshooting installation. We also discussed disaster recovery, troubleshooting connections between components, and more detail.

We hope this blog has helped you enhance your basic concept of troubleshooting knowledge. If you want to learn more, check out our articles on Ansible vs. PuppetDevOp's best thingsDevOps tools, and reasons to build a career in DevOps.

Practice makes a man perfect. To practice and improve yourself in the interview, you can check out Top 100 SQL problemsInterview experienceCoding interview questions, and the Ultimate guide path for interviews.

Do upvote our blog to help other ninjas grow. Happy Coding!

thank you image
Live masterclass