Puppet vs. Ansible: Game-Changing Differences in Modern Infrastructure Automation

After working extensively with both Puppet and Ansible, I’ve discovered some powerful features in the Puppet ecosystem that deserve more attention. In this post, I’ll share my experiences with Hiera and Puppet Bolt, highlighting how they compare to Ansible’s approach.

intro

Hiera vs. Ansible’s host_vars and group_vars

One of the most significant differences between Puppet and Ansible is how they handle variable hierarchies. Ansible uses host_vars and group_vars directories, which provide a straightforward but somewhat rigid approach to configuration data. In contrast, Puppet’s Hiera offers a much more sophisticated system.

Why Hiera Shines

Hiera’s approach to configuration lookup is fundamentally different and more flexible:

Fact-based lookups: Hiera can use any fact about a system to determine which configuration files to load. This goes far beyond Ansible’s group-based approach, allowing for configurations based on operating system, hardware details, custom facts, or any combination thereof.
Hierarchical merging: Hiera doesn’t just stop at the first match like Ansible often does. Instead, it can merge data from multiple levels of the hierarchy, allowing for elegant inheritance patterns.
Lookup specificity: When Hiera finds a key at one level of the hierarchy, it doesn’t override it with values from less specific levels. This “first match wins” approach ensures that the most specific configuration always takes precedence.

This flexibility makes Hiera an incredibly powerful tool for managing configurations across diverse environments. For instance, you can define a base configuration that applies to all systems, override specific settings for a particular OS, and then further customize for individual hosts—all without repetition.

Hiera in Action: A Real-World Example

Let’s look at an actual Hiera configuration to understand its power (example taken from the official Puppet documentation):

---
version: 5
defaults:  # Used for any hierarchy level that omits these keys.
  datadir: data         # This path is relative to hiera.yaml's directory.
  data_hash: yaml_data  # Use the built-in YAML backend.

hierarchy:
  - name: "Per-node data"                   # Human-readable name.
    path: "nodes/%{trusted.certname}.yaml"  # File path, relative to datadir.
                                   # ^^^ IMPORTANT: include the file extension!

  - name: "Per-datacenter business group data" # Uses custom facts.
    path: "location/%{facts.whereami}/%{facts.group}.yaml"

  - name: "Global business group data"
    path: "groups/%{facts.group}.yaml"

  - name: "Per-datacenter secret data (encrypted)"
    lookup_key: eyaml_lookup_key   # Uses non-default backend.
    path: "secrets/%{facts.whereami}.eyaml"
    options:
      pkcs7_private_key: /etc/puppetlabs/puppet/eyaml/private_key.pkcs7.pem
      pkcs7_public_key:  /etc/puppetlabs/puppet/eyaml/public_key.pkcs7.pem

  - name: "Per-OS defaults"
    path: "os/%{facts.os.family}.yaml"

  - name: "Common data"
    path: "common.yaml"

This configuration demonstrates several key strengths of Hiera:

Multiple layers of specificity: The hierarchy starts with node-specific data (nodes/%{trusted.certname}.yaml), then moves through various levels of abstraction (datacenter + business group, business group alone, datacenter-specific secrets), down to OS-specific settings, and finally to common data that applies to all systems.
Dynamic path generation: Notice how paths include variables like %{trusted.certname}, %{facts.whereami}, and %{facts.os.family}. These are interpolated with actual system facts at runtime, creating a dynamic configuration system that adapts to each node.
Mixed backend support: While most data comes from YAML files, the configuration demonstrates how you can use different backends (like the encrypted eyaml_lookup_key) for sensitive data, with backend-specific options.
Clear progression: When Hiera looks up a value, it starts at the top of this hierarchy and works its way down. Once it finds a value, it stops looking (unless you’re using a merge strategy). This means more specific configurations naturally override more general ones.

In Ansible, achieving this level of configuration flexibility would require complex variable precedence rules and possibly custom plugins. Hiera makes it declarative and straightforward.

Puppet Bolt: A Fresh Approach to Remote Execution

Puppet Bolt is another gem in the Puppet ecosystem that offers an interesting alternative to Ansible for ad-hoc tasks and orchestration.

Speed and Flexibility

Bolt takes a different approach to remote execution that brings both advantages and limitations:

Script-based execution: While Ansible relies heavily on Python modules, Bolt embraces a more universal approach by simply executing scripts on remote systems. This might seem less sophisticated at first, but it brings remarkable flexibility.
Language agnosticism: Bolt doesn’t care what language your script is written in. Bash, Python, PowerShell, Ruby—it’s all the same to Bolt. This removes the dependency on specific Python versions that can sometimes complicate Ansible deployments.
Simplicity in execution: Sometimes a simple bash script is the easiest solution to a problem. Bolt acknowledges this reality and makes it trivial to run whatever makes the most sense for a given task.

Limitations and Considerations

Despite its strengths, Bolt isn’t without challenges:

JSON string conversion: One frustration I’ve encountered is that converting JSON strings to objects isn’t natively supported. You either need to use modules or extend Bolt with custom functions. If you’re not interested in diving into Ruby or managing additional module dependencies, this can be a drawback.
Less granular modules: We do lose some of the fine-grained task-level module support that Ansible provides. This is the trade-off for Bolt’s flexibility.

Interactive Use and Documentation

Where Bolt really shines is in its usability:

Standalone operation: Using Bolt without the broader Puppet ecosystem works perfectly fine, making it accessible even if you’re not fully committed to Puppet.
Interactive experience: Bolt feels designed for more interactive use than Ansible, with features like the prompt::menu function that provides interactive dropdown options, allowing users to make selections directly in the terminal during plan execution.
Superior plan discovery: The bolt plan show command provides a clean, organized list of all available plans. When viewing a specific plan, Bolt displays detailed documentation that’s easily added through comments in the code.

Task targeting flexibility: Plans in Bolt are conceptually similar to Ansible playbooks, but with a key difference in execution model. In Bolt, each task within a plan can target a different set of hosts, giving you greater flexibility. This contrasts with Ansible, where a play defines a list of tasks that all run against the same list of hosts. This distinction makes Bolt plans more adaptable when you need different actions on different host groups within the same workflow.

Here’s a simple example showing how Bolt plans can target different sets of hosts for different tasks:

plan my_module::my_deployment(
  TargetSpec $web_servers,
  TargetSpec $db_servers,
  String $version
) {
  # First, update the database servers
  # Using catch_errors to prevent plan failure on error
  $db_results = run_task('database::update', $db_servers, { 'version' => $version }, _catch_errors => true)

  # Check if database update was successful
  if $db_results.ok {
    # Run a simple command on the web servers to verify they're online
    $web_status = run_command('systemctl status nginx', $web_servers)

    # Deploy the new application version to web servers
    run_task('webapp::deploy', $web_servers, {
      'version' => $version,
      'db_endpoint' => get_db_endpoint($db_results)
    })

    # Send success notification
    run_task('notification::send', 'monitoring.example.com', {
      'message' => "Deployment of version ${version} completed successfully",
      'status' => 'success'
    })
  } else {
    # Send failure notification with details
    $error_msg = $db_results.error_set.first.error.message
    run_task('notification::send', 'monitoring.example.com', {
      'message' => "Deployment of version ${version} failed during database update: ${error_msg}",
      'status' => 'failure'
    })

    # Log details for investigation
    out::message("Database update failed: ${error_msg}")

    # Raise an exception that the plan failed
    fail_plan("Deployment failed at database stage")
  }
}

Notice how run_task() and run_command() take a target parameter ($db_servers, $web_servers, or even a specific hostname), allowing each operation to be directed at precisely the right group of hosts. This syntax run_task($task_name, $targets, $args) makes the targeting explicit and flexible.

Plan Functions and Built-in Capabilities

Bolt comes with a rich set of built-in plan functions that extend its capabilities. The Bolt Plan Functions documentation provides a comprehensive reference of these functions, including file manipulation, error handling, and interactive prompts like prompt::menu. These functions make Bolt plans powerful and flexible without requiring external dependencies.

Parameter Handling and Variable Scope

The way Bolt handles parameters and variables feels more like proper programming:

Function-like parameters: Plans have proper parameters, similar to functions in programming languages, with validation checks at the beginning of execution.
Cleaner than Ansible’s -e: Compared to the common practice of injecting variables with -e in Ansible, Bolt’s approach feels more correct and maintainable. It asserts that all arguments are defined and used properly.
Sensible variable scoping: Ansible’s variable precedence can sometimes lead to confusion, with -e variables overriding set_fact values in ways that might be unexpected. Bolt follows traditional programming scopes, where variables have clear context and shouldn’t be redeclared.

Inventory Handling

Like Ansible, Bolt supports dynamic inventories:

Language flexibility: Again embracing its language-agnostic approach, Bolt allows you to write dynamic inventories in any programming language. The only requirement is that they return properly formatted JSON.
Task-based inventory sources: Bolt makes it incredibly simple to use a task as an inventory source. As documented in the Bolt inventory reference, you can create a dynamic inventory by simply adding a _plugin: task key at the root level of your inventory file, and then specifying a task key with the name of a module::task to execute. This task returns the inventory data, making it easy to generate target lists from external sources like cloud providers, CMDBs, or custom databases.
Here’s a minimal example:
```
# inventory.yaml
---
_plugin: task
task: my_module::get_targets
```
This simple configuration tells Bolt to execute the my_module::get_targets task and use its output as the inventory for the current run.

Conclusion

While Ansible remains a popular and powerful tool, my experience with Puppet’s Hiera and Bolt has revealed significant advantages in certain scenarios. Hiera’s sophisticated data hierarchy provides exceptional flexibility for configuration management, while Bolt offers a refreshing approach to remote execution that balances simplicity with power.

conclusion

If you’re deep in the Ansible ecosystem, it’s still valuable to understand what benefits other tools can bring. Staying open-minded about different approaches helps broaden your technical perspective, can lead to fresh ideas for solving problems, and ultimately makes you a more versatile infrastructure developer. Even if you don’t switch tools, exploring alternative design philosophies can inspire creative solutions and innovations in your own workflows and automation strategies.

Hiera vs. Ansible’s host_vars and group_vars#

Why Hiera Shines#

Hiera in Action: A Real-World Example#

Puppet Bolt: A Fresh Approach to Remote Execution#

Speed and Flexibility#

Limitations and Considerations#

Interactive Use and Documentation#

Plan Functions and Built-in Capabilities#

Parameter Handling and Variable Scope#

Inventory Handling#

Conclusion#