Puppet at Loggly

Jordan Sissel. hacker. Loggly, Inc.

Puppet at Loggly

Our puppet deployment is:


Infrastructure in MVC

A source of Truth feeds the Model and results in an applied configuration.


Style - Modules

I never use ‘import’. I always let puppet determine the path of a class.

Your puppet repository should look like this:


# Examples:

Style - File Paths

Puppet lets you abstract things away from the madness of your OS, let it.


class ssh::server {
  file {
      #source => "puppet:///modules/ssh/etc/ssh/sshd_config", # Bad!
      source => "puppet:///modules/ssh/sshd_config";          # Good!

Style - Avoid Coupling (variables)

Styles - Avoid Coupling (variables)

Style - Avoid Coupling (resources)

Style - Avoid Coupling (resources)

Style - Manifest Files


    # modules/foo/manifests/init.pp
    class foo {
      class bar {
        # Bad!
    # Good
    # modules/foo/manifests/init.pp
    class foo { ... }

    # modules/foo/manifests/bar.pp
    class foo::bar { ... }

Style - Custom Defines

Style - Testing


Two main kinds of truth:

  • Machine truth, “I am a frontend”
  • Sources of machine truth: extlookup (by hostname), RightScale API, EC2 userdata, etc.
  • Deployment truth, “The frontend role has package ‘loggly-frontend’ with version 1.2345”
  • Sources of deployment truth: extlookup (by environment/deployment), RightScale API, etc
  • Possible synonyms for ‘deployment’: cluster, environment, site.

Machine Truth on RightScale

  • Tag each machine with roles: frontend, solr, zookeeper, monitor, etc.
  • Use API to query all machines and tags in a deployment.
  • Also tag with any other machine-specific data (zookeeper id, etc).
Tag(s): role:solr=true

Deployment Truth

Deployment Truth (example)

Truth - Role-based deployment

  • Every application component gets a role.
  • Even tiny stuff like standalone cron jobs.
  • Roles: frontend, mysql-master, db-backup, s3-cleaner, monitor, etc.

Truth - Defining a Role

class loggly::frontend {
  # include any other required classes

  package {
    "loggly-frontend": ensure => extlookup("package/loggly-frontend");

  iptables::rule {
    "allow http": ports => 80;
    "allow https": ports => 443;

  apache::config {
      source => template("loggly/frontend/loggly-frontend.httpd.conf.erb");

Truth - Defining a Role

Result of previous slide’s config:

Truth - Defining a Role

All features of a role should be defined in that class.

Nodeless Puppet

Why Nodeless?


Nodeless Puppet (site.pp)

A nodeless site.pp is practically empty.

# manifests/site.pp
node default {
  include truth::enforcer

Nodeless Puppet (truth::enforcer)

# modules/truth/manifests/enforcer.pp
class truth::enforcer {
  # For each role, include 1 class that is that role.
  if has_role("statsserver") {
    include loggly::statsserver
  } else {
    # Ensure this server has no 'statsserver' if we are not a statsserver.
    include loggly::stattserver::remove


Custom Defined Resources

Custom defines let you create your own resource types that wrap other resources.

# Simplified version:
define supervisor::program($command, $user, $notifycmd="", $directory="/",
                           $ensure="present") {
  include supervisor

  file {
      ensure => $file_ensure,
      content => template("supervisor/program.erb"),
      notify => Exec["poke supervisord"];


Example Custom Defines - supervisor

supervisor::program {
    command => "/opt/loggly/solrserver/solrserver.sh",
    user => "root", # for ulimit, will drop privs before launching.
    subscribe => Class["loggly::config", "loggly::solr"],
    require => [User["loggly-solrserver"],
                Class["loggly::common", "zeromq::java"],

Example Custom Defines - iptables

iptables::rule {
  "zookeeper client":
    roles => ["solr", "zookeeper"],
    ports => 2181;
  "zookeeper ensemble":
    roles => ["zookeeper"],
    ports => [2888, 3888];

Example Custom Defines - nagios

# The '@@' notation means 'export this resource'
# More on exported resources later.
@@nagios::host {
    address => $ipaddress_eth0,
    tag => "deployment::$deployment";

@@nagios::check {
  "disk space on $fqdn":
    command => "check-disk-space",
    remote => true,
    host => $fqdn,
    contacts => "pagerduty",
    tag => "deployment::$deployment";

Exported Resources

Biggest win:

Treat your infrastructure as a collective (or multiple collectives) rather than as individual, standalone hosts

Exported Resources

Exported Resources: Nagios

Exported Resources: Nagios

Exporting a check (a custom define):

@@nagios::check {
  "httpinput end-to-end test from $fqdn":
    command => "endtoend-httpinput",   # Command to run
    remote => true,                    # Is an NRPE check
    host => $fqdn,                     # Host to target
    contacts => "pagerduty",           # Contact
    tag => "deployment::$deployment";  # Workaround for bug#5239

nagios::command {
    command => "/usr/local/bin/endtoend.py http",
    remote => true; # register this command with NRPE

Collecting Exported Checks

Collect all checks exported in our deployment (the <<| ... |>> syntax):

# Query by tag to work around puppet bug #5239
Nagios::Check <<| tag == "deployment::$deployment" |>> {
  notify => Class["nagios::server"]


% ls /etc/nagios3/checks.d 
check-frontend1.example.com-disk space on frontend1.example.com.cfg
check-frontend2.example.com-disk space on frontend2.example.com.cfg
check-ops.example.com-disk space on ops.example.com.cfg
check-proxy1.example.com-disk space on proxy1.example.com.cfg
check-proxy2.example.com-disk space on proxy2.example.com.cfg

Code to be released soon.

Exported Resources (caveats)

What if you shutdown an EC2 instance and it no longer needs monitoring?

Exported Resources (caveats)

Scaling problems:

Exported Resources (caveats)

Masterless Puppet


Masterless Puppet

Masterless and Storeconfigs

  • I still wanted storeconfigs (for exported resources).
  • No obvious place to run the database.
  • Amazon RDS to the rescue! (Launches a managed mysql instance for me. Win!)

Masterless Benefits

Masterless Caveats

In general: You have to solve problems already solved with the master:

Masterless: Lifecycle of a Puppet Run

We have a script run via cron doing:

  • apt-get install loggly-puppet
    • This downloads our latest puppet manifests and modules.
    • Needs to be outside of puppet so we can repair ‘broken puppet’ problems like syntax errors or other catalog problems.
  • puppet –environment prerun manifests/prerun.pp
    • Fetches latest truth (Query RightScale for machines/tags/properties)
    • Does basic sanity checking and bootstrapping fixes for the real puppet run.
    • Upgrades puppet, makes storeconfigs work, etc.
  • puppet –storeconfigs manifests/site.pp
    • This is the main puppet run.
    • The ‘environment’ is set in puppet.conf by puppet itself, based on truth.

The End

Find me later: