// Integrating icinga/nagios with puppet | Light At The End Of The Tunnel Integrating icinga/nagios with puppet – Light At The End Of The Tunnel

Light At The End Of The Tunnel

systems administration meanderings

Integrating icinga/nagios with puppet

We are trialling Icinga at work as the interface to Nagios is a little behind the times, but the instructions here are applicable to either.  We run a cluster and we needed a way to monitor the nodes within that cluster.  To get each node to boot and install an OS we use Cobbler, and to ensure that each node is configured in the same manner we use puppet.  Getting a node to be automatically added to puppet, has been covered elsewhere.  The issue we faced is that when those nodes booted they did not add themselves to the monitoring solution,  Icinga.

To add a node to Icinga involves the following steps:-

  1. Defing the node, IP address, name etc ...
  2. Adding some services to that node
  3. Restarting [Icinga](http://www.icinga.org/) to pick up the new information

As the nodes are configured by puppet it makes sense to add the services and the node configuration from puppet to  Icinga.  Fortunately, puppet integration with Nagios is well defined, and comes with classes that can be used to make life easier.

To monitor services on nodes there are a couple of common methods:-

  1. Use [check_by_ssh](http://nagiosplugins.org/man/check_by_ssh)
  2. Use [NRPE](http://nagios.sourceforge.net/docs/3_0/addons.html)
  3. Use [NSCA](http://community.nagios.org/2009/06/11/nagios-setting-up-the-nsca-addon-for-passive-checks/) for passive checks

I originally decided to go with check_by_ssh, as puppet has support for collecting ssh keys, but after several frustrating attempts I realised that it only worked with the root user.  I then abandoned that strategy and decided to use check_http_remote instead, as I didn’t have to worry about keys, all the check processing would be originated on the node itself and my master would just need to collect and display the results. As this sounds lot like NSCA, it has one major advantage, if the check is not updated within a fix length of time, then the test goes to unknown, thereby indicating a problem.

The solution uses a lot from this blog post, and it’s references, my init.pp is modified to reference Icinga and to use check_http_remote. I originally started out using sqlite as the backend for the puppet stored configurations, but it could not handle the number of nodes, causing locking issues. I therefore changed to MySQL.

Below is excerpts from the init.pp I used to create the solution.

[ruby]

The main icinga monitor class

class icinga { include apache $icinga_cfgdir = “/etc/icinga/objects”

    package {
            'nagios-http-master':
                    ensure => installed;
    }

    service {
            'ido2db':
                    ensure => running,
                    hasstatus => true,
                    hasrestart => true,
                    subscribe => File [ $icinga_cfgdir ]
    }

    service {
            'icinga':
                    ensure => running,
                    hasstatus => true,
                    hasrestart => true,
                    subscribe => File [ $icinga_cfgdir ]
    }

    file {
            # prepare a place for a password for the nagiosadmin
            "/etc/icinga/htpasswd.users":
                    ensure => present,
                    mode => 0640, owner => root, group => apache;
            # disable default debian configurations
            [ "/etc/icinga/objects/localhost.cfg",
              "/etc/icinga/objects/timeperiods.cfg",
              "/etc/icinga/objects/services.cfg" ]:
                    ensure => present,
                    notify => Service[icinga];
            "/etc/icinga/objects/hostgroups.cfg":
                    source => "puppet:///modules/icinga/hostgroups_nagios.cfg",
                    mode => 0644, owner => nagios, group => nagios,
                    notify => Service[icinga];
            # permit external commands from the CGI
            "/etc/icinga":
                    ensure => directory, mode => 751,
                    owner => nagios, group => nagios,
                    notify => Service[icinga];
            "/etc/icinga/var/rw":
                    ensure => directory, mode => 2710,
                    owner => nagios, group => nagios-cmd,
                    notify => Service[icinga];
            "/etc/icinga/objects":
                    ensure => directory, mode => 775,
                    owner => nagios, group => nagios,
                    recurse => true,
                    notify => Service[icinga];
            "/etc/icinga/bin":
                    source => "puppet:///modules/icinga/bin/",
                    recurse => true,
                    mode => 0755, owner => root, group => 0;
    }
     # import the various definitions
    Nagios_host <<| |>>
    Nagios_service <<| |>>

    define command($command_line) {
            file { "$icinga_cfgdir/${name}_command.cfg":
                            ensure => present, content => template( "icinga/command.erb" ),
                            mode => 644, owner => nagios, group => nagios,
                            notify => Service[icinga],
            }
    }

    icinga::command {
            # from ssh.pp
            ssh_port:
                    command_line => '/usr/lib/nagios/plugins/check_ssh -p $ARG1$ $HOSTADDRESS$';
            # from apache2.pp
            http_port:
                    command_line => '/usr/lib/nagios/plugins/check_http -p $ARG1$ -H $HOSTADDRESS$ -I $HOSTADDRESS$';

check_disk: command_line => ‘/usr/lib/nagios/plugins/check_http_result -H $HOSTNAME$ -t 30 -C '/usr/lib64/nagios/plugins/check_disk -m -w $ARG1$ -c $ARG2$ -X tmpfs -X iso9660 -X nfs -X lustre' -n $SERVICEDESC$ -u http://icinga/nagios-http/'; }

    define host($ip = $fqdn, $short_alias = $fqdn) {
            @@file {
                    "$icinga_cfgdir/${name}_host.cfg":
                            ensure => present, content => template( "icinga/host.erb" ),
                            mode => 644, owner => nagios, group => nagios,
                            tag => 'icinga'
            }
    }

    define service($check_command = '',
            $icinga_host_name = $fqdn, $icinga_description = '')
    {
            # this is required to pass icinga' internal checks:
            # every service needs to have a defined host
            include icinga::target
            $real_check_command = $check_command ? {
                    '' => $name,
                    default => $check_command
            }
            $real_icinga_description = $icinga_description ? {
                    '' => $name,
                    default => $icinga_description
            }
            @@file {
                    "$icinga_cfgdir/${icinga_host_name}_${name}_service.cfg":
                            ensure => present, content => template( "icinga/service.erb" ),
                            mode => 644, owner => nagios, group => nagios,
                            tag => 'icinga'
            }
    }

    define extra_host($ip = $fqdn, $short_alias = $fqdn, $parent = "icinga") {
            $icinga_parent = $parent
            file {
                    "$icinga_cfgdir/${name}_host.cfg":
                            ensure => present, content => template( "icinga/host.erb" ),
                            mode => 644, owner => nagios, group => nagios,
                            notify => Service[icinga],
            }
    }

}

include this class in every host that should be monitored by icinga

class itarget { include icinga-keys

    user { "nagios":
       ensure   => "present",
       uid      => "366",
       gid      => "366",
       groups   => "nagios",
       home     => "/var/log/nagios",
       shell    => "/bin/bash",
       require  => Group["nagios"]
    }

    group { "nagios":
       ensure => "present",
       gid    => "366",
    }

    package {
            [ 'nagios-plugins', 'nagios-plugins-zmj', 'nagios-snmp-plugins', 'nagios-http-common', 'nagios-http-remote', 'nagios-plugins-setuid', 'nagios-of-plugins', 'nagios-sendnsca' ]:
                    ensure => installed,
    }

    @@nagios_host { $fqdn:
            ensure => present,
            alias => $hostname,
            address => $ipaddress,
            use => "linux-server",
            target => "/etc/icinga/objects/host_${hostname}.cfg",
    }

    @@nagios_service { "check_ping_${hostname}":
            check_command => "check_ping!100.0,20%!500.0,60%",
            use => "generic-service",
            host_name => "$fqdn",
            notification_period => "24x7",
            service_description => "check_ping",
            target => "/etc/icinga/objects//check_ping_${hostname}.cfg",
    }
    @@nagios_service { "check_ssh_${hostname}":
            check_command => "check_ssh",
            use => "generic-service",
            host_name => "$fqdn",
            notification_period => "24x7",
            service_description => "check_ssh",
            target => "/etc/icinga/objects/check_ssh_${hostname}.cfg",
    }
    @@nagios_service { "check_disk_${hostname}":
            check_command => "check_disk!5%!2%",
            use => "generic-service",
            host_name => "$fqdn",
            notification_period => "24x7",
            service_description => "check_disk",
            target => "/etc/icinga/objects/check_disk_$hostname.cfg",
    }

}

    @@nagios_service { "check_ping_${hostname}":
            check_command => "check_ping!100.0,20%!500.0,60%",
            use => "generic-service",
            host_name => "$fqdn",
            notification_period => "24x7",
            service_description => "check_ping",
            target => "/etc/icinga/objects//check_ping_${hostname}.cfg",
    }
    @@nagios_service { "check_ssh_${hostname}":
            check_command => "check_ssh",
            use => "generic-service",
            host_name => "$fqdn",
            notification_period => "24x7",
            service_description => "check_ssh",
            target => "/etc/icinga/objects/check_ssh_${hostname}.cfg",
    }
    @@nagios_service { "check_disk_${hostname}":
            check_command => "check_disk!5%!2%",
            use => "generic-service",
            host_name => "$fqdn",
            notification_period => "24x7",
            service_description => "check_disk",
            target => "/etc/icinga/objects//check_disk_$hostname.cfg",
    }

} [/ruby]

This has worked effectively, however it is not perfect.  For one thing I built Icinga from scratch, plus I made my life easier by using the Nagios user, and not Icinga.  This was due to the users we had setup already


Share

#