Nagios, is a free and open-source computer-software application that monitors systems, networks and infrastructure. Nagios offers monitoring and alerting services for servers, switches, applications and services. It alerts users when things go wrong and alerts them a second time when the problem has been resolved.
Why do we need Nagios?
- Detects all types of network or server issues
- Helps you to find the root cause of the problem which allows you to get the permanent solution to the problem
- Active monitoring of your entire infrastructure and business processes
- Allows you to monitors and troubleshoot server performance issues
- Helps you to plan for infrastructure upgrades before outdated systems create failures
- You can maintain the security and availability of the service
- Automatically fix problems in a panic situation
- In this demonstration, we will be using centos-07.
- We will be using 2 machines for our lab: 1 Nagios Server Node and 1 Node being Managed by Nagios Server.
For our demonstration, nagios.unixlab.com is our server node and webserver.unixlab.com is the node being managed by Nagios.
3. For Both the Nodes — memory should be at least 2 GB and there should be at least 2 core cpu.
4. Both the servers should communicate to each other over the network.
5. For both the servers, Setup hostname and update /etc/hosts with node details:
Nagios Configuration Steps:
Perform Steps 1 to 12 on Nagios server node only ( nagios.unixlab.com):
1. Enable epel-release repository:
2. Install Nagios:
3. Install Nagios-Plugins:
4. Now Enable nagios, Start nagios service and Check the status:
systemctl enable nagios && systemctl start nagios && systemctl status nagios
5. Now take backup of the file — “/etc/httpd/conf.d/nagios.conf” and update the file with below details:
The nagios.conf file can be downloaded from following link: download_nagios.conf
6. When you download and install nagios and nagios plugins, it creates a apache user — “nagiosadmin” by default. Let’s set the password for this user:
htpasswd /etc/nagios/passwd nagiosasmin
7. Let’s check if apache user exists and is part of nagios group and change ownership of the directories : “/usr/share/nagios” and “/usr/lib64/nagios” to apache.
8. Now restart Nagios, check the apache configuration syntax error with “apachectl -t” command and start the apache service with “apachectl start “command.
9. Now update your windows hosts file with below details:
This can be done by updating your windows machine’s “C:\Windows\System32\drivers\etc\hosts” file by opening it in notepad with administrator privilege.
10. Now open your web browser and type “nagios.example.com”. you should see a window popping up and asking for username and password.
Type username as “nagiosadmin” and the password we have set earlier
11. Once you provide correct credentials, you should see the nagios dashboard.
12. Now click on the “Hosts” link in the left hand screen. You should see your nagios server as “localhost”
Till Now we have Just created the basic structure of Nagios server. Now we have to configure it to be enable to monitor other servers.
Perform steps from 13 to 25 on both the servers(if mentioned otherwise) — ( nagios.unixlab.com and webserver.unixlab.com):
13. Install following libraries : gcc , glibc, glibc-common, gd, gd-devel,make, net-snmp and openssl-devel:
yum install -y gcc glibc glibc-common gd gd-devel make net-snmp openssl-devel
14. Now we have to download the nagios plugins and nrp service plugins. Create below Directory structure like this:
15. Now download the plugins using below commands:
16. Now validate if the plugins have been downloaded:
17. Now First extract “nagios-plugins-2.2.1.tar.gz”. use command “tar -xvf nagios-plugins-2.2.1.tar.gz” for this. you will see a directory called “nagios-plugins-2.2.1” been created.
18. Now cd to “nagios-plugins-2.2.1” directory and run below commands in sequence.
19. Now go back to “nagios-download” directory once again and extract the “nrpe-3.2.1.tar.gz” plugin tar ball. You will see the tar has been extracted to “nrpe-3.2.1” directory.
20. Now go inside the “nrpe-3.2.1” directory and run below command:
./configure — enable-ssl
21. On node “webserver” only — Now in the node — “webserver.unixlab.com” run below commands:
Validate if the “nagios” user has been created.
22. On Both the servers, run below commands in sequence:
23. Now start the nrpe service and check the status of nrpe service. The status will show you as failed. So, don’t worry.
24. On node “Nagios” only– Now in the node — nagios.unixlab.com, Update the file “/usr/local/nagios/etc/nrpe.cfg” with details mentioned as below:
25. On node “webserver” only — Now in the node — “webserver.unixlab.com”, Update the file “/usr/local/nagios/etc/nrpe.cfg” with below details:
25. On Both the Nodes ( Nagios and webserver) — Now on both the nodes restart the “nrpe” service and check the status again. It should show up and running.
Perform rest of the steps on Nagios server node only ( nagios.unixlab.com):
26. Now validate if nrpe service is working as expected. Below commands need to be run from node — nagios.unixlab.com only:
- Run the command — “/usr/local/nagios/libexec/check_nrpe -H 192.168.33.80”. It should show the nrpe version.
- Run the Command — “/usr/local/nagios/libexec/check_nrpe -H 192.168.33.80 -c check_load”. This will check the cpu load average on local host (nagios.unixlab.com) and show us the details:
- Run the command — “/usr/local/nagios/libexec/check_nrpe -H 192.168.33.81 -c check_load”. This will check the cpu load average on remote host (webserver.unixlab.com) and show us the details:
27. Now go to “/etc/nagios” and create a directory called “conf.d”. Change to this directory.
28. Now create “hosts.cfg” file as below:
29. Also, create “services.cfg” in the same location as below. The services.cfg file contains the “check” we need to perform on remote servers.
30. Now create “objects” directory inside “/etc/nagios/conf.d” and go inside this directory. Create “commands.cfg” file as mentioned below:
31.Now go to “/etc/nagios” directory and update the “nagios.cfg” file with the paths of the configuration files we have just created :hosts.cfg, services.cfg and commands.cfg.
32. Check for any configuration error with “nagios -v /etc/nagios/nagios.cfg” command. It should show zero warnings and zero error as shown below
nagios -v /etc/nagios/nagios.cfg
33. Now restart the nagios service. Validate if its up and running.
We are all done.
34. Now open your browser and type “http://nagios.example.com/”. Login with “nagiosadmin” and the password we had set earlter. you will be logged in to the nginx dashboard.
35. Now Click on the “hosts” tab under “Current Status” section located at the Left hand side window. You will see our “webserver.unixlab.com” node listed there.
36. Now Click on the server “webserver.unixlab.com”; which will take you to the next screen. This screen displays all the Monitoring status of our node “webserver”.
37. Explore a bit all the option available in the dashboard. You can even visit “https://www.nagios.org/” to get mode details.
This concludes our tutorial for “Configuring Nagios Monitoring for Infrastructure Servers”.
Note: You can download all the configuration files used in this tutorial from following link: download