ITIL Services: server performance monitored with OpsView and NAGIOS
“We wanted to have full support when and if needed, and now we can completely rely on it!”
The client runs a custom ERP platform to cover its distribution needs. One of the most used modules is the sales force automation application which is run using PostgreSQL as database backend and iMobileDistribution as client application front end.
The same configuration is deployed in 6 physical separated locations, each deployment has a database and a number of iMD clients used to run the business. Each location runs standalone and is not connected with the other locations.
The customer needs a solution that allows them to be notified immediately on the status of the hosting servers (linux and windows), database servers (PostgreSQL) and synchronization application. If a server is offline, the sales agents cannot synchronize their mobile devices and sale opportunities are likely to be lost. As there is no dedicated technical staff in each location, the solution would help them take appropriate action and solve the issue in no time.
To address these needs we planned on setting up a Nagios server to monitor each location, collecting information for specific performance affected parameters. If a specific parameter is outside a given interval, then a warning or error message is sent by email or SMS. Thus, our customer will be able to take appropriate action to solve the problem, or at least they are aware of the problem asking for a resolution.
Due to the distributed configuration of the deployment we had to implement a solution for the remote servers to be contacted without being connected to their internal networks.
Why open source?
As open-source, publicly available interface specifications provide users with an accessible and customizable implementation of their own special building blocks.
We setup the monitoring server in our central location and used port forwarding and SSH to access the remote networks.
For each location, we connect to the local server which is used as a gateway to collect information for the other machines being monitored.
For each server we check:
- CPU load
- free memory
- free disk space
For database we gather:
- postgres status
- postgres load
The information is collated on the nagios server and made available using OpsView front-end. Notifications are sent by email to interested people when some parameters are outside their normal values.