Virtual router has running services which needs to run always until cloudsack disable it.
In VR if some service goes down currently there is no mechanism to alert the admin and
take action on the crashed services.
This feature is about monitoring the services rendered by the VR.
Goal for this feature is to monitor all the VR services and ensure they are running through the lifetime of VR
On service failure
a) Restart the service
b) Generate an alert and event indicating failure
This monitoring VR services has two tasks.
https://issues.apache.org/jira/browse/CLOUDSTACK-4736
Services to be monitored in VR
Note: Monitoring process can monitor only the services with daemons.
Cloudstack sends the config file of services to be monitored to the router. Services like dnsmasq and haproxy are selected
if the service is selected in network offering.
The services sshd, webserver is selected by default from the DB.
New DB table:
table name: monitoring_services
id,uuid: id and uuid
service : General name of the service
process_name: service name in running processes list
service_name: Service is which is services path
service_path : Service path (Ex: /etc/init.d/<service>)
pidfile : path of the pid file
isDefault: wether the service is monitored by default or not
Inside the VR there is python script which reads the config file and periodically checks the status of service.
The monitor script monitors only the service with pid file. If there are multiple processes with same name, monitoring
checks for the process which has pid in service pid file (Ex: /var/run/<servicename>.pid).
If the services is not running, it recheck the status for 5 seconds in interval of 1 second. It the services still not running then
the monitoring script do the following.
1. write syslog log about service fail and Restart the service.
2. If restart fails, writes a event log in in syslog.
3. A restart failed process is unmonitored for the next 30 minutes. After 30 minutes monitor tries to
restart the service.
The monitor script is added to crontab to run for every 3 minutes.
1. Advanced zone Isolated networks
2. Basic zone shared network
3. Advanced zone shared network
Notifying log from VR to management server or external receivers needs to discussed and finalised.
One possible solution to send monitor logs from VR to MS is
1. polling the VR from the management server for logs.
2. Also overload existing VR usage polling threads.
Note: This task is out of scope for the 4.3 release
No UI chagnes.
xenserver, kvm, vmware
Since this feature has new script files, router reboot is required for existing router.
https://cwiki.apache.org/confluence/display/CLOUDSTACK/System+VMs+and+services+resiliency