Consul Integration with Opsgenie
11 Sep 2016In past I discussed about using consul as a service discovery and configuration management solution and how to configure for alerts on key value changes.
Opsgenie is a must-have tool for SRE team of any company beyond a certain scale. It enables alerts on some events (services down for example) via email, SMS, mobile push and even phone calls (so this is the enabler for dreaded pager service).
At my company we use opsgenie and SRE team is on a constant roster - so on a rotation basis different engineers are on-call.
There was a request to integrate service health on consul to opsgenie. Quick search over internet revealed that there is a solution called consul-alerts which is also the official recommendation from opsgenie.
After installing it and trying to figure out - to be frank - I couldn’t comprehend it well (couldn’t get it deliver notifications successfully rather). So I thought, let me get to first principles and use the native consul watch
command and some bash
to do the simple job (isn’t KISS something never to be forgotten?) and came up with this simple(st possible) script which does the job well. This script is invoked from an upstart
script (ubuntu 14.04) which takes care of re-running it if it gets killed somehow.
service.alert.ctmpl
is simple - it lists down all service names, their IP, their port and current status (passing, warning, critical etc.):
It took less than 5 minutes to get this running.