Ready to know about downtime before your customers?
Status List delivers uptime monitoring and professional hosted status pages for sites of all shapes and sizes.
Trusted by 1000+ companies
Let’s setup ZooKeeper monitoring with InfluxDb (a popular time series database and visualizer). In this guide, we’ll get you completely set up to monitor your existing ZooKeeper cluster.
1. Let’s get InfluxDb installed. Head over to the official download page. Install the database, Telegraf data collector and cli packages.
2. Start the influxdb and telegraf service
service influxdb start
service telegraf start
3. Open the InfluxDb Dashboard by going to http://localhost:8083
4. Install the ZooKeeper dashboard template
export ZOOKEEPER_HOST=https://localhost:2181
influx apply -u https://raw.githubusercontent.com/influxdata/community-templates/master/zookeeper/zookeeper.yml
5. Reload your browser and view the new dashboard. You should see metrics start to appear within a few seconds.
Trusted by 1000+ companies
6. Let’s create a place for your notifications to go. Go to Alerts > Alerts > Notification Endpoints. Click + Create.
a. Give a name and description.
b. Choose an HTTP/Slack or PagerDuty notification and fill out the appropriate details.
7. Configure your alerts. Go to Alerts > Alerts and click + Create. Add threshold and deadman alerts for each of the following (feel free to customize to your needs).
- znode total occupied memory is too big
- query: approximate_data_size /1024 /1024 > 1 * 1024 # more than 1024 MB(1 GB)
- timebucket: 1m
- alert title: Instance {{ $labels.instance }} znode total occupied memory is too big
- avg latency is too high
- query: avg_latency > 100
- timebucket: 1m
- alert title: Instance {{ $labels.instance }} avg latency is too high
- create too many znodes
- query: znode_count > 1000000
- timebucket: 1m
- alert title: Instance {{ $labels.instance }} create too many znodes
- open too many files
- query: open_file_descriptor_count > 300
- timebucket: 1m
- alert title: Instance {{ $labels.instance }} open too many files
- create too many connections
- query: num_alive_connections > 50 # suppose we use the default maxClientCnxns: 60
- timebucket: 1m
- alert title: Instance {{ $labels.instance }} create too many connections
- set too many watch
- query: watch_count > 10000
- timebucket: 1m
- alert title: Instance {{ $labels.instance }} set too many watch
For more details on how to configure alerts, please see the influxdb alert documentation.
Congratulations, that’s it! You’re ready to go.
Trusted by 1000+ companies
© Status List 2024