netdata/netdata

View on GitHub
src/health/guides/consul/consul_autopilot_server_health_status.md

Summary

Maintainability
Test Coverage
### Understand the alert

The `consul_autopilot_server_health_status` alert triggers when a Consul server in your service mesh is marked `unhealthy`. This can affect the overall stability and performance of the service mesh. Regular monitoring and addressing unhealthy servers are crucial in maintaining a smooth functioning environment.

### What is Consul?

`Consul` is a service mesh solution that provides a full-featured control plane with service discovery, configuration, and segmentation functionalities. It is used to connect, secure, and configure services across any runtime platform and public or private cloud.

### Troubleshoot the alert

Follow the steps below to identify and resolve the issue of an unhealthy Consul server:

1. Check Consul server logs

   Inspect the logs of the unhealthy server to identify the root cause of the issue. You can find logs typically in `/var/log/consul` or use `journalctl` with Consul:

   ```
   journalctl -u consul
   ```

2. Verify connectivity

   Ensure that the unhealthy server can communicate with other servers in the datacenter. Check for any misconfigurations or network issues.

3. Review server resources

   Monitor the resource usage of the unhealthy server (CPU, memory, disk I/O, network). High resource usage can impact the server's health status. Use tools like `top`, `htop`, `iotop`, or `nload` to monitor the resources.

4. Restart the Consul server

   If the issue persists and you cannot identify the root cause, try restarting the Consul server:

   ```
   sudo systemctl restart consul
   ```

5. Refer to Consul's documentation

   Consult the official [Consul troubleshooting documentation](https://developer.hashicorp.com/consul/tutorials/datacenter-operations/troubleshooting) for further assistance.

6. Inspect the Consul UI

   Check the Consul UI for the server health status and any additional information related to the unhealthy server. You can find the Consul UI at `http://<consul-server-ip>:8500/ui/`.

### Useful resources

1. [Consul Documentation](https://www.consul.io/docs)
2. [Running Consul as a Systemd Service](https://learn.hashicorp.com/tutorials/consul/deployment-guide#systemd-service)