src/go/plugin/go.d/modules/elasticsearch/integrations/elasticsearch.md from firehol/netdata

src/go/plugin/go.d/modules/elasticsearch/integrations/elasticsearch.md
Summary

Maintainability

Test Coverage

Issues
<!--startmeta
custom_edit_url: "https://github.com/netdata/netdata/edit/master/src/go/plugin/go.d/modules/elasticsearch/integrations/elasticsearch.md"
meta_yaml: "https://github.com/netdata/netdata/edit/master/src/go/plugin/go.d/modules/elasticsearch/metadata.yaml"
sidebar_label: "Elasticsearch"
learn_status: "Published"
learn_rel_path: "Collecting Metrics/Search Engines"
most_popular: True
message: "DO NOT EDIT THIS FILE DIRECTLY, IT IS GENERATED BY THE COLLECTOR'S metadata.yaml FILE"
endmeta-->

# Elasticsearch


<img src="https://netdata.cloud/img/elasticsearch.svg" width="150"/>


Plugin: go.d.plugin
Module: elasticsearch

<img src="https://img.shields.io/badge/maintained%20by-Netdata-%2300ab44" />

## Overview

This collector monitors the performance and health of the Elasticsearch cluster.


It uses [Cluster APIs](https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster.html) to collect metrics.

Used endpoints:

| Endpoint               | Description          | API                                                                                                         |
|------------------------|----------------------|-------------------------------------------------------------------------------------------------------------|
| `/`                    | Node info            |                                                                                                             |
| `/_nodes/stats`        | Nodes metrics        | [Nodes stats API](https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-nodes-stats.html) |
| `/_nodes/_local/stats` | Local node metrics   | [Nodes stats API](https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-nodes-stats.html) |
| `/_cluster/health`     | Cluster health stats | [Cluster health API](https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-health.html)   |
| `/_cluster/stats`      | Cluster metrics      | [Cluster stats API](https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-stats.html)     |


This collector is supported on all platforms.

This collector supports collecting metrics from multiple instances of this integration, including remote instances.


### Default Behavior

#### Auto-Detection

By default, it detects instances running on localhost by attempting to connect to port 9200:

- http://127.0.0.1:9200
- https://127.0.0.1:9200


#### Limits

By default, this collector monitors only the node it is connected to. To monitor all cluster nodes, set the `cluster_mode` configuration option to `yes`.


#### Performance Impact

The default configuration for this integration is not expected to impose a significant performance impact on the system.


## Metrics

Metrics grouped by *scope*.

The scope defines the instance that the metric belongs to. An instance is uniquely identified by a set of labels.



### Per node

These metrics refer to the cluster node.

Labels:

| Label      | Description     |
|:-----------|:----------------|
| cluster_name | Name of the cluster. Based on the [Cluster name setting](https://www.elastic.co/guide/en/elasticsearch/reference/current/important-settings.html#cluster-name). |
| node_name | Human-readable identifier for the node. Based on the [Node name setting](https://www.elastic.co/guide/en/elasticsearch/reference/current/important-settings.html#node-name). |
| host | Network host for the node, based on the [Network host setting](https://www.elastic.co/guide/en/elasticsearch/reference/current/important-settings.html#network.host). |

Metrics:

| Metric | Dimensions | Unit |
|:------|:----------|:----|
| elasticsearch.node_indices_indexing | index | operations/s |
| elasticsearch.node_indices_indexing_current | index | operations |
| elasticsearch.node_indices_indexing_time | index | milliseconds |
| elasticsearch.node_indices_search | queries, fetches | operations/s |
| elasticsearch.node_indices_search_current | queries, fetches | operations |
| elasticsearch.node_indices_search_time | queries, fetches | milliseconds |
| elasticsearch.node_indices_refresh | refresh | operations/s |
| elasticsearch.node_indices_refresh_time | refresh | milliseconds |
| elasticsearch.node_indices_flush | flush | operations/s |
| elasticsearch.node_indices_flush_time | flush | milliseconds |
| elasticsearch.node_indices_fielddata_memory_usage | used | bytes |
| elasticsearch.node_indices_fielddata_evictions | evictions | operations/s |
| elasticsearch.node_indices_segments_count | segments | segments |
| elasticsearch.node_indices_segments_memory_usage_total | used | bytes |
| elasticsearch.node_indices_segments_memory_usage | terms, stored_fields, term_vectors, norms, points, doc_values, index_writer, version_map, fixed_bit_set | bytes |
| elasticsearch.node_indices_translog_operations | total, uncommitted | operations |
| elasticsearch.node_indices_translog_size | total, uncommitted | bytes |
| elasticsearch.node_file_descriptors | open | fd |
| elasticsearch.node_jvm_heap | inuse | percentage |
| elasticsearch.node_jvm_heap_bytes | committed, used | bytes |
| elasticsearch.node_jvm_buffer_pools_count | direct, mapped | pools |
| elasticsearch.node_jvm_buffer_pool_direct_memory | total, used | bytes |
| elasticsearch.node_jvm_buffer_pool_mapped_memory | total, used | bytes |
| elasticsearch.node_jvm_gc_count | young, old | gc/s |
| elasticsearch.node_jvm_gc_time | young, old | milliseconds |
| elasticsearch.node_thread_pool_queued | generic, search, search_throttled, get, analyze, write, snapshot, warmer, refresh, listener, fetch_shard_started, fetch_shard_store, flush, force_merge, management | threads |
| elasticsearch.node_thread_pool_rejected | generic, search, search_throttled, get, analyze, write, snapshot, warmer, refresh, listener, fetch_shard_started, fetch_shard_store, flush, force_merge, management | threads |
| elasticsearch.node_cluster_communication_packets | received, sent | pps |
| elasticsearch.node_cluster_communication_traffic | received, sent | bytes/s |
| elasticsearch.node_http_connections | open | connections |
| elasticsearch.node_breakers_trips | requests, fielddata, in_flight_requests, model_inference, accounting, parent | trips/s |

### Per cluster

These metrics refer to the cluster.

Labels:

| Label      | Description     |
|:-----------|:----------------|
| cluster_name | Name of the cluster. Based on the [Cluster name setting](https://www.elastic.co/guide/en/elasticsearch/reference/current/important-settings.html#cluster-name). |

Metrics:

| Metric | Dimensions | Unit |
|:------|:----------|:----|
| elasticsearch.cluster_health_status | green, yellow, red | status |
| elasticsearch.cluster_number_of_nodes | nodes, data_nodes | nodes |
| elasticsearch.cluster_shards_count | active_primary, active, relocating, initializing, unassigned, delayed_unaasigned | shards |
| elasticsearch.cluster_pending_tasks | pending | tasks |
| elasticsearch.cluster_number_of_in_flight_fetch | in_flight_fetch | fetches |
| elasticsearch.cluster_indices_count | indices | indices |
| elasticsearch.cluster_indices_shards_count | total, primaries, replication | shards |
| elasticsearch.cluster_indices_docs_count | docs | docs |
| elasticsearch.cluster_indices_store_size | size | bytes |
| elasticsearch.cluster_indices_query_cache | hit, miss | events/s |
| elasticsearch.cluster_nodes_by_role_count | coordinating_only, data, data_cold, data_content, data_frozen, data_hot, data_warm, ingest, master, ml, remote_cluster_client, voting_only | nodes |

### Per index

These metrics refer to the index.

Labels:

| Label      | Description     |
|:-----------|:----------------|
| cluster_name | Name of the cluster. Based on the [Cluster name setting](https://www.elastic.co/guide/en/elasticsearch/reference/current/important-settings.html#cluster-name). |
| index | Name of the index. |

Metrics:

| Metric | Dimensions | Unit |
|:------|:----------|:----|
| elasticsearch.node_index_health | green, yellow, red | status |
| elasticsearch.node_index_shards_count | shards | shards |
| elasticsearch.node_index_docs_count | docs | docs |
| elasticsearch.node_index_store_size | store_size | bytes |



## Alerts


The following alerts are available:

| Alert name  | On metric | Description |
|:------------|:----------|:------------|
| [ elasticsearch_node_indices_search_time_query ](https://github.com/netdata/netdata/blob/master/src/health/health.d/elasticsearch.conf) | elasticsearch.node_indices_search_time | search performance is degraded, queries run slowly. |
| [ elasticsearch_node_indices_search_time_fetch ](https://github.com/netdata/netdata/blob/master/src/health/health.d/elasticsearch.conf) | elasticsearch.node_indices_search_time | search performance is degraded, fetches run slowly. |
| [ elasticsearch_cluster_health_status_red ](https://github.com/netdata/netdata/blob/master/src/health/health.d/elasticsearch.conf) | elasticsearch.cluster_health_status | cluster health status is red. |
| [ elasticsearch_cluster_health_status_yellow ](https://github.com/netdata/netdata/blob/master/src/health/health.d/elasticsearch.conf) | elasticsearch.cluster_health_status | cluster health status is yellow. |
| [ elasticsearch_node_index_health_red ](https://github.com/netdata/netdata/blob/master/src/health/health.d/elasticsearch.conf) | elasticsearch.node_index_health | node index $label:index health status is red. |


## Setup

### Prerequisites

No action required.

### Configuration

#### File

The configuration file name for this integration is `go.d/elasticsearch.conf`.


You can edit the configuration file using the `edit-config` script from the
Netdata [config directory](/docs/netdata-agent/configuration/README.md#the-netdata-config-directory).

```bash
cd /etc/netdata 2>/dev/null || cd /opt/netdata/etc/netdata
sudo ./edit-config go.d/elasticsearch.conf
```
#### Options

The following options can be defined globally: update_every, autodetection_retry.


<details open><summary>Config options</summary>

| Name | Description | Default | Required |
|:----|:-----------|:-------|:--------:|
| update_every | Data collection frequency. | 5 | no |
| autodetection_retry | Recheck interval in seconds. Zero means no recheck will be scheduled. | 0 | no |
| url | Server URL. | http://127.0.0.1:9200 | yes |
| cluster_mode | Controls whether to collect metrics for all nodes in the cluster or only for the local node. | false | no |
| collect_node_stats | Controls whether to collect nodes metrics. | true | no |
| collect_cluster_health | Controls whether to collect cluster health metrics. | true | no |
| collect_cluster_stats | Controls whether to collect cluster stats metrics. | true | no |
| collect_indices_stats | Controls whether to collect indices metrics. | false | no |
| timeout | HTTP request timeout. | 2 | no |
| username | Username for basic HTTP authentication. |  | no |
| password | Password for basic HTTP authentication. |  | no |
| proxy_url | Proxy URL. |  | no |
| proxy_username | Username for proxy basic HTTP authentication. |  | no |
| proxy_password | Password for proxy basic HTTP authentication. |  | no |
| method | HTTP request method. | GET | no |
| body | HTTP request body. |  | no |
| headers | HTTP request headers. |  | no |
| not_follow_redirects | Redirect handling policy. Controls whether the client follows redirects. | no | no |
| tls_skip_verify | Server certificate chain and hostname validation policy. Controls whether the client performs this check. | no | no |
| tls_ca | Certification authority that the client uses when verifying the server's certificates. |  | no |
| tls_cert | Client TLS certificate. |  | no |
| tls_key | Client TLS key. |  | no |

</details>

#### Examples

##### Basic single node mode

A basic example configuration.

```yaml
jobs:
  - name: local
    url: http://127.0.0.1:9200

```
##### Cluster mode

Cluster mode example configuration.

<details open><summary>Config</summary>

```yaml
jobs:
  - name: local
    url: http://127.0.0.1:9200
    cluster_mode: yes

```
</details>

##### HTTP authentication

Basic HTTP authentication.

<details open><summary>Config</summary>

```yaml
jobs:
  - name: local
    url: http://127.0.0.1:9200
    username: username
    password: password

```
</details>

##### HTTPS with self-signed certificate

Elasticsearch with enabled HTTPS and self-signed certificate.

<details open><summary>Config</summary>

```yaml
jobs:
  - name: local
    url: https://127.0.0.1:9200
    tls_skip_verify: yes

```
</details>

##### Multi-instance

> **Note**: When you define multiple jobs, their names must be unique.

Collecting metrics from local and remote instances.


<details open><summary>Config</summary>

```yaml
jobs:
  - name: local
    url: http://127.0.0.1:9200

  - name: remote
    url: http://192.0.2.1:9200

```
</details>



## Troubleshooting

### Debug Mode

**Important**: Debug mode is not supported for data collection jobs created via the UI using the Dyncfg feature.

To troubleshoot issues with the `elasticsearch` collector, run the `go.d.plugin` with the debug option enabled. The output
should give you clues as to why the collector isn't working.

- Navigate to the `plugins.d` directory, usually at `/usr/libexec/netdata/plugins.d/`. If that's not the case on
  your system, open `netdata.conf` and look for the `plugins` setting under `[directories]`.

  ```bash
  cd /usr/libexec/netdata/plugins.d/
  ```

- Switch to the `netdata` user.

  ```bash
  sudo -u netdata -s
  ```

- Run the `go.d.plugin` to debug the collector:

  ```bash
  ./go.d.plugin -d -m elasticsearch
  ```

### Getting Logs

If you're encountering problems with the `elasticsearch` collector, follow these steps to retrieve logs and identify potential issues:

- **Run the command** specific to your system (systemd, non-systemd, or Docker container).
- **Examine the output** for any warnings or error messages that might indicate issues.  These messages should provide clues about the root cause of the problem.

#### System with systemd

Use the following command to view logs generated since the last Netdata service restart:

```bash
journalctl _SYSTEMD_INVOCATION_ID="$(systemctl show --value --property=InvocationID netdata)" --namespace=netdata --grep elasticsearch
```

#### System without systemd

Locate the collector log file, typically at `/var/log/netdata/collector.log`, and use `grep` to filter for collector's name:

```bash
grep elasticsearch /var/log/netdata/collector.log
```

**Note**: This method shows logs from all restarts. Focus on the **latest entries** for troubleshooting current issues.

#### Docker Container

If your Netdata runs in a Docker container named "netdata" (replace if different), use this command:

```bash
docker logs netdata 2>&1 | grep elasticsearch
```