README.adoc
= NGINX-to-Prometheus log file exporter
:tip-caption: :bulb:
:note-caption: :information_source:
:important-caption: :heavy_exclamation_mark:
:caution-caption: :fire:
:warning-caption: :warning:
:toc:
:toc-placement!:
:toc-title:
image:https://img.shields.io/github/workflow/status/martin-helmich/prometheus-nginxlog-exporter/Compile%20&%20Test[GitHub Workflow Status]
image:https://quay.io/repository/martinhelmich/prometheus-nginxlog-exporter/status[link="https://quay.io/repository/martinhelmich/prometheus-nginxlog-exporter",Docker Repository on Quay]
image:https://goreportcard.com/badge/github.com/martin-helmich/prometheus-nginxlog-exporter[link="https://goreportcard.com/report/github.com/martin-helmich/prometheus-nginxlog-exporter", Go Report Card]
image:https://img.shields.io/github/license/martin-helmich/prometheus-nginxlog-exporter[GitHub]
image:https://img.shields.io/badge/donate-PayPal-yellow[link="https://www.paypal.com/cgi-bin/webscr?cmd=_s-xclick&hosted_button_id=SEARYHPVS9U5N&source=url", Donate]
Helper tool that continuously reads an NGINX log file (or any kind of similar log file) and exports metrics to https://prometheus.io/[Prometheus].
[discrete]
== Contents
toc::[]
== Usage
You can either use a simple configuration, using command-line flags, or create
a configuration file with a more advanced configuration.
Use the command-line:
[source]
----
$ ./prometheus-nginxlog-exporter \
-format="<FORMAT>" \
-listen-port=4040 \
-namespace=nginx \
[PATHS-TO-LOGFILES...]
----
Use the configuration file:
[source]
----
$ ./prometheus-nginxlog-exporter -config-file /path/to/config.hcl
----
You can verify your config file before deployment, which will exit with shell status indicating success:
[source]
----
$ ./prometheus-nginxlog-exporter -config-file /path/to/config.hcl -verify-config
----
Installation
------------
There are multiple ways to install this exporter.
=== Docker
Docker images for this exporter are available at the quay.io and pkg.github.com
registries:
- `quay.io/martinhelmich/prometheus-nginxlog-exporter:v1`
- `ghcr.io/martin-helmich/prometheus-nginxlog-exporter/exporter:v1`
Have a look at the https://github.com/martin-helmich/prometheus-nginxlog-exporter/releases[releases page]
to see the available versions and how to pull their images. In general, I would
recommend using the `v1` tag instead of `latest`.
Run the exporter as follows (adjust paths like `/path/to/logs` and
`/path/to/config` to your own needs):
[source]
----
$ docker run \
--name nginx-exporter \
-v /path/to/logs:/mnt/nginxlogs \
-p 4040:4040 \
quay.io/martinhelmich/prometheus-nginxlog-exporter \
mnt/nginxlogs/access.log
----
Command-line flags and arguments can simply be appended to the `docker run` command, for example to use a
configuration file:
[source]
----
$ docker run \
--name nginx-exporter \
-p 4040:4040 \
-v /path/to/logs:/mnt/nginxlogs \
-v /path/to/config.hcl:/etc/prometheus-nginxlog-exporter.hcl \
quay.io/martinhelmich/prometheus-nginxlog-exporter \
-config-file /etc/prometheus-nginxlog-exporter.hcl
----
=== DEB and RPM packages
Each https://github.com/martin-helmich/prometheus-nginxlog-exporter/releases[release]
from 1.5.1 or newer provides both DEB and RPM packages.
DEB:
$ wget https://github.com/martin-helmich/prometheus-nginxlog-exporter/releases/download/v1.9.2/prometheus-nginxlog-exporter_1.9.2_linux_amd64.deb
$ apt install ./prometheus-nginxlog-exporter_1.9.2_linux_amd64.deb
RPM:
$ wget https://github.com/martin-helmich/prometheus-nginxlog-exporter/releases/download/v1.9.2/prometheus-nginxlog-exporter_1.9.2_linux_amd64.rpm
$ yum localinstall prometheus-nginxlog-exporter_1.9.2_linux_amd64.rpm
The package come with a dependency on systemd and configure the exporter to be
running automatically:
$ systemctl status prometheus-nginxlog-exporter
$ # systemctl disable prometheus-nginxlog-exporter
$ # systemctl enable prometheus-nginxlog-exporter
The packages drop a configuration file to `/etc/prometheus-nginxlog-exporter.hcl`
which you can adjust to your own needs.
### Manual installation, with systemd
If you do not want to use one of the pre-built packages, you can download the
binary itself and manually configure systemd to start it. You can find an
example unit file for this service
https://github.com/martin-helmich/prometheus-nginxlog-exporter/blob/master/res/package/prometheus-nginxlog-exporter.service[in this repository].
Simply copy the unit file to `/etc/systemd/system`:
$ wget -O /etc/systemd/system/prometheus-nginxlog-exporter.service https://raw.githubusercontent.com/martin-helmich/prometheus-nginxlog-exporter/master/res/package/prometheus-nginxlog-exporter.service
$ systemctl enable prometheus-nginxlog-exporter
$ systemctl start prometheus-nginxlog-exporter
The shipped unit file expects the binary to be located in
`/usr/sbin/prometheus-nginxlog-exporter` (if you sideload the exporter without
using your package manager, you might want to put it to `/usr/local`, instead)
and the configuration file in `/etc/prometheus-nginxlog-exporter.hcl`. Adjust
to your own needs.
### Kubernetes
If you run a logfile-generating service (be it NGINX, or anything that generates
similar access log files) in Kubernetes, you can run the exporter as a sidecar
along your "main" container within the same pod.
The following example shows you how to deploy the exporter as a sidecar,
accepting logs from the main container via syslog:
[source,yaml]
----
apiVersion: v1
kind: Pod
metadata:
name: nginx-example
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "4040"
spec:
containers:
- name: web
image: nginx
# ...
- name: exporter
image: docker.pkg.github.com/martin-helmich/prometheus-nginxlog-exporter/exporter:v1
args: ["-config-file", "/etc/prometheus-nginxlog-exporter/config.hcl"]
volumeMounts:
- name: exporter-config
mountPath: /etc/prometheus-nginxlog-exporter
volumes:
- name: exporter-config
configMap:
name: exporter-config
----
In this example, the configuration file is passed via the `exporter-config`
ConfigMap. This might look like follows:
[source,yaml]
----
apiVersion: v1
kind: ConfigMap
metadata:
name: exporter-config
data:
config.hcl: |
listen {
port = 4040
}
namespace "nginx" {
source = {
syslog {
listen_address = "udp://127.0.0.1:5531"
format = "rfc3164"
}
}
format = "$remote_addr - $remote_user [$time_local] \"$request\" $status $body_bytes_sent \"$http_referer\" \"$http_user_agent\" \"$http_x_forwarded_for\""
labels {
app = "default"
}
}
----
The config file instructs the exporter to accept log input via syslog. To
forward logs to the exporter, just instruct your main container to send its
access logs via syslog to `127.0.0.1:5531` (which works, since the main
container and the sidecar share their network namespace).
### Build from source
To build the exporter from source, simply build it with `go get`:
$ go get github.com/martin-helmich/prometheus-nginxlog-exporter
Alternatively, clone this repository and just run `go build`:
$ git clone https://github.com/martin-helmich/prometheus-nginxlog-exporter.git
$ cd prometheus-nginxlog-exporter
$ go build
== Collected metrics
This exporter collects the following metrics. This collector can listen on
multiple log files at once and publish metrics in different namespaces. Each
metric uses the labels `method` (containing the HTTP request method) and
`status` (containing the HTTP status code).
[IMPORTANT]
====
Keep in mind that some of these metrics will require certain values to be present
in your access log format (for example, the `http_upstream_time_seconds` metric
will require your access to contain the variable `$upstream_response_time`.
====
Metrics are exported at the `/metrics` path.
These metrics are exported:
|===
| `<namespace>_http_response_count_total` | The total amount of processed HTTP requests/responses.
| `<namespace>_http_response_size_bytes` | The total amount of transferred content in bytes.
| `<namespace>_http_request_size_bytes` | The total amount of received traffic in bytes. This metrics requires the `$request_length` variable in the log format.
| `<namespace>_http_upstream_time_seconds` | A summary vector of the upstream response times in seconds. Logging these needs to be specifically enabled in NGINX using the `$upstream_response_time` variable in the log format.
| `<namespace>_http_upstream_time_seconds_hist` | Same as `<namespace>_http_upstream_time_seconds`, but as a histogram vector. Also requires the `$upstream_response_time` variable in the log format.
| `<namespace>_http_response_time_seconds` | A summary vector of the total response times in seconds. Logging these needs to be specifically enabled in NGINX using the `$request_time` variable in the log format.
| `<namespace>_http_response_time_seconds_hist` | Same as `<namespace>_http_response_time_seconds`, but as a histogram vector. Also requires the `$request_time` variable in the log format.
|===
Additional labels can be configured in the configuration file (see below).
`<namespace>` can be omitted or overridden - see <<Namespace-as-labels>> for
more information.
== Configuration file
You can specify a configuration file to read at startup. The configuration file
is expected to be either in https://github.com/hashicorp/hcl[HCL] or YAML format. Here's an example file:
[source,hcl]
----
listen {
port = 4040
address = "10.1.2.3"
metrics_endpoint = "/metrics"
}
consul {
enable = true
address = "localhost:8500"
datacenter = "dc1"
scheme = "http"
token = ""
service {
id = "nginx-exporter"
name = "nginx-exporter"
address = "192.168.3.1"
tags = ["foo", "bar"]
}
}
namespace "app1" {
format = "$remote_addr - $remote_user [$time_local] \"$request\" $status $body_bytes_sent \"$http_referer\" \"$http_user_agent\" \"$http_x_forwarded_for\""
source {
files = [
"/var/log/nginx/app1/access.log"
]
}
# log can be printed to std out, e.g. for debugging purposes (disabled by default)
print_log = false
# metrics_override = { prefix = "myprefix" }
# namespace_label = "vhost"
labels {
app = "application-one"
environment = "production"
foo = "bar"
}
histogram_buckets = [.005, .01, .025, .05, .1, .25, .5, 1, 2.5, 5, 10]
}
namespace "app2" {
format = "$remote_addr - $remote_user [$time_local] \"$request\" $status $body_bytes_sent \"$http_referer\" \"$http_user_agent\" \"$http_x_forwarded_for\" $upstream_response_time"
source {
files = [
"/var/log/nginx/app2/access.log"
]
}
}
----
The same file as YAML file:
[source,yaml]
----
listen:
port: 4040
address: "10.1.2.3"
metrics_endpoint: "/metrics"
consul:
enable: true
address: "localhost:8500"
datacenter: dc1
scheme: http
token: ""
service:
id: "nginx-exporter"
name: "nginx-exporter"
address = "192.168.3.1"
tags: ["foo", "bar"]
namespaces:
- name: app1
format: "$remote_addr - $remote_user [$time_local] \"$request\" $status $body_bytes_sent \"$http_referer\" \"$http_user_agent\" \"$http_x_forwarded_for\""
source:
files:
- /var/log/nginx/app1/access.log
# metrics_override:
# prefix: "myprefix"
# namespace_label: "vhost"
labels:
app: "application-one"
environment: "production"
foo: "bar"
histogram_buckets: [.005, .01, .025, .05, .1, .25, .5, 1, 2.5, 5, 10]
- name: app2
format: "$remote_addr - $remote_user [$time_local] \"$request\" $status $body_bytes_sent \"$http_referer\" \"$http_user_agent\" \"$http_x_forwarded_for\" $upstream_response_time"
source:
files:
- /var/log/nginx/app2/access.log
----
Advanced features
-----------------
### Namespace as labels
For historic reasons, this exporter exports separate metrics for different
namespaces (because the namespace is part of the metric name). However, in many
(most) cases, it's more convenient to have the same metric name across different
namespaces (with different log formats and names).
This can be done in two steps:
1. Override Prometheus metrics namespace to some common prefix (`metrics_override`)
2. Set label name for nginxlog-exporter's config namespace (`namespace_label`)
[source,hcl]
----
namespace "app1" {
...
metrics_override = { prefix = "myprefix" }
namespace_label = "vhost"
...
}
namespace "app2" {
...
metrics_override = { prefix = "myprefix" }
namespace_label = "vhost"
...
}
----
Exported metrics will have the following format:
[source]
----
myprefix_http_response_count_total{vhost="app1", ...}
myprefix_http_response_count_total{vhost="app2", ...}
...
----
* `prefix` can be set to `""`, resulting metrics like `http_response_count_total{...}`
* `namespace_label` can be omitted - so you have full control on metric format
Some details and history on this can be found in https://github.com/martin-helmich/prometheus-nginxlog-exporter/issues/13[issue #13].
### Custom labels pass-through
Partial case of <<Dynamic-re-labeling>>:
[source,hcl]
----
namespace "app1" {
format = "$remote_addr - $remote_user [$time_local] ... \"$geoip_country_code\" $upstream_addr"
...
relabel "upstream_addr" { from = "upstream_addr" }
relabel "country" { from = "geoip_country_code" }
...
}
----
Exported metrics will have `upstream_addr` and `country` labels.
### Log sources
Currently, the exporter supports reading log data from
1. files
2. syslog
All log sources can be configured on a per-namespace basis using the `source` property.
#### Reading from files
When reading from log files, all that is needed is a `files` property:
```hcl
namespace "test" {
source {
files = ["/var/log/nginx/access.log"]
// ...
}
}
```
#### Reading from syslog
The exporter can also open and listen on a Syslog port and read logs from there. Configuration works as follows:
[source,hcl]
----
namespace "test" {
source {
syslog {
listen_address = "udp://127.0.0.1:8514" <1>
format = "rfc3164" <2>
tags = ["nginx"] <3>
}
// ...
}
}
----
<1> The `listen_address` might be either a TCP or UDP address. UNIX sockets are not supported (yet -- pull requests are welcome)
<2> The `format` may be one of `rfc3164`, `rfc5424`, `rfc6587` or `auto`. If omitted, it will default to `auto`
<3> The `tags` must be specified.
Have a look at http://nginx.org/en/docs/syslog.html[the respective section of the NGINX documentation] on how to set up NGINX to log into syslog.
### Dynamic re-labeling
Re-labeling lets you add arbitrary fields from the parsed log line as labels to your metrics.
To add a dynamic label, add a `relabel` statement to your configuration file:
[source,hcl]
----
namespace "app-1" {
// ...
relabel "host" {
from = "server_name"
whitelist = [ <1>
"host-a.com",
"host-b.de"
]
}
}
----
<1> The `whitelist` property is optional; if set, only the supplied values will be added as label.
All other values will be subsumed under the `"other"` label value. See #16 for a more detailed
discussion around the reasoning.
Dynamic relabeling also allows you to aggregate your metrics by request path (which replaces
the experimental feature originally introduced in #23). The following example splits the content of
the `request` variable at every space (using `split`) and return the second element (index 1) of the
resulting list which is the base for the regex):
[source,hcl]
----
namespace "app1" {
// ...
relabel "request_uri" {
from = "request"
split = 2
separator = " " // <1>
// if enabled, only include label in response count metric (default is false)
only_counter = false
match "^/users/[0-9]+" {
replacement = "/users/:id"
}
match "^/profile" {
replacement = "/profile"
}
}
}
----
<1> The `separator` property is optional; if omitted, the space character (`" "`) will be assumed as separator.
If a match is found, the `replacement` replaces each occurrence of the corresponding match in the original value. Otherwise the processing continues to check the following match statements.
The YAML configuration for relabelings works similar to the HCL configuration:
[source,yaml]
----
namespaces:
- name: app1
relabel_configs:
- target_label: request_uri
from: request
split: 2
separator: ' '
matches:
- regexp: "^/users/[0-9]+"
replacement: "/users/:id"
----
If your regular expression contains groups, you can also use the matched values of those in the `replacement` value:
[source,hcl]
----
relabel "request_uri" {
from = "request"
split = 2
match "^/(users|profiles)/[0-9]+" {
replacement = "/$1/:id"
}
}
----
### File Globs
You can specify one or more wildcards in the source file names, in which case the wildcards will be resolved to the corresponding list of files at startup of the exporter.
Be aware that the list of matches is only evaluated at the start of the program. If a new file is added with a match of one glob filter, you'll have to restart the program for it to be monitored.
Given a config like this:
[source,hcl]
---
namespace "test" {
source {
files = ["/var/log/nginx/*_access.log"]
// ...
}
}
---
And a folder containing these files:
```bash
# /var/log/nginx
main_access.log
main_error.log
virtualhost1_access.log
virtualhost1_error.log
```
The list of files monitored by this namespace will be `/var/log/nginx/main_access.log,/var/log/nginx/virtualhost1_access.log`.
### JSON log_format
You can use the JSON parser by setting the `--parser` command line flag or `parser` config file property to `json`.
== Frequently Asked Questions
> I have started the exporter, but it is not exporting any application-specific metrics!
This may have several issues:
1. Make sure that the access log files that your exporter is listening on are present. The exporter will exit with an error code if a file is present but cannot be opened (for example, due to bad permissions), but will _wait_ for a file if it does not yet exist.
2. Make sure that the exporter can parse the lines from your access log files. Pay attention to the `<namespace>_parse_errors_total` metric, which will indicate how many log lines could not be parsed.
> The exporter exports the `<namespace>_http_response_count_total` metric, but not _[other metric that is mentioned in the README]_!
Most metrics require certain values to be present in the access log files that are not present in the NGINX default configuration. Especially, make sure that the access log contains the http://nginx.org/en/docs/http/ngx_http_upstream_module.html#var_upstream_response_time[`$upstream_response_time`], http://nginx.org/en/docs/http/ngx_http_log_module.html#var_request_time[`$request_time`] and/or http://nginx.org/en/docs/http/ngx_http_core_module.html#variables[`$body_bytes_sent`] variables. These need to be enabled in the NGINX configuration (more precisely, the `log_format` setting) and then added to the format specified for the exporter.
> How can I configure NGINX to export these variables?
Have a look at NGINX's https://www.nginx.com/resources/admin-guide/logging-and-monitoring/[Logging and Monitoring] guide. It contains some good examples that contain the `$request_time` and `$upstream_response_time`:
```
log_format upstream_time '$remote_addr - $remote_user [$time_local] '
'"$request" $status $body_bytes_sent '
'"$http_referer" "$http_user_agent"'
'rt=$request_time uct="$upstream_connect_time" uht="$upstream_header_time" urt="$upstream_response_time"';
```
Credits
-------
- https://github.com/hpcloud/tail[tail], MIT license
- https://github.com/satyrius/gonx[gonx], MIT license
- https://github.com/prometheus/client_golang[Prometheus Go client library], Apache License
- https://github.com/hashicorp/hcl[HashiCorp configuration language], Mozilla Public License