src/go/plugin/go.d/modules/weblog/metadata.yaml
plugin_name: go.d.plugin
modules:
- meta:
id: collector-go.d.plugin-web_log
plugin_name: go.d.plugin
module_name: web_log
monitored_instance:
name: Web server log files
link: ""
categories:
- data-collection.web-servers-and-web-proxies
icon_filename: webservers.svg
keywords:
- webserver
- apache
- httpd
- nginx
- lighttpd
- logs
most_popular: false
info_provided_to_referring_integrations:
description: ""
related_resources:
integrations:
list: []
overview:
data_collection:
metrics_description: |
This collector monitors web servers by parsing their log files.
method_description: ""
default_behavior:
auto_detection:
description: |
It automatically detects log files of web servers running on localhost.
limits:
description: ""
performance_impact:
description: ""
additional_permissions:
description: ""
multi_instance: true
supported_platforms:
include: []
exclude: []
setup:
prerequisites:
list: []
configuration:
file:
name: go.d/web_log.conf
options:
description: |
Weblog is aware of how to parse and interpret the following fields (**known fields**):
> [nginx](https://nginx.org/en/docs/varindex.html)
>
> [apache](https://httpd.apache.org/docs/current/mod/mod_log_config.html)
| nginx | apache | description |
|-------------------------|----------|------------------------------------------------------------------------------------------|
| $host ($http_host) | %v | Name of the server which accepted a request. |
| $server_port | %p | Port of the server which accepted a request. |
| $scheme | - | Request scheme. "http" or "https". |
| $remote_addr | %a (%h) | Client address. |
| $request | %r | Full original request line. The line is "$request_method $request_uri $server_protocol". |
| $request_method | %m | Request method. Usually "GET" or "POST". |
| $request_uri | %U | Full original request URI. |
| $server_protocol | %H | Request protocol. Usually "HTTP/1.0", "HTTP/1.1", or "HTTP/2.0". |
| $status | %s (%>s) | Response status code. |
| $request_length | %I | Bytes received from a client, including request and headers. |
| $bytes_sent | %O | Bytes sent to a client, including request and headers. |
| $body_bytes_sent | %B (%b) | Bytes sent to a client, not counting the response header. |
| $request_time | %D | Request processing time. |
| $upstream_response_time | - | Time spent on receiving the response from the upstream server. |
| $ssl_protocol | - | Protocol of an established SSL connection. |
| $ssl_cipher | - | String of ciphers used for an established SSL connection. |
Notes:
- Apache `%h` logs the IP address if [HostnameLookups](https://httpd.apache.org/docs/2.4/mod/core.html#hostnamelookups) is Off. The web log collector counts hostnames as IPv4 addresses. We recommend either to disable HostnameLookups or use `%a` instead of `%h`.
- Since httpd 2.0, unlike 1.3, the `%b` and `%B` format strings do not represent the number of bytes sent to the client, but simply the size in bytes of the HTTP response. It will differ, for instance, if the connection is aborted, or if SSL is used. The `%O` format provided by [`mod_logio`](https://httpd.apache.org/docs/2.4/mod/mod_logio.html) will log the actual number of bytes sent over the network.
- To get `%I` and `%O` working you need to enable `mod_logio` on Apache.
- NGINX logs URI with query parameters, Apache doesnt.
- `$request` is parsed into `$request_method`, `$request_uri` and `$server_protocol`. If you have `$request` in your log format, there is no sense to have others.
- Don't use both `$bytes_sent` and `$body_bytes_sent` (`%O` and `%B` or `%b`). The module does not distinguish between these parameters.
folding:
title: Config options
enabled: true
list:
- name: update_every
description: Data collection frequency.
default_value: 1
required: false
- name: autodetection_retry
description: Recheck interval in seconds. Zero means no recheck will be scheduled.
default_value: 0
required: false
- name: path
description: Path to the web server log file.
default_value: ""
required: true
- name: exclude_path
description: Path to exclude.
default_value: "*.gz"
required: false
- name: url_patterns
description: List of URL patterns.
default_value: "[]"
required: false
detailed_description: |
"URL pattern" scope metrics will be collected for each URL pattern.
Option syntax:
```yaml
url_patterns:
- name: name1
pattern: pattern1
- name: name2
pattern: pattern2
```
- name: url_patterns.name
description: Used as a dimension name.
default_value: ""
required: true
- name: url_patterns.pattern
description: Used to match against full original request URI. Pattern syntax in [matcher](https://github.com/netdata/netdata/tree/master/src/go/plugin/go.d/pkg/matcher#supported-format).
default_value: ""
required: true
- name: log_type
description: Log parser type.
default_value: auto
required: false
detailed_description: |
Weblog supports 5 different log parsers:
| Parser type | Description |
|-------------|-------------------------------------------|
| auto | Use CSV and auto-detect format |
| csv | A comma-separated values |
| json | [JSON](https://www.json.org/json-en.html) |
| ltsv | [LTSV](http://ltsv.org/) |
| regexp | Regular expression with named groups |
Syntax:
```yaml
log_type: auto
```
If `log_type` parameter set to `auto` (which is default), weblog will try to auto-detect appropriate log parser and log format using the last line of the log file.
- checks if format is `CSV` (using regexp).
- checks if format is `JSON` (using regexp).
- assumes format is `CSV` and tries to find appropriate `CSV` log format using predefined list of formats. It tries to parse the line using each of them in the following order (the first one matches is used later):
```sh
$host:$server_port $remote_addr - - [$time_local] "$request" $status $body_bytes_sent - - $request_length $request_time $upstream_response_time
$host:$server_port $remote_addr - - [$time_local] "$request" $status $body_bytes_sent - - $request_length $request_time
$host:$server_port $remote_addr - - [$time_local] "$request" $status $body_bytes_sent $request_length $request_time $upstream_response_time
$host:$server_port $remote_addr - - [$time_local] "$request" $status $body_bytes_sent $request_length $request_time
$host:$server_port $remote_addr - - [$time_local] "$request" $status $body_bytes_sent
$remote_addr - - [$time_local] "$request" $status $body_bytes_sent - - $request_length $request_time $upstream_response_time
$remote_addr - - [$time_local] "$request" $status $body_bytes_sent - - $request_length $request_time
$remote_addr - - [$time_local] "$request" $status $body_bytes_sent $request_length $request_time $upstream_response_time
$remote_addr - - [$time_local] "$request" $status $body_bytes_sent $request_length $request_time
$remote_addr - - [$time_local] "$request" $status $body_bytes_sent
```
If you're using the default Apache/NGINX log format, auto-detect will work for you. If it doesn't work you need to set the format manually.
- name: csv_config
description: CSV log parser config.
default_value: ""
required: false
- name: csv_config.delimiter
description: CSV field delimiter.
default_value: ","
required: false
- name: csv_config.format
description: CSV log format.
default_value: ""
required: false
detailed_description: ""
- name: ltsv_config
description: LTSV log parser config.
default_value: ""
required: false
- name: ltsv_config.field_delimiter
description: LTSV field delimiter.
default_value: "\\t"
required: false
- name: ltsv_config.value_delimiter
description: LTSV value delimiter.
default_value: ":"
required: false
- name: ltsv_config.mapping
description: LTSV fields mapping to **known fields**.
default_value: ""
required: true
detailed_description: |
The mapping is a dictionary where the key is a field, as in logs, and the value is the corresponding **known field**.
> **Note**: don't use `$` and `%` prefixes for mapped field names.
```yaml
log_type: ltsv
ltsv_config:
mapping:
label1: field1
label2: field2
```
- name: json_config
description: JSON log parser config.
default_value: ""
required: false
- name: json_config.mapping
description: JSON fields mapping to **known fields**.
default_value: ""
required: true
detailed_description: |
The mapping is a dictionary where the key is a field, as in logs, and the value is the corresponding **known field**.
> **Note**: don't use `$` and `%` prefixes for mapped field names.
```yaml
log_type: json
json_config:
mapping:
label1: field1
label2: field2
```
- name: regexp_config
description: RegExp log parser config.
default_value: ""
required: false
- name: regexp_config.pattern
description: RegExp pattern with named groups.
default_value: ""
required: true
detailed_description: |
Use pattern with subexpressions names. These names should be **known fields**.
> **Note**: don't use `$` and `%` prefixes for mapped field names.
Syntax:
```yaml
log_type: regexp
regexp_config:
pattern: PATTERN
```
examples:
folding:
title: Config
enabled: true
list: []
troubleshooting:
problems:
list: []
alerts:
- name: web_log_1m_unmatched
metric: web_log.excluded_requests
info: percentage of unparsed log lines over the last minute
link: https://github.com/netdata/netdata/blob/master/src/health/health.d/web_log.conf
- name: web_log_1m_requests
metric: web_log.type_requests
info: "ratio of successful HTTP requests over the last minute (1xx, 2xx, 304, 401)"
link: https://github.com/netdata/netdata/blob/master/src/health/health.d/web_log.conf
- name: web_log_1m_redirects
metric: web_log.type_requests
info: "ratio of redirection HTTP requests over the last minute (3xx except 304)"
link: https://github.com/netdata/netdata/blob/master/src/health/health.d/web_log.conf
- name: web_log_1m_bad_requests
metric: web_log.type_requests
info: "ratio of client error HTTP requests over the last minute (4xx except 401)"
link: https://github.com/netdata/netdata/blob/master/src/health/health.d/web_log.conf
- name: web_log_1m_internal_errors
metric: web_log.type_requests
info: "ratio of server error HTTP requests over the last minute (5xx)"
link: https://github.com/netdata/netdata/blob/master/src/health/health.d/web_log.conf
- name: web_log_web_slow
metric: web_log.request_processing_time
info: average HTTP response time over the last 1 minute
link: https://github.com/netdata/netdata/blob/master/src/health/health.d/web_log.conf
- name: web_log_5m_requests_ratio
metric: web_log.type_requests
info: ratio of successful HTTP requests over over the last 5 minutes, compared with the previous 5 minutes
link: https://github.com/netdata/netdata/blob/master/src/health/health.d/web_log.conf
metrics:
folding:
title: Metrics
enabled: false
description: ""
availability: []
scopes:
- name: global
description: These metrics refer to the entire monitored application.
labels: []
metrics:
- name: web_log.requests
description: Total Requests
unit: requests/s
chart_type: line
dimensions:
- name: requests
- name: web_log.excluded_requests
description: Excluded Requests
unit: requests/s
chart_type: stacked
dimensions:
- name: unmatched
- name: web_log.type_requests
description: Requests By Type
unit: requests/s
chart_type: stacked
dimensions:
- name: success
- name: bad
- name: redirect
- name: error
- name: web_log.status_code_class_responses
description: Responses By Status Code Class
unit: responses/s
chart_type: stacked
dimensions:
- name: 1xx
- name: 2xx
- name: 3xx
- name: 4xx
- name: 5xx
- name: web_log.status_code_class_1xx_responses
description: Informational Responses By Status Code
unit: responses/s
chart_type: stacked
dimensions:
- name: a dimension per 1xx code
- name: web_log.status_code_class_2xx_responses
description: Successful Responses By Status Code
unit: responses/s
chart_type: stacked
dimensions:
- name: a dimension per 2xx code
- name: web_log.status_code_class_3xx_responses
description: Redirects Responses By Status Code
unit: responses/s
chart_type: stacked
dimensions:
- name: a dimension per 3xx code
- name: web_log.status_code_class_4xx_responses
description: Client Errors Responses By Status Code
unit: responses/s
chart_type: stacked
dimensions:
- name: a dimension per 4xx code
- name: web_log.status_code_class_5xx_responses
description: Server Errors Responses By Status Code
unit: responses/s
chart_type: stacked
dimensions:
- name: a dimension per 5xx code
- name: web_log.bandwidth
description: Bandwidth
unit: kilobits/s
chart_type: area
dimensions:
- name: received
- name: sent
- name: web_log.request_processing_time
description: Request Processing Time
unit: milliseconds
chart_type: line
dimensions:
- name: min
- name: max
- name: avg
- name: web_log.requests_processing_time_histogram
description: Requests Processing Time Histogram
unit: requests/s
chart_type: line
dimensions:
- name: a dimension per bucket
- name: web_log.upstream_response_time
description: Upstream Response Time
unit: milliseconds
chart_type: line
dimensions:
- name: min
- name: max
- name: avg
- name: web_log.upstream_responses_time_histogram
description: Upstream Responses Time Histogram
unit: requests/s
chart_type: line
dimensions:
- name: a dimension per bucket
- name: web_log.current_poll_uniq_clients
description: Current Poll Unique Clients
unit: clients
chart_type: stacked
dimensions:
- name: ipv4
- name: ipv6
- name: web_log.vhost_requests
description: Requests By Vhost
unit: requests/s
chart_type: stacked
dimensions:
- name: a dimension per vhost
- name: web_log.port_requests
description: Requests By Port
unit: requests/s
chart_type: stacked
dimensions:
- name: a dimension per port
- name: web_log.scheme_requests
description: Requests By Scheme
unit: requests/s
chart_type: stacked
dimensions:
- name: http
- name: https
- name: web_log.http_method_requests
description: Requests By HTTP Method
unit: requests/s
chart_type: stacked
dimensions:
- name: a dimension per HTTP method
- name: web_log.http_version_requests
description: Requests By HTTP Version
unit: requests/s
chart_type: stacked
dimensions:
- name: a dimension per HTTP version
- name: web_log.ip_proto_requests
description: Requests By IP Protocol
unit: requests/s
chart_type: stacked
dimensions:
- name: ipv4
- name: ipv6
- name: web_log.ssl_proto_requests
description: Requests By SSL Connection Protocol
unit: requests/s
chart_type: stacked
dimensions:
- name: a dimension per SSL protocol
- name: web_log.ssl_cipher_suite_requests
description: Requests By SSL Connection Cipher Suite
unit: requests/s
chart_type: stacked
dimensions:
- name: a dimension per SSL cipher suite
- name: web_log.url_pattern_requests
description: URL Field Requests By Pattern
unit: requests/s
chart_type: stacked
dimensions:
- name: a dimension per URL pattern
- name: web_log.custom_field_pattern_requests
description: Custom Field Requests By Pattern
unit: requests/s
chart_type: stacked
dimensions:
- name: a dimension per custom field pattern
- name: custom time field
description: TBD
labels: []
metrics:
- name: web_log.custom_time_field_summary
description: Custom Time Field Summary
unit: milliseconds
chart_type: line
dimensions:
- name: min
- name: max
- name: avg
- name: web_log.custom_time_field_histogram
description: Custom Time Field Histogram
unit: observations
chart_type: line
dimensions:
- name: a dimension per bucket
- name: custom numeric field
description: TBD
labels: []
metrics:
- name: web_log.custom_numeric_field_{{field_name}}_summary
description: Custom Numeric Field Summary
unit: '{{units}}'
chart_type: line
dimensions:
- name: min
- name: max
- name: avg
- name: URL pattern
description: TBD
labels: []
metrics:
- name: web_log.url_pattern_status_code_responses
description: Responses By Status Code
unit: responses/s
chart_type: line
dimensions:
- name: a dimension per pattern
- name: web_log.url_pattern_http_method_requests
description: Requests By HTTP Method
unit: requests/s
chart_type: line
dimensions:
- name: a dimension per HTTP method
- name: web_log.url_pattern_bandwidth
description: Bandwidth
unit: kilobits/s
chart_type: area
dimensions:
- name: received
- name: sent
- name: web_log.url_pattern_request_processing_time
description: Request Processing Time
unit: milliseconds
chart_type: line
dimensions:
- name: min
- name: max
- name: avg