firehol/netdata

View on GitHub
src/go/plugin/go.d/modules/weblog/metadata.yaml

Summary

Maintainability
Test Coverage
plugin_name: go.d.plugin
modules:
  - meta:
      id: collector-go.d.plugin-web_log
      plugin_name: go.d.plugin
      module_name: web_log
      monitored_instance:
        name: Web server log files
        link: ""
        categories:
          - data-collection.web-servers-and-web-proxies
        icon_filename: webservers.svg
      keywords:
        - webserver
        - apache
        - httpd
        - nginx
        - lighttpd
        - logs
      most_popular: false
      info_provided_to_referring_integrations:
        description: ""
      related_resources:
        integrations:
          list: []
    overview:
      data_collection:
        metrics_description: |
          This collector monitors web servers by parsing their log files.
        method_description: ""
      default_behavior:
        auto_detection:
          description: |
            It automatically detects log files of web servers running on localhost.
        limits:
          description: ""
        performance_impact:
          description: ""
      additional_permissions:
        description: ""
      multi_instance: true
      supported_platforms:
        include: []
        exclude: []
    setup:
      prerequisites:
        list: []
      configuration:
        file:
          name: go.d/web_log.conf
        options:
          description: |
            Weblog is aware of how to parse and interpret the following fields (**known fields**):
            
            > [nginx](https://nginx.org/en/docs/varindex.html)
            >
            > [apache](https://httpd.apache.org/docs/current/mod/mod_log_config.html)
            
            | nginx                   | apache   | description                                                                              |
            |-------------------------|----------|------------------------------------------------------------------------------------------|
            | $host ($http_host)      | %v       | Name of the server which accepted a request.                                             |
            | $server_port            | %p       | Port of the server which accepted a request.                                             |
            | $scheme                 | -        | Request scheme. "http" or "https".                                                       |
            | $remote_addr            | %a (%h)  | Client address.                                                                          |
            | $request                | %r       | Full original request line. The line is "$request_method $request_uri $server_protocol". |
            | $request_method         | %m       | Request method. Usually "GET" or "POST".                                                 |
            | $request_uri            | %U       | Full original request URI.                                                               |
            | $server_protocol        | %H       | Request protocol. Usually "HTTP/1.0", "HTTP/1.1", or "HTTP/2.0".                         |
            | $status                 | %s (%>s) | Response status code.                                                                    |
            | $request_length         | %I       | Bytes received from a client, including request and headers.                             |
            | $bytes_sent             | %O       | Bytes sent to a client, including request and headers.                                   |
            | $body_bytes_sent        | %B (%b)  | Bytes sent to a client, not counting the response header.                                |
            | $request_time           | %D       | Request processing time.                                                                 |
            | $upstream_response_time | -        | Time spent on receiving the response from the upstream server.                           |
            | $ssl_protocol           | -        | Protocol of an established SSL connection.                                               |
            | $ssl_cipher             | -        | String of ciphers used for an established SSL connection.                                |

            Notes:
            
            - Apache `%h` logs the IP address if [HostnameLookups](https://httpd.apache.org/docs/2.4/mod/core.html#hostnamelookups) is Off. The web log collector counts hostnames as IPv4 addresses. We recommend either to disable HostnameLookups or use `%a` instead of `%h`.
            - Since httpd 2.0, unlike 1.3, the `%b` and `%B` format strings do not represent the number of bytes sent to the client, but simply the size in bytes of the HTTP response. It will differ, for instance, if the connection is aborted, or if SSL is used. The `%O` format provided by [`mod_logio`](https://httpd.apache.org/docs/2.4/mod/mod_logio.html) will log the actual number of bytes sent over the network.
            - To get `%I` and `%O` working you need to enable `mod_logio` on Apache.
            - NGINX logs URI with query parameters, Apache doesnt.
            - `$request` is parsed into `$request_method`, `$request_uri` and `$server_protocol`. If you have `$request` in your log format, there is no sense to have others.
            - Don't use both `$bytes_sent` and `$body_bytes_sent` (`%O` and `%B` or `%b`). The module does not distinguish between these parameters.
          folding:
            title: Config options
            enabled: true
          list:
            - name: update_every
              description: Data collection frequency.
              default_value: 1
              required: false
            - name: autodetection_retry
              description: Recheck interval in seconds. Zero means no recheck will be scheduled.
              default_value: 0
              required: false
            - name: path
              description: Path to the web server log file.
              default_value: ""
              required: true
            - name: exclude_path
              description: Path to exclude.
              default_value: "*.gz"
              required: false
            - name: url_patterns
              description: List of URL patterns.
              default_value: "[]"
              required: false
              detailed_description: |
                "URL pattern" scope metrics will be collected for each URL pattern. 

                Option syntax:
                
                ```yaml
                url_patterns:
                  - name: name1
                    pattern: pattern1
                  - name: name2
                    pattern: pattern2
                ```
            - name: url_patterns.name
              description: Used as a dimension name.
              default_value: ""
              required: true
            - name: url_patterns.pattern
              description: Used to match against full original request URI. Pattern syntax in [matcher](https://github.com/netdata/netdata/tree/master/src/go/plugin/go.d/pkg/matcher#supported-format).
              default_value: ""
              required: true
            - name: log_type
              description: Log parser type.
              default_value: auto
              required: false
              detailed_description: |
                Weblog supports 5 different log parsers:

                | Parser type | Description                               |
                |-------------|-------------------------------------------|
                | auto        | Use CSV and auto-detect format            |
                | csv         | A comma-separated values                  |
                | json        | [JSON](https://www.json.org/json-en.html) |
                | ltsv        | [LTSV](http://ltsv.org/)                  |
                | regexp      | Regular expression with named groups      |
                
                Syntax:

                ```yaml
                log_type: auto
                ```

                If `log_type` parameter set to `auto` (which is default), weblog will try to auto-detect appropriate log parser and log format using the last line of the log file.
                
                - checks if format is `CSV` (using regexp).
                - checks if format is `JSON` (using regexp).
                - assumes format is `CSV` and tries to find appropriate `CSV` log format using predefined list of formats. It tries to parse the line using each of them in the following order (the first one matches is used later):
                
                  ```sh
                  $host:$server_port $remote_addr - - [$time_local] "$request" $status $body_bytes_sent - - $request_length $request_time $upstream_response_time
                  $host:$server_port $remote_addr - - [$time_local] "$request" $status $body_bytes_sent - - $request_length $request_time
                  $host:$server_port $remote_addr - - [$time_local] "$request" $status $body_bytes_sent     $request_length $request_time $upstream_response_time
                  $host:$server_port $remote_addr - - [$time_local] "$request" $status $body_bytes_sent     $request_length $request_time
                  $host:$server_port $remote_addr - - [$time_local] "$request" $status $body_bytes_sent
                  $remote_addr - - [$time_local] "$request" $status $body_bytes_sent - - $request_length $request_time $upstream_response_time
                  $remote_addr - - [$time_local] "$request" $status $body_bytes_sent - - $request_length $request_time
                  $remote_addr - - [$time_local] "$request" $status $body_bytes_sent     $request_length $request_time $upstream_response_time
                  $remote_addr - - [$time_local] "$request" $status $body_bytes_sent     $request_length $request_time
                  $remote_addr - - [$time_local] "$request" $status $body_bytes_sent
                  ```
                
                  If you're using the default Apache/NGINX log format, auto-detect will work for you. If it doesn't work you need to set the format manually.
            - name: csv_config
              description: CSV log parser config.
              default_value: ""
              required: false
            - name: csv_config.delimiter
              description: CSV field delimiter.
              default_value: ","
              required: false
            - name: csv_config.format
              description: CSV log format.
              default_value: ""
              required: false
              detailed_description: ""
            - name: ltsv_config
              description: LTSV log parser config.
              default_value: ""
              required: false
            - name: ltsv_config.field_delimiter
              description: LTSV field delimiter.
              default_value: "\\t"
              required: false
            - name: ltsv_config.value_delimiter
              description: LTSV value delimiter.
              default_value: ":"
              required: false
            - name: ltsv_config.mapping
              description: LTSV fields mapping to **known fields**.
              default_value: ""
              required: true
              detailed_description: |
                The mapping is a dictionary where the key is a field, as in logs, and the value is the corresponding **known field**.

                > **Note**: don't use `$` and `%` prefixes for mapped field names.

                ```yaml
                log_type: ltsv
                ltsv_config:
                  mapping:
                    label1: field1
                    label2: field2
                ```
            - name: json_config
              description: JSON log parser config.
              default_value: ""
              required: false
            - name: json_config.mapping
              description: JSON fields mapping to **known fields**.
              default_value: ""
              required: true
              detailed_description: |
                The mapping is a dictionary where the key is a field, as in logs, and the value is the corresponding **known field**.

                > **Note**: don't use `$` and `%` prefixes for mapped field names.

                ```yaml
                log_type: json
                json_config:
                  mapping:
                    label1: field1
                    label2: field2
                ```
            - name: regexp_config
              description: RegExp log parser config.
              default_value: ""
              required: false
            - name: regexp_config.pattern
              description: RegExp pattern with named groups.
              default_value: ""
              required: true
              detailed_description: |
                Use pattern with subexpressions names. These names should be **known fields**.
                
                > **Note**: don't use `$` and `%` prefixes for mapped field names.

                Syntax:

                ```yaml
                log_type: regexp
                regexp_config:
                  pattern: PATTERN
                ```
        examples:
          folding:
            title: Config
            enabled: true
          list: []
    troubleshooting:
      problems:
        list: []
    alerts:
      - name: web_log_1m_unmatched
        metric: web_log.excluded_requests
        info: percentage of unparsed log lines over the last minute
        link: https://github.com/netdata/netdata/blob/master/src/health/health.d/web_log.conf
      - name: web_log_1m_requests
        metric: web_log.type_requests
        info: "ratio of successful HTTP requests over the last minute (1xx, 2xx, 304, 401)"
        link: https://github.com/netdata/netdata/blob/master/src/health/health.d/web_log.conf
      - name: web_log_1m_redirects
        metric: web_log.type_requests
        info: "ratio of redirection HTTP requests over the last minute (3xx except 304)"
        link: https://github.com/netdata/netdata/blob/master/src/health/health.d/web_log.conf
      - name: web_log_1m_bad_requests
        metric: web_log.type_requests
        info: "ratio of client error HTTP requests over the last minute (4xx except 401)"
        link: https://github.com/netdata/netdata/blob/master/src/health/health.d/web_log.conf
      - name: web_log_1m_internal_errors
        metric: web_log.type_requests
        info: "ratio of server error HTTP requests over the last minute (5xx)"
        link: https://github.com/netdata/netdata/blob/master/src/health/health.d/web_log.conf
      - name: web_log_web_slow
        metric: web_log.request_processing_time
        info: average HTTP response time over the last 1 minute
        link: https://github.com/netdata/netdata/blob/master/src/health/health.d/web_log.conf
      - name: web_log_5m_requests_ratio
        metric: web_log.type_requests
        info: ratio of successful HTTP requests over over the last 5 minutes, compared with the previous 5 minutes
        link: https://github.com/netdata/netdata/blob/master/src/health/health.d/web_log.conf
    metrics:
      folding:
        title: Metrics
        enabled: false
      description: ""
      availability: []
      scopes:
        - name: global
          description: These metrics refer to the entire monitored application.
          labels: []
          metrics:
            - name: web_log.requests
              description: Total Requests
              unit: requests/s
              chart_type: line
              dimensions:
                - name: requests
            - name: web_log.excluded_requests
              description: Excluded Requests
              unit: requests/s
              chart_type: stacked
              dimensions:
                - name: unmatched
            - name: web_log.type_requests
              description: Requests By Type
              unit: requests/s
              chart_type: stacked
              dimensions:
                - name: success
                - name: bad
                - name: redirect
                - name: error
            - name: web_log.status_code_class_responses
              description: Responses By Status Code Class
              unit: responses/s
              chart_type: stacked
              dimensions:
                - name: 1xx
                - name: 2xx
                - name: 3xx
                - name: 4xx
                - name: 5xx
            - name: web_log.status_code_class_1xx_responses
              description: Informational Responses By Status Code
              unit: responses/s
              chart_type: stacked
              dimensions:
                - name: a dimension per 1xx code
            - name: web_log.status_code_class_2xx_responses
              description: Successful Responses By Status Code
              unit: responses/s
              chart_type: stacked
              dimensions:
                - name: a dimension per 2xx code
            - name: web_log.status_code_class_3xx_responses
              description: Redirects Responses By Status Code
              unit: responses/s
              chart_type: stacked
              dimensions:
                - name: a dimension per 3xx code
            - name: web_log.status_code_class_4xx_responses
              description: Client Errors Responses By Status Code
              unit: responses/s
              chart_type: stacked
              dimensions:
                - name: a dimension per 4xx code
            - name: web_log.status_code_class_5xx_responses
              description: Server Errors Responses By Status Code
              unit: responses/s
              chart_type: stacked
              dimensions:
                - name: a dimension per 5xx code
            - name: web_log.bandwidth
              description: Bandwidth
              unit: kilobits/s
              chart_type: area
              dimensions:
                - name: received
                - name: sent
            - name: web_log.request_processing_time
              description: Request Processing Time
              unit: milliseconds
              chart_type: line
              dimensions:
                - name: min
                - name: max
                - name: avg
            - name: web_log.requests_processing_time_histogram
              description: Requests Processing Time Histogram
              unit: requests/s
              chart_type: line
              dimensions:
                - name: a dimension per bucket
            - name: web_log.upstream_response_time
              description: Upstream Response Time
              unit: milliseconds
              chart_type: line
              dimensions:
                - name: min
                - name: max
                - name: avg
            - name: web_log.upstream_responses_time_histogram
              description: Upstream Responses Time Histogram
              unit: requests/s
              chart_type: line
              dimensions:
                - name: a dimension per bucket
            - name: web_log.current_poll_uniq_clients
              description: Current Poll Unique Clients
              unit: clients
              chart_type: stacked
              dimensions:
                - name: ipv4
                - name: ipv6
            - name: web_log.vhost_requests
              description: Requests By Vhost
              unit: requests/s
              chart_type: stacked
              dimensions:
                - name: a dimension per vhost
            - name: web_log.port_requests
              description: Requests By Port
              unit: requests/s
              chart_type: stacked
              dimensions:
                - name: a dimension per port
            - name: web_log.scheme_requests
              description: Requests By Scheme
              unit: requests/s
              chart_type: stacked
              dimensions:
                - name: http
                - name: https
            - name: web_log.http_method_requests
              description: Requests By HTTP Method
              unit: requests/s
              chart_type: stacked
              dimensions:
                - name: a dimension per HTTP method
            - name: web_log.http_version_requests
              description: Requests By HTTP Version
              unit: requests/s
              chart_type: stacked
              dimensions:
                - name: a dimension per HTTP version
            - name: web_log.ip_proto_requests
              description: Requests By IP Protocol
              unit: requests/s
              chart_type: stacked
              dimensions:
                - name: ipv4
                - name: ipv6
            - name: web_log.ssl_proto_requests
              description: Requests By SSL Connection Protocol
              unit: requests/s
              chart_type: stacked
              dimensions:
                - name: a dimension per SSL protocol
            - name: web_log.ssl_cipher_suite_requests
              description: Requests By SSL Connection Cipher Suite
              unit: requests/s
              chart_type: stacked
              dimensions:
                - name: a dimension per SSL cipher suite
            - name: web_log.url_pattern_requests
              description: URL Field Requests By Pattern
              unit: requests/s
              chart_type: stacked
              dimensions:
                - name: a dimension per URL pattern
            - name: web_log.custom_field_pattern_requests
              description: Custom Field Requests By Pattern
              unit: requests/s
              chart_type: stacked
              dimensions:
                - name: a dimension per custom field pattern
        - name: custom time field
          description: TBD
          labels: []
          metrics:
            - name: web_log.custom_time_field_summary
              description: Custom Time Field Summary
              unit: milliseconds
              chart_type: line
              dimensions:
                - name: min
                - name: max
                - name: avg
            - name: web_log.custom_time_field_histogram
              description: Custom Time Field Histogram
              unit: observations
              chart_type: line
              dimensions:
                - name: a dimension per bucket
        - name: custom numeric field
          description: TBD
          labels: []
          metrics:
            - name: web_log.custom_numeric_field_{{field_name}}_summary
              description: Custom Numeric Field Summary
              unit: '{{units}}'
              chart_type: line
              dimensions:
                - name: min
                - name: max
                - name: avg
        - name: URL pattern
          description: TBD
          labels: []
          metrics:
            - name: web_log.url_pattern_status_code_responses
              description: Responses By Status Code
              unit: responses/s
              chart_type: line
              dimensions:
                - name: a dimension per pattern
            - name: web_log.url_pattern_http_method_requests
              description: Requests By HTTP Method
              unit: requests/s
              chart_type: line
              dimensions:
                - name: a dimension per HTTP method
            - name: web_log.url_pattern_bandwidth
              description: Bandwidth
              unit: kilobits/s
              chart_type: area
              dimensions:
                - name: received
                - name: sent
            - name: web_log.url_pattern_request_processing_time
              description: Request Processing Time
              unit: milliseconds
              chart_type: line
              dimensions:
                - name: min
                - name: max
                - name: avg