Splunk Enterprise

Open Telemetry Question on operators router and parsing metadata out of log.file.path

big6consultant
New Member

I'm having issues getting parsing working using a custom config otel specification. The `log.file.path` should be one of these two formats:

1. /splunk-otel/app-api-starter-project-template/app-api-starter-project-template-96bfdf8866-9jz7m/app-api-starter-project-template.log
2. /splunk-otel/app-api-starter-project-template/app-api-starter-project-template.log

One with and one without the pod name.

We are doing it this way so that we only index one application log file in a set of directories rather than picking up a ton of kubernetes logs that we will never review, but yet have to store.

At the bottom is the full otel config.

We are noticing that regardless of the file path (1 or 2) above, it keeps going to the default option, and in the `catchall` attribute in splunk, it has the value of log.file.path which always is the 1st format above (e.g. /splunk-otel/app-api-starter-project-template/app-api-starter-project-template-96bfdf8866-9jz7m/app-api-starter-project-template.log).

- id: catchall
  type: move
  from: attributes["log.file.path"]
  to: attributes["catchall"]

Why is it that it's not going to the route `parse-deep-filepath` considering the Regex should match.

We want to be able to pull out the `application name`, the `pod name`, and the `namespace` which are all reflected in the full `log.file.path`

receivers:
    filelog/mule-logs-volume:
      include: 
      - /splunk-otel/*/app*.log
      - /splunk-otel/*/*/app*.log
      start_at: beginning
      include_file_path: true
      include_file_name: true
      resource: 
        com.splunk.sourcetype: mule-logs
        k8s.cluster.name: {{ k8s_cluster_instance_name }}
        deployment.environment: {{ aws_environment_name }}
        splunk_server: {{ splunk_host }}
      operators:
      - type: router
        id: get-format
        routes:
          - output: parse-deep-filepath
            expr: 'log.file.path matches "^/splunk-otel/[^/]+/[^/]+/app-[^/]+[.]log$"'
          - output: parse-shallow-filepath
            expr: 'log.file.path matches "^/splunk-otel/[^/]+/app-[^/]+[.]log$"'
          - output: nil-filepath
            expr: 'log.file.path matches "^<nil>$"'
        default: catchall
      # Extract metadata from file path
      - id: parse-deep-filepath
        type: regex_parser
        regex: '^/splunk-otel/(?P<namespace>[^/]+)/(?P<pod_name>[^/]+)/(?P<application>[^/]+)[.]log$'
        parse_from: attributes["log.file.path"]
      - id: parse-shallow-filepath
        type: regex_parser
        regex: '^/splunk-otel/(?P<namespace>[^/]+)/(?P<application>[^/]+)[.]log$'
        parse_from: attributes["log.file.path"]
      - id: nil-filepath
        type: move
        from: attributes["log.file.path"]
        to: attributes["nil_filepath"]
      - id: catchall
        type: move
        from: attributes["log.file.path"]
        to: attributes["catchall"]

exporters:
    splunk_hec/logs:
        # Splunk HTTP Event Collector token.
        token: "{{ splunk_token }}"
        # URL to a Splunk instance to send data to.
        endpoint: "{{ splunk_full_endpoint }}"
        # Optional Splunk source: https://docs.splunk.com/Splexicon:Source
        source: "output"
        # Splunk index, optional name of the Splunk index targeted.
        index: "{{ splunk_index_name }}"
        # Maximum HTTP connections to use simultaneously when sending data. Defaults to 100.
        #max_connections: 20
        # Whether to disable gzip compression over HTTP. Defaults to false.
        disable_compression: false
        # HTTP timeout when sending data. Defaults to 10s.
        timeout: 900s
        tls:
          # Whether to skip checking the certificate of the HEC endpoint when sending data over HTTPS. Defaults to false.
          # For this demo, we use a self-signed certificate on the Splunk docker instance, so this flag is set to true.
          insecure_skip_verify: true

processors:
    batch:

extensions:
    health_check:
      endpoint: 0.0.0.0:8080
    pprof:
      endpoint: :1888
    zpages:
      endpoint: :55679
    file_storage/checkpoint:
      directory: /output/
      timeout: 10s
      compaction:
        on_start: true
        directory: /output/
        max_transaction_size: 65_536

service:
    extensions: [pprof, zpages, health_check, file_storage/checkpoint]
    pipelines:
      logs:
        receivers: [filelog/mule-logs-volume]
        processors: [batch]
        exporters: [splunk_hec/logs]
Labels (1)
0 Karma

big6consultant
New Member

found a solution by splitting out

    filelog/mule-logs-volume:
      include: 
      - /splunk-otel/*/app*.log
      - /splunk-otel/*/*/app*.log

into two separate filelog entries as such

    filelog/mule-logs-volume1:
      include: 
      - /daas-splunk-otel/*/*/dla*.log
      start_at: beginning

    filelog/mule-logs-volume2:
      include: 
      - /daas-splunk-otel/*/dla*.log
      start_at: beginning

and remove all the router stuff 

0 Karma
Get Updates on the Splunk Community!

Enter the Splunk Community Dashboard Challenge for Your Chance to Win!

The Splunk Community Dashboard Challenge is underway! This is your chance to showcase your skills in creating ...

.conf24 | Session Scheduler is Live!!

.conf24 is happening June 11 - 14 in Las Vegas, and we are thrilled to announce that the conference catalog ...

Introducing the Splunk Community Dashboard Challenge!

Welcome to Splunk Community Dashboard Challenge! This is your chance to showcase your skills in creating ...