Splunk Search

REGEX Extraction (same log format, different fields in DNS data)

tmarlette
Motivator

I am attempting to extract 2 fields, that are structured the same in an event, however represent 2 actions. one represents a query, the other a response for DNS data.

Here is a sample event:

    QUESTION SECTION:
        Offset = 0x000c, RR count = 0
        Name      ".www.couponcabin.com.distil.us."
          QTYPE   A .
          QCLASS  1
        ANSWER SECTION:
        Offset = 0x002f, RR count = 0
        Name      ".www.couponcabin.com.DISTIL[C027].us."
          TYPE   CNAME  .
          CLASS  1
          TTL    34
          DLEN   9
          DATA   .scotch[C020].distil.us.
        Offset = 0x005f, RR count = 1
        Name      ".scotch[C043].DISTIL[C027].us."
          TYPE   CNAME  .
          CLASS  1
          TTL    21
          DLEN   5
          DATA   .us[C056].scotch[C020].distil.us.
        Offset = 0x0077, RR count = 2
        Name      ".us[C05F].scotch[C043].DISTIL[C027].us."
          TYPE   CNAME  .
          CLASS  1
          TTL    51
          DLEN   27
          DATA   .shard1.premium.newjersey[C020].distil.us.
        Offset = 0x00a1, RR count = 3
        Name      ".shard1.premium.newjersey[C043].DISTIL[C027].us."
          TYPE   A  .
          CLASS  1
          TTL    86
          DLEN   4
          DATA   10.10.10.10
        AUTHORITY SECTION:
          empty
        ADDITIONAL SECTION:
        Offset = 0x00ca, RR count = 0

Notice that there is a 'QUESTION SECTION' and an 'ANSWER SECTION', both of which have the value 'Name ...'

I am attempting to extract the QUESTION SECTION Name value as the field 'query', and the ANSWER SECTION Name values as the field 'answer'. I know how to make an mv field, I just need the extractions themselves.

Here is what I currently have

EXTRACT-qa = Name\s+(?<query>\"[^\"]+\")

I use the MV_ADD transform to make this field a multivalue field, however this extracts ALL of the matches, not separating the 'query' field from the 'answers' fields.

Thank you for any help you can provide!

Tags (1)
0 Karma
1 Solution

woodcock
Esteemed Legend

Here is a search-bar solution; you should be able to convert it to a conf-file solution:

... | rex "(?s)^[\r\n]*QUESTION\s+SECTION:(?<QUESTION_SECTION>.*?)[\r\n]*ANSWER\s+SECTION:(?<ANSWER_SECTION>.*?)[\r\n]*(?:AUTHORITY\s+SECTION:(?<AUTHORITY_SECTION>.*?)[\r\n]*)(?:ADDITIONAL\s+SECTION:(?<ADDITIONAL_SECTION>.*?)[\r\n]*)?$"
| rex max_match=99 field = QUESTION_SECTION "(?s)[\r\n]+Name\s+(?<query>[^\r\n]+)"
| rex max_match=99 field = ANSWER_SECTION "(?s)[\r\n]+Name\s+(?<answer>[^\r\n]+)"

View solution in original post

woodcock
Esteemed Legend

Here is a search-bar solution; you should be able to convert it to a conf-file solution:

... | rex "(?s)^[\r\n]*QUESTION\s+SECTION:(?<QUESTION_SECTION>.*?)[\r\n]*ANSWER\s+SECTION:(?<ANSWER_SECTION>.*?)[\r\n]*(?:AUTHORITY\s+SECTION:(?<AUTHORITY_SECTION>.*?)[\r\n]*)(?:ADDITIONAL\s+SECTION:(?<ADDITIONAL_SECTION>.*?)[\r\n]*)?$"
| rex max_match=99 field = QUESTION_SECTION "(?s)[\r\n]+Name\s+(?<query>[^\r\n]+)"
| rex max_match=99 field = ANSWER_SECTION "(?s)[\r\n]+Name\s+(?<answer>[^\r\n]+)"

tmarlette
Motivator

I attempted this extraction, but it didn't match anything my friend. I'm using RegExr as well, and it doesn't match for either section.

0 Karma

tmarlette
Motivator

OK, i'm working on this, but I can't seem to put a REGEX in transforms.conf that looks through a field. Do you happen to know a way woodcock?

I have the 'answer_section' field extracted through props.conf. How Do I tell Splunk to search through 'answer_section' for another extraction?

0 Karma

woodcock
Esteemed Legend

You do this by stacking the transforms with the correct details. Try this:

In props.conf:

[myDNS]
Report-ThisPartIsArbitraryButMustBeUnique = extract_answer_section answer_section_mv

In transforms.conf:

[extract_answer_section ]
REGEX = ANSWER\s+SECTION:([^*]+)AUTHORITY
FORMAT = answer_section::$1

[answer_section_mv]
SOURCE_KEY = answer_section
REGEX = (Name)\s+\"([^\"]+)\"
FORMAT = $1::$2
MV_ADD = true
0 Karma

tmarlette
Motivator

I took part of this and I think it works well enough.

here are my settings

in props.conf

[myDns]
REPORT-mv = answer_section_MV
EXTRACT-ans_sec = ANSWER\sSECTION:(?[^*]+)AUTHORITY

in transforms.conf

[answer_section_MV]
SOURCE_KEY = answer_section
REGEX = (Name)\s+\"([^\"]+)\"
FORMAT = $1::$2
MV_ADD = true    

When I attempted to extract the 'answer_section' field in transforms.conf, it wouldn't pull out the 'Name' field.

0 Karma

somesoni2
SplunkTrust
SplunkTrust

There is an attribute called SOURCE_KEY in transforms.conf but it only takes indexed fields (your answer_section field is search time field extraction, so you can use it). You may be able to use field _raw (default SOURCE_KEY value), by merging the regex for your answer_section field extraction and these new field extractions.

0 Karma

tmarlette
Motivator

Alrighty, this is what I have now for this.

props.conf

 [myDNS]
EXTRACT-ans_sec = ANSWER\sSECTION:(?<answer_section>[^*]+)AUTHORITY
REPORT-fields = answer_section_mv

transforms.conf

[answer_section_mv]
REGEX = Name\s+\"(?<answer>[^\"]+)\"
MV_ADD = true
SOURCE_KEY = field:answer_section

Naturally, the 'answer' field is not being extracted. I'm not sure how to combine the REGEX for the answer_section and the answer fields.

0 Karma

sundareshr
Legend

You could break it up into two regexes. Like this (REGEX NOT TESTED)

EXTRACT-q = QUESTION.*Name\s+(?<query>\"[^\"]+\")
EXTRACT-ans = ANSWER.*Name\s+(?<query>\"[^\"]+\") 

and have MV_ADD for the ans and not q

0 Karma

tmarlette
Motivator

When I use the 'answer' extraction you have here, the REGEX match stops at the end of the line. Is there a way to make it span multiple lines? I tried using [^?N] but that doesn't work either.

0 Karma

sundareshr
Legend

You need to activate multi-line mode matching for the regex by specifying (?m) at the start. Try like this

(?m)ANSWER.*Name\s+(?<query>\"[^\"]+\")
0 Karma

tmarlette
Motivator

Negative, this doesn't work my friend. It doesn't even capture the first 'Name' line. Thank you!

0 Karma

richgalloway
SplunkTrust
SplunkTrust

If there always a single QUESTION SECTION followed by multiple ANSWER SECTIONs? If so, you could take the first value of the multivalue field as 'query' and the remainder as 'answer'.

---
If this reply helps you, Karma would be appreciated.
0 Karma

tmarlette
Motivator

are you talking about using a | stats first() function of some kind?

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Just a simple eval query=mvindex(foo, 0).

---
If this reply helps you, Karma would be appreciated.
0 Karma
Get Updates on the Splunk Community!

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...