Splunk Search

REGEX Extraction (same log format, different fields in DNS data)

tmarlette
Motivator

I am attempting to extract 2 fields, that are structured the same in an event, however represent 2 actions. one represents a query, the other a response for DNS data.

Here is a sample event:

    QUESTION SECTION:
        Offset = 0x000c, RR count = 0
        Name      ".www.couponcabin.com.distil.us."
          QTYPE   A .
          QCLASS  1
        ANSWER SECTION:
        Offset = 0x002f, RR count = 0
        Name      ".www.couponcabin.com.DISTIL[C027].us."
          TYPE   CNAME  .
          CLASS  1
          TTL    34
          DLEN   9
          DATA   .scotch[C020].distil.us.
        Offset = 0x005f, RR count = 1
        Name      ".scotch[C043].DISTIL[C027].us."
          TYPE   CNAME  .
          CLASS  1
          TTL    21
          DLEN   5
          DATA   .us[C056].scotch[C020].distil.us.
        Offset = 0x0077, RR count = 2
        Name      ".us[C05F].scotch[C043].DISTIL[C027].us."
          TYPE   CNAME  .
          CLASS  1
          TTL    51
          DLEN   27
          DATA   .shard1.premium.newjersey[C020].distil.us.
        Offset = 0x00a1, RR count = 3
        Name      ".shard1.premium.newjersey[C043].DISTIL[C027].us."
          TYPE   A  .
          CLASS  1
          TTL    86
          DLEN   4
          DATA   10.10.10.10
        AUTHORITY SECTION:
          empty
        ADDITIONAL SECTION:
        Offset = 0x00ca, RR count = 0

Notice that there is a 'QUESTION SECTION' and an 'ANSWER SECTION', both of which have the value 'Name ...'

I am attempting to extract the QUESTION SECTION Name value as the field 'query', and the ANSWER SECTION Name values as the field 'answer'. I know how to make an mv field, I just need the extractions themselves.

Here is what I currently have

EXTRACT-qa = Name\s+(?<query>\"[^\"]+\")

I use the MV_ADD transform to make this field a multivalue field, however this extracts ALL of the matches, not separating the 'query' field from the 'answers' fields.

Thank you for any help you can provide!

Tags (1)
0 Karma
1 Solution

woodcock
Esteemed Legend

Here is a search-bar solution; you should be able to convert it to a conf-file solution:

... | rex "(?s)^[\r\n]*QUESTION\s+SECTION:(?<QUESTION_SECTION>.*?)[\r\n]*ANSWER\s+SECTION:(?<ANSWER_SECTION>.*?)[\r\n]*(?:AUTHORITY\s+SECTION:(?<AUTHORITY_SECTION>.*?)[\r\n]*)(?:ADDITIONAL\s+SECTION:(?<ADDITIONAL_SECTION>.*?)[\r\n]*)?$"
| rex max_match=99 field = QUESTION_SECTION "(?s)[\r\n]+Name\s+(?<query>[^\r\n]+)"
| rex max_match=99 field = ANSWER_SECTION "(?s)[\r\n]+Name\s+(?<answer>[^\r\n]+)"

View solution in original post

woodcock
Esteemed Legend

Here is a search-bar solution; you should be able to convert it to a conf-file solution:

... | rex "(?s)^[\r\n]*QUESTION\s+SECTION:(?<QUESTION_SECTION>.*?)[\r\n]*ANSWER\s+SECTION:(?<ANSWER_SECTION>.*?)[\r\n]*(?:AUTHORITY\s+SECTION:(?<AUTHORITY_SECTION>.*?)[\r\n]*)(?:ADDITIONAL\s+SECTION:(?<ADDITIONAL_SECTION>.*?)[\r\n]*)?$"
| rex max_match=99 field = QUESTION_SECTION "(?s)[\r\n]+Name\s+(?<query>[^\r\n]+)"
| rex max_match=99 field = ANSWER_SECTION "(?s)[\r\n]+Name\s+(?<answer>[^\r\n]+)"

tmarlette
Motivator

I attempted this extraction, but it didn't match anything my friend. I'm using RegExr as well, and it doesn't match for either section.

0 Karma

tmarlette
Motivator

OK, i'm working on this, but I can't seem to put a REGEX in transforms.conf that looks through a field. Do you happen to know a way woodcock?

I have the 'answer_section' field extracted through props.conf. How Do I tell Splunk to search through 'answer_section' for another extraction?

0 Karma

woodcock
Esteemed Legend

You do this by stacking the transforms with the correct details. Try this:

In props.conf:

[myDNS]
Report-ThisPartIsArbitraryButMustBeUnique = extract_answer_section answer_section_mv

In transforms.conf:

[extract_answer_section ]
REGEX = ANSWER\s+SECTION:([^*]+)AUTHORITY
FORMAT = answer_section::$1

[answer_section_mv]
SOURCE_KEY = answer_section
REGEX = (Name)\s+\"([^\"]+)\"
FORMAT = $1::$2
MV_ADD = true
0 Karma

tmarlette
Motivator

I took part of this and I think it works well enough.

here are my settings

in props.conf

[myDns]
REPORT-mv = answer_section_MV
EXTRACT-ans_sec = ANSWER\sSECTION:(?[^*]+)AUTHORITY

in transforms.conf

[answer_section_MV]
SOURCE_KEY = answer_section
REGEX = (Name)\s+\"([^\"]+)\"
FORMAT = $1::$2
MV_ADD = true    

When I attempted to extract the 'answer_section' field in transforms.conf, it wouldn't pull out the 'Name' field.

0 Karma

somesoni2
Revered Legend

There is an attribute called SOURCE_KEY in transforms.conf but it only takes indexed fields (your answer_section field is search time field extraction, so you can use it). You may be able to use field _raw (default SOURCE_KEY value), by merging the regex for your answer_section field extraction and these new field extractions.

0 Karma

tmarlette
Motivator

Alrighty, this is what I have now for this.

props.conf

 [myDNS]
EXTRACT-ans_sec = ANSWER\sSECTION:(?<answer_section>[^*]+)AUTHORITY
REPORT-fields = answer_section_mv

transforms.conf

[answer_section_mv]
REGEX = Name\s+\"(?<answer>[^\"]+)\"
MV_ADD = true
SOURCE_KEY = field:answer_section

Naturally, the 'answer' field is not being extracted. I'm not sure how to combine the REGEX for the answer_section and the answer fields.

0 Karma

sundareshr
Legend

You could break it up into two regexes. Like this (REGEX NOT TESTED)

EXTRACT-q = QUESTION.*Name\s+(?<query>\"[^\"]+\")
EXTRACT-ans = ANSWER.*Name\s+(?<query>\"[^\"]+\") 

and have MV_ADD for the ans and not q

0 Karma

tmarlette
Motivator

When I use the 'answer' extraction you have here, the REGEX match stops at the end of the line. Is there a way to make it span multiple lines? I tried using [^?N] but that doesn't work either.

0 Karma

sundareshr
Legend

You need to activate multi-line mode matching for the regex by specifying (?m) at the start. Try like this

(?m)ANSWER.*Name\s+(?<query>\"[^\"]+\")
0 Karma

tmarlette
Motivator

Negative, this doesn't work my friend. It doesn't even capture the first 'Name' line. Thank you!

0 Karma

richgalloway
SplunkTrust
SplunkTrust

If there always a single QUESTION SECTION followed by multiple ANSWER SECTIONs? If so, you could take the first value of the multivalue field as 'query' and the remainder as 'answer'.

---
If this reply helps you, Karma would be appreciated.
0 Karma

tmarlette
Motivator

are you talking about using a | stats first() function of some kind?

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Just a simple eval query=mvindex(foo, 0).

---
If this reply helps you, Karma would be appreciated.
0 Karma
Get Updates on the Splunk Community!

More Ways To Control Your Costs With Archived Metrics | Register for Tech Talk

Tuesday, May 14, 2024  |  11AM PT / 2PM ET Register to Attend Join us for this Tech Talk and learn how to ...

.conf24 | Personalize your .conf experience with Learning Paths!

Personalize your .conf24 Experience Learning paths allow you to level up your skill sets and dive deeper ...

Threat Hunting Unlocked: How to Uplevel Your Threat Hunting With the PEAK Framework ...

WATCH NOWAs AI starts tackling low level alerts, it's more critical than ever to uplevel your threat hunting ...