Splunk Dev

Why does BREAK_ONLY_BEFORE work only for some events?

Saaral
New Member

I have applied regex in the heavy forwarders as below. But this works only for few events and a lot of events are not getting parsed with the regex in BREAK_ONLY_BEFORE.

pulldown_type = 1
SEDCMD-backslash=s/\//g
TRUNCATE = 0
BREAK_ONLY_BEFORE = {\”name\”
DATETIME_CONFIG = CURRENT
INDEXED_EXTRACTIONS = json
KV_MODE = json
category = Structured
SHOULD_LINEMERGE = false
NO_BINARY_CHECK = true

Sample logs as below.

{\"name\":\"\",\"\":,\"severity\":\"info\",\"time\":,\"host\":\"\",\"hostname\":\"\",\"\":\"\",\"\":\"UNKNOWN CORRELATION\",\"userId\":\"UNKNOWN USER\",\"moduleName\":\"\",\"\":\"a\",\"client\":\"AgentDesktop\",\"type\":\"application\",\"msg\":\"\",\"\":\"\"}{\"name\":\"\",\"level\":30,\"\":\"info\",\"time\":,\"host\":\"\",\"hostname\":\"\",\"\":\"\",\"clientCorrelationId\":\"\",\"userId\":\"UNKNOWN 

For some events the same stanza in heavy forwarder works, but for others, it does not work. Can someone let me know what could be wrong?

0 Karma

sudosplunk
Motivator

Hi,

Your SHOULD_LINEMERGE value must be true. And I made small adjustment to your regex. Try below,

props.conf:

BREAK_ONLY_BEFORE = \{\W+name
SHOULD_LINEMERGE = true
0 Karma

Saaral
New Member

Thanks! But how my stanza worked for one event and it is not working for another event. Why it was not working for all the events with the same pattern? Also in the regex you provided, I want to break only at name and at the braces before that.Will this break the event at the field name?

0 Karma

sudosplunk
Motivator

I am not sure how it worked for the first event. Your regex did not match the event. Tested here. The backslash before quotes must be escaped in order to match \".

I updated my regex above. This will look for { before name

0 Karma

Saaral
New Member

Hi Surya

Thanks! I will try to implement it ! Also could you let me know what regex can be applied to the below log sample to break at the name field?

{\"name\":\"\",\"level\":,\"severity\":\"info\",\"time

0 Karma

sudosplunk
Motivator

If events are multi-line, then try (?m)\{\W+name

(?m) - multi-line modifier
\{ - This will look for { literally.
\W+ - This will match any number of non-word characters. If you're sure about the number of characters between { and name, then make use of quantifiers, for example, \W{1,3} - this will look for minimum 1 and max 3 characters instead of looking for 1 and unlimited.
name - This will look for name literally case-sensitive.

Please refer to this page for more details.

If events are not multi-line:

I would suggest using LINE_BREAKER instead of BREAK_ONLY_BEFORE because, LINE_BREAKER will improve processing speed. If you would like to use LINE_BREAKER, then below are the configs,

LINE_BREAKER = ([\r\n]+)\{\W+name
SHOULD_LINEMERGE = false
0 Karma

Saaral
New Member

Hi Surya

We tried most of all the suggestions that you provided but nothing looks to be working.Only few events are being parsed and most of the events are not.But the SED command that I am applying works for all the events.The Regex is not working for all the events.I have not used the LINe BREAKER though.Will it work ?

0 Karma

sudosplunk
Motivator

Okay, I see what you're doing. I will provide you two set of configs, one for multi line events; and another for single line events. Please apply these configs per your use case.

Multi line events (records with name starting in same line):

{\"name\":\"\",\"\":,\"severity\":\"info\",\"time\":,\"host\":\"\",\"hostname\":\"\",\"\":\"\",\"\":\"UNKNOWN CORRELATION\",\"userId\":\"UNKNOWN USER\",\"moduleName\":\"\",\"\":\"a\",\"client\":\"AgentDesktop\",\"type\":\"application\",\"msg\":\"\",\"\":\"\"}{\"name\":\"\",\"level\":30,\"\":\"info\",\"time\":,\"host\":\"\",\"hostname\":\"\",\"\":\"\",\"clientCorrelationId\":\"\",\"userId\":\"UNKNOWN 

props.conf:

[your_sourcetype]
    BREAK_ONLY_BEFORE = (?m)\{\W*name
    SHOULD_LINEMERGE = true
    SEDCMD-backslash=s/\\//g
    DATETIME_CONFIG = CURRENT
    KV_MODE = json
    category = Structured
    NO_BINARY_CHECK = true
    TRUNCATE = 0

Single line events (records with name starting in new line):

{\"name\":\"\",\"\":,\"severity\":\"info\",\"time\":,\"host\":\"\",\"hostname\":\"\",\"\":\"\",\"\":\"UNKNOWN CORRELATION\",\"userId\":\"UNKNOWN USER\",\"moduleName\":\"\",\"\":\"a\",\"client\":\"AgentDesktop\",\"type\":\"application\",\"msg\":\"\",\"\":\"\"}
{\"name\":\"\",\"level\":30,\"\":\"info\",\"time\":,\"host\":\"\",\"hostname\":\"\",\"\":\"\",\"clientCorrelationId\":\"\",\"userId\":\"UNKNOWN 

props.conf:

[your_sourcetype]
    LINE_BREAKER = ([\r\n]+)\{\W*name
    SHOULD_LINEMERGE = false
    SEDCMD-backslash=s/\\//g
    DATETIME_CONFIG = CURRENT
    KV_MODE = json
    category = Structured
    NO_BINARY_CHECK = true
    TRUNCATE = 0

You can test regex for both BREAK_ONLY_BEFORE and LINE_BREAKER with their respective data samples here.

Also, in your configurations, you're using INDEXED_EXTRACTIONS and KV_MODE to extract json fields. This is not suggestible as this will extract fields twice, resulting in duplicate field values. Please have a look at below links and use any one setting which suits your need.

https://answers.splunk.com/answers/556279/why-would-indexed-extractionsjson-in-propsconf-be.html

https://www.hurricanelabs.com/blog/splunk-case-study-indexed-extractions-vs-search-time-extractions

0 Karma

Saaral
New Member

Hi Surya- The solution thatyou provided yesterday works only for the events starting with new line.For the events are merged in a single line,it does not work.Will the above stanza work for thos merged events within a single line too?

0 Karma

sudosplunk
Motivator

Yes. Use the 1st set of configs. I am not sure why it did not work the first time. Can you paste your full props.conf here which you're using right now. Please use "code generator" (the icon with 101010) for pasting content.

0 Karma

Saaral
New Member
[empath_app_log]
pulldown_type = 1
SEDCMD-backslash=s/\\//g
TRUNCATE = 0
BREAK_ONLY_BEFORE = \{\W+name
DATETIME_CONFIG = CURRENT
INDEXED_EXTRACTIONS = json
KV_MODE = json
category = Structured
SHOULD_LINEMERGE = true
NO_BINARY_CHECK = true
0 Karma

Saaral
New Member

This is what we deployed last night and only the events starting with newline is being parsed while the events merged together in single line is not being parsed.

0 Karma

Saaral
New Member
{"name":"utterance.service logger","level":30,"severity":"info","time":"host":"","hostname":"","category":"application","clientCorrelationId":"","userId":"","moduleName":"DisplayUtterancesFsModule","source":"angular","client":"AgentDesktop","type":"application","msg":"utterance does not exist","logId":""}{"name":"utterance.service logger","level":30,"severity":"info","time":,"host":"","hostname":"","category":"application","clientCorrelationId":"","userId":"","moduleName":"","source":"angular","client":"AgentDesktop","type":"application","msg":"utterance does not exist","logId":""}
0 Karma

Saaral
New Member

Above the sample log that is not being parsed .I pulled it from the splunk UI

0 Karma

sudosplunk
Motivator

Thanks for the information. Please add (?m) - multi-line modifier before \{\W+name. This will make splunk to look at each line for {"name string.

0 Karma

Saaral
New Member

Oops! I applied that as well.Below is the one that is in the server and still not working as I expected.

[empath_app_log]
pulldown_type = 1
SEDCMD-backslash=s/\//g
TRUNCATE = 0
BREAK_ONLY_BEFORE = (?m){\W+name
DATETIME_CONFIG = CURRENT
INDEXED_EXTRACTIONS = json
KV_MODE = json
category = Structured
SHOULD_LINEMERGE = true
NO_BINARY_CHECK = true

0 Karma

sudosplunk
Motivator

Hmm. Can you check if any other setting is taking precedence by running this command splunk btool props list --debug | grep 'empath_app_log'

Do you mind walking me through your architecture. Data flow is from UF --> HF --> Indexer?

0 Karma

Saaral
New Member

The Data flow is from Deployment server to the heavy forwarder to the indexers.

0 Karma

sudosplunk
Motivator

Are you collecting logs from deployment server? In that case, please place the same props.conf along with your inputs.conf on DS as well. What was the output of btool command. Did you notice any conflicts?

0 Karma

Saaral
New Member

I am unable to run that command.I dont have that previlege

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...