Why does BREAK_ONLY_BEFORE work only for some even...

Saaral · ‎09-06-2018

I have applied regex in the heavy forwarders as below. But this works only for few events and a lot of events are not getting parsed with the regex in BREAK_ONLY_BEFORE.

pulldown_type = 1
SEDCMD-backslash=s/\//g
TRUNCATE = 0
BREAK_ONLY_BEFORE = {\”name\”
DATETIME_CONFIG = CURRENT
INDEXED_EXTRACTIONS = json
KV_MODE = json
category = Structured
SHOULD_LINEMERGE = false
NO_BINARY_CHECK = true

Sample logs as below.

{\"name\":\"\",\"\":,\"severity\":\"info\",\"time\":,\"host\":\"\",\"hostname\":\"\",\"\":\"\",\"\":\"UNKNOWN CORRELATION\",\"userId\":\"UNKNOWN USER\",\"moduleName\":\"\",\"\":\"a\",\"client\":\"AgentDesktop\",\"type\":\"application\",\"msg\":\"\",\"\":\"\"}{\"name\":\"\",\"level\":30,\"\":\"info\",\"time\":,\"host\":\"\",\"hostname\":\"\",\"\":\"\",\"clientCorrelationId\":\"\",\"userId\":\"UNKNOWN

For some events the same stanza in heavy forwarder works, but for others, it does not work. Can someone let me know what could be wrong?

sudosplunk · ‎09-06-2018

Hi,

Your SHOULD_LINEMERGE value must be true. And I made small adjustment to your regex. Try below,

props.conf:

BREAK_ONLY_BEFORE = \{\W+name
SHOULD_LINEMERGE = true

Saaral · ‎09-06-2018

Thanks! But how my stanza worked for one event and it is not working for another event. Why it was not working for all the events with the same pattern? Also in the regex you provided, I want to break only at name and at the braces before that.Will this break the event at the field name?

sudosplunk · ‎09-06-2018

I am not sure how it worked for the first event. Your regex did not match the event. Tested here. The backslash before quotes must be escaped in order to match \".

I updated my regex above. This will look for { before name

Saaral · ‎09-07-2018

Hi Surya

Thanks! I will try to implement it ! Also could you let me know what regex can be applied to the below log sample to break at the name field?

{\"name\":\"\",\"level\":,\"severity\":\"info\",\"time

sudosplunk · ‎09-07-2018

If events are multi-line, then try (?m)\{\W+name

(?m) - multi-line modifier
\{ - This will look for { literally.
\W+ - This will match any number of non-word characters. If you're sure about the number of characters between { and name, then make use of quantifiers, for example, \W{1,3} - this will look for minimum 1 and max 3 characters instead of looking for 1 and unlimited.
name - This will look for name literally case-sensitive.

Please refer to this page for more details.

If events are not multi-line:

I would suggest using LINE_BREAKER instead of BREAK_ONLY_BEFORE because, LINE_BREAKER will improve processing speed. If you would like to use LINE_BREAKER, then below are the configs,

LINE_BREAKER = ([\r\n]+)\{\W+name
SHOULD_LINEMERGE = false

Saaral · ‎09-11-2018

Hi Surya

We tried most of all the suggestions that you provided but nothing looks to be working.Only few events are being parsed and most of the events are not.But the SED command that I am applying works for all the events.The Regex is not working for all the events.I have not used the LINe BREAKER though.Will it work ?

sudosplunk · ‎09-11-2018

Okay, I see what you're doing. I will provide you two set of configs, one for multi line events; and another for single line events. Please apply these configs per your use case.

Multi line events (records with name starting in same line):

{\"name\":\"\",\"\":,\"severity\":\"info\",\"time\":,\"host\":\"\",\"hostname\":\"\",\"\":\"\",\"\":\"UNKNOWN CORRELATION\",\"userId\":\"UNKNOWN USER\",\"moduleName\":\"\",\"\":\"a\",\"client\":\"AgentDesktop\",\"type\":\"application\",\"msg\":\"\",\"\":\"\"}{\"name\":\"\",\"level\":30,\"\":\"info\",\"time\":,\"host\":\"\",\"hostname\":\"\",\"\":\"\",\"clientCorrelationId\":\"\",\"userId\":\"UNKNOWN

props.conf:

[your_sourcetype]
    BREAK_ONLY_BEFORE = (?m)\{\W*name
    SHOULD_LINEMERGE = true
    SEDCMD-backslash=s/\\//g
    DATETIME_CONFIG = CURRENT
    KV_MODE = json
    category = Structured
    NO_BINARY_CHECK = true
    TRUNCATE = 0

Single line events (records with name starting in new line):

{\"name\":\"\",\"\":,\"severity\":\"info\",\"time\":,\"host\":\"\",\"hostname\":\"\",\"\":\"\",\"\":\"UNKNOWN CORRELATION\",\"userId\":\"UNKNOWN USER\",\"moduleName\":\"\",\"\":\"a\",\"client\":\"AgentDesktop\",\"type\":\"application\",\"msg\":\"\",\"\":\"\"}
{\"name\":\"\",\"level\":30,\"\":\"info\",\"time\":,\"host\":\"\",\"hostname\":\"\",\"\":\"\",\"clientCorrelationId\":\"\",\"userId\":\"UNKNOWN

props.conf:

[your_sourcetype]
    LINE_BREAKER = ([\r\n]+)\{\W*name
    SHOULD_LINEMERGE = false
    SEDCMD-backslash=s/\\//g
    DATETIME_CONFIG = CURRENT
    KV_MODE = json
    category = Structured
    NO_BINARY_CHECK = true
    TRUNCATE = 0

You can test regex for both BREAK_ONLY_BEFORE and LINE_BREAKER with their respective data samples here.

Also, in your configurations, you're using INDEXED_EXTRACTIONS and KV_MODE to extract json fields. This is not suggestible as this will extract fields twice, resulting in duplicate field values. Please have a look at below links and use any one setting which suits your need.

https://answers.splunk.com/answers/556279/why-would-indexed-extractionsjson-in-propsconf-be.html

https://www.hurricanelabs.com/blog/splunk-case-study-indexed-extractions-vs-search-time-extractions

Saaral · ‎09-11-2018

Hi Surya- The solution thatyou provided yesterday works only for the events starting with new line.For the events are merged in a single line,it does not work.Will the above stanza work for thos merged events within a single line too?

sudosplunk · ‎09-11-2018

Yes. Use the 1st set of configs. I am not sure why it did not work the first time. Can you paste your full props.conf here which you're using right now. Please use "code generator" (the icon with 101010) for pasting content.

Saaral · ‎09-11-2018

[empath_app_log]
pulldown_type = 1
SEDCMD-backslash=s/\\//g
TRUNCATE = 0
BREAK_ONLY_BEFORE = \{\W+name
DATETIME_CONFIG = CURRENT
INDEXED_EXTRACTIONS = json
KV_MODE = json
category = Structured
SHOULD_LINEMERGE = true
NO_BINARY_CHECK = true

Saaral · ‎09-11-2018

This is what we deployed last night and only the events starting with newline is being parsed while the events merged together in single line is not being parsed.

Saaral · ‎09-11-2018

{"name":"utterance.service logger","level":30,"severity":"info","time":"host":"","hostname":"","category":"application","clientCorrelationId":"","userId":"","moduleName":"DisplayUtterancesFsModule","source":"angular","client":"AgentDesktop","type":"application","msg":"utterance does not exist","logId":""}{"name":"utterance.service logger","level":30,"severity":"info","time":,"host":"","hostname":"","category":"application","clientCorrelationId":"","userId":"","moduleName":"","source":"angular","client":"AgentDesktop","type":"application","msg":"utterance does not exist","logId":""}

Saaral · ‎09-11-2018

Above the sample log that is not being parsed .I pulled it from the splunk UI

sudosplunk · ‎09-11-2018

Thanks for the information. Please add (?m) - multi-line modifier before \{\W+name. This will make splunk to look at each line for {"name string.

Saaral · ‎09-11-2018

Oops! I applied that as well.Below is the one that is in the server and still not working as I expected.

[empath_app_log]
pulldown_type = 1
SEDCMD-backslash=s/\//g
TRUNCATE = 0
BREAK_ONLY_BEFORE = (?m){\W+name
DATETIME_CONFIG = CURRENT
INDEXED_EXTRACTIONS = json
KV_MODE = json
category = Structured
SHOULD_LINEMERGE = true
NO_BINARY_CHECK = true

sudosplunk · ‎09-11-2018

Hmm. Can you check if any other setting is taking precedence by running this command splunk btool props list --debug | grep 'empath_app_log'

Do you mind walking me through your architecture. Data flow is from UF --> HF --> Indexer?

Saaral · ‎09-11-2018

The Data flow is from Deployment server to the heavy forwarder to the indexers.

sudosplunk · ‎09-11-2018

Are you collecting logs from deployment server? In that case, please place the same props.conf along with your inputs.conf on DS as well. What was the output of btool command. Did you notice any conflicts?

Saaral · ‎09-11-2018

I am unable to run that command.I dont have that previlege

Why does BREAK_ONLY_BEFORE work only for some events?

Extending Observability Content to Splunk Cloud

More Control Over Your Monitoring Costs with Archived Metrics!

New in Observability Cloud - Explicit Bucket Histograms