Getting Data In

How do I fine tune a JSON extraction from inside a log file using the "add data" wizard?

tmaire2
New Member

Hello everyone,

I have a Log file with JSON format in it like this :

12:48:12.3194 Info {"message":"Test ListOfEmails execution started","level":"Information","logType":"Default","timeStamp":"2018-11-12T12:48:12.0992011+01:00","fingerprint":"fingerprintID","windowsIdentity":"WindowsIdentity_name","machineName":"machine_name","processName":"Test ListOfEmails","processVersion":"1.0.0.0","jobId":"name_of_the_job","robotName":"name_of_the_robot","machineId":44111,"fileName":"Main"}

When i imported this file (manually) with the Splunk "Add data" wizard, it didn't auto discover the fields in the JSON part. So i try to use the "Extract Fields" to extract my fields. It works for some of the fields but not for all of them (like "machineId" and fileName"). When I try to extract multiple fields in once and field one by one, I get the same results; it throws me this error :

"The extraction failed. If you are extracting multiple fields, try removing one or more fields. Start with extractions that are embedded within longer text strings."

Then i try to do my own Regex :

^(?:[^ \n]* ){2}\{"\w+":"(?P<message>[^"]+)[^:\n]*:"(?P<level>[^"]+)[^:\n]*:"(?P<logType>\w+)(?:[^"\n]*"){8}(?P<fingerprint>[^"]+)[^:\n]*:"(?P<windowsIdentity>[^"]+)[^:\n]*:"(?P<machineName>[^"]+)[^:\n]*:"(?P<processName>[^"]+)[^:\n]*:"(?P<processVersion>[^"]+)[^:\n]*:"(?P<jobId>[^"]+)[^:\n]*:"(?P<robotName>[^"]+)[^:\n]*:(?P<machineId>[^",]+)[^:\n]*:"(?P<fileName>[^"]+)

It work (it extract all my field except some of them with very long message) until i write the last part for the "fileName" and give me this error :

Error in 'rex' command: regex="(?ms)^(?:[^ \n] ){2}{"\w+":"(?P[^"]+)[^:\n]:"(?P[^"]+)[^:\n]:"(?P\w+)(?:[^"\n]"){8}(?P[^"]+)[^:\n]:"(?P[^"]+)[^:\n]:"(?P[^"]+)[^:\n]:"(?P[^"]+)[^:\n]:"(?P[^"]+)[^:\n]:"(?P[^"]+)[^:\n]:"(?P[^"]+)[^:\n]:(?P[^",]+)[^:\n]*:"(?P[^"]+)" has exceeded configured match_limit, consider raising the value in limits.conf*

Afteward, i try to remove this part "12:48:12.3194 Info " in order to only have the JSON format and it works like a charm with the field auto discovery (no need to use the "Extract fields").

Is there a way in the "Add data" wizard to remove this part "12:48:12.3194 Info ". In order to only keep JSON? Is that a good way to do that? Or maybe there is another way to transform my logs that i didn't think of?

Thank you by advance for your replies,

Regards,
Thibaut

0 Karma
1 Solution

skalliger
Motivator

Hi,

nope, there is no way to tune the JSON discovery. However, you can cut the _raw before the fields get extracted.
You would want to do something like this in your props.conf and transforms.conf:

props.conf:

[your_sourcetype]
# call it whatever you like (TRANSFORM-example)
TRANSFORM-json = json_cut

transforms.conf:

[json_cut]
DEST_KEY = _raw
REGEX = (?:^(\d+\:){2}\d+\.\d+\s\w+\s)(?<json>[^\}]+\})
FORMAT = $1

You may want to tune this RegEx. I just took your example event and matched it.

Skalli

View solution in original post

0 Karma

tmaire2
New Member

Hello Skalliger, Thank you for your help 🙂

Thank you it works better, although i still have some event that are not taken. But i found the problem. When i import my data with the "add data" wizard, by letting the "line break" in auto i got the same amount of event when i import my file with the configuration files. But as i say, some event are "merged" together so i don't have all the events.

Always in "add data" wizard, if i select "every line" instead of "auto" in "Line break" it works (i got all my event separated correctly. So, how can I translate this part in the config file. I guess the modification is in the transforms.conf?

edit: i found it : in props.conf add SHOULD_LINEMERGE = false

Thank you again for your time.
Thibaut

0 Karma

skalliger
Motivator

Thanks for the feedback. Woud be nice however, if you could accept my answer as the answer to the question. I'm trying to get a free .conf pass next year. 🙂

Skalli

0 Karma

tmaire2
New Member

Thank you for your time and help :). I accept your answer, is that ok now?

Thibaut

0 Karma

tmaire2
New Member

It's working but not completely. some of the events are not present.

It work for :

    09:54:34.1821 Info {"message":"UiPath_REFrameWork_UiDemo execution started","level":"Information","logType":"Default","timeStamp":"2018-10-08T09:54:34.0170959+02:00","fingerprint":"0fcfd8d0-ad31-47fd-b240-c1ddc9fd4169","windowsIdentity":"name","machineName":"DCPJQQ2","processName":"UiPath_REFrameWork_UiDemo","processVersion":"1.0.0.0","jobId":"252fbec2-83d3-4f01-b165-5c728b850989","robotName":"DCPJQQ2","machineId":44772,"fileName":"System1_login"}

But not for :

09:55:11.0503 Info {"message":"UiPath_REFrameWork_UiDemo execution started","level":"Information","logType":"Default","timeStamp":"2018-10-08T09:55:10.9611418+02:00","fingerprint":"41543e91-d14f-48d3-ac9a-d53b3a3c33da","windowsIdentity":"name","machineName":"DCPJQQ2","processName":"UiPath_REFrameWork_UiDemo","processVersion":"1.0.0.0","jobId":"54bea4fe-da6c-4c55-aec3-019bd57b037b","robotName":"DCPJQQ2","machineId":44772,"fileName":"InitAllApplications"}

Or :

14:11:05.6823 Info {"message":"UiPath_REFrameWork_UiDemo execution ended","level":"Information","logType":"Default","timeStamp":"2018-10-08T14:11:05.6874037+02:00","fingerprint":"325d2ba7-f8a2-440d-9e8a-70bf6103008a","windowsIdentity":"name","machineName":"DCPJQQ2","processName":"UiPath_REFrameWork_UiDemo","processVersion":"1.0.0.0","jobId":"4f6ac200-c4cc-4562-953d-33c7f1e3b00e","robotName":"DCPJQQ2","machineId":44772,"totalExecutionTimeInSeconds":1,"totalExecutionTime":"00:00:01","fileName":"Main"}

Or :

09:54:34.6757 Error {"message":"Invoke Workflow File: Cannot create unknown type '{http://schemas.uipath.com/workflow/activities}GetSecureCredential'.","level":"Error","logType":"Default","timeStamp":"2018-10-08T09:54:34.6747442+02:00","fingerprint":"1953d68a-44f3-4b9f-b10d-df026d4b941e","windowsIdentity":"name","machineName":"DCPJQQ2","processName":"UiPath_REFrameWork_UiDemo","processVersion":"1.0.0.0","jobId":"252fbec2-83d3-4f01-b165-5c728b850989","robotName":"DCPJQQ2","machineId":44772,"fileName":"System1_login"}

The last one i guess is a regex problem because of the "}" in the "message" but for the rest i don't know why Splunk don't take them because they are very similar from the first one.

Thanks for the help

0 Karma

skalliger
Motivator

Oh, I didn't see those extra braces. Sorry, then we make it a little bit easier:

(?=\{)(?<json>\{[^(\n||\r\n)]+)

This will match until the end of the line (\r or \r\n), because your JSON should end with a closing brace.

Skalli

0 Karma

skalliger
Motivator

Hi,

nope, there is no way to tune the JSON discovery. However, you can cut the _raw before the fields get extracted.
You would want to do something like this in your props.conf and transforms.conf:

props.conf:

[your_sourcetype]
# call it whatever you like (TRANSFORM-example)
TRANSFORM-json = json_cut

transforms.conf:

[json_cut]
DEST_KEY = _raw
REGEX = (?:^(\d+\:){2}\d+\.\d+\s\w+\s)(?<json>[^\}]+\})
FORMAT = $1

You may want to tune this RegEx. I just took your example event and matched it.

Skalli

0 Karma

tmaire2
New Member

hello skalliger,

Thanks for your response !

Do i modify theses files in the default or local directory? (sorry i'm quite new at theses config files) and after that, how can i see theses modification (select my new sourcetype) in "add data" wizard because even if I modifiy props.conf and transform.conf in the local or default directory i still can't see my new sourcetype ?
I must be doing something wrong.

Thank you,
Thibaut

0 Karma

skalliger
Motivator

You want to do your modifications inside the local directory. If the files don't exist yet, create them.
The JSON must be read somehwere. For example from a monitor of a Universal Forwarder or something else. When defining your inputs.conf to get your data in, you should always define an index and a sourcetype.
This sourcetype is it where we refer to from props.conf and transforms.conf.

Did that answer your question?

Skalli

0 Karma

tmaire2
New Member

Thank you Skalliger, it's very clear with a UF. The thing is we don't have a Splunk infrastructure yet (i use the free license on my machine without any UF or HF) so for now i just want to understand how to properly get data in. All my log are on my computer and i import them with the "Add data" wizard.

So, if i'm right, i first need to create an Index (or i can use the default one?) and a sourcetype in the Inputs.conf on my machine where Splunk is installed. Modify props and transforms files and indicate the sourcetype previously created in the inputs.conf. After that, i will see my sourcetype in the "add wizard" with the correct transformation applied on my logs?

Thanks for your time,
Thibaut

0 Karma

skalliger
Motivator

As mentioned before, you always want to set an index and a sourcetype. You don't want to use the main index. 🙂

That's correct, define the data inputs in inputs.conf, create an index in indexes.conf and here you go.

Skalli

tmaire2
New Member

Hi skalliger,

Thanks for your response !

Do i modify theses files in the default or local directory? (sorry i'm quite new with theses conf files) and after that how can i find these modification in the "add data" wizard (to apply my sourcetype to the logfile)?

Thibaut

0 Karma
Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...