Getting Data In

Why are JSON Wrapped Windows Logs not being read as JSON by "Add Data"?

keenerms
Engager

Hey, I'm very experienced using Splunk as an analyst, but not at all experienced on the admin side of things, but am trying to learn.  I was recently given a JSON file full of Windows Logs to analyze.  Not sure why they gave me the data that way, but they did, and that's how I have to use it.  

When I try and upload the file to Splunk, I select "Add Data", I upload the file, and it does not recognize it as JSON.  If I select json_no_timestamp, it seems to recognize it, but doesn't break it up into events.  Every event starts the same way, and I copied the first 12 lines of JSON below (when auto-arranged).  Using Regex101, I found a Regex that matches the beginning of the event, but adding that into Event Breaks Pattern does not break the event.  

I've tried the following Event Breaks Patterns because sometimes when you copy the lines, there is whitespace, and sometimes there is no whitespace (Splunk, Atom, and Regex101 show line breaks and whitespace, but when I copied it into this comment... no line breaks!  Unsure if that's b/c of presentation or just copy/paste):
\{\s\"sort\"\:
{\n\s+\"sort\"
\{\r\n\s+\"sort\"\:
{ "sort":

 

{
    "data": [
        {
            "sort": [
                0
            ],
            "_score": null,
            "_type": "winevtx",
            "_index": "winevtx",
            "_id": "==",
            "_source": {
                "process_id": 488,
                "message": "A Kerberos service ticket was requested.",
                "provider_guid": "{}",
                "log_name": "Security",
                "source_name": "Microsoft-Windows-Security-Auditing",
                "event_data": {
                    "TicketOptions": "0x60810010",
                    "TargetUserName": "JOHN$@LOCAL.LOCAL",
                    "ServiceName": "krbtgt",
                    "IpAddress": "::ffff:10.10.0.1",
                    "TargetDomainName": "LOCAL.LOCAL",
                    "IpPort": "53782",
                    "TicketEncryptionType": "0x12",
                    "LogonGuid": "{}",
                    "TransmittedServices": "-",
                    "Status": "0x0",
                    "ServiceSid": "S-1-5-21-3052363079-1128767895-2942130287-502"
                },
                "beat": {
                    "name": "LOCAL",
                    "version": "5.2.2",
                    "hostname": "LOCAL"
                },
                "thread_id": 1096,
                "@version": "1",
                "@metadata": {
                    "index_local_timestamp": "2017-04-20T06:27:21.283576",
                    "hostname": "LOCAL",
                    "index_utc_timestamp": "2017-04-20T06:27:21.283576",
                    "timezone": "UTC+0000"
                },
                "opcode": "Info",
                "@timestamp": "2017-04-20T06:25:33.801Z",
                "tags": [
                    "beats_input_codec_plain_applied"
                ],
                "type": "wineventlog",
                "computer_name": "LOCAL.LOCAL.local",
                "event_id": 4769,
                "record_number": "127898",
                "level": "Information",
                "keywords": [
                    "Audit Success"
                ],
                "host": "LOCAL",
                "task": "Kerberos Service Ticket Operations"
            }
        }
    ]
}

 

 

Every event starts with { "sort": [ 0 ], so I know that's where I want to break it up.  I'm sure I'm missing something simple.  What is it?

Appreciate any assistance.

Labels (1)
Tags (1)
0 Karma

PickleRick
SplunkTrust
SplunkTrust

You'd have to post a longer excerpt for us to advise something. Put it in a code block or use "preformatted" style so it doesn't get butchered by the browser line (re)breaking.

But if this is a one-off task, I'd probably try to use python to load the json and output the events into a reasonably formatted file.

0 Karma

keenerms
Engager

Edited it as to add a single event, and deleted or modified any items that might be considered identifying.  

I considering using a JSON to CSV Python script, but I though that would be more work for a one off than Splunk.  My assumption is that I'm not understanding something about how Splunk ingests JSON.  

0 Karma

PickleRick
SplunkTrust
SplunkTrust

Well, even if you brokd at this "sort" entry, you would not get a well-formed json as a result. Each event would lack proper beginning but instead might have some trailing data.

While ingesting json, regardless of whether you want index-time extractions or search-time, you need to have the whole event as a well-formed json structure so you have to set your event beforehand so your input is split properly. Then you can worry whether it's a json or not. It's relatively easy with single-lined jsons but gets, as you noticed, complicated with multiline jsons since the event boundaries are not well defined. That's why I suggested re-processing it with external tool to get a clearer event separation.

If you events are indented, you could try looking for the unindendted closing bracket to find the end of an event. Since you want the bracket to be included in the ending event, you need a non-capturing group.

LINE_BREAKER = (?:^})([\r\n]+)

 

0 Karma
Get Updates on the Splunk Community!

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars this month. This ...

They're back! Join the SplunkTrust and MVP at .conf24

With our highly anticipated annual conference, .conf, comes the fez-wearers you can trust! The SplunkTrust, as ...

Enterprise Security Content Update (ESCU) | New Releases

Last month, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...