Getting Data In

how to break json data comming from tcp input

preben12
Communicator

Hi

I'm trying to break json events comming from tcp input into seperate events.

 {
    "action" : "STOP",
    "source" : "AS_PLANNED",
    "timestamp" : "2017-03-24T08:29:59.977+01:00",
    "productionNumber" : "14801720125",
    "productionType" : "Radio",
    "eventId" : "1179469773327",
    "title" : "Some title",
    "flowPublicationId" : "1179469742812",
    "channelPresentationCode" : "xx",
    "channelPresentationName" : "xxyy",
    "timeAllocationType" : "Segment of program",
    "actualTime" : "2017-03-24T08:30:00.000+01:00",
    "startTimeAnnounced" : "2017-03-24T08:06:00.000+01:00",
    "startTimePlanned" : "2017-03-24T08:06:00.000+01:00",
    "stopTimePlanned" : "2017-03-24T08:30:00.000+01:00",
    "broadcastDate" : "2017-03-24",
    "live" : false,
    "quickReprise" : false,
    "streamingLive" : false,
    "streamingOD" : true,
    "streamingDestination" : " (WEBCMS)",
    "numberOfBlocks" : "8",
    "blockPartNumber" : "5",
    "blockId" : "1179469768813"
  }

Note that the json is pretty printed with spaces and linebreaks.
It works fine if I ommit the spaces and linebreaks with the default json sourcetype, but with the pretty printet version the event get's split into several events.

I have figured out I have to create a custom sourcetype and use a custom LINE_BREAKER as stated here https://answers.splunk.com/answers/171197/how-to-get-two-lines-of-json-to-break-as-two-event.html.
But I was not able to find the magic rex to ommit spaces and linebreaks.

0 Karma
1 Solution

skoelpin
SplunkTrust
SplunkTrust

You should apply base configs at the app level in props.conf to get the linebreaking your looking for

So as a basic example, under /opt/splunk/etc/apps/<APP_NAME>/local/props.conf

You should have the following stanza

[my_sourcetype]
TIME_PREFIX = ^
MAX_TIMESTAMP_LOOKAHEAD = 25
TZ = GMT
# A performance tweak is to disable SHOULD_LINEMERGE and then set the 
# LINE_BREAKER to "line ending characters coming before a new time stamp"
# (note the direct link of the TIME_FORMAT to the regex of LINE_BREAKER).
TIME_FORMAT = %Y-%m-%d %H:%M:%S,%3N
LINE_BREAKER = ([\r\n]+){
SHOULD_LINEMERGE = False
# 10000 is default, should be set on a case by case basis
TRUNCATE = 10000

Since this is TCP data, it will most likely not have a timestamp baked into the event, so the timestamp will be added at the time the event was indexed.. I would suggest you have a dedicated syslog server which the tcp data is sent to and log it there, then install a forwarder on that syslog server and send the data to Splunk

View solution in original post

0 Karma

skoelpin
SplunkTrust
SplunkTrust

You should apply base configs at the app level in props.conf to get the linebreaking your looking for

So as a basic example, under /opt/splunk/etc/apps/<APP_NAME>/local/props.conf

You should have the following stanza

[my_sourcetype]
TIME_PREFIX = ^
MAX_TIMESTAMP_LOOKAHEAD = 25
TZ = GMT
# A performance tweak is to disable SHOULD_LINEMERGE and then set the 
# LINE_BREAKER to "line ending characters coming before a new time stamp"
# (note the direct link of the TIME_FORMAT to the regex of LINE_BREAKER).
TIME_FORMAT = %Y-%m-%d %H:%M:%S,%3N
LINE_BREAKER = ([\r\n]+){
SHOULD_LINEMERGE = False
# 10000 is default, should be set on a case by case basis
TRUNCATE = 10000

Since this is TCP data, it will most likely not have a timestamp baked into the event, so the timestamp will be added at the time the event was indexed.. I would suggest you have a dedicated syslog server which the tcp data is sent to and log it there, then install a forwarder on that syslog server and send the data to Splunk

0 Karma

preben12
Communicator

Works fine. Thanks

0 Karma

sbbadri
Motivator

try this,

[ yoursourcetype]
SHOULD_LINEMERGE=true
NO_BINARY_CHECK=true
CHARSET=UTF-8
INDEXED_EXTRACTIONS=json
KV_MODE=none

0 Karma

skoelpin
SplunkTrust
SplunkTrust

Why would you set SHOULD_LINEMERGE = true??

This would result in single line events and moot the point of Splunk.. A better approach would be to capture the entire json message as a single event

Also, why disable NO_BINARY_CHECK? This stanza has trouble written all over it

0 Karma
Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...