Getting Data In

how to break json data comming from tcp input

preben12
Communicator

Hi

I'm trying to break json events comming from tcp input into seperate events.

 {
    "action" : "STOP",
    "source" : "AS_PLANNED",
    "timestamp" : "2017-03-24T08:29:59.977+01:00",
    "productionNumber" : "14801720125",
    "productionType" : "Radio",
    "eventId" : "1179469773327",
    "title" : "Some title",
    "flowPublicationId" : "1179469742812",
    "channelPresentationCode" : "xx",
    "channelPresentationName" : "xxyy",
    "timeAllocationType" : "Segment of program",
    "actualTime" : "2017-03-24T08:30:00.000+01:00",
    "startTimeAnnounced" : "2017-03-24T08:06:00.000+01:00",
    "startTimePlanned" : "2017-03-24T08:06:00.000+01:00",
    "stopTimePlanned" : "2017-03-24T08:30:00.000+01:00",
    "broadcastDate" : "2017-03-24",
    "live" : false,
    "quickReprise" : false,
    "streamingLive" : false,
    "streamingOD" : true,
    "streamingDestination" : " (WEBCMS)",
    "numberOfBlocks" : "8",
    "blockPartNumber" : "5",
    "blockId" : "1179469768813"
  }

Note that the json is pretty printed with spaces and linebreaks.
It works fine if I ommit the spaces and linebreaks with the default json sourcetype, but with the pretty printet version the event get's split into several events.

I have figured out I have to create a custom sourcetype and use a custom LINE_BREAKER as stated here https://answers.splunk.com/answers/171197/how-to-get-two-lines-of-json-to-break-as-two-event.html.
But I was not able to find the magic rex to ommit spaces and linebreaks.

0 Karma
1 Solution

skoelpin
SplunkTrust
SplunkTrust

You should apply base configs at the app level in props.conf to get the linebreaking your looking for

So as a basic example, under /opt/splunk/etc/apps/<APP_NAME>/local/props.conf

You should have the following stanza

[my_sourcetype]
TIME_PREFIX = ^
MAX_TIMESTAMP_LOOKAHEAD = 25
TZ = GMT
# A performance tweak is to disable SHOULD_LINEMERGE and then set the 
# LINE_BREAKER to "line ending characters coming before a new time stamp"
# (note the direct link of the TIME_FORMAT to the regex of LINE_BREAKER).
TIME_FORMAT = %Y-%m-%d %H:%M:%S,%3N
LINE_BREAKER = ([\r\n]+){
SHOULD_LINEMERGE = False
# 10000 is default, should be set on a case by case basis
TRUNCATE = 10000

Since this is TCP data, it will most likely not have a timestamp baked into the event, so the timestamp will be added at the time the event was indexed.. I would suggest you have a dedicated syslog server which the tcp data is sent to and log it there, then install a forwarder on that syslog server and send the data to Splunk

View solution in original post

0 Karma

skoelpin
SplunkTrust
SplunkTrust

You should apply base configs at the app level in props.conf to get the linebreaking your looking for

So as a basic example, under /opt/splunk/etc/apps/<APP_NAME>/local/props.conf

You should have the following stanza

[my_sourcetype]
TIME_PREFIX = ^
MAX_TIMESTAMP_LOOKAHEAD = 25
TZ = GMT
# A performance tweak is to disable SHOULD_LINEMERGE and then set the 
# LINE_BREAKER to "line ending characters coming before a new time stamp"
# (note the direct link of the TIME_FORMAT to the regex of LINE_BREAKER).
TIME_FORMAT = %Y-%m-%d %H:%M:%S,%3N
LINE_BREAKER = ([\r\n]+){
SHOULD_LINEMERGE = False
# 10000 is default, should be set on a case by case basis
TRUNCATE = 10000

Since this is TCP data, it will most likely not have a timestamp baked into the event, so the timestamp will be added at the time the event was indexed.. I would suggest you have a dedicated syslog server which the tcp data is sent to and log it there, then install a forwarder on that syslog server and send the data to Splunk

0 Karma

preben12
Communicator

Works fine. Thanks

0 Karma

sbbadri
Motivator

try this,

[ yoursourcetype]
SHOULD_LINEMERGE=true
NO_BINARY_CHECK=true
CHARSET=UTF-8
INDEXED_EXTRACTIONS=json
KV_MODE=none

0 Karma

skoelpin
SplunkTrust
SplunkTrust

Why would you set SHOULD_LINEMERGE = true??

This would result in single line events and moot the point of Splunk.. A better approach would be to capture the entire json message as a single event

Also, why disable NO_BINARY_CHECK? This stanza has trouble written all over it

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...