All Apps and Add-ons

Splunk Add-on for Amazon Web Services: Why are injested JSON event fields not extracted using a custom sourcetype for Kinesis stream?

markconlin
Path Finder

Objective
Using the Splunk Add-on for Amazon Web Services to ingest events from AWS Kinesis with a custom sourcetype.

Issue
Ingested json event fields are not extracted when using custom sourcetype.

What I have tried
I have created Kinesis inputs to read from the stream. One with the sourcetype = aws:kinesis (as specified in the documentation here http://docs.splunk.com/Documentation/AddOns/released/AWS/Kinesis) and one with a custom sourcetype.

The custom sourcetype events do not have extracted json fields. (see picture attached).
The standard sourceytpe events do have extracted json fields.
I have tested this sourcetype using oneshot to place json data into a test index and the fields were extracted correctly.

Create indices
/opt/splunk/bin# ./splunk add index fromkinesis
/opt/splunk/bin# ./splunk add index bythebookkn
/opt/splunk/bin# ./splunk add index oneshottest

Test sourcetype with oneshot
/opt/splunk/bin# ./splunk add oneshot /opt/splunk/data/test.json -sourcetype myevents -index oneshottest

Kinesis inputs

/opt/splunk/etc/apps/Splunk_TA_aws/local# cat aws_kinesis_tasks.conf

[bythebookkn]
account = splunk
encoding =
format = CloudWatchLogs
index = bythebookkn
init_stream_position = LATEST
region = us-east-1
sourcetype = aws:kinesis
stream_names = stage-my-events

[fromkinesis]
account = splunk
encoding =
format = CloudWatchLogs
index = fromkinesis
init_stream_position = LATEST
region = us-east-1
sourcetype = myevents
stream_names = stage-my-events

Sourcetype

/opt/splunk/etc/system/local# cat props.conf
TRUNCATE = 800000

[myevents]
INDEXED_EXTRACTIONS = json
TIMESTAMP_FIELDS = info.created
TIME_FORMAT = %Y-%d-%m %H:%M:%S.%3Q
TZ = UTC
detect_trailing_nulls = auto
SHOULD_LINEMERGE = false
KV_MODE = none
AUTO_KV_JSON = false
category = Custom
disabled = false

alt text
alt text

0 Karma

mreynov_splunk
Splunk Employee
Splunk Employee

I think you need to remove format=CloudWatchLogs because that strips the JSON wrapper. Set it to "none" and try again.

0 Karma

markconlin
Path Finder

@mreynov_splunk This change does not achieve my objective.

The result is that I get a well extracted json document that is a kinesis event. Meanwhile my log messages (which are also json) are a text field in the Kinesis json called "message" and are totally not parsed as json.

{
   "logGroup":"STAGE-airborne-boeing-logs",
   "owner":"076263846157",
   "logStream":"json-STAGE-airborne-stage-alfa-i-ba21b342",
   "subscriptionFilters":[
      "stage-airborne-boeing"
   ],
   "messageType":"DATA_MESSAGE",
   "logEvents":[
      {
         "id":"33043166675459237061536447207295493120073654567646199808",
         "message":"{\"info\": {\"event_type\": \"session_custom_period\", \"relativeCreated\": 8933604.59113121, \"process\": 17277, \"period\": 120, \"module\": \"sessions\", \"funcName\": \"save\", \"msecs\": 616.2080764770508, \"message\": \"Save custom session expiration period\", \"filename\": \"sessions.py\", \"levelno\": 20, \"processName\": \"MainProcess\", \"lineno\": 147, \"asctime\": \"2016-12-14 09:13:59,616\", \"msg\": \"Save custom session expiration period\", \"loggername\": \"airborne.core.accounts.sessions\", \"exc_text\": null, \"name\": \"airborne.core.accounts.sessions\", \"thread\": 140696234220752, \"created\": \"2016-12-14 17:13:59.616\", \"threadName\": \"GreenThread-430\", \"session_id\": \"8io3n7e518knrklws112pgdmwfrvyqur\", \"pathname\": \"/home/ubuntu/projects/airborne/airborne/core/accounts/sessions.py\", \"exc_info\": null, \"message_type\": \"accounts\", \"levelname\": \"INFO\"}, \"context\": {}}",
         "timestamp":1481706839000
      }
   ]
}

alt text

0 Karma

mreynov_splunk
Splunk Employee
Splunk Employee

hmm... it should work if it is proper JSON throughout. This is the first question to answer. If not, then yea, you are in a pickle.

either way, it makes sense to start from Kinesis, because at least it handles the JSON wrapper for you.
Send me a sample and I can try it. (I am assuming the sample above is not how your data looked like coming in; I am specifically interested in the back slashes)

0 Karma

markconlin
Path Finder

Hello... @mreynov_splunk can you help?
Is this an actual bug like before or am I doing something wrong?

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...