All Apps and Add-ons

Splunk Add-on for Amazon Web Services: Why are injested JSON event fields not extracted using a custom sourcetype for Kinesis stream?

markconlin
Path Finder

Objective
Using the Splunk Add-on for Amazon Web Services to ingest events from AWS Kinesis with a custom sourcetype.

Issue
Ingested json event fields are not extracted when using custom sourcetype.

What I have tried
I have created Kinesis inputs to read from the stream. One with the sourcetype = aws:kinesis (as specified in the documentation here http://docs.splunk.com/Documentation/AddOns/released/AWS/Kinesis) and one with a custom sourcetype.

The custom sourcetype events do not have extracted json fields. (see picture attached).
The standard sourceytpe events do have extracted json fields.
I have tested this sourcetype using oneshot to place json data into a test index and the fields were extracted correctly.

Create indices
/opt/splunk/bin# ./splunk add index fromkinesis
/opt/splunk/bin# ./splunk add index bythebookkn
/opt/splunk/bin# ./splunk add index oneshottest

Test sourcetype with oneshot
/opt/splunk/bin# ./splunk add oneshot /opt/splunk/data/test.json -sourcetype myevents -index oneshottest

Kinesis inputs

/opt/splunk/etc/apps/Splunk_TA_aws/local# cat aws_kinesis_tasks.conf

[bythebookkn]
account = splunk
encoding =
format = CloudWatchLogs
index = bythebookkn
init_stream_position = LATEST
region = us-east-1
sourcetype = aws:kinesis
stream_names = stage-my-events

[fromkinesis]
account = splunk
encoding =
format = CloudWatchLogs
index = fromkinesis
init_stream_position = LATEST
region = us-east-1
sourcetype = myevents
stream_names = stage-my-events

Sourcetype

/opt/splunk/etc/system/local# cat props.conf
TRUNCATE = 800000

[myevents]
INDEXED_EXTRACTIONS = json
TIMESTAMP_FIELDS = info.created
TIME_FORMAT = %Y-%d-%m %H:%M:%S.%3Q
TZ = UTC
detect_trailing_nulls = auto
SHOULD_LINEMERGE = false
KV_MODE = none
AUTO_KV_JSON = false
category = Custom
disabled = false

alt text
alt text

0 Karma

mreynov_splunk
Splunk Employee
Splunk Employee

I think you need to remove format=CloudWatchLogs because that strips the JSON wrapper. Set it to "none" and try again.

0 Karma

markconlin
Path Finder

@mreynov_splunk This change does not achieve my objective.

The result is that I get a well extracted json document that is a kinesis event. Meanwhile my log messages (which are also json) are a text field in the Kinesis json called "message" and are totally not parsed as json.

{
   "logGroup":"STAGE-airborne-boeing-logs",
   "owner":"076263846157",
   "logStream":"json-STAGE-airborne-stage-alfa-i-ba21b342",
   "subscriptionFilters":[
      "stage-airborne-boeing"
   ],
   "messageType":"DATA_MESSAGE",
   "logEvents":[
      {
         "id":"33043166675459237061536447207295493120073654567646199808",
         "message":"{\"info\": {\"event_type\": \"session_custom_period\", \"relativeCreated\": 8933604.59113121, \"process\": 17277, \"period\": 120, \"module\": \"sessions\", \"funcName\": \"save\", \"msecs\": 616.2080764770508, \"message\": \"Save custom session expiration period\", \"filename\": \"sessions.py\", \"levelno\": 20, \"processName\": \"MainProcess\", \"lineno\": 147, \"asctime\": \"2016-12-14 09:13:59,616\", \"msg\": \"Save custom session expiration period\", \"loggername\": \"airborne.core.accounts.sessions\", \"exc_text\": null, \"name\": \"airborne.core.accounts.sessions\", \"thread\": 140696234220752, \"created\": \"2016-12-14 17:13:59.616\", \"threadName\": \"GreenThread-430\", \"session_id\": \"8io3n7e518knrklws112pgdmwfrvyqur\", \"pathname\": \"/home/ubuntu/projects/airborne/airborne/core/accounts/sessions.py\", \"exc_info\": null, \"message_type\": \"accounts\", \"levelname\": \"INFO\"}, \"context\": {}}",
         "timestamp":1481706839000
      }
   ]
}

alt text

0 Karma

mreynov_splunk
Splunk Employee
Splunk Employee

hmm... it should work if it is proper JSON throughout. This is the first question to answer. If not, then yea, you are in a pickle.

either way, it makes sense to start from Kinesis, because at least it handles the JSON wrapper for you.
Send me a sample and I can try it. (I am assuming the sample above is not how your data looked like coming in; I am specifically interested in the back slashes)

0 Karma

markconlin
Path Finder

Hello... @mreynov_splunk can you help?
Is this an actual bug like before or am I doing something wrong?

0 Karma
Get Updates on the Splunk Community!

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...

Updated Team Landing Page in Splunk Observability

We’re making some changes to the team landing page in Splunk Observability, based on your feedback. The ...

New! Splunk Observability Search Enhancements for Splunk APM Services/Traces and ...

Regardless of where you are in Splunk Observability, you can search for relevant APM targets including service ...