I have my apache access logs going to cloudwatchlogs in aws. I used to use the aws addon TA for splunk to collect the events from aws cloudwatchlogs using the built in api calls. Due to api limits I switched to a kinesis stream and lambda function to send the events to the splunk http event collector. The data now comes in a json payload which looks like the following:
{"message": "12.156.22.149 - - [09/Dec/2016:20:20:44 -0500] \"-\" 408 - \"-\" \"-\" \"-\"", "aws_account_name": "anon-op-prod"}
Splunk extracts the fields message and aws_account_name, but obviously this will not be recognized by splunk as access_combined_wcookie because it cannot extract the fields. My thought is to drop the json object names before indexing because I don't care about them, only the data.
I thought this would work but maybe I am misunderstanding how my regex is being handled in transforms.
My config is as follows:
Transforms.conf:
[setnull]
REGEX = .
DEST_KEY = queue
FORMAT = nullQueue
[setparsing]
REGEX = "message": "(.*")"
DEST_KEY = _raw
FORMAT = $1
Props.conf:
[access_combined_wcookie]
TRANSFORMS-ACCESS = setnull, setparsing
This doesn't seem to be working. The idea was to replace _raw with whatever matched in the regex grouping to be indexed. I don't get anything though. Do I have to have another step to send it back to the queue or is the regex flawed? Any help would be appreciated.
... View more