Getting Data In

dynamic sourcetype extraction problems

Ultracpp
Engager

Hi all,
I am trying to setup dynamic sourcetype extraction, but no luck.

sample message has json:
{"id":"someid","type":"action"}

This is my config:

inputs.conf:


[tcp://9001]
connection_host = none
source=platform

props.conf:


[source::platform]
TRANSFORMS-sourcetype = platform-st

transofrms.conf:


[platform-st]
SOURCE_KEY = source
DEST_KEY = MetaData:Sourcetype
REGEX = \"type\":\"([^\"]+)\"
FORMAT = sourcetype::$1

Thank you

Tags (1)
1 Solution

hexx
Splunk Employee
Splunk Employee

I believe that the problem lies with this configuration parameter :

"SOURCE_KEY = source".

From transforms.conf.spec :

SOURCE_KEY = <string>
* NOTE: This attribute is valid for both index-time and search-time field extractions.
* Optional. Defines the KEY that Splunk applies the REGEX to.
* For search time extractions, you can use this attribute to extract one or more values from
the values of another field. You can use any field that is available at the time of the
execution of this field extraction.
* For index-time extractions use the KEYs described at the bottom of this file.
* KEYs are case-sensitive, and should be used exactly as they appear in the KEYs list at
the bottom of this file. (For example, you would say SOURCE_KEY = MetaData:Host, not
SOURCE_KEY = metadata:host .)
* SOURCE_KEY is typically used in conjunction with REPEAT_MATCH in index-time field
transforms.
* Defaults to _raw, which means it is applied to the raw, unprocessed text of all events.

The string "source" is an invalid value for SOURCE_KEY. I am assuming that your goal is to extract the value to assign to the "sourcetype" from the body of your events.

In that case, you should remove the "SOURCE_KEY = source" parameter altogether, which will result in Splunk applying your REGEX to the body of the event (the "_raw" field).

View solution in original post

hexx
Splunk Employee
Splunk Employee

I believe that the problem lies with this configuration parameter :

"SOURCE_KEY = source".

From transforms.conf.spec :

SOURCE_KEY = <string>
* NOTE: This attribute is valid for both index-time and search-time field extractions.
* Optional. Defines the KEY that Splunk applies the REGEX to.
* For search time extractions, you can use this attribute to extract one or more values from
the values of another field. You can use any field that is available at the time of the
execution of this field extraction.
* For index-time extractions use the KEYs described at the bottom of this file.
* KEYs are case-sensitive, and should be used exactly as they appear in the KEYs list at
the bottom of this file. (For example, you would say SOURCE_KEY = MetaData:Host, not
SOURCE_KEY = metadata:host .)
* SOURCE_KEY is typically used in conjunction with REPEAT_MATCH in index-time field
transforms.
* Defaults to _raw, which means it is applied to the raw, unprocessed text of all events.

The string "source" is an invalid value for SOURCE_KEY. I am assuming that your goal is to extract the value to assign to the "sourcetype" from the body of your events.

In that case, you should remove the "SOURCE_KEY = source" parameter altogether, which will result in Splunk applying your REGEX to the body of the event (the "_raw" field).

gkanapathy
Splunk Employee
Splunk Employee

You should not specify SOURCE_KEY = source. Presumably, you want to run the regex against the raw data, not the source field.

Get Updates on the Splunk Community!

Detecting Remote Code Executions With the Splunk Threat Research Team

REGISTER NOWRemote code execution (RCE) vulnerabilities pose a significant risk to organizations. If ...

Observability | Use Synthetic Monitoring for Website Metadata Verification

If you are on Splunk Observability Cloud, you may already have Synthetic Monitoringin your observability ...

More Ways To Control Your Costs With Archived Metrics | Register for Tech Talk

Tuesday, May 14, 2024  |  11AM PT / 2PM ET Register to Attend Join us for this Tech Talk and learn how to ...