I'm trying to make a field extraction with transforms.conf. I have a stanza in props.conf for the source.
[source::/path/to/file]
REPORT-some-name = extraction_rule_1, extraction_rule_2...
and in transforms.conf:
[extraction_rule_1]
REGEX = code_1 (?<special_field_1>[^,]*),(?<special_field_2>[^,]*)
SOURCE_KEY = fields:code,field_with_commasepareted_data
I do not get any "special_field_N" in my result.
Also tried to make a calulated field with code and field_with_commasepareted_data. This field works, but the extraction does not run on the calculated field for some reason.
The reason i try to do this is that I have around 200 codes that have different types of data in the field_with_commasepareted_data field.
Try this:
In props.conf:
TRANSFORMS-set = myfieldextract
In transforms.conf
[my_fieldextract]
REGEX = "(\w+)":"([^"]+)
FORMAT = $1::$2
That will extract all fields/values for you (i think:))
From documentation:
SOURCE_KEY = <string>
* NOTE: This attribute is valid for both index-time and search-time field
extractions.
* Optional. Defines the KEY that Splunk applies the REGEX to.
* For search time extractions, you can use this attribute to extract one or
more values from the values of another field. You can use any field that
is available at the time of the execution of this field extraction
* For index-time extractions use the KEYs described at the bottom of this
file.
* KEYs are case-sensitive, and should be used exactly as they appear in
the KEYs list at the bottom of this file. (For example, you would say
SOURCE_KEY = MetaData:Host, *not* SOURCE_KEY = metadata:host .)
* If <string> starts with "field:" or "fields:" the meaning is changed.
Instead of looking up a KEY, it instead looks up an already indexed field.
For example, if a CSV field name "price" was indexed then
"SOURCE_KEY = field:price" causes the REGEX to match against the contents
of that field. It's also possible to list multiple fields here with
"SOURCE_KEY = fields:name1,name2,name3" which causes MATCH to be run
against a string comprising of all three values, separated by space
characters.
* SOURCE_KEY is typically used in conjunction with REPEAT_MATCH in
index-time field transforms.
* Defaults to _raw, which means it is applied to the raw, unprocessed text
of all events.
Sample data:
{"timestamp":"2016-10-28T15:22:43.915+02:00","log_level":"INFO","src_ip":"1.2.3.4","app_name":"SOME-APP","ssn":"1234567890123","session_id":"4E1A043B.C25E.58134D14","agreement_id":"1234567890123","frontend_host":"www.sparebank1.no","market":"pm","thread":"qtp683287027-4312126","request_id":"REQID-7777deb3a68f","log_type":"audit","audit_code":"pm006","fields":"9876543212,Name of creditor","status":"ok"}
"fields" is different for every "audit_code"
Do you have any sample data you can share?