Getting Data In

Can Splunk (Heavy Forwarder or any other Splunk component) Filter Field before Indexing

luthfi49
Explorer

Can Splunk Filter data in field level before indexing ?

Field level mean that we want to remove some field from event before indexing.
From what I know, heavy forwarder has the capability to filter data, but it is only on “event level” -> mean that we can filter out all event with a specific type. But we can’t only filter some field

I'm newbie in Splunk and this is my first question. Hopefully it help the others too 🙂

Tags (2)
0 Karma
1 Solution

kristian_kolb
Ultra Champion

Yes, logs can most certainly be filtered before indexing, just as you mention. However, the filtering is not based off an extracted field, simply because the fields are not yet extracted.

The solution is to create a similar regex extraction as the one being performed at search time for most field extraction, and then modify the extracted data prior to indexing. It sounds more complicated than it is, but you need to have some grasp of regex syntax. See the example below, where parts of session_id's are being replaced with ####. You could create a regex that captures your desired field=field_value and replace it with nothing.

http://docs.splunk.com/Documentation/Splunk/5.0.4/Data/Anonymizedatausingconfigurationfiles#Through_...

Hope this helps,

K

For better help, always post a few sample events.

View solution in original post

fbl_itcs
Path Finder

The easiest way to achieve this would be a SEDCMD.

See http://docs.splunk.com/Documentation/Splunk/5.0.4/admin/Propsconf for how to configure an SEDCMD. You can simply replace the parts you want to remove with "nothing".

E.g.:
props.conf:

[xxxxtesfilterxxxxxx]
SEDCMD-test = s/Domain=EPC-SubscriberId=[^,]+,//g
SEDCMD-test2 = s/EPC-SubscriberId=[^,]+,//g

This is untested. Please test in a dev enviroment bevore roling it out to production.

0 Karma

luthfi49
Explorer

Yes, but not only change the value of the field to null, I want to remove the field.

0 Karma

kristian_kolb
Ultra Champion

Yes, logs can most certainly be filtered before indexing, just as you mention. However, the filtering is not based off an extracted field, simply because the fields are not yet extracted.

The solution is to create a similar regex extraction as the one being performed at search time for most field extraction, and then modify the extracted data prior to indexing. It sounds more complicated than it is, but you need to have some grasp of regex syntax. See the example below, where parts of session_id's are being replaced with ####. You could create a regex that captures your desired field=field_value and replace it with nothing.

http://docs.splunk.com/Documentation/Splunk/5.0.4/Data/Anonymizedatausingconfigurationfiles#Through_...

Hope this helps,

K

For better help, always post a few sample events.

luthfi49
Explorer

After I checked again, the filtering is work !, and the license used is smaller than the size of the file log.

Thank you very much for the help !

To make sure I will try it for another case.

And yes, the field I want to remove is this "field" (I don't know how to highlight it) 🙂

XXXXX: Tue Aug 27 13:25:13 2013, Host:úú Tue Aug 27 13:25:22 2013 Field1; Field2;

"Domain=EPC-SubscriberId=ValueDomain,NextField=NextValue,"

EPC-SubscriberId=ValueEPC,

"NextField=ValueField"

0 Karma

kristian_kolb
Ultra Champion

I take it that you have read the docs at http://docs.splunk.com/Documentation/Splunk/5.0.4/Data/Anonymizedatausingconfigurationfiles

From what I understand you want to remove the higlighted portions;

XXXXX: Tue Aug 27 13:25:13 2013, Host:úú Tue Aug 27 13:25:22 2013 Field1; Field2; Domain=EPC-SubscriberId=ValueDomain,NextField=NextValue,EPC-SubscriberId=ValueEPC*,NextField=ValueField*

is that correct?

0 Karma

luthfi49
Explorer

sample event (I change some values)
XXXXX: Tue Aug 27 13:25:11 2013, Host:úú Tue Aug 27 13:25:22 2013
Field1; Field2; Domain=EPC-SubscriberId=ValueDomain,NextField=NextValue,EPC-SubscriberId=ValueEPC,NextField=ValueField

XXXXX: Tue Aug 27 13:25:12 2013, Host:úú Tue Aug 27 13:25:22 2013
Field1; Field2; Domain=EPC-SubscriberId=ValueDomain,NextField=NextValue,EPC-SubscriberId=ValueEPC,NextField=ValueField

XXXXX: Tue Aug 27 13:25:13 2013, Host:úú Tue Aug 27 13:25:22 2013
Field1; Field2; Domain=EPC-SubscriberId=ValueDomain,NextField=NextValue,EPC-SubscriberId=ValueEPC,NextField=ValueField

0 Karma

luthfi49
Explorer

Here is my props.conf
[xxxxtesfilterxxxxxx]
TRANSFORMS-anonymize = remove-fieldtes

my transform.conf
[remove-fieldtes]
REGEX=(?msi)^(.*?)(Domain=.*?)(EPC-SubscriberId.*?)(EPC-SubscriberId=.*?)(\,.*?)$
FORMAT=$1$4
DEST_KEY=_raw

0 Karma

luthfi49
Explorer

well I have try that and the filtering still not working,
Actually my concern is to reduce the license usage by removing field. The link you provide generally is used to masking data in indexer.
So What I try is edit props and transform in heavy forwarder to mask the field to null value.
I still don't know why it is not working. I will try again later 🙂

0 Karma

kristian_kolb
Ultra Champion

If you want any help with the construction of the regexes, you will need to provide some sample events. Mask sensitive data as needed.

Good luck!

0 Karma

luthfi49
Explorer

Thanks, I will try this first 🙂

0 Karma

lukejadamec
Super Champion

Yes, you can filter events based on field values within the event. But, it sounds like you are trying to change the value of a field to null, while saving the rest of the event. Is that right?

0 Karma

treinke
Builder

See the following answer. You can use this same type of filtering through RegEx.

http://answers.splunk.com/answers/99905/how-to-forward-only-specific-windows-eventlogs-via-splunk-un...

There are no answer without questions
0 Karma

kristian_kolb
Ultra Champion

like lukejadamec says in his comment, luthfi seems to want to change the contents of (or remove altogether) certain fields in events, not remove the whole event based on a field value.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...