Getting Data In

HOWTO: throw away portions of an event

Jason
Motivator

I have a very talkative data source that I only want a few fields - not entire events - from. How do I keep the parts I want and avoid indexing the rest of the event?

Tags (2)
0 Karma
1 Solution

Jason
Motivator

You can use a SEDCMD in props.conf to slightly or drastically rewrite data before it gets indexed.

Example: a talkative web server with a data stream coming in as sourcetype iislogs has a bunch of fields we aren't interested in, so let's lighten the load. Default fields:

date
time
s-sitename
s-ip
cs-method
cs-uri-stem
cs-uri-query
s-port
cs-username
c-ip
cs-version
cs(User-Agent)
cs(Cookie)
cs(Referer)
cs-host
sc-status
sc-substatus
sc-win32-status

Desired fields:

date
time
cs-method
cs-uri-stem
cs-uri-query
c-ip
sc-status

Here's how to do it: write a regular expression that will match an entire event, and capture the items you want out of it. Next, write a replacement. In this case, I wanted to include the field names in so Splunk would pull them out automatically at search time.

In props.conf:

[iislogs]
SEDCMD-lighteniislogs = s/(\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2})\s\S+\s\S+\s(\S+)\s(\S+)\s(\S+)\s\S+\s\S+\s(\S+)\s.*\s(\S+)\s\S+\s\S+/\1 cs-method="\2" cs-uri-stem="\3" cs-uri-query="\4" c-ip="\5" sc-status="\6"/

This will rewrite the event

2010-11-14 00:02:46 W3SVR1 192.168.2.2 GET /favicon.ico - 80 - 192.0.1.124 HTTP/1.1 Mozilla/4.0+(compatible;+MSIE+8.0;very_long_browser_user_agent_string) ID=0E839D;Lots=of_other_cookie_data - w3svr1.client.com 200 0 0

into the event

2010-11-14 00:02:46 cs-method="GET" cs-uri-stem="/favicon.ico" cs-uri-query="-" c-ip="192.0.1.124" sc-status="200"

View solution in original post

Genti
Splunk Employee
Splunk Employee

i would recommend keeping the fields and parts of the event you want and sending the rest of the event to the nullQueue

0 Karma

Jason
Motivator

I thought nullQueue was for whole events, such as events that matched or didn't match a particular regex? Here we were concerned with dropping a portion of each event.

Jason
Motivator

You can use a SEDCMD in props.conf to slightly or drastically rewrite data before it gets indexed.

Example: a talkative web server with a data stream coming in as sourcetype iislogs has a bunch of fields we aren't interested in, so let's lighten the load. Default fields:

date
time
s-sitename
s-ip
cs-method
cs-uri-stem
cs-uri-query
s-port
cs-username
c-ip
cs-version
cs(User-Agent)
cs(Cookie)
cs(Referer)
cs-host
sc-status
sc-substatus
sc-win32-status

Desired fields:

date
time
cs-method
cs-uri-stem
cs-uri-query
c-ip
sc-status

Here's how to do it: write a regular expression that will match an entire event, and capture the items you want out of it. Next, write a replacement. In this case, I wanted to include the field names in so Splunk would pull them out automatically at search time.

In props.conf:

[iislogs]
SEDCMD-lighteniislogs = s/(\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2})\s\S+\s\S+\s(\S+)\s(\S+)\s(\S+)\s\S+\s\S+\s(\S+)\s.*\s(\S+)\s\S+\s\S+/\1 cs-method="\2" cs-uri-stem="\3" cs-uri-query="\4" c-ip="\5" sc-status="\6"/

This will rewrite the event

2010-11-14 00:02:46 W3SVR1 192.168.2.2 GET /favicon.ico - 80 - 192.0.1.124 HTTP/1.1 Mozilla/4.0+(compatible;+MSIE+8.0;very_long_browser_user_agent_string) ID=0E839D;Lots=of_other_cookie_data - w3svr1.client.com 200 0 0

into the event

2010-11-14 00:02:46 cs-method="GET" cs-uri-stem="/favicon.ico" cs-uri-query="-" c-ip="192.0.1.124" sc-status="200"
Get Updates on the Splunk Community!

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars this month. This ...

They're back! Join the SplunkTrust and MVP at .conf24

With our highly anticipated annual conference, .conf, comes the fez-wearers you can trust! The SplunkTrust, as ...

Enterprise Security Content Update (ESCU) | New Releases

Last month, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...