Getting Data In

HOWTO: throw away portions of an event

Jason
Motivator

I have a very talkative data source that I only want a few fields - not entire events - from. How do I keep the parts I want and avoid indexing the rest of the event?

Tags (2)
0 Karma
1 Solution

Jason
Motivator

You can use a SEDCMD in props.conf to slightly or drastically rewrite data before it gets indexed.

Example: a talkative web server with a data stream coming in as sourcetype iislogs has a bunch of fields we aren't interested in, so let's lighten the load. Default fields:

date
time
s-sitename
s-ip
cs-method
cs-uri-stem
cs-uri-query
s-port
cs-username
c-ip
cs-version
cs(User-Agent)
cs(Cookie)
cs(Referer)
cs-host
sc-status
sc-substatus
sc-win32-status

Desired fields:

date
time
cs-method
cs-uri-stem
cs-uri-query
c-ip
sc-status

Here's how to do it: write a regular expression that will match an entire event, and capture the items you want out of it. Next, write a replacement. In this case, I wanted to include the field names in so Splunk would pull them out automatically at search time.

In props.conf:

[iislogs]
SEDCMD-lighteniislogs = s/(\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2})\s\S+\s\S+\s(\S+)\s(\S+)\s(\S+)\s\S+\s\S+\s(\S+)\s.*\s(\S+)\s\S+\s\S+/\1 cs-method="\2" cs-uri-stem="\3" cs-uri-query="\4" c-ip="\5" sc-status="\6"/

This will rewrite the event

2010-11-14 00:02:46 W3SVR1 192.168.2.2 GET /favicon.ico - 80 - 192.0.1.124 HTTP/1.1 Mozilla/4.0+(compatible;+MSIE+8.0;very_long_browser_user_agent_string) ID=0E839D;Lots=of_other_cookie_data - w3svr1.client.com 200 0 0

into the event

2010-11-14 00:02:46 cs-method="GET" cs-uri-stem="/favicon.ico" cs-uri-query="-" c-ip="192.0.1.124" sc-status="200"

View solution in original post

Genti
Splunk Employee
Splunk Employee

i would recommend keeping the fields and parts of the event you want and sending the rest of the event to the nullQueue

0 Karma

Jason
Motivator

I thought nullQueue was for whole events, such as events that matched or didn't match a particular regex? Here we were concerned with dropping a portion of each event.

Jason
Motivator

You can use a SEDCMD in props.conf to slightly or drastically rewrite data before it gets indexed.

Example: a talkative web server with a data stream coming in as sourcetype iislogs has a bunch of fields we aren't interested in, so let's lighten the load. Default fields:

date
time
s-sitename
s-ip
cs-method
cs-uri-stem
cs-uri-query
s-port
cs-username
c-ip
cs-version
cs(User-Agent)
cs(Cookie)
cs(Referer)
cs-host
sc-status
sc-substatus
sc-win32-status

Desired fields:

date
time
cs-method
cs-uri-stem
cs-uri-query
c-ip
sc-status

Here's how to do it: write a regular expression that will match an entire event, and capture the items you want out of it. Next, write a replacement. In this case, I wanted to include the field names in so Splunk would pull them out automatically at search time.

In props.conf:

[iislogs]
SEDCMD-lighteniislogs = s/(\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2})\s\S+\s\S+\s(\S+)\s(\S+)\s(\S+)\s\S+\s\S+\s(\S+)\s.*\s(\S+)\s\S+\s\S+/\1 cs-method="\2" cs-uri-stem="\3" cs-uri-query="\4" c-ip="\5" sc-status="\6"/

This will rewrite the event

2010-11-14 00:02:46 W3SVR1 192.168.2.2 GET /favicon.ico - 80 - 192.0.1.124 HTTP/1.1 Mozilla/4.0+(compatible;+MSIE+8.0;very_long_browser_user_agent_string) ID=0E839D;Lots=of_other_cookie_data - w3svr1.client.com 200 0 0

into the event

2010-11-14 00:02:46 cs-method="GET" cs-uri-stem="/favicon.ico" cs-uri-query="-" c-ip="192.0.1.124" sc-status="200"
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...