Getting Data In

Keep specific part of a textfile / email and discard the rest

eichfuss
Path Finder

Hi there,

I know the docs and the search function in answers.splunk.com. But I think I sit on the line. Hope someone can get me in the right direction or can help me with my problem.

I want to log emails and with all the header in the mail I just want to index a part of the mail. Here is an example of a similar mail.
I just want the part from "Object: Sensor A" till "Time: 2013-01-27 11:58:23" and push the rest to the Null-Queue.

Thanks a lot
Cheers, Sven

##################################

Content-Type: multipart/alternative; boundary=Apple-Mail-3A77049A-4A01-443F-B1DB-C1AA16C7497D
Content-Transfer-Encoding: 7bit
Subject: blablablablabla
From: Doc Snider blablabla@blablabl.de
Message-Id: 92D35476-1711-4B3451-A4B5-8D14534351E@gmail.com
Date: Mon, 27 Jan 2014 11:30:57 +0100
To: doc@blablabla.de
Mime-Version: 1.0 (1.0)
X-Mailer: iPhone Mail (11A501)

--Apple-Mail-3A77049A-4A01-443F-B1DB-C1AA16C7497D
Content-Type: text/plain;
charset=utf-8
Content-Transfer-Encoding: quoted-printable

Here are the infos

Object: Sensor A
Temperature: 42
Humidity: 32
Time: 2013-01-27 11:58:23

here is more uninteresting text.
blablablablablabla

############################################
Tags (3)
0 Karma
1 Solution

kristian_kolb
Ultra Champion

I guess you could (permanently) remove the unwanted stuff with a sed script, invoked through SEDCMD in props.conf, like so;

props.conf

[your_email_sourcetype]
SEDCMD = s/(?m).*[\r\n](Object:.*[\r\n]Time:\s[\d-]+\s[\d:]+)/\1/g

Just ensure that the events get indexed with the correct timestamp as well - as there seems to be different timestamps in the header and the message. So perhaps you should also add the following to the stanza above;

TIME_FORMAT = %Y-%m-%d %H:%M:%S
TIME_PREFIX = Time:+\s
MAX_TIMESTAMP_LOOKAHEAD = 400

Read more here;

http://docs.splunk.com/Documentation/Splunk/6.0.1/Data/Anonymizedatausingconfigurationfiles

View solution in original post

kristian_kolb
Ultra Champion

I guess you could (permanently) remove the unwanted stuff with a sed script, invoked through SEDCMD in props.conf, like so;

props.conf

[your_email_sourcetype]
SEDCMD = s/(?m).*[\r\n](Object:.*[\r\n]Time:\s[\d-]+\s[\d:]+)/\1/g

Just ensure that the events get indexed with the correct timestamp as well - as there seems to be different timestamps in the header and the message. So perhaps you should also add the following to the stanza above;

TIME_FORMAT = %Y-%m-%d %H:%M:%S
TIME_PREFIX = Time:+\s
MAX_TIMESTAMP_LOOKAHEAD = 400

Read more here;

http://docs.splunk.com/Documentation/Splunk/6.0.1/Data/Anonymizedatausingconfigurationfiles

eichfuss
Path Finder

Thanks a lot Kristian,
that`s the way.

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...