Getting Data In

Best practice for indexing files with headers (preamble_regex, field_header_regex, header_field_line_number)

threatanalyst
Engager

I have been trying to understand when it is best practice to use PREAMBLE_REGEX, FIELD_HEADER_REGEX, and/or HEADER_FIELD_LINE_NUMBER when indexing files with headers. I couldn't find in the documentation answers to some of the following questions:

  1. Will one attempted behavior ever "override" anther?
  2. If I use them all, which order do they take priority (listed order, some other order)?
  3. Is it best to only use the minimum number of settings required, or should I always try to set all of them?
  4. If a file without actual events still contains the header, how do I avoid Splunk registering the header as a separate event?

For example, I'm trying to parse the following sample output from TZWorks..

usp - full ver: 0.52; Copyright (c) TZWorks LLC
License #-------------- is authenticated for business use and registered to --------------
run time: -------------- [UTC]; Host: -------------
"cmdline: C:\--------------\usp64.exe -csvl2t -partition C:"
note: When comparing timestamps from manual analysis use option [-show_other_times] to see full range of timestamps recovered

date,time,timezone,MACB,source,sourcetype,type,user,host,short,desc,version,filename,inode,notes,format,extra
$sampledata...

I set up the following lines in props.conf (among other settings):

[usp]
PREAMBLE_REGEX = ^(usp|License|run|\"cmdline|\s*$)
FIELD_HEADER_REGEX = ^date
HEADER_FIELD_LINE_NUMBER = 7

These settings seem to work as long as the event files are consistent with the sample above. However, when no events are found, neither the header field ("date,time,timezone... etc.") nor the $sampledata exists, and Splunk interprets the first 5 lines as an actual event when indexing. Is there a better way to approach this in general that might also help solve my issue when the file does not contain events?

0 Karma

richgalloway
SplunkTrust
SplunkTrust

The docs say the FIELD_HEADER_REGEX value is not included in the headers so your current setting shouldn't work. That it does work tells me that field is trumped by one of the other two.

---
If this reply helps you, Karma would be appreciated.
0 Karma
Get Updates on the Splunk Community!

Index This | Forward, I’m heavy; backward, I’m not. What am I?

April 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

A Guide To Cloud Migration Success

As enterprises’ rapid expansion to the cloud continues, IT leaders are continuously looking for ways to focus ...

Join Us for Splunk University and Get Your Bootcamp Game On!

If you know, you know! Splunk University is the vibe this summer so register today for bootcamps galore ...