Getting Data In

Entire file contents as a single event

keiche
Explorer

I would like to know how to setup Splunk to monitor a local input directory, BUT the new files which are added (which contain multiple lines) are ingested by Splunk and only create 1 new event per file (containing all of the file's contents). I do have the ability to manipulate the file data to add line-breaks if that is the solution.

2 Solutions

gkanapathy
Splunk Employee
Splunk Employee

Just set the LINE_BREAKER for the sourcetype to something that will never match, such as (?!). You will also probably also need to increase MAX_EVENTS (default is only 500 lines, there isn't a hard limit I know of) and TRUNCATE to something larger than the biggest file size (or I think 0 is unlimited).

View solution in original post

ftk
Motivator

I use a regular monitor stanza combined with a custom sourcetype to index full files of interest.

I use the following monitor to index changes to my splunk configs for example (inputs.conf):

[monitor://C:\Program Files\Splunk\etc\...\*.conf]
followTail = False
sourcetype = splunk_config
index = my_custom_index
disabled = false

and define the splunk_config sourcetype in props.conf as such:

[splunk_config]
BREAK_ONLY_BEFORE=goblygook
MAX_EVENTS=200000
DATETIME_CONFIG = NONE
CHECK_METHOD = modtime
pulldown_type = true
LEARN_MODEL = false

this combination will index all files under splunk\etc ending in .conf. The BREAK_ONLY_BEFORE=gooblybook basically tells splunk not to break the event (in this case the conf file) until it encounters "gooblygook" which shouldn't be in any of your files.

View solution in original post

lguinn2
Legend

Update: Check out this answer to the same question Each File as One Single Splunk Event

[mysinglefilesourcetype]
SHOULD_LINEMERGE = false
LINE_BREAKER = ((*FAIL))
TRUNCATE = 99999999

I think this is newer information

amfranz
Engager

In regards to gcoles findings about the first approach not working with Splunk 4.3:

LINE_BREAKER = (?!)

This approach still works in Splunk 4.3 with a minor modification. The expression needs to be surrounded by an additional pair of parantheses:

LINE_BREAKER = ((?!))

I think this is because Splunk 4.3 requires the regular expression to have at least one capture expression, and earlier Splunk versions did not enforce this. The "(?!)" is merely a lookahead expression, the additional pair of parentheses does add a capture expression.

gcoles
Communicator

As a note to anyone else who may be using this page as a reference, I had been using the LINE_BREAKER directive to do this (as outlined by gkanapathy), but this stopped working when we upgraded our indexers to 4.3. I had to change our props.conf entries for these kinds of inputs to use the method shown by ftk. I verified that the first method fails whether using lightweight or heavy forwarders, as long as the machine that is processing the props.conf for the sourcetype is 4.3.

0 Karma

ftk
Motivator

I use a regular monitor stanza combined with a custom sourcetype to index full files of interest.

I use the following monitor to index changes to my splunk configs for example (inputs.conf):

[monitor://C:\Program Files\Splunk\etc\...\*.conf]
followTail = False
sourcetype = splunk_config
index = my_custom_index
disabled = false

and define the splunk_config sourcetype in props.conf as such:

[splunk_config]
BREAK_ONLY_BEFORE=goblygook
MAX_EVENTS=200000
DATETIME_CONFIG = NONE
CHECK_METHOD = modtime
pulldown_type = true
LEARN_MODEL = false

this combination will index all files under splunk\etc ending in .conf. The BREAK_ONLY_BEFORE=gooblybook basically tells splunk not to break the event (in this case the conf file) until it encounters "gooblygook" which shouldn't be in any of your files.

Yashar_Shah
New Member

It doesn't enough if you forward data to the indexer. It is just useful for local file monitoring.

0 Karma

lguinn2
Legend

Do you want Splunk to create one event per file, or do you want it to create one event per line?

0 Karma

gkanapathy
Splunk Employee
Splunk Employee

Just set the LINE_BREAKER for the sourcetype to something that will never match, such as (?!). You will also probably also need to increase MAX_EVENTS (default is only 500 lines, there isn't a hard limit I know of) and TRUNCATE to something larger than the biggest file size (or I think 0 is unlimited).

adobrzeniecki
Path Finder

Is this still good in 2021?

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...