Getting Data In

Limit the index of a txt file to only last xxx lines

ferio
New Member

I want to forward the data of a Alarm.txt file using splunk forwarder but limit the index to only last xxx lines of the file.

The problem is my txt file has Day Month date and time and it does not have year. It will just repeat with 5 years of repeat of the months on this file. Only the end of the file is the most current year information.

Tue Jan 25 11:53:02, Set Alarm
Tue Jan 25 11:53:15, Set Alarm,
Tue Jan 25 12:02:54, Set Alarm,
Wed Feb 02 08:51:07, Set Alarm,
.....
Thu Dec 04 05:59:13, Set Alarm
Tue Jan 25 12:02:54, Set Alarm,

Can someone guide me on what I need to do so that it only extract the last part of the file?

Or if someone has a way for me to assign the correct year on each part of this file that will also work, so that I can index all the data with the correct year information.

Right now Splunk index everything as 2017 on this file.

Tags (1)
0 Karma

woodcock
Esteemed Legend

You might also make use of the followTail setting described here:
https://answers.splunk.com/answers/57819/when-is-it-appropriate-to-set-followtail-to-true.html

Strip the last 4 lines of the file, do the process (but don't restart splunk yet), put the last 4 lines back in, restart Splunk.

0 Karma

woodcock
Esteemed Legend

OK, given this claim:

Start with last few lines now and then forward every new line would be good, this is an alarm log for an equipment and I can work without the history.

then proceed as follows:
Setup a dummy indexer on the free license and point this forwarder at that indexer.
Stop splunk on the forwarder.
Point the forwarder temporarily to the dummy indexer ( outputs.conf ).
Setup a monitor clause to forward this file to the dummy indexer.
Start splunk on the forwarder and the file will be forwarded.

You have now updated the fishbucket on the forwarder to the last line of the file. This is the DB that Splunk uses to keep track what has/not been forwarded from each file.

Stop splunk on the forwarder.
Undo the changes to outputs.conf
Restart splunk on the forwarder.
From this point forward, you will get every line from the file forwarded.

Now to get the last 4 lines:
Just copy the file to somewhere else (like /tmp/) remove everything but the last 4 lines and use oneshot to forward in these events:

https://docs.splunk.com/Documentation/SplunkCloud/6.5.1612/Data/MonitorfilesanddirectoriesusingtheCL...

NOTE: you may think that these 2 settings can help you here, but neither can:

MAX_DAYS_AGO
ignoreOlderThan 
0 Karma

hettervik
Builder

I don't know if this is what you're looking for, but you could perhaps set the HEADER_FIELD_LINE_NUMBER in props.conf.

HEADER_FIELD_LINE_NUMBER = integer
- Tells Splunk the line number of the line within the file that contains the header fields.  If set to 0, Splunk attempts to locate the header fields within the file automatically.
- The default value is set to 0.

woodcock
Esteemed Legend

Initially I thought that this was brilliant but this will only work if he is using INDEXED_EXTRACTIONS , which he probably is not.

0 Karma

hettervik
Builder

INDEXED_EXTRACTIONS, why is that? Though, when I look at the props.conf documentation again, I understand that the HEADER_FIELD_LINE_NUMBER will have to be configured in props.conf on the instance reading the log file (probably an UF).

0 Karma

woodcock
Esteemed Legend

Look again. The HEADER_FIELD_LINE_NUMBER setting is in the INDEXED_EXTRACTIONS section.

0 Karma

hettervik
Builder

If I'm not looking at the wrong place or something, the HEADER_FIELD_LINE_NUMBER is in the "Structured Data Header Extraction and configuration"-section. It's specified that "all of its settings apply at input time, when data is first read by Splunk." INDEXED_EXTRACTIONS just happens to be one of many settings that an be specified at input time.

0 Karma

woodcock
Esteemed Legend

I agree it is a bit unclear but the grayed out section there most-definitely applies ONLY to INDEXED_EXTRACTIONS. Look at the other sections and compare.

0 Karma

ferio
New Member

my text file don't have a header, but this is good information to know about. Thanks!

0 Karma

hettervik
Builder

I was thinking, if you set HEADER_FIELD_LINE_NUMBER so that Splunk will start reading the file from the line you want Splunk to read from, reading only the last "xxx" lines, wouldn't that work?

0 Karma

ferio
New Member

Is there a command that is similar to End of File EOF? or do I have to define a line number?

0 Karma

woodcock
Esteemed Legend

Splunk will automatically assume the year is the current year as long as the month/day is not in the future. As far as only indexing some of the lines, is this a one-time-only forwarding or do you need to start with the last 4 lines now and then forward every new line as it comes in?

0 Karma

ferio
New Member

Start with last few lines now and then forward every new line would be good, this is an alarm log for an equipment and I can work without the history.

0 Karma
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...