Getting Data In

How do you index the events an existing file AND index new events on-going in that same file

peter_gianusso
Communicator

1)Lets say we have a text log file with 50 existing events in it and the file has NOT been indexed by Splunk.
2) When we add that file as a data input, no events are indexed. The source type is NOT seen on the main search screen. The host is NOT seen on the main search screen.
3) Then we turn on our application and it writes a new event to that SAME EXACT FILE, the source type IS then seen on the main search screen. The host IS then seen on the main search screen. Splunk shows 1 event indexed. As our application continues to write new events, the new and only the new events seem to be indexed.

The Splunk people that we were working with thought that the existing events would have been indexed as well.

The timestamps and formats are the same. This is occurring across our system. Not just for one 1.

Our expectation is that the existing 50 events (#1) and the new events would be indexed and searchable. How can we do that?

Tags (1)
0 Karma
1 Solution

bwooden
Splunk Employee
Splunk Employee

Splunk may have seen that file before (or thinks it has). Does the file have large headers in it?

You can force Splunk to re-index the entire file with the oneshot command.

opt/splunk/bin/splunk add oneshot -sourcetype test -index main -source /tmp/some_file.txt

NB: This will index the entire file, so lines from the file already in Splunk will be duplicated.

View solution in original post

0 Karma

RicoSuave
Builder

We need to see your input stanza. If followTail is set to true then splunk will tail a new file it sees and not index any historical data in it. This is also explained here: http://splunk-base.splunk.com/answers/57819/when-is-it-appropriate-to-set-followtail-to-true

0 Karma

bwooden
Splunk Employee
Splunk Employee

Splunk may have seen that file before (or thinks it has). Does the file have large headers in it?

You can force Splunk to re-index the entire file with the oneshot command.

opt/splunk/bin/splunk add oneshot -sourcetype test -index main -source /tmp/some_file.txt

NB: This will index the entire file, so lines from the file already in Splunk will be duplicated.

0 Karma

peter_gianusso
Communicator

I think we resolved this somehow by doing a few things. We removed the custom index reference in the stanza. Followtail was false (0) previously so that did not fix the problem.

We are new to Splunk and it seems pretty finicky. We are having trouble getting consistent results.

0 Karma

sdaniels
Splunk Employee
Splunk Employee

Yes good points. Can you edit your post to include your inputs.conf settings for that sourcetype.

0 Karma

aholzer
Motivator

Make sure you don't have such properties like MAX_DAYS_AGO defined incorrectly in your props.conf.

Also make sure you don't have anything like ignore_older_than property in the inputs.conf file.

Either of these cases could explain why you aren't processing the older events but you are processing the newer events.

0 Karma

sdaniels
Splunk Employee
Splunk Employee

All of the new events come in no problem?

I assume you tried it a few times and cleaned the index?

This will delete all data and you'll never get it back so be careful. http://docs.splunk.com/Documentation/Splunk/latest/admin/RemovedatafromSplunk

You could create a separate index to test it again if you have everything in the main index. That way, cleaning the index is not really a big deal. I would recommend a call to support if this doesn't get you anything.

0 Karma

peter_gianusso
Communicator

That's what the Splunk people are saying as well...we tried that...we even tried it with an Splunk Sales Engineer

0 Karma

sdaniels
Splunk Employee
Splunk Employee

When you add an input for a file and choose - 'Continuously index data from a file or directory this Splunk instance can access' - it should index the existing file and then add new events as well.

0 Karma
Get Updates on the Splunk Community!

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...