Solved: Re-processing of old, previously indexed events to...

kris2000 · ‎06-21-2010

Hello everyone,

We have to take care of a migration scenario where old events needs to be re-indexed and "re-processed" to extract new fields as specified in the latest transforms.conf/props.conf.

Original source files may not be available. (I think Splunk indexers have _raw data)

Delete and re-play by the forwarders is not an option because of :

Huge amount of live data coming out of forwarders in addition to the "now required" historical data.
Non availability of original log files at the forwarders beyond certain time.
Large number of logfiles involved in a distributed production environment.

I read the posting: http://answers.splunk.com/questions/684/after-fixing-props-conf-how-to-re-index-the-same-files-using... but can not use the method as delete & re-play is costly and probably not feasible in our environment.

The only other way I can think of getting arround this problem is- extracting almost all fields at the search time. But not sure whether this is the best and standard way.

Is there a Splunk supported way of re-indexing/re-processing old events other than delete and re-play?

(or)

Delete and re-play is the only way to re-index old events?

Any suggestions, ideas are greately appreciated.

Thanks

gkanapathy · ‎06-21-2010

I think your best choice is probably to do search time extractions. There are very few cases where index-time extraction will gain you anything, and is (as you can see) much more inflexible. Remember, all text in Splunk is already indexed, so indexing specific fields only is usually superfluous.

View solution in original post

kris2000 · ‎06-22-2010

The data is "partly" indexed and extracted.

The fields include custom Timestamps, task durations etc.,

For Example each line in the file would be like :

F1 F2 F3,F4:F5

(F1, F2.. are fields)

My Question is if we initially decide to extract only F1, F2
using transforms.conf and props.conf but later decide to extract remaining fields (F3,F4,F5) also
for old data/events(by changing transforms.conf).

How do I apply this new rule in transforms.conf to the old data/events to extract/index remaining fields (F3,F4,F5)?

gkanapathy · ‎06-21-2010

I think your best choice is probably to do search time extractions. There are very few cases where index-time extraction will gain you anything, and is (as you can see) much more inflexible. Remember, all text in Splunk is already indexed, so indexing specific fields only is usually superfluous.

Simeon · ‎06-21-2010

Can you elaborate on the data a bit further? Especially with respect to the specific fields you want to extract (indexed, not-indexed, extracted?).

Simeon · ‎06-21-2010

You can apply new field extractions to existing data at any time. However, you cannot alter indexed fields such as the sourcetype, source, or index name fields. If you do have indexed fields, there are ways to relabel the data which include usage of tags.

Re-processing of old, previously indexed events to extract additional fields using new transforms.conf/props.conf

.conf24 | Registration Open!

ICYMI - Check out the latest releases of Splunk Edge Processor

Introducing the 2024 SplunkTrust!