Getting Data In

After fixing props.conf, how to re-index the same files using the new settings?

Justin_Grant
Contributor

I accidentally imported some files into Splunk and the default line-breaking didn't work correctly. Now I want to repeat the import using a fixed props.conf and transforms.conf.

But I know that Splunk won't re-index files that it's already seen (meaning unchanged CRCs). How can I get Splunk to re-import these files?

1 Solution

gkanapathy
Splunk Employee
Splunk Employee

If it's a one time re-import, I recommend you use first delete the old source with the search command:

source=/full/path/to/file | delete

(remembering that the "delete" capability is not granted to any role including admin by default) and then use the oneshot input at the command line, or it's equivalent in the Manager>Data Input>Files and Directories,option "Index a file on the Splunk server". The cli is:

./splunk add oneshot /full/path/to/file -sourcetype mysourcetype -index myindex -host myhostparam

The parameters other than the full path to the file (not a relative path) are optional. This is preferable to setting crcSalt to <SOURCE> for a one-time reload, since you don't need to then set it back if you don't want that enabled.

View solution in original post

Lowell
Super Champion

This is not an "answer" exactly, but more of a helpful technique that I've found that may be useful as a way of avoiding the situation where reindexing is necessary. I'm guessing that others that stumble across this question may find this helpful to them as well.


Splunk provides a command line tool to let you test your log file configuration (specifically the sourcetype association) without actually indexing anything.

splunk test sourcetype /path/to/my/file.log

This command will dump the props settings that have been associated based on source and sourcetype properties. This helps you know what settings splunk would use if/when the specified file is indexed. I've found this useful for testing source pattern matching rules, and verifying that all the props settings are setup correctly.

It's also possible to make a config change and then re-run the tool to see if your change had the desired effect without restarting splunk. I've saved many many hours by using this tool instead of waiting for splunk to restart, re-index, and then search, just to find out I had a silly configuration typo.

Hope this approach helps others to avoid re-indexing by becoming more proactive in confirming props.conf settings...

markrobinsonuk
Engager

The 'test' and 'train' commands have been deprecated.

Type "help [object|topic]" to view help on a specific object or topic.

0 Karma

gkanapathy
Splunk Employee
Splunk Employee

If it's a one time re-import, I recommend you use first delete the old source with the search command:

source=/full/path/to/file | delete

(remembering that the "delete" capability is not granted to any role including admin by default) and then use the oneshot input at the command line, or it's equivalent in the Manager>Data Input>Files and Directories,option "Index a file on the Splunk server". The cli is:

./splunk add oneshot /full/path/to/file -sourcetype mysourcetype -index myindex -host myhostparam

The parameters other than the full path to the file (not a relative path) are optional. This is preferable to setting crcSalt to <SOURCE> for a one-time reload, since you don't need to then set it back if you don't want that enabled.

thisissplunk
Builder

What do you do if you want to add many files at once? I need the source field to end up being /full/path/to/file/x.log. I have about 3,000 I need to do this way. I cannot stop the indexer just to clean my test index. I'm currently very frustrated by this functionality. I'm just trying to index data gain that will have an updated source field.

0 Karma

joxley
Path Finder

Read this https://answers.splunk.com/answers/72562/how-to-reindex-data-from-a-forwarder.html for a really good answer about different ways of reindexing the data

0 Karma

gkanapathy
Splunk Employee
Splunk Employee

You should use a test index for new data sources anyway, because delete is dangerous and overuse of it will hurt search performance. You can still use oneshot with a test index though, and it makes the testing cycle easier. Using delete with oneshot is really for fixing mistakes.

Justin_Grant
Contributor

dude that's a genius idea! The only other way I knew to handle this case was with SPLUNK CLEAN and a test index, but your approach sounds much easier. thanks!

Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

Splunk is officially part of Cisco

Revolutionizing how our customers build resilience across their entire digital footprint.   Splunk ...

Splunk APM & RUM | Planned Maintenance March 26 - March 28, 2024

There will be planned maintenance for Splunk APM and RUM between March 26, 2024 and March 28, 2024 as ...