Splunk Search

Regex in inputs.con

mtyrefors
Engager

Hi.
I have this "problem":
I get files delivered into the same folder containing the same data, but with different filenames:
APP_20140404 and APP_201404041337.

The first example is APP_DATE (Delivered once a day, with all logentries for that day) and the second one APP_DATE+TIMESTAMP (Delivered once every 15 minutes, containing the last 15 minutes of logentries).

I can not influence which of these files are delivered to me...

Is there any way that I can make sure that I only index the files with a timestamp in the filename?
I suspect that would be some sort of regex in inputs.conf?

Tags (2)
1 Solution

DavidHourani
Super Champion

Hello,

You can do this via graphical interface when adding the data input directly from your splunk portal.

If you wish to do this via inputs.conf the simply add the whitelist/blacklist conditions you want. In your example:

Log_X_20140404.txt-------------> Regex: \w+_X_\d{8}.txt

\w+: match any word
_X_: match _X_ after the previous any word.
\d{8}: match 8 decimals after _X_
.txt: match .txt after the 8 decimals

So all in all you're matching WORD_8DigitsNumber.TXT

Log_X_20140404165218.txt-------> Regex: \w+_X_\d{14}.txt

Here it's the same as before only you are matching 14 digits instead of 8.

So simply use one of those 2 in your white list and the other won't be taken into consideration.

You can refer to the following site for more help creating your regex: http://regex101.com/

Best regards,
David

View solution in original post

DavidHourani
Super Champion

Hello,

You can do this via graphical interface when adding the data input directly from your splunk portal.

If you wish to do this via inputs.conf the simply add the whitelist/blacklist conditions you want. In your example:

Log_X_20140404.txt-------------> Regex: \w+_X_\d{8}.txt

\w+: match any word
_X_: match _X_ after the previous any word.
\d{8}: match 8 decimals after _X_
.txt: match .txt after the 8 decimals

So all in all you're matching WORD_8DigitsNumber.TXT

Log_X_20140404165218.txt-------> Regex: \w+_X_\d{14}.txt

Here it's the same as before only you are matching 14 digits instead of 8.

So simply use one of those 2 in your white list and the other won't be taken into consideration.

You can refer to the following site for more help creating your regex: http://regex101.com/

Best regards,
David

MuS
Legend

Hi mtyrefors,

you can use whitelist and/or blacklist in inputs.conf for this:

whitelist = <regular expression>
* If set, files from this input are monitored only if their path matches the specified regex.
* Takes precedence over the deprecated _whitelist attribute, which functions the same way.

as example, blacklist everything first and whitelist what you want (assumption: the file name is always like in your example APP_12digits), like this:

[yourmonitorstanza]
blacklist = .+
whitelist = \w+_\d{12}

hope this helps ...

cheers, MuS

0 Karma

mtyrefors
Engager

Hi and thanks.
My requirements have changed a little and I am having some trouble adapting your response to that.

The files look like this.
* Log_X_20140404.txt
* Log_X_20140404165218.txt

That is Log_X_Date.txt and Log_X_DATE+TIME.txt

Can you help?

Thanks.

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...