Getting Data In

What is the best way to index old data with fixed dates?

changux
Builder

Hi all.

I have a set of logs without a timestamp field, so, this value is taken from "Current time" on each sourcetype (16 in total). It is assumed that one of my users put these logs in a local folder once per day, and the Splunk forwarder transmits it to the indexer having a daily report of the information. Sadly, my user doesn't do this, and now, i have old data waiting to be indexed in a fixed date, i mean:

Oct3/log1.....log16
Oct4/log1.....log16
Oct5/log1.....log16

I have some reports showing the daily activity, so, i can't index all the data at the same time 😞
Logs have the same name log1.....log2, doesn't include any date in their name.

Any suggestion to index data on a specific date? My dirty idea for now is stop Splunk server, change the server date, start Splunk and index one folder according to the date and repeat changing system date until complete the folders with fixed dates.

Thanks!

0 Karma
1 Solution

changux
Builder

Hi all.
I have a solution 🙂

While reading:

https://docs.splunk.com/Documentation/Splunk/6.5.0/Admin/Propsconf

I tried with DATETIME_CONFIG = NONE under the proper sourcetype stanza in props.conf and change the system's modified time of the source file (under linux, is easy):

touch -t 201608251513 log1

So,

-rw-r--r--@ 1 user  staff  5442513 Aug 25 15:13 log1

And next:

./splunk add oneshot ~/any/log1 -sourcetype "prueba" -index "test" -host "local"

Events _time were indexed with old date/time perfectly 🙂

alt text

Special thanks to @somesoni2's useful help.

View solution in original post

changux
Builder

Hi all.
I have a solution 🙂

While reading:

https://docs.splunk.com/Documentation/Splunk/6.5.0/Admin/Propsconf

I tried with DATETIME_CONFIG = NONE under the proper sourcetype stanza in props.conf and change the system's modified time of the source file (under linux, is easy):

touch -t 201608251513 log1

So,

-rw-r--r--@ 1 user  staff  5442513 Aug 25 15:13 log1

And next:

./splunk add oneshot ~/any/log1 -sourcetype "prueba" -index "test" -host "local"

Events _time were indexed with old date/time perfectly 🙂

alt text

Special thanks to @somesoni2's useful help.

adayton20
Contributor

I'm not sure what your timestamps look like on this old data you mentioned, but you may want to explore telling Splunk how to format the time stamp, and then converting it into a standardized format using the attributes TIME_PREFIX and TIME_FORMAT in your props.conf

http://docs.splunk.com/Documentation/Splunk/6.5.0/Data/Configuretimestamprecognition

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi changux,
Verify if the timestamp that Splunk assignes to these events is the file date and time.
Instead, if you want to assign the current date and time you have to force it inserting DATETIME_CONFIG=CURRENT in the relative stanza of props.conf.

Bye.
Giuseppe

0 Karma

somesoni2
SplunkTrust
SplunkTrust

You might be able to follow these steps to take the timestamp from the name. Of course, you would need to rename files to include the date they belong to.

http://blogs.splunk.com/2009/12/02/configure-splunk-to-pull-a-date-out-of-a-non-standard-filename/

0 Karma

changux
Builder

Sadly, the solution doesn't work. I tried following the documentation and a lot of unexpected errors in logs:

10-21-2016 15:08:32.831 -0500 ERROR AggregatorMiningProcessor - Uncaught exception in Aggregator, skipping an event: Can't open DateParser XML configuration file "/splunk/splunk/etc/system/local/datetime.xml": No such file or directory - data_source="/files/splunk/201610061212/log1", data_host="Splunk", data_sourcetype="log1"

The props.conf has references to $SPLUNK/etc/system/local/datetime.xml but insist in /splunk/splunk/...

I made a symbolic link to fix a fake path /splunk/splunk and after restart:

10-21-2016 15:10:01.956 -0500 WARN  DateParserVerbose - A possible timestamp match (Tue Jan 29 16:15:06 2002) is outside of the acceptable time window. If this timestamp is correct, consider adjusting MAX_DAYS_AGO and MAX_DAYS_HENCE. Context: source::/files/splunk/201610151212/log1|host::Splunk|log1|151

My datetime.xml looks like:

<define name="_masheddate3" extract="year, month, day, hour, minute">
    <text><![CDATA[source::/files/splunk/(\d{4})(\d{2})(\d{2})(\d{2})(\d{2})]]></text>
</define>

Any idea?

0 Karma

somesoni2
SplunkTrust
SplunkTrust

Can you share your props.conf as well? I'm hoping that you putting all these configurations in the Indexer/Heavy forwarder and restarting it after the change.

0 Karma

changux
Builder

Thanks. I am doing my changes directly in the indexer...

[log1]
DATETIME_CONFIG = $SPLUNK_HOME/etc/system/local/datetime.xml
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = false
category = Custom
disabled = false
pulldown_type = true
CHARSET = ISO-8859-1
crcSalt = <SOURCE>

I saw in some cases an error related with small size of file and included the crcSalt line...

0 Karma

somesoni2
SplunkTrust
SplunkTrust

Can you try specifying absolute path for DATETIME_CONFIG,instead of $SPLUNK_HOME...?

The second error that you received was showing that month and date was recognized but not the year, causing Splunk to think it as 2017 date which is further than the default value of MAX_DAYS_HENCE (2 days).

0 Karma

changux
Builder

Hi.

This is my datetime.xml:

<define name="_masheddate3" extract="year, month, day">
        <text><![CDATA[source::/files/splunk/(\d{4})(\d{2})(\d{2})/]]></text>
</define>

An example folder is:

/files/splunk/201610151212/

In my splunkd.log:

10-25-2016 14:41:38.308 -0500 WARN  DateParserVerbose - A possible timestamp match (Thu Sep 19 17:33:58 2002) is outside of the acceptable time window. If this timestamp is correct, consider adjusting MAX_DAYS_AGO and MAX_DAYS_HENCE. Context: source::/files/splunk/201610181212/log1|host::Splunk|dis_recep_inspeccion|160

The timestamp should be 2016 10 15, and is incorrect, any suggestion?

0 Karma

somesoni2
SplunkTrust
SplunkTrust

Give this a try

<define name="_masheddate3" extract="month, day, year">
        <text><![CDATA[(?:^|source::)\/files\/splunk\/(\d{4})(\d{2})(\d{2})\/]]></text>
</define>

Also, add <use name="_masheddate3"/> in the <datePatterns>..</datePatterns> as well in datetime.xml

0 Karma

changux
Builder

I tried with year, month, day and:

10-25-2016 15:43:21.027 -0500 WARN  DateParserVerbose - A possible timestamp match (Thu Jun 13 05:47:54 2002) is outside of the acceptable time window. If this timestamp is correct, consider adjusting MAX_DAYS_AGO and MAX_DAYS_HENCE. Context: source::/files/splunk/201610151212/log1|host::Splunk|log1|122
10-25-2016 15:43:21.027 -0500 WARN  DateParserVerbose - A possible timestamp match (Wed Aug 28 07:14:46 2002) is outside of the acceptable time window. If this timestamp is correct, consider adjusting MAX_DAYS_AGO and MAX_DAYS_HENCE. Context: source::/files/splunk/201610151212/log1|host::Splunk|log1|122
10-25-2016 15:43:21.027 -0500 WARN  DateParserVerbose - A possible timestamp match (Thu Aug 12 17:16:19 2004) is outside of the acceptable time window. If this timestamp is correct, consider adjusting MAX_DAYS_AGO and MAX_DAYS_HENCE. Context: source::/files/splunk/201610151212/log1|host::Splunk|log1|122
10-25-2016 15:43:21.027 -0500 WARN  DateParserVerbose - A possible timestamp match (Sat Oct  5 01:54:52 2002) is outside of the acceptable time window. If this timestamp is correct, consider adjusting MAX_DAYS_AGO and MAX_DAYS_HENCE. Context: source::/files/splunk/201610151212/log1|host::Splunk|log1|122
10-25-2016 15:43:21.027 -0500 WARN  DateParserVerbose - A possible timestamp match (Wed Mar 13 00:20:38 2002) is outside of the acceptable time window. If this timestamp is correct, consider adjusting MAX_DAYS_AGO and MAX_DAYS_HENCE. Context: source::/files/splunk/201610151212/log1|host::Splunk|log1|122
10-25-2016 15:43:21.027 -0500 WARN  DateParserVerbose - A possible timestamp match (Wed Jan 30 10:23:20 2002) is outside of the acceptable time window. If this timestamp is correct, consider adjusting MAX_DAYS_AGO and MAX_DAYS_HENCE. Context: source::/files/splunk/201610151212/log1|host::Splunk|log1|122
10-25-2016 15:43:21.027 -0500 WARN  DateParserVerbose - A possible timestamp match (Tue Jan 29 16:15:06 2002) is outside of the acceptable time window. If this timestamp is correct, consider adjusting MAX_DAYS_AGO and MAX_DAYS_HENCE. Context: source::/files/splunk/201610151212/log1|host::Splunk|log1|122

The example folder was:

/files/splunk/201610151212

My files:

<define name="_masheddate3" extract="year, month, day">
        <text><![CDATA[(?:^|source::)\/files\/splunk\/(\d{4})(\d{2})(\d{2})\/]]></text>
</define>

<datePatterns>
...
      <use name="_masheddate3"/>
...

</datePatterns>

😞

0 Karma

somesoni2
SplunkTrust
SplunkTrust

It was year,month,day only. I forgot to update that when I took a sample from default datetime.xml.

Could you try this

<define name="_masheddate3" extract="year, month, day">
         <text><![CDATA[(?:^|source::)\/files\/splunk\/(\d{4})(\d{2})(\d{2})\d{4}\/]]></text>
 </define>

Also, since you control the format of the date in the folder name , could you try providing an epoch value for the date (path will be /files/splunk/1476489600 for oct-10-2016) and using following in datetime.xml

<define name="_masheddate3" extract="utcepoch">
         <text><![CDATA[(?:^|source::)\/files\/splunk\/(\d+)\/]]></text>
 </define>
0 Karma

changux
Builder

Thanks. I tried with the first option, no luck.

10-25-2016 20:49:12.981 -0500 WARN  DateParserVerbose - A possible timestamp match (Sun Feb 16 15:29:36 2003) is outside of the acceptable time window. If this timestamp is correct, consider adjusting MAX_DAYS_AGO and MAX_DAYS_HENCE. Context: source::/files/splunk/201610191212/log1|host::Splunk|log1|120
10-25-2016 20:49:12.981 -0500 WARN  DateParserVerbose - A possible timestamp match (Sun Jun  2 02:06:31 2002) is outside of the acceptable time window. If this timestamp is correct, consider adjusting MAX_DAYS_AGO and MAX_DAYS_HENCE. Context: source::/files/splunk/201610191212/log1|host::Splunk|log1|120
10-25-2016 20:49:12.981 -0500 INFO  IndexWriter - idx=cod-analisis Creating hot bucket=hot_v1_12, given event timestamped=1477446552

Using epoc in the folder name is not an option for now 😞

0 Karma

changux
Builder

month, day, year or year, month, day ?

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...