I need to index files that are summaries of data for a particular day. The data within the file is basically csv format with a delimiter of ":". There is NO timestamp (or just date or just time) in the data, but there is a date in the filename --the filename format is XX_wordshere_20150921
I know that I can do search-time extraction to pull the XX out of the source field. How do I get splunk to use the date in the filename + a time of 12:00:00 as the time stamp for each event in the file?
I was able to get splunk to use the date from the file name, but there had to be a time value on the events.
So, I now pre-process the file before it is indexed. I wrote a script to create an time stamp by extracting the date from the file name and then append a static time stamp (it looks like: "2015-11-02 12:00:00"). I then append that timestamp to each event in the file.
This is possible in Splunk Enterprise 7.2, making use of the new ingest-time eval. Full documentation is at https://docs.splunk.com/Documentation/Splunk/latest/Data/IngestEval.
Example
File Name: Log_I15_13092018.txt
File Name Format: Log_I15_%d%m%Y.txt
_time value assigned to events: 13/09/2018 00:00:00.000
props.conf
[mysourcetype]
TRANSFORMS=timestampeval
transforms.conf
[timestampeval]
INGEST_EVAL = _time=strptime(replace(source,".*(?=/)/",""),"Log_I15_%d%m%Y.txt")
This takes the "source" metadata value (which is the path and file name), removes the path, then extracts the date from the filename. The time defaults to 00:00:00.
All events in the file will have the same _time when imported.
Hello
this is my file format for example :
2019-04-03T07:33:05.929Z_1.91.0.192_1.88.0.0_5.9.6418.0.zip
can you help me out with it ?
please note that my files are indexed from S3 bucket using aws app
should i use transforms file or something in aws app ?
Hi @sarit_s,
Sorry, I should have explained how mine works, will help you understand with regards to your question.
Hope this helps?
Thanks for sharing, @mthomas_splunk ! Great "hack" - much more elegant than using datetime.xml and source.
Here's my example that also worked:
[timestampeval]
INGEST_EVAL = _time=strptime(replace(source,".*(?=/)/",""),"%Y-%m-%dT%H:%M:%S%:z")
I was able to get splunk to use the date from the file name, but there had to be a time value on the events.
So, I now pre-process the file before it is indexed. I wrote a script to create an time stamp by extracting the date from the file name and then append a static time stamp (it looks like: "2015-11-02 12:00:00"). I then append that timestamp to each event in the file.
I have tried to follow the suggested posts, but it is not working. My filename is AB_countMetrics_20150921.csv. There is no date or time value in the actual data. I'd like the timestamp for each event in the file to be 2015-09-21_12:00:00.000.
I think that the default _masheddate defined in the datetime.xml file should parse the date, but I'm not confident in my REGEX skills.
For the props.conf settings, do I set the TIME_FORMAT entry to what I want it to be, or to what it is in the file?
I'm thinking the problem is the lack of time value anywhere to be found. Can the hh:mm:ss portion be set to a static value in the datetime.xml file?
Hi,
Even i'm facing the same issue. I don't have time or date field in my events and i want to pick date from my filename. Also i tried above steps with changes in regex of datetime.xml, still there are no results.
@lyndac: Have you got any solution for this problem?