Splunk Search

Including Indextime in Table

dbastidas
New Member

I am a fairly new Splunk user..I have 5 different source types. Each sourcetype represents a unique txt file that generates every half hour. Each of the five txt files are written to subdirectories of D:\testing\. IE, D:\testing\subdir1, D:\testing\subdir2, etc.

I need to accomplish 4 goals:

  1. Search only the last log file that generated for each Sourcetype (the -30m&m time timeframe).
  2. Count the number of Lines in each file (subtracting 1 line from each).
  3. Identify the Indextime for each of the 5 log files.
  4. Display "Sourcetype", "Count", and "Indextime" in one table (sorted by count) for a total of 5 rows and 3 colums of data.

Search #1 - Displays Sourcetype and Count in a table with no problems.

earliest=-30m@m | search source="D:\\testing\\*" | stats sum(linecount) as "linecount" by sourcetype | eval Count=linecount-1 | sort 0 - "Count" | table "sourcetype" "Count"

Search #2 - Displays Sourcetype and Indextime in a table with no problems.

earliest=-30m@m | search source="D:\\testing\\*" | eval "Indextime"=strftime(_indextime,"%+")| table "sourcetype" "Indextime"

Search #3 - When I try to combine both searches into one, I get results similar to Search #1 but with no data in the Indextime column.

earliest=-30m@m | search source="D:\\testing\\*" | stats sum(linecount) as "linecount" by sourcetype | eval Count=linecount-1 | eval "Indextime"=strftime(_indextime,"%+") | sort 0 "Count" | table "sourcetype" "Count" "Indextime"

I've been struggling with this for a couple of days and would appreciate it if someone could help me come up with a solution that I can try.

Note that I have no choice but to use Indextime because there are no timestamps in these txt files.

0 Karma
1 Solution

lguinn2
Legend

First, you do have a timestamp. Splunk always creates a timestamp. If there is no timestamp in the events, Splunk will use the file mod time as the timestamp. If there is no file mod time (for example in a scripted input), Splunk will use the index time.

Second, I think your searches are more complicated than they need to be. Try these instead:

Single search:

source="D:\testing\*" earliest=-30m@m | stats count as Count by sourcetype | eval Count=Count-1 

Combined search:

source="D:\testing\*" earliest=-30m@m 
| stats count as Count latest(_time) as LatestTime by sourcetype source
| sort -LatestTime
| dedup sourcetype
| eval Count=Count-1 

The second search calculates the event count and timestamp for every file (assuming there will be multiple files per sourcetype). It then sorts the table with the most recent sources first. dedup keeps only the first (therefore most recent) entry for each sourcetype.

BTW, the count function in the stats command will work great if your text file has one-line-per-event, and it is very efficient. However, if you have multi-line events. you can replace "count as Count" with "sum(linecount) as Count" in order to count actual lines instead of events.

View solution in original post

lguinn2
Legend

First, you do have a timestamp. Splunk always creates a timestamp. If there is no timestamp in the events, Splunk will use the file mod time as the timestamp. If there is no file mod time (for example in a scripted input), Splunk will use the index time.

Second, I think your searches are more complicated than they need to be. Try these instead:

Single search:

source="D:\testing\*" earliest=-30m@m | stats count as Count by sourcetype | eval Count=Count-1 

Combined search:

source="D:\testing\*" earliest=-30m@m 
| stats count as Count latest(_time) as LatestTime by sourcetype source
| sort -LatestTime
| dedup sourcetype
| eval Count=Count-1 

The second search calculates the event count and timestamp for every file (assuming there will be multiple files per sourcetype). It then sorts the table with the most recent sources first. dedup keeps only the first (therefore most recent) entry for each sourcetype.

BTW, the count function in the stats command will work great if your text file has one-line-per-event, and it is very efficient. However, if you have multi-line events. you can replace "count as Count" with "sum(linecount) as Count" in order to count actual lines instead of events.

lguinn2
Legend

You are not getting the latest count, you are getting the largest count. Look at your sort command. If that's what you want, then okay. But to test it, run the command with and without the dedup in it. You can see that the sorting will be largest count first, and dedup keeps the first event and discards the rest of the same sourcetype...

0 Karma

dbastidas
New Member

I was looking to use linecount, since one txt file=one event. You lead me in the right direction, thanks! Here is the search that I used that allowed me to accomplishes all 4 goals.

earliest=-30m@m | search source="D:\testing\*"
| stats sum(linecount) as Count latest(_time) as LatestTime by sourcetype
| sort -Count
| dedup sourcetype
| eval Count=Count-1
| eval Timestamp=strftime(LatestTime,"%D %H:%M %Z")
| table sourcetype Count Timestamp

0 Karma
Get Updates on the Splunk Community!

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars this month. This ...

They're back! Join the SplunkTrust and MVP at .conf24

With our highly anticipated annual conference, .conf, comes the fez-wearers you can trust! The SplunkTrust, as ...

Enterprise Security Content Update (ESCU) | New Releases

Last month, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...