Splunk Search

Subsearch Question

Phynyte
New Member

Some background information on this. I have a CSV file that is being loaded every Monday. There are no time stamps in the file so everything has a time stamp of when the file was loaded. I set up my search to compare the previous file loaded vs the most recent. The current search I have will display a person if they show up on the previous report or the new one. I want to split these out into two separate searches. However, the only unique value that I can use to do this is the "_time". Can someone how me figure out a statement to add so I only pull results that has a time stamp the most recent file. Keep in mind I don't have a specific time that the file will be loaded.

**One important note, I'm comparing two files but at any given point there are ONLY 2 total time stamps present no matter how many logs each file has. Because when Splunk loads the file it's loading it with the current time it was loaded.

I was hoping I could use a where statement and for the value of @w1 but it's not possible.

index=sensitive host=ActiveEmployeeList earliest=-7d@w1 [search index=sensitive host=ActiveEmployeeList earliest=-7d@w1 | table EEN Last_Name | stats count by EEN | where count=1 | table EEN | format] | table _time EEN Last_Name First_Name Job_Title | sort by EEN

Tags (1)
0 Karma
1 Solution

somesoni2
Revered Legend

Try this

This will give your list of persons appearing only on this week's report

index=sensitive host=ActiveEmployeeList earliest=-7d@w1 [search index=sensitive host=ActiveEmployeeList earliest=-7d@w1 | stats first(_time) as _time] | eventstats count by EEN | where count=1 | table _time EEN Last_Name First_Name Job_Title | sort by EEN

This will give your list of persons appearing only on previous week's report

index=sensitive host=ActiveEmployeeList earliest=-7d@w1 [search index=sensitive host=ActiveEmployeeList earliest=-7d@w1 | stats last(_time) as _time] | eventstats count by EEN | where count=1 | table _time EEN Last_Name First_Name Job_Title | sort by EEN

Updated answer:

Try this

index=sensitive host=ActiveEmployeeList earliest=-7d@w1 [search index=sensitive host=ActiveEmployeeList earliest=-7d@w1 | table EEN Last_Name | stats count, first(_time) as _time by EEN | where count=1 | eventstats max(_time) as currentWeekTime | where _time=currentWeekTime | table EEN | format] | table _time EEN Last_Name First_Name Job_Title | sort by EEN

What I am doing here is that along with unique EEN, I am getting their _time value as well in subsearch. Then within subsearch itself taking the max value of _time which will be current week timestamp and then filtering EEN from current week only.

View solution in original post

somesoni2
Revered Legend

Try this

This will give your list of persons appearing only on this week's report

index=sensitive host=ActiveEmployeeList earliest=-7d@w1 [search index=sensitive host=ActiveEmployeeList earliest=-7d@w1 | stats first(_time) as _time] | eventstats count by EEN | where count=1 | table _time EEN Last_Name First_Name Job_Title | sort by EEN

This will give your list of persons appearing only on previous week's report

index=sensitive host=ActiveEmployeeList earliest=-7d@w1 [search index=sensitive host=ActiveEmployeeList earliest=-7d@w1 | stats last(_time) as _time] | eventstats count by EEN | where count=1 | table _time EEN Last_Name First_Name Job_Title | sort by EEN

Updated answer:

Try this

index=sensitive host=ActiveEmployeeList earliest=-7d@w1 [search index=sensitive host=ActiveEmployeeList earliest=-7d@w1 | table EEN Last_Name | stats count, first(_time) as _time by EEN | where count=1 | eventstats max(_time) as currentWeekTime | where _time=currentWeekTime | table EEN | format] | table _time EEN Last_Name First_Name Job_Title | sort by EEN

What I am doing here is that along with unique EEN, I am getting their _time value as well in subsearch. Then within subsearch itself taking the max value of _time which will be current week timestamp and then filtering EEN from current week only.

somesoni2
Revered Legend

I guess your current query does 1 and 2 already. For segregation of new employees (with timestamp of this weeks monday) and no-longer-employee (from previous monday's report), (since your said there will be only two possible value of timestamps), your can add '| eventstats max(_time) as CurrentWeek min(_time) as PreviousWeek | eval employeeType=case(_time=CurrentWeek,"New Employee",_time=PreviousWeek,"Terminated Employees") | fields - CurrentWeek, PreviousWeek' to your existing search.

Phynyte
New Member

To try and simplify this query i'm trying to write. I'm trying to do 3 things.
1. Find the EEN's that only appear 1 time when comparing the 2 files
2. Once those EEN's are found there are 2 possible time stamps that can be associated. One for new employees and one for people who are no longer employeed.
3. My goal is to separate those 2 groups by the only unique field between them (time stamp). So all new employees will have Time stamp A and the others time stamp B.
-The challenging part is I can' specify a specific time because this is going to run weekly from Monday to Monday.

0 Karma

Phynyte
New Member

That doesn't work. I think the subsearch needs to contain the "stats count by EEN | where count=1" because that portion compares the two files and removes all of the duplicates.

Then Once I have those EEN's my outer search will need to somehow split the times up.

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...