I have a field in an event called access_date which will be the date of a read or write of an oracle_table. I need to
find the tables by owner and table_name with the latest access_date > 180 days ago.
ie
owner table_name access_date
Joe XYZ_table 01-MAR-2017
Frank XYZ_table 02-JAN-2010
Joe XYZ_table 21-AUG-2016
Sally ABC_table 01-JUN-2013
Sally ABC_table 10-SEP-2015
would return
Frank XYZ_table 02-JAN-2010
Sally ABC_table 10-SEP-2015
I cannot seem this to work - any help will be appreciated
Assuming that the date is a string and not an integer:
| makeresults
| eval raw="Joe XYZ_table 01-MAR-2017::Frank XYZ_table 02-JAN-2010::Joe XYZ_table 21-AUG-2016::Sally ABC_table 01-JUN-2013::Sally ABC_table 10-SEP-2015"
| makemv delim="::" raw
| mvexpand raw
| rename raw AS _raw
| rex "^(?<owner>\S+)\s+(?<table_name>\S+)\s+(?<access_date>\S+)$"
| fields - _*
| rename COMMENT AS "Everything above this line fakes sample data; everything below this line is your solution"
| eval access_date_epoch = strptime(access_date, "%d-%b-%Y")
| eval age = now() - access_date_epoch
| where age > (180 * 24 * 60 * 60) OR isnull(access_date_epoch)
| sort 0 - access_date_epoch
| dedup owner table_name
| fields - age access_date_epoch
Note that Splunk times use Unix/Posix/Epoch time:
https://en.wikipedia.org/wiki/Unix_time
Therefore times older than January 1, 1970 are not possible (less than 0). So the fact that the times disappear is no problem because we can just change the test from | where age > (180 * 24 * 60 * 60)
to | where age > (180 * 24 * 60 * 60) OR isnull(access_date_epoch)
.
Latest access date for XYZ table which is greater than 180 days ago should be 21-AUG-2016 as per data you have provided. Please confirm.
yes, but I want to produce a report that shows this going forward
Maybe I do not understand why 2016 is not picked. In case 2010 is picked for XYZ then the ame should be 2013 for ABC i.e. the earliest access time > 180 days and not the latest time > 180 days.
| eval access_date_epoch=strptime(access_date, "%d-%b-%Y")
| table access_date owner table_name access_date_epoch
| eval duration= now()-access_date_epoch
| eval threshold=180*24*60*60
| where duration>threshold
| sort -access_date_epoch
| dedup table_name
| table acces_date owner table_name
if you look in the example I wanted the latest date by owner and table_name > 180 days ago
2016 is not picked because that owner Joe and table XYZ was accessed in the first event on
01-MAR-2017
Joe XYZ_table 01-MAR-2017
Joe XYZ_table 21-AUG-2016
Like this:
... | stats max(access_date) AS LastAccessDate BY owner table_name
OR, if it comes to you sorted, like this:
... | dedup owner table_name
OR, like this:
... | sort 0 - access_date | dedup owner table_name
This assumes that the access_date is a number, but it probably isn't so you should convert access_date to epoch
in the SQL and then add this to end of your selected solution: | fieldformat access_date = strftime(access_date, "%d-%b-%Y")
and if you cannot do it in the SQL and must do it in the SPL, then add this to the beginning of your selected solution: | eval access_date = strptime(access_date, "%d-%b-%Y")
.
Does it matter that the access_date is Uppercase? MAY verses May
No, just try it.
...| eval access_date = strptime(access_date, "%d-%b-%Y") - I do not get any values at all in the access_date field
if I
...| eval new_access_date = strptime(access_date, "%d-%b-%Y") - I do not get a field new_access_date (shouldn't I get that field?)
this is where I have been stuck
it's something to do with the year...I have some dummy years of 1958 and they will not produce a result with
...| eval access_date = strptime(access_date, "%d-%b-%Y") -
Splunk times use Unix/Posix/Epoch time:
https://en.wikipedia.org/wiki/Unix_time
Therefore times older than January 1, 1970 are not possible (less than 0
). So the fact that the times disappear is no problem because you cah just change your test from | where age > (180 * 24 * 60 * 60)
to | where age > (180 * 24 * 60 * 60) OR isnull(access_date)
as I did in my new answer.
I missed the part about 180 days; stand-by for new answer with fake data (tested).