I am trying to combine the search results from 3 separate sources logs and then compare the results against it against a employee list and display only the items in the employees.csv that don't show up in the search. The employees.csv contains some users that have a special status so I need to exclude those from the search as well. Here is my search term:
|inputlookup employees.csv | search NOT status="*" | search NOT [ search [ search index=main sourcetype="syslog" host="ssh-server" "session opened for user" | dedup userLogin | eval login = upper(userLogin)| fields login] | append [search sourcetype="MDT" | dedup EMP_ID | eval op_code = upper(EMP_ID)| fields op_code] | append [search host="ssh-server" who: | dedup userLogin | eval login = upper(userLogin)| fields login] | append [search source=webapps/var-log/httpd/log* | dedup userLogin | eval login = upper(userLogin)| fields login ] ]
employees.csv ---------- has the following fileds: full_name, login, op_code, status
search NOT status="*"
-------- is used to exclude those employees with a special status from the CSV
search NOT [ search [ search index=main sourcetype="syslog" host="ssh-server" "session opened for user" | dedup userLogin | eval login = upper(userLogin)| fields login
------ pull the ssh logins from the server dedup the userLogin field, convert it to uppercase and call it login to match the CSV field.
search sourcetype="MDT" | dedup EMP_ID | eval op_code = upper(EMP_ID)| fields op_code]
-------- pull the logins from MDT field is called EMP_ID and then converted to uppercase and converted to op_code for the CSV.
search host="ssh-server" who: | dedup userLogin | eval login = upper(userLogin)| fields login
--------- a script dumps the users currently logged in to the syslog with the tag of "who:" in them. This is done to catch users who have been logged in for a long time frame.
search source=webapps/var-log/httpd/log* | dedup userLogin | eval login = upper(userLogin)| fields login
-------- Pull user logins from the web server convert the userLogin to uppercase and call it login to match the CSV file.
I hope this breakdown helps explain my search. The goal of this is to find employees that have not logged into the system in a set time frame. I get good results running each search independently but when I combine them I get bad results so I am missing something.
This is what I ended up with:
(index=main sourcetype="syslog" host="ssh-server" "session opened for user" ) OR (sourcetype="MDT") OR (host="ssh-server" who:) OR (source=webapps/var-log/httpd/log*)
| eval login=upper(userLogin)
| eval op_code=upper(EMP_ID)
| fields login status
| dedup login
| eval status="OK"
| inputlookup employees.csv append=t
| search NOT status=R
| search NOT status=K
| search NOT status=Q
| stats count by login
| where count<2
Thank you to everyone that helped out!
This is what I ended up with:
(index=main sourcetype="syslog" host="ssh-server" "session opened for user" ) OR (sourcetype="MDT") OR (host="ssh-server" who:) OR (source=webapps/var-log/httpd/log*)
| eval login=upper(userLogin)
| eval op_code=upper(EMP_ID)
| fields login status
| dedup login
| eval status="OK"
| inputlookup employees.csv append=t
| search NOT status=R
| search NOT status=K
| search NOT status=Q
| stats count by login
| where count<2
Thank you to everyone that helped out!
I am wanting to get full_name, login, op_code, back as the return results. Currently I am getting several false hits from all of the sources. I haven't been able to even get 2 of them combined before it begins to return false hits.
EDIT:
I accidentally used inputlookup instead of inputcsv. Can you try it now with the search corrected below? NOTE that employees.csv needs to be located ins $SPLUNK_HOME$/var/run/Splunk. If it still fails, can you tell where it fails if you perform the search pipe by pipe?
Would something like this do what you need? It should work if all the logins you want to check against are in your spreadsheet once and only once.
(index=main sourcetype="syslog" host="ssh-server" "session opened for user" ) OR (sourcetype="MDT") OR (host="ssh-server" who:) OR (source=webapps/var-log/httpd/log*)
| eval login=upper(userLogin)
| eval op_code=upper(EMP_ID)
| fields op_code login
| dedup login
| eval status="OK"
| inputcsv employees.csv append=t
| search NOT status="*"
| fields login op_code
| stats count by login values(op_code)
| where count>1
This search:
The logic is that if all logins are in the spreadsheet once and only once, combining a list of those login and a deduplicated list of logins from all of your other sourcetypes should yield a count of 1 for a login that IS in your spreadsheet but NOT in your combined sourcetypes, and a count of 2 for all logins that are in both your spreadhseet and your combined sourcetypes.
I like the way this one looks, but it failed to return any results
Give this a try
|inputlookup employees.csv | search NOT status="*"
| search NOT
[search index=main sourcetype="syslog" host="ssh-server" "session opened for user" | dedup userLogin | eval login = upper(userLogin)| fields login
| append [search host="ssh-server" who: | dedup userLogin | eval login = upper(userLogin)| fields login]
| append [search source=webapps/var-log/httpd/log* | dedup userLogin | eval login = upper(userLogin)| fields login]
| dedup login]
OR NOT [search sourcetype="MDT" | dedup EMP_ID | eval op_code = upper(EMP_ID)| fields op_code]
Explanation
Take all the rows from employees.csv with blank status.
Filter all the records where
EITHER login which matches from
1) index=main sourcetype="syslog" host="ssh-server" "session opened for user"
2) host="ssh-server" who:
3) source=webapps/var-log/httpd/log*
OR op_code matches from
1) sourcetype="MDT"
The queries are structured in the same way.
This resulted in 40 more false hits than the original one. Can you highlight the changes you made?
Could you be more specific? What fields do you want in the final results? What are you getting from your current combined search? When you combine searches one at a time, when do the results fail?
I am trying to get a list of employees that have not logged into those systems in the last x number of days to be displayed. I did test them interdependently but every time I combine them my results are wrong. It is like the method I am using to combine them is not working at all. I am hoping someone here can point out where the error may be.
What results are you trying to get?
Since your individual searches work as expected, try adding them together one at a time until your results go bad. Fix that problem and add the next search. Repeat until you get what you want.