Splunk Search

Which users have done this but have not done that ?

Mahieu
Communicator

Hello there,

I'm struggling a little bit with the search language, booleans, eventtypes and stuff ... I can't find a good way to create a very simple report.

What i'm trying to do is to identify users that have connected to server A and have not connected to server B over a selected time period.

The logs would be like :
sourcetype=server_a username=johndoe "New connection to server A"
sourcetype=server_b username=johndoe "New connection to server B"

I'd like to have the list of users for wich Splunk can find at least one "New connection to server A" but zero "New connection to server B" over the selected time period.

It looks very simple but ... i guess i'm not using the search language efficiently. Of course, the selected time period could be somethig like 3 month and there are lots of events so "transaction" doesn't look like a good option.

Thanks a lot in advance for your help.

Mat

Tags (2)
0 Karma
1 Solution

mukeshb
Explorer

Try using subsearches. Something like this should work

sourcetype=server_a "New connection to server A" NOT [search sourcetype=server_b "New connection to server B" | dedup user| fields user] | table user

The above does the following:
Performs a search to give the users logged into server B (this is the sub search assuming the field name is user and that the field name is same for both the sourcetypes)
For performance issues, the sub search is made to give only the field user
The outer search searches for users which are NOT in the list of the inner search and lists them as a table.

Make changes accordingly and look for sub searches in splunkdocs.

View solution in original post

David
Splunk Employee
Splunk Employee

It should be substantially faster to run a search like this:

sourcetype=server_a OR sourcetype=server_b username=johndoe | stats count(eval(searchmatch("New connection to server A"))) as ACount count(eval(searchmatch( "New connection to server B"))) as BCount |where ACount>0 AND BCount=0

MuS
Legend

You just forgot to use OR sourcetype=server_b in the base search 😉

David
Splunk Employee
Splunk Employee

Good catch! Fixed.

0 Karma

mukeshb
Explorer

Try using subsearches. Something like this should work

sourcetype=server_a "New connection to server A" NOT [search sourcetype=server_b "New connection to server B" | dedup user| fields user] | table user

The above does the following:
Performs a search to give the users logged into server B (this is the sub search assuming the field name is user and that the field name is same for both the sourcetypes)
For performance issues, the sub search is made to give only the field user
The outer search searches for users which are NOT in the list of the inner search and lists them as a table.

Make changes accordingly and look for sub searches in splunkdocs.

DalJeanis
Legend

While this would work, a subsearch is something that you should avoid if there's a decent alternative, which in this case is given in David [Splunk]'s answer.

The reasons to avoid subsearches are many, but most obviously, there's a limit to the number of results that a subsearch can return, and in this case, it turned out highly inefficent due to the extensive results of the subsearch.

0 Karma

Mahieu
Communicator

Hello there,

This works, thank you.
Still, it takes ages to run, even on a short period of time.
The reason for that is that we have lots of users, and i mean LOTS.

When I inspect the search, the original "NOT [search sourcetype=server_b "New connection to server B" | dedup user| fields user]" becomes something like :

NOT ( ( user="xxxx1" ) OR ( da_user="xxxx2" ) OR ( da_user="xxxx3" ) OR ( da_user="xxxxx3" ) OR ( da_username="www") .......)

And this goes on and on and on, the total number of users in the "NOT" is around 40,000.

I guess this explains why the search takes forever to complete.

Any suggestions on how to improve the performance here ?
Summary indexing does not look like a good option as I'd need to "remove" information from my summary index in this case. I thought about an intermediary lookup table that would include the username and the last connection time but i'm not sure it'd make things faster.

Thoughts ? Suggestions ?

Thanks a lot in advance

Mat

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...