Hello everyone. I'm trying to get a time chart of unique users from my IIS logs. Our apps are both authenticated and unauthenticated, so a unique user is both a unique username and a unique IP with no username. Its been a long day and I can't figure it out, but here's what I'm working with so far:
earliest=-7d@d latest=@d | timechart span=1d dc(cs_username)
Here's some sample data to help visualize:
cs_username,c_ip
User1,1.1.1.1
User2,2.2.2.2
User3,1.1.1.1
,3.1.1.1
,4.4.4.4
,1.1.1.1
NOTE: As pictured above, there could be duplicate IPs for users, and even a user with username having the same IP as a record containing no username.
Any tips? Thanks everyone!
@mcollins42, can you please try the following:
<YourBaseSearch>
| eval c_username=case(isnull(c_username),"N/A",true(),c_username)
| eval key=c_username."-".c_ip
| timechart span=1d dc(key) as distinctUsers
The way I read your requirements, the c_ip
only matters if there is no cs_username
. The same username logging on from two different IPs counts as only one distinct user. As such, I've added two test cases, a repeat user with a second IP "User2,5.2.2.2" and one more record that is blank cs_username
and a duplicate c_ip
"none,4.4.4.4".
| makeresults
| eval mydata="User1,1.1.1.1 User2,2.2.2.2 User3,1.1.1.1 User2,5.2.2.2 none,3.1.1.1 none,4.4.4.4 none,1.1.1.1 none,4.4.4.4"
| makemv mydata
| mvexpand mydata
| makemv delim="," mydata
| eval cs_username=mvindex(mydata,0)
| eval c_ip=mvindex(mydata,1)
| eval cs_username=if(cs_username="none",null(),cs_username)
| table cs_username,c_ip
| rename COMMENT as "The above enters your test data"
| rename COMMENT as "And here's the code"
| eval User=coalesce(cs_username,"((none))".c_ip)
| stats dc(User)
@mcollins42, can you please try the following:
<YourBaseSearch>
| eval c_username=case(isnull(c_username),"N/A",true(),c_username)
| eval key=c_username."-".c_ip
| timechart span=1d dc(key) as distinctUsers
This is a great solution! Thanks!
@mcollins42, I have converted to answer. Please accept to mark this question as answered!
@mcollins42, does the same field cs_username contain both unique username and a unique IP or are they two separate fields? If they are the same field then your query should give you the result which you are looking for.
Thanks for the quick reply! They are two different fields.
Can you give the field name for IP address? Also does the two fields exist on all events or not. If the cs_username is present will the IP address field be null and vice-versa?
Please give the following a try, assuming your ip address field name is client_ip
(replace with the actual field name)
<YourBaseSearch> earliest=-7d@d latest=@d
| eval user=coalesce(cs_username,client_ip)
| timechart span=1d dc(user) as disctinctCount
I've updated my description with some sample data. There will always be an IP, but may not be a username.
Then why dont you use c_ip for distinct count?
<YourBaseSearch>
| timechart span=1d dc(c_ip) as distinctUsers
Aha! Because if there are multiple usernames connecting from one IP (Or a username, and an unauthenticated person with no username), it needs to be multiple counts. Like this:
User1,1.1.1.1
User2,1.1.1.1
User3,1.1.1.1
,1.1.1.1
When the username is not available, do you want to consider it as generic bucket say "NA" OR you want to associate it with last User which used it?
If there's two different IPs with no usernames, that's two users. Two of the same IPs with no username is one user.
Thanks!