Splunk Search

Is there any more accurate way to correlate it?

xsstest
Communicator

Good moring,everyone.

I have some events. They come from the same sourcetype.I want to get a detailed registration information of the user.But each user action will be split into multiple events, and I want to correlate them

alt text

As above:

the first table: (register_info)

if spl is index=tomcat sourcetype=regsiter "reg_basic_info" "mobile" "udid"

I can get the basic information,Sometimes successful registration, and sometimes registered unsuccessfully, Whether successful or unsuccessful,It will generate this event, This information includes _time、session_id、mobile、UDID、sip、sn

the second table: (register_result)

if spl is index=tomcat sourcetype=regsiter "reg_success" "uid" "mobile"

In this table, I can get the result of registration. If the user is registered successfully, it will generate an event. If the user registered unsuccessfully, this event will not be generated , Yes, here I think you can use session_id to associate the event of tables, and Identify a successful registered user

the third table : (register_location)

if spl is index=tomcat sourcetype=regsiter "reg_location" "province" "city"

In this table,I can get the location information of registered users,It may have multiple location information, depending on the user's different operations.Of course, this event may also be empty, for example: the user does not enable positioning service on the phone, it will not produce any log(event).Here, I think you can use mobileand sn to associate with the event of the first table,but you need to consider,If the location service is not enabled,This event will be empty,That means it doesn't generate this log.

the fourth table: (register_device)

if spl is index=tomcat sourcetype=regsiter "reg_device_info" "udid" "models"

In this table, I can get registered user's device information,I think you can use UDID directly to associate with the event of first table, Because each device has the only device ID,This may produce multiple events,we just have to choose one of them.

The last table, which is all the information I'd like to eventually get for a successful signup user。

You might say, I can do it with the join command, and when I use thejoin command, I find that if the search span is set to a day or more, the result will be inaccurate, and it may miss some results.

Is there a better way to associate this information and not miss any successful registered users?

Tags (1)
0 Karma
1 Solution

DalJeanis
Legend

This combines your first two searches...

index=tomcat sourcetype=register 
    ( "reg_basic_info" "mobile" "udid") 
 OR ( "reg_success" "uid" "mobile")
| fields _time session_id mobile UDID sip sn uid register_result SIP
| stats values(*) as * by session_id
| eval register_result=coalesce(register_result,"failure")

This combines your first three searches. We went with your assumption that the sessionid was not useful here, and assumed that if more than one record was present for a combination of mobile and sn, then the latest() one at time of the query would be used.

index=tomcat sourcetype=register 
    ( "reg_basic_info" "mobile" "udid") 
 OR ( "reg_success" "uid" "mobile")
 OR ( "reg_location" "province" "city")
| fields _time session_id mobile UDID sip sn uid register_result SIP province city district

| rename COMMENT as "mark location records, copy location information to other records, delete location information") 
| eval rectype=if(isnull(city),"keep","location")
| eventstats latest(province) as province latest(city) as city by mobile sn
| where rectype="keep"
| fields - rectype

| rename COMMENT as "roll records together by session") 
| stats values(*) as * by session_id
| eval register_result=coalesce(register_result,"failure")

If you need the last location record before a session or the first one after, then let us know and we can show you how that middle section needs to be recoded, with a sort and a streamstats, to get it.


This combines your four searches, using the same sort of logic with the UDID for device information.

index=tomcat sourcetype=register 
    ( "reg_basic_info" "mobile" "udid") 
 OR ( "reg_success" "uid" "mobile")
 OR ( "reg_location" "province" "city")
 OR ("reg_device_info" "udid" "models")
| fields _time session_id mobile UDID sip sn uid register_result SIP province city district mfr models DevName

| rename COMMENT as "mark location and device records, copy location and device information to other records, delete location and device information") 
| eval rectype=if(isnull(city) AND isnull(DevName),"keep","delete")
| eventstats latest(province) as province latest(city) as city by mobile sn
| eventstats latest(mfr) as mfr latest(models) as models latest(DevName) as DevName by UDID
| where rectype="keep"
| fields - rectype

| rename COMMENT as "roll records together by session") 
| stats values(*) as * by session_id
| eval register_result=coalesce(register_result,"failure")

View solution in original post

woodcock
Esteemed Legend

This worked for me:

| makeresults 
| eval raw="time=2017/11/23 9:27:58.000,table=reg_basic_info,session_id=A,mobile=13888888888,UUID=AAAAAA-AAAAA-BBBBB,sip=12.12.12.12,sn=0123144::time=2017/11/23 9:27:59.000,table=reg_basic_info,session_id=B,mobile=13888888888,UUID=AAAAAA-AAAAA-BBBBB,sip=12.12.12.12,sn=1231444::time=2017/11/23 9:27:58.000,table=reg_success,session_id=A,uid=123,register_result=success,sip=12.12.12.12::time=2017/11/23 9:27:59.000,table=reg_location,session_id=C,mobile=13888888888,province=Bejing,city=Bejing,district=Dongcheng,sn=0123144::time=2017/11/23 10:17:18.000,table=reg_location,session_id=A,mobile=13888888888,province=Bejing,city=Bejing,district=Dongcheng,sn=0447777::time=2017/11/23 9:27:57.000,table=reg_device_info,session_id=D,UUID=AAAAAA-AAAAA-BBBBB,mfr=Apple,models=iPhone,DevBase=My_Iphone::time=2017/11/23 9:31:57.000,table=reg_device_info,session_id=E,UUID=AAAAAA-AAAAA-BBBBB,,mfr=Apple,models=iPhone,DevBase=My_Iphone"
| makemv delim="::" raw
| mvexpand raw
| rename raw AS _raw
| kv
| eval _time=strptime(time, "%Y/%m/%d %H:%M:%S.%3N")
| fields - time
| sort 0 _time
| eventstats values(UUID) AS UUID BY session_id
| eventstats values(DevBase) AS DevBase values(mfr) AS mfr values(models) AS models BY UUID
| stats min(_time) AS _time values(*) AS * BY UUID session_id
| search uid="*"
| table _time mobile sip UUID register_result uid province city district mfr models DevBase

So try this:

(index=tomcat sourcetype=regsiter "reg_basic_info" "mobile" "udid")
OR
(index=tomcat sourcetype=regsiter "reg_success" "uid" "mobile")
OR
(index=tomcat sourcetype=regsiter "reg_location" "province" "city")
OR
(index=tomcat sourcetype=regsiter "reg_device_info" "udid" "models")
| eventstats values(UUID) AS UUID BY session_id
| eventstats values(DevBase) AS DevBase values(mfr) AS mfr values(models) AS models BY UUID
| stats min(_time) AS _time values(*) AS * BY UUID session_id
| search uid="*"
| table _time mobile sip UUID register_result uid province city district mfr models DevBase

DalJeanis
Legend

This combines your first two searches...

index=tomcat sourcetype=register 
    ( "reg_basic_info" "mobile" "udid") 
 OR ( "reg_success" "uid" "mobile")
| fields _time session_id mobile UDID sip sn uid register_result SIP
| stats values(*) as * by session_id
| eval register_result=coalesce(register_result,"failure")

This combines your first three searches. We went with your assumption that the sessionid was not useful here, and assumed that if more than one record was present for a combination of mobile and sn, then the latest() one at time of the query would be used.

index=tomcat sourcetype=register 
    ( "reg_basic_info" "mobile" "udid") 
 OR ( "reg_success" "uid" "mobile")
 OR ( "reg_location" "province" "city")
| fields _time session_id mobile UDID sip sn uid register_result SIP province city district

| rename COMMENT as "mark location records, copy location information to other records, delete location information") 
| eval rectype=if(isnull(city),"keep","location")
| eventstats latest(province) as province latest(city) as city by mobile sn
| where rectype="keep"
| fields - rectype

| rename COMMENT as "roll records together by session") 
| stats values(*) as * by session_id
| eval register_result=coalesce(register_result,"failure")

If you need the last location record before a session or the first one after, then let us know and we can show you how that middle section needs to be recoded, with a sort and a streamstats, to get it.


This combines your four searches, using the same sort of logic with the UDID for device information.

index=tomcat sourcetype=register 
    ( "reg_basic_info" "mobile" "udid") 
 OR ( "reg_success" "uid" "mobile")
 OR ( "reg_location" "province" "city")
 OR ("reg_device_info" "udid" "models")
| fields _time session_id mobile UDID sip sn uid register_result SIP province city district mfr models DevName

| rename COMMENT as "mark location and device records, copy location and device information to other records, delete location and device information") 
| eval rectype=if(isnull(city) AND isnull(DevName),"keep","delete")
| eventstats latest(province) as province latest(city) as city by mobile sn
| eventstats latest(mfr) as mfr latest(models) as models latest(DevName) as DevName by UDID
| where rectype="keep"
| fields - rectype

| rename COMMENT as "roll records together by session") 
| stats values(*) as * by session_id
| eval register_result=coalesce(register_result,"failure")
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...