Splunk Search

Using tstats to extract first element in multivalue field

BarnesLeo
Engager

I have a field that looks something like this in the event viewer:

project_sources: [
    {
        scmEvent: {
            message: message1
            sha: xxxxxx
            ......
    }
    {
        scmEvent: {
            message: message2
            sha: yyyyyyy
            ......
    }
]

My end goal is to extract scmEvent.sha for the first element in the sources array as efficiently as possible. Can I do that somehow with tstats?

These two queries give me the same results:
1. index=myIndex | stats count by project_sources{}.scmEvent.message, sources{}.scmEvent.sha
2. | tstats count where index=myIndex by project_sources{}.scmEvent.message, sources{}.scmEvent.sha

For these cases I will get a table with something like this:
message1 xxxxx 30
message1 yyyyy 30
message2 xxxxx 30
message2 yyyyy 30

So it seems like the stats commands are counting all combinations of fields in element 0 and element 1 of sources.

Can I extract only one of the elements using tstats?
(This may be a trivial thing to do, but I still haven't fully grasped how fields that are dictionaries/arrays work.)

Tags (2)
0 Karma

kamlesh_vaghela
SplunkTrust
SplunkTrust

Hi @BarnesLeo,

Can you please try the following search?

index=YOUR_INDEX | rename project_sources{}.scmEvent.message as project_sources_message, project_sources{}.scmEvent.sha as source_sha 
| eval temp = mvzip(project_sources_message,source_sha) 
| stats count by _time temp 
| eval project_sources_message=mvindex(split(temp,","),0),source_sha=mvindex(split(temp,","),1) 
| stats sum(count) as count by project_sources_message source_sha

My Sample Search:

| makeresults 
| eval _raw="{\"project_sources\": [{\"scmEvent\": {\"message\": \"message1\",\"sha\": \"xxxxxx\"} },{\"scmEvent\": {\"message\": \"message2\",\"sha\": \"yyyyyyy\"} ] }" 
| append 
    [| makeresults 
    | eval _raw="{\"project_sources\": [{\"scmEvent\": {\"message\": \"message1\",\"sha\": \"xxxxxx\"} },{\"scmEvent\": {\"message\": \"message3\",\"sha\": \"zzzzzz\"} ] }"] 
| kv 
| rename project_sources{}.scmEvent.message as project_sources_message, project_sources{}.scmEvent.sha as source_sha 
| eval temp = mvzip(project_sources_message,source_sha) 
| stats count by _time temp 
| eval project_sources_message=mvindex(split(temp,","),0),source_sha=mvindex(split(temp,","),1) 
| stats sum(count) as count by project_sources_message source_sha

THanks

0 Karma

BarnesLeo
Engager

Thanks for the answer, but this misses the point of the question. I know I can get the query to do what I want with mvzip or spath before using stats. The question is how to do it with tstats since I want maximum performance.

0 Karma
Get Updates on the Splunk Community!

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars this month. This ...

They're back! Join the SplunkTrust and MVP at .conf24

With our highly anticipated annual conference, .conf, comes the fez-wearers you can trust! The SplunkTrust, as ...

Enterprise Security Content Update (ESCU) | New Releases

Last month, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...