Splunk Search

Splitting a field after combining fields

psomeshwar
Path Finder

So, I have two indexes and sourcetypes with the following fields:

index1 and sourcetype1:

aip = 34.465.45.234

AppVendor = vendor1, vendor2, vendor3 (These are all from different events)

AppName = app2, app3, app1 (All from different events corresponding to position of the vendors above)

AppVersion = 3.0343, 1.354, 2.5465 (Same convention)

index2 and sourcetype2:

jsonevent.external_ip = 34.465.45.234

jsonevent.hostname = Host1

jsonevent.Username = User1


I use the following search:

(index=index1 sourcetype=sourcetype1) OR (index=index2 sourcetype=sourcetype2)

| rename jsonevent.external_ip as exip
| rename aip as agentip
| eval external_ip = coalesce(agentip, exip)
| stats values(jsonevent.hostname) as Hostnames, values(jsonevent.Username) as Users, values(AppVendor) as Vendors, values(AppName) as Applications, values(AppVersion) as Version by external_ip
| search Hostnames=* Users=* Vendors=* Applications=* Version=*


I get the following:

external_ip                    Hostnames         Usernames              Vendors             Applications                Version

34.465.45.234             Host1                    User1                          Vendor1           app1                                1.354
                                                                                                                  Vendor2           app2                                2.5465 
                                                                                                                  Vendor3           app3                                3.0343

 

What I want is the following:

external_ip                    Hostnames         Usernames              Vendors             Applications                Version

34.465.45.234             Host1                    User1                          Vendor1           app2                                3.0343
34.465.45.234             Host1                    User1                          Vendor2           app3                                1.354 
34.465.45.234             Host1                    User1                          Vendor3           app1                                2.5465

Does anyone have any ideas how to achieve this?

Labels (3)
0 Karma

psomeshwar
Path Finder

So, @ITWhisperer and @richgalloway.  I combined both offered solutions into the following which did end up working:

(index=index1 sourcetype=sourcetype1) OR (index=index2 sourcetype=sourcetype2)

| rename jsonevent.external_ip as exip
| rename aip as agentip
| eval external_ip = coalesce(agentip, exip)
| eventstats list(jsonevent.hostname) as Hostnames, list(jsonevent.Username) as Users by external_ip
| eval hostuser = mvzip(Hostnames, Users)
| mvexpand hostuser
| eval HostUser = split(hostuser, ",")
| eval Hostnames=mvindex(HostUser, 0), Users=mvindex(HostUser, 1)
| rename AppVendor as Vendors, AppName as Applications, AppVersion as Version
| where isnotnull(Vendors)
| search Hostnames=*, Users=*
| table external_ip, Hostnames, Users, Vendors, Applications, Version


This gave me the result I wanted, but I also ended up with a caution that the list command reached a limit of 100 and had to truncate some events, and the search as a whole slows to a crawl. Is the list command that resource intensive?

ITWhisperer
SplunkTrust
SplunkTrust

Since your sample data doesn't show more than one host or user for each ip address, guessing that list would be useful to get what you needed was beyond my knowledge. Perhaps you could provide a more representative example of the events you are dealing with next time so we might be able to suggest a suitable solution (hopefully avoiding memory issues if possible).

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Yes, the list function is resource-intensive.  It makes a list of every value for the specified field so that takes time and memory.

---
If this reply helps you, Karma would be appreciated.
0 Karma

richgalloway
SplunkTrust
SplunkTrust

The mvexpand command will split the multi-value fields into separate events.  The problem is doing so breaks the relationships with other multi-value fields.  To work around that, combine the three multi-value fields into a single multi-value field, use mvexpand, then split the fields apart.

| rename jsonevent.external_ip as exip
| rename aip as agentip
| eval external_ip = coalesce(agentip, exip)
| stats values(jsonevent.hostname) as Hostnames, values(jsonevent.Username) as Users, values(AppVendor) as Vendors, values(AppName) as Applications, values(AppVersion) as Version by external_ip
| eval tuple=mvzip(Hostnames, mvzip(Usernames, mvzip(Vendors, mvzip(Applications, Version))))
| mvexpand tuple
| eval tuple=split(tuple, ",")
| eval Hostnames=mvindex(tuple, 0), Usernames=mvindex(tuple, 1), Vendors=mvindex(tuple, 2), Applications=mvindex(tuple, 3),Version=mvindex(tuple, 4)
---
If this reply helps you, Karma would be appreciated.
0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

The problem with using values() is that the multivalue fields are each sorted lexicographically independently and the original relationship between the values from the events is lost before the mvzip/mvexpand/mvindex fixup.

richgalloway
SplunkTrust
SplunkTrust

Excellent point.  My answer should use list rather than values.

---
If this reply helps you, Karma would be appreciated.
0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Try something like this

(index=index1 sourcetype=sourcetype1) OR (index=index2 sourcetype=sourcetype2)

| rename jsonevent.external_ip as exip
| rename aip as agentip
| eval external_ip = coalesce(agentip, exip)
| eventstats values(jsonevent.hostname) as Hostnames, values(jsonevent.Username) as Usernames by external_ip
| rename AppVendor as Vendors, AppName as Applications, AppVersion as Version
| where isnotnull(Vendors)
| table external_ip Hostnames Usernames Vendors Applications Version
0 Karma

psomeshwar
Path Finder

I used this search and it did work, however, something that I probably should have mentioned earlier is that multiple hosts and users are linked to the same external ip, so I am now getting multivalue fields for the Hostnames and Users. Anything that can be done for that? Or should I combine the two fields beforehand, then split them after the eventstats command?

0 Karma
Get Updates on the Splunk Community!

Introducing the Splunk Community Dashboard Challenge!

Welcome to Splunk Community Dashboard Challenge! This is your chance to showcase your skills in creating ...

Get the T-shirt to Prove You Survived Splunk University Bootcamp

As if Splunk University, in Las Vegas, in-person, with three days of bootcamps and labs weren’t enough, now ...

Wondering How to Build Resiliency in the Cloud?

IT leaders are choosing Splunk Cloud as an ideal cloud transformation platform to drive business resilience,  ...