Splunk Search

condense many line response results to a small table

drinkingjimmy
Explorer

I'm working with email response data which comes into my index in individual messages. Each email message can have more than 100 entries in the index.

I'm trying to build tables to make the data easy to read.

This is what some simplified and sanitized results from my search look like:

[01:00:22.164297] x=ABC mod=mail cmd=msg rule=ruleQ subject="Test 123" size=8583
[01:00:22.136496] x=ABC mod=spam cmd=run rule=notspam 
[01:00:22.106325] x=ABC mod=spam cmd=run policy=outbound
[01:00:22.067675] x=ABC mod=mail cmd=attachment file=text.html size=3347 
[01:00:22.039732] x=ABC mod=mail cmd=attachment file=text.txt size=2093 
[01:00:22.010986] x=ABC mod=session cmd=data rcpt=personA@rec.org
[01:00:22.010986] x=ABC mod=session cmd=data rcpt=personB@rec.org
[01:00:22.000234] x=ABC mod=mail sender=noreply@sender.org

Tabled to show how data is structured for columns I care about:

╔═══════════════════╦═════╦══════════╦═══════════╦════════════════════╦═════════════════╦══════╦═════════╗
║        time       ║  x  ║ subject  ║   file    ║       sender       ║      rcpt       ║ size ║  rule   ║
╠═══════════════════╬═════╬══════════╬═══════════╬════════════════════╬═════════════════╬══════╬═════════╣
║ [01:00:22.164297] ║ ABC ║ Test 123 ║           ║                    ║                 ║ 8583 ║ ruleQ   ║
║ [01:00:22.136496] ║ ABC ║          ║           ║                    ║                 ║      ║ notspam ║
║ [01:00:22.106325] ║ ABC ║          ║           ║                    ║                 ║      ║         ║
║ [01:00:22.067675] ║ ABC ║          ║ text.html ║                    ║                 ║ 3347 ║         ║
║ [01:00:22.039732] ║ ABC ║          ║ text.txt  ║                    ║                 ║ 2093 ║         ║
║ [01:00:22.010986] ║ ABC ║          ║           ║                    ║ personA@rec.org ║      ║         ║
║ [01:00:22.010986] ║ ABC ║          ║           ║                    ║ personB@rec.org ║      ║         ║
║ [01:00:22.000234] ║ ABC ║          ║           ║ noreply@sender.org ║                 ║      ║         ║
╚═══════════════════╩═════╩══════════╩═══════════╩════════════════════╩═════════════════╩══════╩═════════╝

This is what I'd like to get back:

╔═══════════════════╦═════╦══════════╦═══════════╦════════════════════╦═════════════════╦══════╦═════════╗
║        time       ║  x  ║ subject  ║   file    ║       sender       ║      rcpt       ║ size ║  rule   ║
╠═══════════════════╬═════╬══════════╬═══════════╬════════════════════╬═════════════════╬══════╬═════════╣
║ [01:00:22.164297] ║ ABC ║ Test 123 ║ text.html ║ noreply@sender.org ║ personA@rec.org ║ 3347 ║ notspam ║
║ [01:00:22.164297] ║ ABC ║ Test 123 ║ text.txt  ║ noreply@sender.org ║ personA@rec.org ║ 2093 ║ notspam ║
║ [01:00:22.164297] ║ ABC ║ Test 123 ║ text.html ║ noreply@sender.org ║ personB@rec.org ║ 3347 ║ notspam ║
║ [01:00:22.164297] ║ ABC ║ Test 123 ║ text.txt  ║ noreply@sender.org ║ personB@rec.org ║ 2093 ║ notspam ║
╚═══════════════════╩═════╩══════════╩═══════════╩════════════════════╩═════════════════╩══════╩═════════╝

As you can see, the transformations I want for the data include:

  • creating a unique row for each person receiving each attachment
  • the size value is for the attachment, while the size of the whole
    message is dropped
  • The time from the entry which contains the subject name is used for each entry
  • The 'rule' from mod=spam AND rule!=null fills in the rule column for each entry, and the rule from the line which contains subject is ignored
  • The subject, sender and rule get copied to every entry
0 Karma
1 Solution

somesoni2
SplunkTrust
SplunkTrust

Give this a try

your base search 
| eval filesize=if(isnotnull(file), file."#".size,null())
| eval rule=if(mod="spam" AND isnotnull(rule),rule,null())
| eval time=if(isnotnull(subject),time,null())
| stats values(filesize) as filesize values(subject) as subject values(sender) as sender values(rcpt) as rcpt values(rule) as rule values(time) as time by x
| mvexpand rcpt | mvexpand filesize
| rex field=filesize "(?<file>[^#]+)#(?<size>.+)" 
| table time x subject file sender rcpt size rule

View solution in original post

woodcock
Esteemed Legend

Too much, too late, but this works:

Spoof data:

| makeresults 
| eval raw="[01:00:22.164297] x=ABC mod=mail cmd=msg rule=ruleQ subject=\"Test 123\" size=8583::[01:00:22.136496] x=ABC mod=spam cmd=run rule=notspam::[01:00:22.106325] x=ABC mod=spam cmd=run policy=outbound::[01:00:22.067675] x=ABC mod=mail cmd=attachment file=text.html size=3347::[01:00:22.039732] x=ABC mod=mail cmd=attachment file=text.txt size=2093::[01:00:22.010986] x=ABC mod=session cmd=data rcpt=personA@rec.org::[01:00:22.010986] x=ABC mod=session cmd=data rcpt=personB@rec.org::[01:00:22.000234] x=ABC mod=mail sender=noreply@sender.org" 
| makemv delim="::" raw 
| mvexpand raw 
| rename raw AS _raw 
| rex "\[(?<time>[^\]]+)\]\s+x=(?<x>\w+)\s+mod=(?<mod>\w+)\s+cmd=(?<cmd>\w+)\s+rule=(?<rule>\w+)(?:\s+subject=\"(?<subject>[^\"]+)\"\s+size=(?<size>\d+))?" 
| rex "\[(?<time>[^\]]+)\]\s+x=(?<x>\w+)\s+mod=(?<mod>\w+)\s+cmd=(?<cmd>\w+)\s+policy=(?<policy>\w+)" 
| rex "\[(?<time>[^\]]+)\]\s+x=(?<x>\w+)\s+mod=(?<mod>\w+)\s+cmd=(?<cmd>\w+)\s+file=(?<file>.*)\s+size=(?<size>\d+)" 
| rex "\[(?<time>[^\]]+)\]\s+x=(?<x>\w+)\s+mod=(?<mod>\w+)\s+cmd=(?<cmd>\w+)\s+rcpt=(?<rcpt>.*)" 
| rex "\[(?<time>[^\]]+)\]\s+x=(?<x>\w+)\s+mod=(?<mod>\w+)\s+sender=(?<sender>.*)" 

Now the solution:

| eval spam_rule=if(mod="spam", rule, null()) 
| eventstats values(spam_rule) AS spam_rule by x 
| eval file_detail = time . ":-:" . file . ":-:" . size . ":-:" . spam_rule 
| fields - _raw _time time cmd mod policy file size rule spam_rule
| eventstats values(file_detail) AS file_detail BY x
| stats values(*) AS * BY file_detail rcpt x
| rex field=file_detail "^(?<time>.*):-:(?<file>.*):-:(?<size>.*):-:(?<rule>.*)$"
| table time x subject file sender rcpt size rule

Note that my solution preserves the time that the spam rule was executed on the file (which was not in the original ask, I know).

0 Karma

somesoni2
SplunkTrust
SplunkTrust

Give this a try

your base search 
| eval filesize=if(isnotnull(file), file."#".size,null())
| eval rule=if(mod="spam" AND isnotnull(rule),rule,null())
| eval time=if(isnotnull(subject),time,null())
| stats values(filesize) as filesize values(subject) as subject values(sender) as sender values(rcpt) as rcpt values(rule) as rule values(time) as time by x
| mvexpand rcpt | mvexpand filesize
| rex field=filesize "(?<file>[^#]+)#(?<size>.+)" 
| table time x subject file sender rcpt size rule

drinkingjimmy
Explorer

I asked the same question on StackOverflow if you want to double dip on the points 🙂

0 Karma

drinkingjimmy
Explorer

This is much better than the direction I was going!

I was trying to use transaction, but couldn't match up the file sizes with the names after the fact.

0 Karma

DalJeanis
SplunkTrust
SplunkTrust

1) You could use min(time) in line 5 and delete line 4.

2) line 3 is more likely to be | eval rule=if(isnull(subject),rule,null())

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...