Splunk Search

How to get " stats count as hitcount " for logs saprated by commas in rows

krdeepu
New Member

Hi All,

Requesting your help with

Log Example.
I have 54 fields separated by comma
The field data is variable and cannot be controlled.
Hence like grep cannot use match commands
Lastly I have zero knowledge on programming

2017/07/06,ab2,ag4,baa24,3456,34567,Server52,10.2.5.24,127.24.57.25,tcp,rst,,,,,,,,,......,,,vsys4,zone5,vpn
2017/07/06,ab3,ag4,baa24,3456,34567,Server52,10.2.5.24,127.24.57.25,tcp,rst,,,,,,,,,......,,,ssys4,Aone5,vpn
2017/07/06,ab4,ag4,baa24,3456,34567,Server52,10.2.5.24,127.24.57.25,tcp,rst,,,,,,,,,......,,,ssss4,zone5,vpn
2017/07/06,ab2,ag4,baa24,3456,34567,Server52,10.2.5.24,127.24.57.25,tcp,rst,,,,,,,,,......,,,vsys4,zone5,vpn
2017/07/06,ab3,ag4,baa24,3456,34567,Server52,10.2.5.24,127.24.57.25,tcp,rst,,,,,,,,,......,,,ssys4,Aone5,vpn
2017/07/06,ab3,ag4,baa24,3456,34567,Server52,10.2.5.24,127.24.57.25,tcp,rst,,,,,,,,,......,,,ssys4,Aone5,vpn
2017/07/06,ab4,ag4,baa24,3456,34567,Server52,10.2.5.24,127.24.57.25,tcp,rst,,,,,,,,,......,,,ssss4,zone5,vpn
2017/07/06,ab2,ag4,baa24,3456,34567,Server52,10.2.5.24,127.24.57.25,tcp,rst,,,,,,,,,......,,,vsys4,zone5,vpn
2017/07/05,ab2,ag4,baa24,3456,34567,Server52,10.2.5.24,127.24.57.25,tcp,rst,,,,,,,,,......,,,vsys4,zone5,vpn
2017/07/05,ab3,ag4,baa24,3456,34567,Server52,10.2.5.24,127.24.57.25,tcp,rst,,,,,,,,,......,,,ssys4,Aone5,vpn
2017/07/05,ab4,ag4,baa24,3456,34567,Server52,10.2.5.24,127.24.57.25,tcp,rst,,,,,,,,,......,,,ssss4,zone5,vpn
2017/07/05,ab2,ag4,baa24,3456,34567,Server52,10.2.5.24,127.24.57.25,tcp,rst,,,,,,,,,......,,,vsys4,zone5,vpn
2017/07/05,ab3,ag4,baa24,3456,34567,Server52,10.2.5.24,127.24.57.25,tcp,rst,,,,,,,,,......,,,ssys4,Aone5,vpn
2017/07/05,ab3,ag4,baa24,3456,34567,Server52,10.2.5.24,127.24.57.25,tcp,rst,,,,,,,,,......,,,ssys4,Aone5,vpn
2017/07/05,ab4,ag4,baa24,3456,34567,Server52,10.2.5.24,127.24.57.25,tcp,rst,,,,,,,,,......,,,ssss4,zone5,vpn
2017/07/05,ab2,ag4,baa24,3456,34567,Server52,10.2.5.24,127.24.57.25,tcp,rst,,,,,,,,,......,,,vsys4,zone5,vpn

I want to get hit counts for similar logs when fields 2,8,9,52 and 53 are same
Keep in mind that these field names cannot be used in command as it will change continuously
For example in field 52 if the name is vsys5 next log may ssys4 for ssss4

NO=========Fields
1=========2017/07/06,
2========= ab2,
3========= ag4,
4========= baa24,

5========= 3456,

6========= 34567,

7========= Server52,

8========= 10.2.5.24,
9========= 127.24.57.25,
10======== tcp,
11======== rst,
! ========= ****
! ========= ****
! ========= ****
! ========= ****
! ========= ****
52======== or or
53======== zone5,
54======== vpn

The result should group together similar logs having fields 2, 8 ,9,53 and 54 and show the total hit counts
The commands should not use any input from logs like vsys, zone as it keeps on changing and these field names are not constant
The command may be based on "," (comma) and OR index number
I should be able to capture the output based on position, whatever be in index 2,8,9,52,53 ........or so on

index2=====index8========index9=====index52=========index53

Desired result will be as below.

Field2====Field 2========== Field 3======Field 4==========Field 5=============hit count
ab2======10.2.5.24======127.24.57.25====vsys4==========zone5===================6
ab3======10.2.5.24======127.24.57.25====ssys4==========zone5===================6
ab3======10.2.5.24======127.24.57.25====ssss4==========zone5===================4

Kindly keep in mind that I have 54 fields separated by comma.

I tried with the below command. But I am not able to get output after field 13.
-----------------------------------------------------------------------------------------------------------------

sourcetype=************** | rex "^(?<index1>[^\,]+)\,(?<index2>[^\,]+)\,(?<index3>[^\,]+)\,(?<index4>[^\,]+)\,(?<index5>[^\,]+)\,(?<index6>[^\,]+)\,(?<index7>[^\,]+)\,(?<index8>[^\,]+)\,(?<index9>[^\,]+)\,(?<index10>[^\,]+)\,(?<index11>[^\,]+)\,(?<index12>[^\,]+)\,(?<index13>[^\s]+)" | stats count as hitcount by index12 index13

Apart from that as I do not have necessary privilege I am unable to upload screenshots.

Hi Dal Jeanis,
Thanks for you advise but your answer is based on the scenario that after 9th field doesn't matter or used field name "src="

The command may be based on "," (comma) and OR index number
I should be able to capture the output based on position, whatever be in index 2,8,9,52,53 ........or so on as I wish

***Keep in mind that these field names cannot be used in command as it will change continuously

For example in field 52 if the name is "vsys5" next log may "ssys4" for "ssss4" **
---------------------------------------------------------------------------------------------------------------------***

The question was modified after input from Dal Jeanis

0 Karma

DalJeanis
SplunkTrust
SplunkTrust

You basically had it. You do not care about any fields after the 9th, so the variable number of fields doesn't matter, as long as you are using the start anchor ^ at the beginning of the rex (which you were).

| rex "^(?<mydate>[\d\/]{10}),[^,]+,[^,]+,[^,]+,[^,]+,[^,]+,[^,]+,(?<src>[^,]+),(?<dest>[^,]+)"
| stats count as hitcount by mydate src dest

By the way, when posting here on the forums be sure to mark your code as code, for example using the button marked 101 010, so the interface doesn't delete words in angle brackets < > that look like html tags.

If you want to keep all the data, and your hitcount won't be in the hundreds, then you could do this...

| stats count as hitcount list(_raw) as fulldata by mydate src dest

This version could also be useful, if you wanted some of the other field data to be retained...

| rex "^(?<mydate>[\d\/]{10}),"
| rex ",(?<allmyfields>[^,]+" max_match=0
| eval src=mvindex(allmyfields,6)
| eval dest=mvindex(allmyfields,7)
| eval someotherfield=mvindex(allmyfields, the field number minus 2)
| stats count as hitcount values(someotherfield) as someotherfield by mydate src dest
0 Karma

Richfez
SplunkTrust
SplunkTrust

You could also do this in your props.conf using INDEXED_EXTRACTIONS = CSV for this sourcetype. There's a lot of options, you could add a FIELD_NAMES of "f01, f02, f03, f04 ... f52, f53" then just ignore all the other extracted fields.

0 Karma

krdeepu
New Member

Hi DalJeanis / Rich,

I tried with all the commands given by both of you but did not work out. Kindly could you read my scenario again. I have modified the scenario to reflect correctly. The src dst or date names does not appear in the logs. Its just that these are separated by comma. I do have millions of hits

It could be because I may have made mistake somewhere.

sourcetype=XXXXXXXX | rex "(([^,]*),)((?<f02>[^,]*),)(([^,]*),){7}((?<f08>[^,]*),)((?<f09>[^,]*),)(([^,]*),){52}((?<f52>[^,]*),)((?<f53>[^,]*),)" | stats count as hitcount by f02 f08 f09 f52 f53


sourcetype=XXXXXXXX
 | rex ",(?<allmyfields>[^,]+")
 | eval f02=mvindex(allmyfields,2)
 | eval f08=mvindex(allmyfields,8)
 | eval f09=mvindex(allmyfields,9)
 | eval f52=mvindex(allmyfields,52)
 | eval f53=mvindex(allmyfields,53)
 | stats count as hitcount by f02 f08 f09 f52 f53
0 Karma

Richfez
SplunkTrust
SplunkTrust

I think DalJeanis' answer is what you need. Not needing what's after field 9 or not doesn't really matter because his example is more or less trivially extended to handle it, especially the last example.

For the first ones, here's an updated rex, more or less his with just a bit of a change to use repeats, that grab more.

 ... | rex "(([^,]*),)((?<f02>[^,]*),)(([^,]*),){7}((?<f08>[^,]*),)((?<f09>[^,]*),)(([^,]*),){42}((?<f52>[^,]*),)((?<f53>[^,]*),)"

That should grab - ASSUMING I did my math right! LOL - the second, eighth, ninth, fifty-second and fifty-third fields into f02, f08, f09, f52 and f53. You can name them whatever you would like. The key is the {42} means "repeat the last thing 42 times", essentially just skipping 42 fields.

You could also use his last example to do anything you'd like by adjusting the various mvindex evals. Like, use

| eval f02=mvindex(allmyfields,2)
| eval f08=mvindex(allmyfields,8)
| eval f09=mvindex(allmyfields,9)
| eval f52=mvindex(allmyfields,52)
| eval f53=mvindex(allmyfields,53)

(I may be off by one on the mvindex - just use whatever numbers you want or need!)

Give those a try, be sure to accept DalJeanis' answer if it ends up it really does work, or add a comment after it if you need to clarify something a bit more.

Happy Splunking!

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...