Splunk Search

Creating a custom SourceType with multiple delimiters (35 fields)

whitehaven
Explorer

Hi all,

I've searched around a bit and I can't seem to find the answer after failing to figure it out myself.

The data I've got has multiple delimiters and I somehow got it to recognise two of them but not the third.

Hopefully this makes sense but I have 35 fields that I want to delimiter like this:
,,,,,,,::|||||||||||||||||||||||||

So basically the first 8 fields should be delimited by commas the next 2 fields by colons and the remaining 25 by pipes.

Has anyone had experience or success with this?

Thanks in advance.

Tags (1)
0 Karma
1 Solution

gcusello
SplunkTrust
SplunkTrust

Hi @whitehaven,
I never found a situation like this!
Anyway you can use a regex to extract fields

(?<field1>[^\|,:]*)(\||,|:)(?<field2>[^\|,:]*)(\||,|:)(?<field3>[^\|,:]*)(\||,|:)

or if the number of delimers of each kind is fixed (one in my example)

(?<field1>[^,]*),(?<field2>[^:]*):(?<field3>[^\|]*)\|

Ciao.
Giuseppe

View solution in original post

gcusello
SplunkTrust
SplunkTrust

Hi @whitehaven,
I never found a situation like this!
Anyway you can use a regex to extract fields

(?<field1>[^\|,:]*)(\||,|:)(?<field2>[^\|,:]*)(\||,|:)(?<field3>[^\|,:]*)(\||,|:)

or if the number of delimers of each kind is fixed (one in my example)

(?<field1>[^,]*),(?<field2>[^:]*):(?<field3>[^\|]*)\|

Ciao.
Giuseppe

whitehaven
Explorer

hi @gcusello

Thank you so much for pointing me in the direction of regex, I've never played in that area before.

Using regex101.com with some placeholder values and I've got it working there I just need to get it conforming to splunks syntax now.

This is what I have so far:

Date,Time,Activity:Info:ComputerName|Country|City
2019-11-19,00:00:00,Browsing:Google:PC01|Australia|Brisbane
2019-11-19,00:30:00,Browsing:YouTube:PC02|Australia|Sydney

and the regex I use is:

(?<field1>[^\,]*)(,)(?<field2>[^\,]*)(,)(?<field3>[^\:]*)(:)(?<field4>[^\:]*)(:)(?<field5>[^\|]*)(\|)(?<field6>[^\|]*)(\|)(?<field7>[^\n]*)

I'm sure there's a way to say:
use commas for the first 8 fields, colons for the next 2 and pipes for the final 25 and somehow add the field names in before hand but at this stage I'm probably just going to define all 35 fields once I figure out how to make Splunk recognise these 7 😄

0 Karma

whitehaven
Explorer

I've also had success using your second example this way

(?<field1>[^,]*),(?<field2>[^,]*),(?<field3>[^\:]*):(?<field4>[^\:]*):(?<field5>[^\|]*)\|(?<field6>[^\|]*)\|(?<field7>[^\n]*)

Again, still trying to sort the syntax for Splunk 😛

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @whitehaven,
sorry, but I don't understand: what do you mean with "to sort the syntax for Splunk"
do you mean to use this regex in Splunk searches?

If this is you problem, you can use this regex using the rex command

my_search
| rex "(?<field1>[^,]*),(?<field2>[^,]*),(?<field3>[^\:]*):(?<field4>[^\:]*):(?<field5>[^\|]*)\|(?<field6>[^\|]*)\|(?<field7>[^\n]*)"
| ...

or creating a new fiels and inserting in it this regex.

Ciao.
Giuseppe

0 Karma

whitehaven
Explorer

Mate, you're awesome

I did want to create a SourceType for this instead of doing an inline search because with 35 fields it looks disgusting but... It will work for now, so I've added this to the end of the search

| rex "(?<field1>[^,]*),(?<field2>[^,]*),(?<field3>[^,]*),(?<field4>[^,]*),(?<field5>[^,]*),(?<field6>[^,]*),(?<field7>[^,]*),(?<field8>[^:]*):(?<field9>[^:]*):(?<field10>[^\|]*)\|(?<field11>[^\|]*)\|(?<field12>[^\|]*)\|(?<field13>[^\|]*)\|(?<field14>[^\|]*)\|(?<field15>[^\|]*)\|(?<field16>[^\|]*)\|(?<field17>[^\|]*)\|(?<field18>[^\|]*)\|(?<field19>[^\|]*)\|(?<field20>[^\|]*)\|(?<field21>[^\|]*)\|(?<field22>[^\|]*)\|(?<field23>[^\|]*)\|(?<field24>[^\|]*)\|(?<field25>[^\|]*)\|(?<field26>[^\|]*)\|(?<field27>[^\|]*)\|(?<field28>[^\|]*)\|(?<field29>[^\|]*)\|(?<field30>[^\|]*)\|(?<field31>[^\|]*)\|(?<field32>[^\|]*)\|(?<field33>[^\|]*)\|(?<field34>[^\|]*)\|(?<field35>[^\|]*)"

Do you think it's viable to try and get that info into the SourceType so it is already formatted like this? I've had no luck trying but the search does work.

Thanks so much for your help @gcusello

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @whitehaven,
You could create a field extraction using this regex so you have all these fields related to a sourcetype.
Ciao.
Giuseppe

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @whitehaven,
ok!
only one little update: pipe (|) is a special char and need to be escaped, comma (,) and colon (:) don't need.

Ciao.
Giuseppe

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...