Splunk Search

split fields upon indexing/ use regex to split fields

mhornste
Path Finder

Hi,

I have source data comma delimited like this from JMeter:

timeStamp,elapsed,label,responseCode,responseMessage,threadName,dataType,success,failureMessage,bytes,sentBytes,grpThreads,allThreads,Latency,Hostname,IdleTime,Connect
2017/02/14 10:28:42.845,2645,"Login,Admin",200,OK,hostname 5-1,text,true,,66355,0,1,8,2109,JMeterServerHost,0,0

I would like to split the following two fields into four:

field name = label
Login,Admin to be split into these fields upon indexing:
--> label = Login
--> ACL = Admin

field name = threadName
hostname 5-1 to be split into these fields upon indexing:
--> targetHost = hostname
--> JMeterThread = 5-1

How can I configure this in props.conf/ transforms.conf to make it work?

I have already tried the following in props.conf (confirmed with a Splunk search:

Eval-TargetHost = substr(threadName, 1, len(threadName)-4)
--> removes the 5-1

Eval-JMeterThread = substr(threadName, len(threadName)-2, 3)
--> removes the hostname

Unfortunately, it doesn't work or I can't access the fields. I'm pretty new to regular expressions and splitting fields in props/ transforms.conf.

In additon, I found a regex which removes everything after the comma (,) for Login,Admin but I don't know how to also split them. Here's the regex I have tested : \,.*$. It removes everything after Login, since there is a comma. But how can I create two new fields which contain Login and Admin?

Thanks!

Tags (2)
0 Karma
1 Solution

mhornste
Path Finder

Hi,

I have added one stanza for each sourcetype since the host can change. Unfortunately, it is not working (I have restarted Splunk).

If I use the searches with the regular expression (I slightly changed the fields), I can access the fields and the data is displayed properly

https://i.imgsafe.org/40cb74e826.png

My props.conf looks as follows:

# ignore JMeter Header
TRANSFORMS-IgnoreJMeterHeader = JMeterHeader_ignore
# assign fields
REPORT-JMeter = JMeterFields

[Augsburg]
EXTRACT-label = ^(?<labelNew>[^,]+),(?<ACL>.+) in label
EXTRACT-targetHost =  ^(?<targetHostNew>[^\s]+)\s(?<JMeterThread>.+) in threadName

[localhost]
EXTRACT-label = ^(?<labelNew>[^,]+),(?<ACL>.+) in label
EXTRACT-targetHost =  ^(?<targetHostNew>[^\s]+)\s(?<JMeterThread>.+) in threadName

[Augsburg_Workstation]
EXTRACT-label = ^(?<labelNew>[^,]+),(?<ACL>.+) in label
EXTRACT-targetHost =  ^(?<targetHostNew>[^\s]+)\s(?<JMeterThread>.+) in threadName

[Holeby]
EXTRACT-label = ^(?<labelNew>[^,]+),(?<ACL>.+) in label
EXTRACT-targetHost =  ^(?<targetHostNew>[^\s]+)\s(?<JMeterThread>.+) in threadName

[Kopenhagen]
EXTRACT-label = ^(?<labelNew>[^,]+),(?<ACL>.+) in label
EXTRACT-targetHost =  ^(?<targetHostNew>[^\s]+)\s(?<JMeterThread>.+) in threadName

[Oberhausen]
EXTRACT-label = ^(?<labelNew>[^,]+),(?<ACL>.+) in label
EXTRACT-targetHost =  ^(?<targetHostNew>[^\s]+)\s(?<JMeterThread>.+) in threadName

I have gone through the props.conf documentation and have also checked the splunkd log file but couldnt find a hint. Is there a way to check if these EXTRACTs are loaded by splunk?

View solution in original post

0 Karma

gvmorley
Contributor

Well done for sticking with it.

Everything looks OK, so my guess would be that whatever you're doing to extract all of your original field names, is being done at 'Search time'. This would mean that they wouldn't be available yet for the EXTRACT commands in props.conf to reference.

No worries, back to basics.

Just extract them all in the sourcetype stanzas. So delete the EXTRACT lines that you have and try this instead:

EXTRACT-full = ^(?<timeStamp>[^,]*),(?<elapsed>[^,]*),"(?<label>[^,]*),(?<ACL>[^"]*)",(?<responseCode>[^,]*),(?<responseMessage>[^,]*),(?:(?<targetHost>[^\s]*)\s(?<JMeterThread>[^,]*))?,(?<dataType>[^,]*),(?<success>[^,]*),(?<failureMessage>[^,]*),(?<bytes>[^,]*),(?<sentBytes>[^,]*),(?<grpThreads>[^,]*),(?<allThreads>[^,]*),(?<Latency>[^,]*),(?<Hostname>[^,]*),(?<IdleTime>[^,]*),(?<Connect>[^$]*)

That should encompass all of your original fields and the 'splits' that you were looking for in your original question.

I'm not sure if there's a way to see, "if the EXTRACTs are loaded by Splunk". But you can use the btool command to see the total configuration that is being applied to a sourcetype.

You need console / command line access, but you could try running:

bin/splunk btool props list Augsburg --debug

This will give you the total config which is being applied to the 'Augsburg' sourcetype/stanza. In a test in my laptop, this gives:

/Applications/Splunk/etc/system/local/props.conf   [test-extract]
/Applications/Splunk/etc/system/default/props.conf ANNOTATE_PUNCT = True
/Applications/Splunk/etc/system/default/props.conf AUTO_KV_JSON = true
/Applications/Splunk/etc/system/default/props.conf BREAK_ONLY_BEFORE = 
/Applications/Splunk/etc/system/default/props.conf BREAK_ONLY_BEFORE_DATE = True
/Applications/Splunk/etc/system/default/props.conf CHARSET = UTF-8
/Applications/Splunk/etc/system/default/props.conf DATETIME_CONFIG = /etc/datetime.xml
/Applications/Splunk/etc/system/local/props.conf   EXTRACT-full = ^(?<timeStamp>[^,]*),(?<elapsed>[^,]*),"(?<label>[^,]*),(?<ACL>[^"]*)",(?<responseCode>[^,]*),(?<responseMessage>[^,]*),(?:(?<targetHost>[^\s]*)\s(?<JMeterThread>[^,]*))?,(?<dataType>[^,]*),(?<success>[^,]*),(?<failureMessage>[^,]*),(?<bytes>[^,]*),(?<sentBytes>[^,]*),(?<grpThreads>[^,]*),(?<allThreads>[^,]*),(?<Latency>[^,]*),(?<Hostname>[^,]*),(?<IdleTime>[^,]*),(?<Connect>[^$]*)
/Applications/Splunk/etc/system/default/props.conf HEADER_MODE = 
/Applications/Splunk/etc/system/default/props.conf LEARN_MODEL = true
/Applications/Splunk/etc/system/default/props.conf LEARN_SOURCETYPE = true
/Applications/Splunk/etc/system/default/props.conf LINE_BREAKER_LOOKBEHIND = 100
/Applications/Splunk/etc/system/default/props.conf MATCH_LIMIT = 100000
/Applications/Splunk/etc/system/default/props.conf MAX_DAYS_AGO = 2000
/Applications/Splunk/etc/system/default/props.conf MAX_DAYS_HENCE = 2
/Applications/Splunk/etc/system/default/props.conf MAX_DIFF_SECS_AGO = 3600
/Applications/Splunk/etc/system/default/props.conf MAX_DIFF_SECS_HENCE = 604800
/Applications/Splunk/etc/system/default/props.conf MAX_EVENTS = 256
/Applications/Splunk/etc/system/default/props.conf MAX_TIMESTAMP_LOOKAHEAD = 128
/Applications/Splunk/etc/system/default/props.conf MUST_BREAK_AFTER = 
/Applications/Splunk/etc/system/default/props.conf MUST_NOT_BREAK_AFTER = 
/Applications/Splunk/etc/system/default/props.conf MUST_NOT_BREAK_BEFORE = 
/Applications/Splunk/etc/system/default/props.conf SEGMENTATION = indexing
/Applications/Splunk/etc/system/default/props.conf SEGMENTATION-all = full
/Applications/Splunk/etc/system/default/props.conf SEGMENTATION-inner = inner
/Applications/Splunk/etc/system/default/props.conf SEGMENTATION-outer = outer
/Applications/Splunk/etc/system/default/props.conf SEGMENTATION-raw = none
/Applications/Splunk/etc/system/default/props.conf SEGMENTATION-standard = standard
/Applications/Splunk/etc/system/default/props.conf SHOULD_LINEMERGE = True
/Applications/Splunk/etc/system/default/props.conf TRANSFORMS = 
/Applications/Splunk/etc/system/default/props.conf TRUNCATE = 10000
/Applications/Splunk/etc/system/default/props.conf detect_trailing_nulls = false
/Applications/Splunk/etc/system/default/props.conf maxDist = 100
/Applications/Splunk/etc/system/default/props.conf priority = 
/Applications/Splunk/etc/system/default/props.conf sourcetype = 

The useful thing about this is you can see which configuration file the config is coming from.

Splunk's config system is a layered approach, and sometimes is tricky to see what's coming from where.

Have a go with the EXTRACT above, but also try btool for yourself to see if there's some other configuration being picked up.

0 Karma

mhornste
Path Finder

Hi,

I have added one stanza for each sourcetype since the host can change. Unfortunately, it is not working (I have restarted Splunk).

If I use the searches with the regular expression (I slightly changed the fields), I can access the fields and the data is displayed properly

https://i.imgsafe.org/40cb74e826.png

My props.conf looks as follows:

# ignore JMeter Header
TRANSFORMS-IgnoreJMeterHeader = JMeterHeader_ignore
# assign fields
REPORT-JMeter = JMeterFields

[Augsburg]
EXTRACT-label = ^(?<labelNew>[^,]+),(?<ACL>.+) in label
EXTRACT-targetHost =  ^(?<targetHostNew>[^\s]+)\s(?<JMeterThread>.+) in threadName

[localhost]
EXTRACT-label = ^(?<labelNew>[^,]+),(?<ACL>.+) in label
EXTRACT-targetHost =  ^(?<targetHostNew>[^\s]+)\s(?<JMeterThread>.+) in threadName

[Augsburg_Workstation]
EXTRACT-label = ^(?<labelNew>[^,]+),(?<ACL>.+) in label
EXTRACT-targetHost =  ^(?<targetHostNew>[^\s]+)\s(?<JMeterThread>.+) in threadName

[Holeby]
EXTRACT-label = ^(?<labelNew>[^,]+),(?<ACL>.+) in label
EXTRACT-targetHost =  ^(?<targetHostNew>[^\s]+)\s(?<JMeterThread>.+) in threadName

[Kopenhagen]
EXTRACT-label = ^(?<labelNew>[^,]+),(?<ACL>.+) in label
EXTRACT-targetHost =  ^(?<targetHostNew>[^\s]+)\s(?<JMeterThread>.+) in threadName

[Oberhausen]
EXTRACT-label = ^(?<labelNew>[^,]+),(?<ACL>.+) in label
EXTRACT-targetHost =  ^(?<targetHostNew>[^\s]+)\s(?<JMeterThread>.+) in threadName

I have gone through the props.conf documentation and have also checked the splunkd log file but couldnt find a hint. Is there a way to check if these EXTRACTs are loaded by splunk?

0 Karma

mhornste
Path Finder

Oh, yes I have restarted splunkd!

0 Karma

mhornste
Path Finder

Hi,

of course, thank you!
props.conf:

#[JMeterOutput]
#Eval-TargetHost = substr(threadName, 1, len(threadName)-4)
#Eval-JMeterThread = substr(threadName, len(threadName)-2, 3)

# ignore JMeter Header
TRANSFORMS-IgnoreJMeterHeader = JMeterHeader_ignore
# assign fields
REPORT-JMeter = JMeterFields


EXTRACT-label = ^(?<label>[^,]+),(?<ACL>.+) in label
EXTRACT-targetHost =  ^(?<targetHost>[^\s]+)\s(?<JMeterThread>.+) in threadName

transforms.conf:

#
# Discarding the header lines in the jmeter logs (directed to the nullQueue)
#

[JMeterHeader_ignore]
REGEX = ^timeStamp,elapsed,label,responseCode,
DEST_KEY = queue
FORMAT = nullQueue

#
# Extracting CSV fields from JMeter logs
#

[JMeterFields]
DELIMS = ","
#CHECK_FOR_HEADER = true
#HEADER_MODE = firstline
FIELDS = timeStamp,elapsed,label,responseCode,responseMessage,threadName,dataType,success,failureMessage,sentBytes,grpThreads,allThreads,field14,Latency,Hostname,IdleTime,Connect

I have several JMeter Source types since I'm seperating several locations by the sourcetype. Will I have to add one stanza per sourcetype then?

# Augsburg Server
[monitor://\\jmeterhost\Jmeter\Log\Augsburg\Augsburg.csv]
disabled = 0
host = jmeterhost.fqdn
index = jmeter
sourcetype = Augsburg
followTail = 0
initCrcLength = 512

# local Frontend 1 
[monitor://\\jmeterhost\Jmeter\Log\localhost\localhost.csv]
disabled = 0
host = jmeterhost.fqdn
index = jmeter
sourcetype = localhost
followTail = 0
initCrcLength = 512

# Augsburg Workstation
[monitor://\\jmeterhost\Jmeter\Log\Augsburg_Workstation\workstation.csv]
disabled = 0
host = jmeterhost.fqdn
index = jmeter
sourcetype = Augsburg_Workstation
followTail = 0
initCrcLength = 512

# Location Holeby
[monitor://\\jmeterhost\Jmeter\Log\Holeby\holeby.csv]
disabled = 0
host = jmeterhost.fqdn
index = jmeter
sourcetype = Holeby
followTail = 0
initCrcLength = 512

# Kopenhagen
[monitor://\\jmeterhost\Jmeter\Log\Kopenhagen\kopenhagen.csv]
disabled = 0
host = jmeterhost.fqdn
index = jmeter
sourcetype = Kopenhagen
followTail = 0
initCrcLength = 512

# Oberhausen
[monitor://\\jmeterhost\Jmeter\Log\Oberhausen\oberhausen.csv]
disabled = 0
host = jmeterhost.fqdn
index = jmeter
sourcetype = Oberhausen
followTail = 0
initCrcLength = 512
0 Karma

gvmorley
Contributor

OK,

So in your props.conf file, you need to have the 'EXTRACT' lines under a 'stanza' so that Splunk know what data to apply them to.

There's a few different ways to define a stanza (Host, Source, Sourcetype).

As a test, just go for the sourcetype of one of your inputs. Such as:

[Augsburg]
EXTRACT-label = ^(?<label>[^,]+),(?<ACL>.+) in label
EXTRACT-targetHost =  ^(?<targetHost>[^\s]+)\s(?<JMeterThread>.+) in threadName

See if that works (Don't forget to restart Splunk).

0 Karma

gvmorley
Contributor

Then,

If you really have defined the same host for each of the inputs, you can just have one stanza such as

[host::jmeterhost.fqdn]
EXTRACT-label = ^(?<label>[^,]+),(?<ACL>.+) in label
EXTRACT-targetHost =  ^(?<targetHost>[^\s]+)\s(?<JMeterThread>.+) in threadName

Checkout the Admin manual and the props.conf reference for more detail:
http://docs.splunk.com/Documentation/Splunk/6.5.2/Admin/Propsconf

0 Karma

mhornste
Path Finder

Hi,

thanks, the rex both work quite fine. I have added both lines to thte props.conf but unfortunately, they don't show up/ doesn'd work.

I have extracted the fields (label and threadName) and can select them.

https://i.imgsafe.org/31a7074be4.png

0 Karma

gvmorley
Contributor

Good to hear that the rex works.

Happy to help with you getting the props.conf working too.

Could you post the whole stanza that you have in there?

I.e. For example, if your source type was [jmeter] then the stanza would be something like:

[jmeter]
EXTRACT-label = ^(?<label>[^,]+),(?<ACL>.+) in label
EXTRACT-targetHost =  ^(?<targetHost>[^\s]+)\s(?<JMeterThread>.+) in threadName

But you may have some other stuff in there too..?

Also, did you re-start Splunk after changing the props.conf?

0 Karma

gvmorley
Contributor

Hi,

Assuming that you've got all of the header fields extracted already, you can do this at Search time with the rex command:

| rex field=label "^(?<label>[^,]+),(?<ACL>.+)"
| rex field=threadName "^(?<targetHost>[^\s]+)\s(?<JMeterThread>.+)"

The equivalent in props.conf would be something like:

EXTRACT-label = ^(?<label>[^,]+),(?<ACL>.+) in label
EXTRACT-targetHost =  ^(?<targetHost>[^\s]+)\s(?<JMeterThread>.+) in threadName

Try the rex version first to see if that works with your data.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...