Splunk Search

How do I fix my field extraction to account for whitespace in some paths

tkw03
Communicator

Hello

I have some data in a txt file that I am working on extractions for. It extracts fine except that in some of the urls there is/are spaces and it throws the rest of the extractions off.

for example
this works just fine

Type      AppliesTo  Path                                            Snap  Hard    Soft  Adv     Used    
---------------------------------------------------------------------------------------------------------
directory DEFAULT    /place/here2/test                                  No    1.00G   -     990.00M 12      

However this does not

Type      AppliesTo  Path                                            Snap  Hard    Soft  Adv     Used    
---------------------------------------------------------------------------------------------------------

directory DEFAULT    /place/here/fileservers/host16/App Management No    100.00G -     98.00G  90.073G 

due to spaces in the path the extarctions after that dont work.

Here are my props

[ storage:data ]
CHARSET=UTF-8
DATETIME_CONFIG=CURRENT
FIELD_DELIMITER=whitespace
HEADER_FIELD_LINE_NUMBER=1
LINE_BREAKER=([\r\n]+)
NO_BINARY_CHECK=null
SEDCMD-removeDash=s/---------------------------------------------------------------------------------------------------------//g
SEDCMD-removeDash2=s/^-.*$//g
SHOULD_LINEMERGE=false
disabled=false
pulldown_type=true

The issue is using whitespace as the delimiter I suppose but if I dont use that I dont get any field extractions. Any ideas?

Tags (1)
0 Karma
1 Solution

atownson
Explorer

Give the below a shot. You'll need to check the line breaking (LINE_BREAKER) to verify the events are broken properly. And you'll need to list all possible values of the 'Type' field separated by a pipe in the regular expression (EXTRACT). I've listed 'directory' and 'file'. This should give you the correct search-time field extractions.

[storage:data]
CHARSET=UTF-8
DATETIME_CONFIG=CURRENT
LINE_BREAKER=([\r\n]+) *Type +
NO_BINARY_CHECK=null
SHOULD_LINEMERGE=false
disabled=false
pulldown_type=true
EXTRACT-data=^ *(?<Type>directory|file) +(?<AppliesTo>[^ ]+) +(?<Path>.+) +(?<Snap>[^ ]+) +(?<Hard>[^ ]+) +(?<Soft>[^ ]+) +(?<Adv>[^ ]+) +(?<Used>[^ ]+) *$

For a clustered environment:

props.conf on indexers:

 [storage:data]
 CHARSET=UTF-8
 DATETIME_CONFIG=CURRENT
 LINE_BREAKER=([\r\n]+) *Type +
 NO_BINARY_CHECK=null
 SHOULD_LINEMERGE=false
 disabled=false
 pulldown_type=true

props.conf on search heads:

[storage:data]
EXTRACT-data=^ *(?<Type>directory|file) +(?<AppliesTo>[^ ]+) +(?<Path>.+) +(?<Snap>[^ ]+) +(?<Hard>[^ ]+) +(?<Soft>[^ ]+) +(?<Adv>[^ ]+) +(?<Used>[^ ]+) *$

View solution in original post

0 Karma

atownson
Explorer

Give the below a shot. You'll need to check the line breaking (LINE_BREAKER) to verify the events are broken properly. And you'll need to list all possible values of the 'Type' field separated by a pipe in the regular expression (EXTRACT). I've listed 'directory' and 'file'. This should give you the correct search-time field extractions.

[storage:data]
CHARSET=UTF-8
DATETIME_CONFIG=CURRENT
LINE_BREAKER=([\r\n]+) *Type +
NO_BINARY_CHECK=null
SHOULD_LINEMERGE=false
disabled=false
pulldown_type=true
EXTRACT-data=^ *(?<Type>directory|file) +(?<AppliesTo>[^ ]+) +(?<Path>.+) +(?<Snap>[^ ]+) +(?<Hard>[^ ]+) +(?<Soft>[^ ]+) +(?<Adv>[^ ]+) +(?<Used>[^ ]+) *$

For a clustered environment:

props.conf on indexers:

 [storage:data]
 CHARSET=UTF-8
 DATETIME_CONFIG=CURRENT
 LINE_BREAKER=([\r\n]+) *Type +
 NO_BINARY_CHECK=null
 SHOULD_LINEMERGE=false
 disabled=false
 pulldown_type=true

props.conf on search heads:

[storage:data]
EXTRACT-data=^ *(?<Type>directory|file) +(?<AppliesTo>[^ ]+) +(?<Path>.+) +(?<Snap>[^ ]+) +(?<Hard>[^ ]+) +(?<Soft>[^ ]+) +(?<Adv>[^ ]+) +(?<Used>[^ ]+) *$
0 Karma

tkw03
Communicator

Question, if a field in the log record doesnt exist is there a way to force that field to extract nothing? be blank?

Sometimes I have a record like this:
directory DEFAULT /ifs/home/home/T/TLO11 No 1.00G 12

Ans sometimes its like this:
directory DEFAULT /ifs/home/departments/o56/Dev No 1.00G 921.60M 2.55M

0 Karma

_Tom
Explorer

If you want to get the key with an empty value, use "KEEP_EMPTY_VALS = true" in your extraction stanza in transforms.conf.

0 Karma
Get Updates on the Splunk Community!

Index This | I’m short for "configuration file.” What am I?

May 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with a Special ...

New Articles from Academic Learning Partners, Help Expand Lantern’s Use Case Library, ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Your Guide to SPL2 at .conf24!

So, you’re headed to .conf24? You’re in for a good time. Las Vegas weather is just *chef’s kiss* beautiful in ...