Splunk Search

How to Rex out junk in a file path?

packet_hunter
Contributor

Scenario:
I have the following field called 'filePath'

/src/lkfdjgsryj3kt4z57RdC-1-SomeDocument.doc 

I would like to strip off everything in front of the file (called SomeDocument). The common pattern is the "-1-".

I have had no luck with my newbie REX attempts.

Thank you for your help.

Tags (3)
0 Karma
1 Solution

richgalloway
SplunkTrust
SplunkTrust

This should do it.

"-1-(?<filename>.*)"
---
If this reply helps you, Karma would be appreciated.

View solution in original post

somesoni2
SplunkTrust
SplunkTrust

This will do it

your base search | rex field=filePath mode=sed "s/(.*)\/(\w+)-1-(.+)$/\1\/\3/g" 

OR

your base search | eval filePath=replace(filePath,"(.*)\/(\w+)-1-(.+)","\1\/\3") 

UPdated
Try any of these

| eval filePath=replace(filePath,"(.*)\/([^\/-]+)(\/|-)(.+)","\1/\4")  
| rex field=filePath mode=sed "s/(.*)\/([^\/-]+)(\/|-)(.+)$/\1\/\4/g"

Update#2

I did read the question wrong and was trying to retain first portion of the path. Apart from other answers you got, these are additional way to doing the same. Lines before the last line is to generate the sample data.

| gentimes start=-1 | eval filePath="/src/lkfdjgsryj3kt4z57RdC-1-SomeDocument.doc#/src/lkfdjgsryj3kt4z57RdC/SomeDocument.doc#/lkfdjgsryj3kt4z57RdC-1-SomeDocument.doc#/src/temp/lkfdjgsryj3kt4z57RdC-1-SomeDocument.doc" | table filePath | makemv filePath delim="#" | mvexpand filePath  | eval orig=filePath
| eval filePath1=replace(filePath,"(.*)(\/|-)(\w+\.\w+)$","\3")  | rex field=filePath mode=sed "s/(.*)(\/|-)(\w+\.\w+)$/\3/g"
0 Karma

packet_hunter
Contributor

Thank you for the reply. Both work well, however I have to make my question a bit more challenging now.
I am now seeing data come in that is not all the same.
For example:
/src/lkfdjgsryj3kt4z57RdC-1-SomeDocument.doc
/src/lkfdjgsryj3kt4z57RdC/SomeDocument.doc

Notice the character before the document is either [/] or [-].

is it possible rex / eval from the end?

For example include everything before and after the [.] but drop everything after [/] or [-] ? the result being
SomeDocument.extn

Thank you

0 Karma

somesoni2
SplunkTrust
SplunkTrust

Try the updated answer.

0 Karma

packet_hunter
Contributor

Not quite perfected

other sample data before > after

/src/474702523/xtract/SomeDocument.doc > /src/474702523/Information.doc
/3rBN0S5Z7Cz5dG9K-1-SomeDocument.zip > /1-Information.zip

here is the code I am using by the way, maybe I am jacking something up...

index=main sourcetype=X_cef_syslog eventtype=X | [your code inserted] | stats list(filePath)

0 Karma

packet_hunter
Contributor

If you have time to update this, I do learn from examples. I will also play around with this code and post an update if I can get it to work.

Thank you!!!

0 Karma

jkat54
SplunkTrust
SplunkTrust
 ... | rex "-1-(?<fileName>.*)" | table fileName
0 Karma

packet_hunter
Contributor

Thank you for the reply, I appreciate your attempt, but answer does not work for this situation.

0 Karma

packet_hunter
Contributor

Correction, your code is correct. It was my error. Thank you for your response.

0 Karma

jkat54
SplunkTrust
SplunkTrust
¯\_(ツ)_/¯

richgalloway
SplunkTrust
SplunkTrust

This should do it.

"-1-(?<filename>.*)"
---
If this reply helps you, Karma would be appreciated.

packet_hunter
Contributor

Thank you for the reply, I appreciate your attempt, but answer does not work for this situation.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Why not? It works with your sample data. Please show the query you're using and we may be able to help get it working.

---
If this reply helps you, Karma would be appreciated.
0 Karma

richgalloway
SplunkTrust
SplunkTrust

Based on your latest comment to somesoni2 and assuming a filename is always alphnumeric, this rex command will generate a new field called 'filename' with desired part of filePath.

... | rex field=filePath "(?<=\/|-1-)(?<filename>\w+\.\w+)" | ...
---
If this reply helps you, Karma would be appreciated.
0 Karma

packet_hunter
Contributor

This is what I tried

index=main sourcetype=X_cef_syslog eventtype=X  |  rex field=filePath  "-1-(?<filename>.*)"  | stats list(filePath)

index=main sourcetype=X_cef_syslog eventtype=X  |  rex field=filePath "(?<=\/|-1-)(?<filename>\w+\.\w+)"   | stats list(filePath)

I am probably not doing something right, the problem is not knowing what to ask you guys, I am sure your code would work in other situations, maybe its my data.

I appreciate your help.

0 Karma

packet_hunter
Contributor

As you were first with a correct answer, I will accept your answer. Thank you.

And thanks to everyone who helped find an answer.

0 Karma

packet_hunter
Contributor

d'oh I changed to filePath to filename it works great!!!

Sorry for the extra confusion.

Thank you!

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Please accept an answer.

---
If this reply helps you, Karma would be appreciated.
0 Karma
Get Updates on the Splunk Community!

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...