How do I use regex or replace to remove the first occurrence word found and replace second occurrence onward with comma?
For example, the raw data is:
ubuntu CRON[2907]: pam_unix(cron:session): session opened for user root by (uid=0) ubuntu CRON[2907]: pam_unix(cron:session): session closed for user root
I want it to be:
CRON[2907]: pam_unix(cron:session): session opened for user root by (uid=0),CRON[2907]: pam_unix(cron:session): session closed for user root
If you have only one second occurrence of the beginning string, this will work:
| makeresults
| eval _raw="ubuntu CRON[2907]: pam_unix(cron:session): session opened for user root by (uid=0) ubuntu CRON[2907]: pam_unix(cron:session): session closed for user root by (uid=0)"
| rex mode=sed "s/^(\S+)(.*?)\s(\1)/\2, /"
The process for multiple occurrences is more complex. Is the data in that case similar to the example that you provided? if not can you provide an example? Is there a maximum number of occurrences?
You can run rex two times, first time to replace the first ubuntu with blank,
second ubuntu with a comma
(if the string "ubuntu" is not known before hand, please update some more details(which spot it appears), so that rex can be updated)
(rex mode=sed can not be tested on regex101 website, i have tested it on splunk directly, it works fine.. please check the screenshot)
|makeresults
| eval _raw = "ubuntu CRON[2907]: pam_unix(cron:session): session opened for user root by (uid=0) ubuntu CRON[2907]: pam_unix(cron:session): session closed for user root"
| rex mode=sed field=_raw "s#(^ubuntu\s)##"
| rex mode=sed field=_raw "s#ubuntu#,#"
| table _raw
If you have only one second occurrence of the beginning string, this will work:
| makeresults
| eval _raw="ubuntu CRON[2907]: pam_unix(cron:session): session opened for user root by (uid=0) ubuntu CRON[2907]: pam_unix(cron:session): session closed for user root by (uid=0)"
| rex mode=sed "s/^(\S+)(.*?)\s(\1)/\2, /"
The process for multiple occurrences is more complex. Is the data in that case similar to the example that you provided? if not can you provide an example? Is there a maximum number of occurrences?
Hi @cpetterborg, great rex command... Great learning !
to other rex beginners, let me explain it -
"s/^(\S+)(.?)\s(\1)/\2, /"
^(\S+)
--- captures the first word
`(.?)------ remaining line is captured as "\2", till the 2nd ubuntu match
\s(\1)
---- matching for "a space and word ubuntu"
\2,
before the "/", only matching part, after this "/", its the replacement part
--- on the replacement, leave the
\1`, write the "\2" match and then a comma ",". thats it.
Thank you. I saw your original post in email. I'm glad you figured it all out. Congratulations! 🙂 I've upvoted your comment for the fine explanation!