I am trying to do named extraction for the field sample
for each event but failing for some reason. Please help! here are the events :
2017-12-06T11:57:03.744000 POSITION 0 lang=Albanian sample="Unë mund të ha qelq dhe nuk më gjen gjë." constant="double quotes" 'single quotes' \slashes\ `~!@#$%^&*()-_=+{}|;:<>,./? [brackets] <script>alert("raw event unescaped!")</script>
2017-12-06T11:40:03.744000 POSITION 1 lang=Arabic sample="أنا قادر على أكل الزجاج و هذا لا يؤلمني." odd=1 constant="double quotes" 'single quotes' \slashes\ `~!@#$%^&*()-_=+{}|;:<>,./? [brackets] <script>alert("raw event unescaped!")</script>
2017-12-06T11:23:03.744000 POSITION 2 lang=Armenian sample="Կրնամ ապակի ուտել և ինծի անհանգիստ չըներ։" constant="double quotes" 'single quotes' \slashes\ `~!@#$%^&*()-_=+{}|;:<>,./? [brackets] <script>alert("raw event unescaped!")</script>
2017-12-06T11:06:03.744000 POSITION 3 lang=Chinese sample=" 我能吞下玻璃而不傷身體" odd=3 constant="double quotes" 'single quotes' \slashes\ `~!@#$%^&*()-_=+{}|;:<>,./? [brackets] <script>alert("raw event unescaped!")</script>
2017-12-06T10:49:03.744000 POSITION 4 lang=Danish sample="Jeg kan spise glas, det gør ikke ondt på mig." constant="double quotes" 'single quotes' \slashes\ `~!@#$%^&*()-_=+{}|;:<>,./? [brackets] <script>alert("raw event unescaped!")</script>
2017-12-06T10:32:03.744000 POSITION 5 lang=Euro sample="€." odd=5 constant="double quotes" 'single quotes' \slashes\ `~!@#$%^&*()-_=+{}|;:<>,./? [brackets] <script>alert("raw event unescaped!")</script>
2017-12-06T10:15:03.744000 POSITION 6 lang=French sample="Je peux manger du verre, ça ne me fait pas de mal." constant="double quotes" 'single quotes' \slashes\ `~!@#$%^&*()-_=+{}|;:<>,./? [brackets] <script>alert("raw event unescaped!")</script>
2017-12-06T09:58:03.744000 POSITION 7 lang=Georgian sample="მინას ვჭამ და არა მტკივა." odd=7 constant="double quotes" 'single quotes' \slashes\ `~!@#$%^&*()-_=+{}|;:<>,./? [brackets] <script>alert("raw event unescaped!")</script>
2017-12-06T09:41:03.744000 POSITION 8 lang=Greek sample="Μπορώ να φάω σπασμένα γυαλιά χωρίς να πάθω τίποτα." constant="double quotes" 'single quotes' \slashes\ `~!@#$%^&*()-_=+{}|;:<>,./? [brackets] <script>alert("raw event unescaped!")</script>
2017-12-06T09:24:03.744000 POSITION 9 lang=Hawaiian sample="Hiki iaʻu ke ʻai i ke aniani; ʻaʻole nō lā au e ʻeha." odd=9 constant="double quotes" 'single quotes' \slashes\ `~!@#$%^&*()-_=+{}|;:<>,./? [brackets] <script>alert("raw event unescaped!")</script>
2017-12-06T09:07:03.744000 POSITION 10 lang=Hebrew sample="אני יכול לאכול זכוכית וזה לא מזיק לי." constant="double quotes" 'single quotes' \slashes\ `~!@#$%^&*()-_=+{}|;:<>,./? [brackets] <script>alert("raw event unescaped!")</script>
2017-12-06T08:50:03.744000 POSITION 11 lang=Hindi sample="मैं काँच खा सकता हूँ और मुझे उससे कोई चोट नहीं पहुंचती." odd=11 constant="double quotes" 'single quotes' \slashes\ `~!@#$%^&*()-_=+{}|;:<>,./? [brackets] <script>alert("raw event unescaped!")</script>
2017-12-06T08:33:03.744000 POSITION 12 lang=Hindi sample="मैं काँच खा सकता हूँ, मुझे उस से कोई पीडा नहीं होती." constant="double quotes" 'single quotes' \slashes\ `~!@#$%^&*()-_=+{}|;:<>,./? [brackets] <script>alert("raw event unescaped!")</script>
2017-12-06T08:16:03.744000 POSITION 13 lang=Icelandic sample="Ég get etið gler án þess að meiða mig." odd=13 constant="double quotes" 'single quotes' \slashes\ `~!@#$%^&*()-_=+{}|;:<>,./? [brackets] <script>alert("raw event unescaped!")</script>
2017-12-06T07:59:03.744000 POSITION 14 lang=Japanese sample="私はガラスを食べられます。それは私を傷つけません" constant="double quotes" 'single quotes' \slashes\ `~!@#$%^&*()-_=+{}|;:<>,./? [brackets] <script>alert("raw event unescaped!")</script>
2017-12-06T07:42:03.744000 POSITION 15 lang=Korean sample="나는 유리를 먹을 수 있어요. 그래도 아프지 않아요" odd=15 constant="double quotes" 'single quotes' \slashes\ `~!@#$%^&*()-_=+{}|;:<>,./? [brackets] <script>alert("raw event unescaped!")</script>
2017-12-06T07:25:03.744000 POSITION 16 lang=Macedonian sample="Можам да јадам стакло, а не ме штета." constant="double quotes" 'single quotes' \slashes\ `~!@#$%^&*()-_=+{}|;:<>,./? [brackets] <script>alert("raw event unescaped!")</script>
2017-12-06T07:08:03.744000 POSITION 17 lang=Mongolian sample="Би шил идэй чадна, надад хортой биш" odd=17 constant="double quotes" 'single quotes' \slashes\ `~!@#$%^&*()-_=+{}|;:<>,./? [brackets] <script>alert("raw event unescaped!")</script>
2017-12-06T06:51:03.744000 POSITION 18 lang=Old Norse sample="Ek get etið gler án þess að verða sár." constant="double quotes" 'single quotes' \slashes\ `~!@#$%^&*()-_=+{}|;:<>,./? [brackets] <script>alert("raw event unescaped!")</script>
2017-12-06T06:34:03.744000 POSITION 19 lang=Polish sample="Mogę jeść szkło, i mi nie szkodzi." odd=19 constant="double quotes" 'single quotes' \slashes\ `~!@#$%^&*()-_=+{}|;:<>,./? [brackets] <script>alert("raw event unescaped!")</script>
Hi @saurabh_tek11,
Can you please add below configuration in props.conf and check ??
EXTRACT-sample = sample=\"(?<sample>.*?)\"
You can also check by executing below search.
YOUR_SEARCH | rex field=_raw "sample=\"(?<sample>.*?)\"" | table _time sample
Happy Splunking
Hi @saurabh_tek11,
Can you please add below configuration in props.conf and check ??
EXTRACT-sample = sample=\"(?<sample>.*?)\"
You can also check by executing below search.
YOUR_SEARCH | rex field=_raw "sample=\"(?<sample>.*?)\"" | table _time sample
Happy Splunking
@kamlesh_vaghela - Thanks. It works on splunk. But i am trying to extract this one on https://regex101.com
.
Close but still cracking for lang=Hebrew and Arabic. I am trying to understand above both regexes -
"sample=\"(?<sample>[^\"]+)\""
and "sample=\"(?<sample>.*?)\""
so is it that anything after closing angle bracket> is the body of regular expression and in second regex, what is meaning of the optionality after .*
In "sample=\"(?<sample>[^\"]+)\""
does the ^ within character class signifies - a negative (of ", or until last " is found)
OR
start of regex (looking for first " - if yes then what is \"(?<sample
this " in the beginning doing) ?
Please enlighten me.
Hi @saurabh_tek11,
It is difficult for lang=Hebrew and Arabic. I'm able to extract sample value but with "
.
@kamlesh_vaghela -This works on splunk. Thank you. And you have enlightened me how swiftly we can get the named regex extraction done in splunk using erex.
Out of curiosity, Can you help me in extracting this on https://regex101.com ?
Depending on the data, I think following regex would be better
Match everything except double quotes: "sample=\"(?<sample>[^\"]+)\""
After some more study, i am understanding - that the meaning on [^\"]+
in
"sample=\"(?<sample>[^\"]+)\""
is that it will keep looking until an literal " is matched.
is this correct - @niketnilay ?
Yes that is correct 🙂 regex101.com also has explanation of this!