I have the following sample text that's embedded inside a log:
(Response=200) {"log":{"properties":"rob"}}
I am trying the following regex pattern matching, which works as expected, it's finding log message that contains the above sample text.
index=<wide serach> | regex "[\"{]{1}(log){1}"
However, I want to further cement this search to restrict it down to only log messages, so I want to add additional constraints, like so:
index=<wide serach> | regex "[\{\"]{2}(log){1}"
// to match {"log
Or
index=<wide serach> | regex "[\"{]{1}(log){1}[\"{]{1}"
// to match "log"
But in either expression, it returns 0 results.
So why is it so sensitive that adding a few more characters to expression breaks it?
Am I doing something wrong?
Please check your event characters, especially double quotes. Sometimes events will have extended ACSII (UTF-8) or hidden characters which look exactly like normal characters. Go to https://www.browserling.com/tools/text-to-ascii and paste in your event. When I do this using the event you pasted at the top, the site returns this:
40 82 101 115 112 111 110 115 101 61 50 48 48 41 32 123 34 108 111 103 34 58 123 34 112 114 111 112 101 114 116 105 101 115 34 58 34 114 111 98 34 125 125
Go to the actual raw event from within Splunk and paste the text that matches the second line in your original question. If you don't see exactly what I posted above, then your event has a character that may be invalidating your regex.
It works for me:
| makeresults
| eval _raw="(Response=200) {\"log\":{\"properties\":\"rob\"}}"
| regex "[\{\"]{2}(log){1}"
Thanks, it's so strange, I've copied what you have and it doesn't work. Strangely, if I set:
[\{\"]{1}
It works, but if I change it back to 2, like what you have:
[\{\"]{2}
It fails.. 😕
It must be that my "fake" _raw
is somehow different from yours. I always build my stuff using this:
www.RegEx101.com
Just to clarify more, I know that I can create a pattern that matches what I have above (and it works), like so:
index=<wide search> | regex "[\"{]{1}(log){1}"
But I don't care about creating another expression for this text, what I would like to know is WHY my query breaks when I search for the quotation after the keyword, "log"? I'm just trying to understand.
| makeresults
| eval raw="(Response=200) {\"log\":{\"properties\":\"rob\"}}"
| rex field=raw "\(Response=(?<response_code>\d+)\) (?<json_field>{.+})$"
| spath input=json_field
Hi,
It may be a little different, but you can also do this.
Thank you for your response, but my question is, why is this happening? Why is it that when I add an additional pattern searcher, the search fails?
Since you want a slightly looser regex, you can rip out everything before the space as the 'response type', and everything after the "log": as the 'log value', or something similar, but this should work as your regex:
(?<responsetype>.*)\s.*(?<log>log).*{(?<logvalue>.*?)}.*
Basically, it grabs everything before the space as the response type, goes up to "log", and then asserts that the log value is in dictionary form, which I think is fair given your assertion that it's either {"log":{value}} or "log":{value}. This works for both.
Hope this helps!
Thanks but I'm not looking for a looser regex pattern, I want something strict so that I know there's a lower chance for the pattern to pick up logs that I don't want.
My question is, why is this happening? Why is it that when I add an additional pattern searcher, the search fails?