Splunk Search

Regular Expression for log4j

royimad
Builder

I need to know if i could extract the fields of the entire log using regular expression, I don't know how to use it? could it be extracted during the indexing of the log?.

My log line look like:

Fri Jan 04 2013 13:05:34,114 EST ERROR wavemark.webapp.interceptors.WmExceptionInterceptor   - WaveMarkException occurred wavemark.common.exceptions.WaveMarkException: Error while calling method [getReportData] in delegate [ReportSessionDelegate]
at wavemark.webapp.delegates.ReportSessionDelegate.getReportData(ReportSessionDelegate.java:52)
...

The regular expression that i'm intended to use is:

/(?<TIMESTAMP>\d{6}\s\d{6}\.\d{3})\s (?<LEVEL>^[E|W|D|I])\s  (?<THREAD>.*?)\/ (?<CLASS>.*?)\s-\s (?<MESSAGE>.*?[\r|\n](?=^[[E|W|D|I]\s\d{6}\s\d{6}\.\d{3}]?))/gxsm
Tags (1)
0 Karma
1 Solution

kristian_kolb
Ultra Champion

I'd say that you'd probably want to express that slightly different (I must admit that I don't quite follow your regex). In any case you do NOT want to extract the fields at index-time. The way to make the field extractions more permanaent is to configure them in props.conf by way of an EXTRACT based off the sourcetype of the events, like so:

props.conf

[log4j]
EXTRACT-some_unique_name = ^(\S+\s+){4}(?<TIMESTAMP>\S+)\s+\S+\s+(?<LEVEL>\w+)\s+ etc etc

You could also try to use the IFX (Interactive Field Extractor).


UPDATE:

Well, the IFX is one, and it will actually create those configurations (in props.conf) for you, but I has some limitations in understanding what you want to extract.

There is however no limit on (practical) how many EXTRACTs you can have in a props.conf stanza, so you can one EXTRACT for each field you want to extract (instead of doing them all in one go).

This may actually be required if the patterns vary wildly, since a failure in part of an EXTRACT regex will cause all of the extractions to fail.

Having multiple EXTRACTs will let them work independently, but will cause a slight overhead, as the regex processor will have to execute several times.

/K

View solution in original post

martin_mueller
SplunkTrust
SplunkTrust

You can specify regular expressions for field extraction in props.conf/transforms.conf - your expression isn't going to work though. Just looking at the TIMESTAMP field, six digits space six digits dot three digits doesn't match your event at all. Further down your use of ^ and [] looks weird as well.

martin_mueller
SplunkTrust
SplunkTrust

For testing you can either use a dedicated tool such as RegexBuddy, or just use the rex command in splunk.

0 Karma

royimad
Builder

Thanks martin, Do you know any editor where i could build and test my regular expression of splunk? or i good reference to build regular expression with some example cause i have a lot of complicated patterns to look for.

0 Karma

kristian_kolb
Ultra Champion

I'd say that you'd probably want to express that slightly different (I must admit that I don't quite follow your regex). In any case you do NOT want to extract the fields at index-time. The way to make the field extractions more permanaent is to configure them in props.conf by way of an EXTRACT based off the sourcetype of the events, like so:

props.conf

[log4j]
EXTRACT-some_unique_name = ^(\S+\s+){4}(?<TIMESTAMP>\S+)\s+\S+\s+(?<LEVEL>\w+)\s+ etc etc

You could also try to use the IFX (Interactive Field Extractor).


UPDATE:

Well, the IFX is one, and it will actually create those configurations (in props.conf) for you, but I has some limitations in understanding what you want to extract.

There is however no limit on (practical) how many EXTRACTs you can have in a props.conf stanza, so you can one EXTRACT for each field you want to extract (instead of doing them all in one go).

This may actually be required if the patterns vary wildly, since a failure in part of an EXTRACT regex will cause all of the extractions to fail.

Having multiple EXTRACTs will let them work independently, but will cause a slight overhead, as the regex processor will have to execute several times.

/K

kristian_kolb
Ultra Champion

see update /k

0 Karma

bjoernjensen
Contributor

In Windows I use a free tool called "Rad Software Regular Expression Designer". Works for me. Info source i.e.: http://www.regular-expressions.info/

0 Karma

royimad
Builder

Thanks, this is working. Do you know any tool that could help me build my regular expression for props.conf easily? some sort of an editor where i could test my regular expression for splunk ?

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...