Splunk Search

How to write the regex to extract the date fields and another key-value pair from my sample data?

vtsguerrero
Contributor

Hello guys, sup?

We've got this piece of log which is a MySql log and we should not change the layout, but need to extract START_DATE, END_DATE and NUM_ROWS.
As follows a piece of data:

******************************************************
*****************  EXECUTION HYW   *****************
*****************    TRTD08IIL.SQL    *****************
******************************************************

***************   26/02/2015-07:00:06  ***************
Conectting to MYSQL

Execution Data
Process Code : TRTD08IIL
Program           : TRTD08IIL.SQL
Descripton          : Table
Critic      : E
Reference Date : 20140103

***************   26/02/2015-07:00:06  ***************
Temp Execution Path For File: /C:/TEMP/TRTD08IIL.SQL.TRTD08IIL

***************   26/02/2015-07:00:06  ***************
Temp Execution Path For File: /C:/TEMP/TRTD08IIL.SQL.TRTD08IIL

Table tb.User stats: [num_partitions: 0, num_files: 0, num_rows: 65931447, total_size: 0, raw_data_size: 15793929087]
65931447 Rows loaded to tab_clie
MapReduce Jobs Launched:
Job 0: Map: 18  Reduce: 4   Cumulative CPU: 608.98 sec   HDFS Read: 1037601634 HDFS Write: 604987045 SUCCESS
Job 1: Map: 59  Reduce: 58   Cumulative CPU: 4337.55 sec   HDFS Read: 14707612009 HDFS Write: 29413721 SUCCESS
Total MapReduce CPU Time Spent: 0 days 7 hours 59 minutes 42 seconds 920 msec
OK
Time taken: 2256.431 seconds


***************   26/02/2015-07:37:47  ***************
Temp Execution Path For File: /C:/TEMP/TRTD08IIL.SQL.TRTD08IIL

***************   26/02/2015-07:37:47  ***************
Temp Execution Path For File: /C:/TEMP/TRTD08IIL.SQL.TRTD08IIL
0 Karma
1 Solution

richgalloway
SplunkTrust
SplunkTrust

This works on regex101.com:

"(?s)\*{15}\s+(?P<START_DATE>\d\d\/\d\d\/\d{4}-\d\d:\d\d:\d\d)\s+\*{15}.*\s(?P<NUM_ROWS>\d+)\sRows loaded.*Time taken: .*?seconds.*?\*{15}\s+(?<END_DATE>\d\d\/\d\d\/\d{4}-\d\d:\d\d:\d\d)"
---
If this reply helps you, Karma would be appreciated.

View solution in original post

richgalloway
SplunkTrust
SplunkTrust

This works on regex101.com:

"(?s)\*{15}\s+(?P<START_DATE>\d\d\/\d\d\/\d{4}-\d\d:\d\d:\d\d)\s+\*{15}.*\s(?P<NUM_ROWS>\d+)\sRows loaded.*Time taken: .*?seconds.*?\*{15}\s+(?<END_DATE>\d\d\/\d\d\/\d{4}-\d\d:\d\d:\d\d)"
---
If this reply helps you, Karma would be appreciated.

vtsguerrero
Contributor

Thanks @richgalloway! Worked here!

somesoni2
Revered Legend

Is data already ingested and all these lines appear as one event?

0 Karma

vtsguerrero
Contributor

Yeah, max_event was applied with 10.000 in index preview...

0 Karma

theouhuios
Motivator
rex field=_raw "*\s+(?P<Start_date>\S+)\s+"

should give you the time value.

rex field=_raw "(?P<ROLLS>num_rows:\s+\d+)"

This will get the ROLLS.

But you would want num_rows = 65931447 right? for that..

rex field=_raw "num_rows:\s+(?P<num_rows>\d+)\,"

Didn't test these. Should be good though.

0 Karma

theouhuios
Motivator

Sorry about that. Been a while since I posted here. Looks like formatting got messed up.

0 Karma

vtsguerrero
Contributor

I need to extract by regex
should be:

START_DATE = *************** 26/02/2015-07:00:06 ***************
END_DATE = *************** 26/02/2015-07:37:47 ***************
ROLLS = num_rows: 65931447

0 Karma
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...