hi,
these are my sample log file-:
< Jul 15 23:48:33 Phase 0 running (1132 seconds)
CPU Time Status Skew Vertex
0.046 [ : 1] 0% Audit.XYX
0.135 [ : 1] 0% Audit.PQR
Data Bytes Records Status Flow
712 4 [ : : 1] 0% Audit.Flow1
712 4 [ : : 1] 0% Audit.Flow2
0 0 [ : : 12] 0% Flow_1
0 0 [ : : 12] 0% Flow_10
41,417,795 264,261 [ : : 12] 1% Flow_11
41,417,795 264,261 [ : : 12] 1% Flow_11
1,746,882,294 3,158,255 [ : : 12] 0% Flow_12
Jul 15 23:48:33 Phase 0 running (1132 seconds)
CPU Time Status Skew Vertex
0.046 [ : 1] 0% Audit.XYX
0.135 [ : 1] 0% Audit.PQR
Data Bytes Records Status Flow
712 4 [ : : 1] 0% Audit.Flow1
712 4 [ : : 1] 0% Audit.Flow2
0 0 [ : : 12] 0% Flow_1
0 0 [ : : 12] 0% Flow_10
41,417,795 264,261 [ : : 12] 1% Flow_11
41,417,795 264,261 [ : : 12] 1% Flow_11
1,746,882,294 3,158,255 [ : : 12] 0% Flow_12
Jul 15 23:48:33 Phase 0 ended (1132 seconds)
CPU Time Status Skew Vertex
0.046 [ : 1] 0% Audit.XYX
0.135 [ : 1] 0% Audit.PQR
Data Bytes Records Status Flow
712 4 [ : : 1] 0% Audit.Flow1
712 4 [ : : 1] 0% Audit.Flow2
0 0 [ : : 12] 0% Flow_1
0 0 [ : : 12] 0% Flow_10
41,417,795 264,261 [ : : 12] 1% Flow_11
41,417,795 264,261 [ : : 12] 1% Flow_11
1,746,882,294 3,158,255 [ : : 12] 0% Flow_12
Jul 15 23:48:33 Phase 1 running (1132 seconds)
CPU Time Status Skew Vertex
0.046 [ : 1] 0% Audit.XYX
0.135 [ : 1] 0% Audit.PQR
Data Bytes Records Status Flow
712 4 [ : : 1] 0% Audit.Flow1
712 4 [ : : 1] 0% Audit.Flow2
0 0 [ : : 12] 0% Flow_1
0 0 [ : : 12] 0% Flow_10
41,417,795 264,261 [ : : 12] 1% Flow_11
41,417,795 264,261 [ : : 12] 1% Flow_11
1,746,882,294 3,158,255 [ : : 12] 0% Flow_12
consisting of phase (0,1) running,started and ended.
i want to calculate max cpu time took by a particular vertex when the phase ended and max data bytes consumed by flow.
so i have created regex for the field extraction,the log files that we have is from unix environment.
DOS expression for required data set :
.* Phase \d ended.\r\n(.\r\n)*-{80}\r\n-{80}
Unix expression for required data set :
.* Phase \d ended.\n(.\n)*-{80}\n-{80}
but this is not working as expected.Splunk does not extract the given pattern as a record.
record that we are interested in is described int regex above.
the result looks like
Jul 15 23:48:33 Phase 0 ended (1132 seconds)
CPU Time Status Skew Vertex
0.046 [ : 1] 0% Audit.XYX
0.135 [ : 1] 0% Audit.PQR
Data Bytes Records Status Flow
712 4 [ : : 1] 0% Audit.Flow1
712 4 [ : : 1] 0% Audit.Flow2
0 0 [ : : 12] 0% Flow_1
0 0 [ : : 12] 0% Flow_10
41,417,795 264,261 [ : : 12] 1% Flow_11
41,417,795 264,261 [ : : 12] 1% Flow_11
1,746,882,294 3,158,255 [ : : 12] 0% Flow_12
also when we try to extract fields for CPU TIME of ended phases Splunk expression generator randomly picks up some numbers across different phases..
what si the best way to
1)ensure splunk considers only ended phases as distinct records
2)for every ended records extract fields like cpu time,vertex,flow etc >
I don't recognize your regular expression syntax. Splunk uses PCRE. I would start with the following and see what happens.
props.conf
[yoursourcetype]
TRUNCATE = 50000
SHOULD_LINEMERGE = true
BREAK_ONLY_BEFORE_DATE = true
MAX_TIMESTAMP_LOOKAHEAD = 25
TIME_FORMAT = %b %d %H:%M:%S
EXTRACT-e1 =Phase\s(?<Phase>\d+)\s(?<PhaseStatus>\S+)\s\((?<PhaseSeconds>\d+) seconds\)
REPORT-r1 = extract_fields1,extract_fields2
transforms.conf
[extract_fields1]
REGEX = (?m)([\d\,]+)\s+([\d\,]+)\s\[\s*(\d*\:\s*\d* \:\s*\d*)\]\s+\(d{1,3})%\s(\S+)$
FORMAT = FlowDataBytes:$1 FlowRecords:$2 FlowStatus:$3 Flow:$4 FlowName:$5
MV_ADD = true
[extract_fields2]
REGEX = (?m)([\d\.]+)\s\[\s*(\d*\:\s*\d*)\]\s+\(d{1,3})%\s(\S+)$
FORMAT = CPUTime:$1 Status:$2 Skew:$3 Vertex:$4
MV_ADD = true
TRUNCATE should be set to the maximum number of bytes in any single event. To test these settings, try this search
sourcetype=yoursourcetype
| table Phase PhaseStatus PhaseSeconds CPUTime Status Skew Vertext FlowDataBytes FlowRecords FlowStatus FlowName
Note that you have to reindex the data to create the events properly. I hope that you are using a test instance of Splunk or at least a test index for this...
More info on linebreaking here and more info on field extraction here
hi iguinn thanks for your valuable reply it helps us in some extend ..
but when iam running that, showing result like
phase phasestatus phaseseconds cputime status skew vertex
1 ended 1174
0 running 465
it displaying only phase,phasestatus and phaseseconds,but rest of the fields are not coming mainly cputime in which we are most intersted....
the main part is to extract the cpu time...
i have pasted the code given by you in props and transforms.conf as it is..
so need to know like in trnasforms.conf you hv written
[extract_filed1] we hv to put field value here?
Could you please fix your formatting? Code blocks should be indented by 4 spaces on each line.