Hi All,
I am having trouble breaking up the log file below:
Each log entry starts with id:#################### and ends with id:@@@@@@@@@@@@@@@@
However as you can see below the logs can be written to by to different processes at the same time, is there any way to have the logs broken as follows:
Start an event with id:#################### and end only when you see the corresponding id ending e.g id:@@@@@@@@@@@@@@@@
Also the log entries also have a start and a complete date, as Splunk by default breaks by date if we swtich this off is there a way to force Splunk to just take the start date as the event date?
The start date proceeds the line id:####################
8656: ################################################ 8656: From xxxxxx@xxxxxxxxx 31 Mar 2010 11:33:24 (pid=2768,TSRW.52.2,ditTradeBookManager) 8656: IN ### PagefileUsage: 46,508 K, PeakPagefileUsage: 52,756 K 8656: Fetch Trades For FIRM 8656: This user is GLOBAL hence executing the current Tradeserver Functionality 8656: with condition [BOOK in ('JXH1', 'JXF1') and FIRMfutures.EXPIRY_MONTH >= '01-Mar-2010' and CONTRACT in ('USFF/CBOT', 'AUIBCR/SFE') and FIRMfutures.STATUS = 'WHATIF'] 8656: TStradeDB::retrieveTrades returned 0 trades 10152: 10152: ################################################ 8656: OUT ### PagefileUsage: 46,508 K, diff : 0 Bytes 10152: From xxxxxx@vxxxxxxxx 31 Mar 2010 11:33:24 (pid=3636,TSRW.52.2,ditTradeBookManager) 8656: OUT ### PeakPagefileUsage: 52,756 K, diff : 0 Bytes 10152: IN ### PagefileUsage: 46,508 K, PeakPagefileUsage: 52,756 K 10152: Fetch Trades For JSW 8656: Completed 31 Mar 2010 11:33:24 (0.031000s elapsed) 8656: @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 10152: This user is GLOBAL hence executing the current Tradeserver Functionality 10152: with condition [BOOK in ('JXH1', 'JXF1') and TDtrade.EXPIRY >= '01-Mar-2010' and TDtrade.STATUS = 'WHATIF' and TDtrade.VERSION_DATE = '01-Mar-2010' and CONTRACT in ('NZBL/NZFE', 'AUBL/SFE') and FIRMfutures.STATUS = 'WHATIF'] 8656: TStradeDB::retrieveTrades returned 0 trades 8656: OUT ### PagefileUsage: 46,508 K, diff : -68 K 8656: OUT ### PeakPagefileUsage: 52,756 K, diff : 0 Bytes 8656: Completed 31 Mar 2010 11:33:24 (0.032000s elapsed) 8656: @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
So from the above I would get the following extracts:
8656: ################################################ 8656: From xxxxxx@xxxxxxxx31 Mar 2010 11:33:24 (pid=2768,TSRW.52.2,ditTradeBookManager) 8656: IN ### PagefileUsage: 46,508 K, PeakPagefileUsage: 52,756 K 8656: Fetch Trades For FIRM 8656: This user is GLOBAL hence executing the current Tradeserver Functionality 8656: with condition [BOOK in ('JXH1', 'JXF1') and FIRMfutures.EXPIRY_MONTH >= '01-Mar-2010' and CONTRACT in ('USFF/CBOT', 'AUIBCR/SFE') and FIRMfutures.STATUS = 'WHATIF'] 8656: TStradeDB::retrieveTrades returned 0 trades 8656: OUT ### PagefileUsage: 46,508 K, diff : 0 Bytes 8656: OUT ### PeakPagefileUsage: 52,756 K, diff : 0 Bytes 8656: Completed 31 Mar 2010 11:33:24 (0.031000s elapsed) 8656: @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
AND
10152: ################################################ 10152: From xxxxxxx@xxxxxxxx 31 Mar 2010 11:33:24 (pid=3636,TSRW.52.2,ditTradeBookManager) 10152: IN ### PagefileUsage: 46,508 K, PeakPagefileUsage: 52,756 K 10152: Fetch Trades For JSW 10152: This user is GLOBAL hence executing the current Tradeserver Functionality 10152: TStradeDB::retrieveTrades returned 0 trades 10152: OUT ### PagefileUsage: 46,508 K, diff : 0 Bytes 10152: OUT ### PeakPagefileUsage: 52,756 K, diff : 0 Bytes 10152: Completed 31 Mar 2010 11:33:24 (0.000000s elapsed) 10152: @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
AND
8656: ################################################ 8656: From xxxxxxx@xxxxxxxxx 31 Mar 2010 11:33:24 (pid=2768,TSRW.52.2,ditTradeBookManager) 8656: IN ### PagefileUsage: 46,576 K, PeakPagefileUsage: 52,756 K 8656: Fetch Trades For FIRM 8656: This user is GLOBAL hence executing the current Tradeserver Functionality 8656: with condition [BOOK in ('JXH1', 'JXF1') and FIRMfutures.EXPIRY_MONTH >= '01-Mar-2010' and CONTRACT in ('NZBL/NZFE', 'AUBL/SFE') and FIRMfutures.STATUS = 'WHATIF'] 8656: TStradeDB::retrieveTrades returned 0 trades 8656: OUT ### PagefileUsage: 46,508 K, diff : -68 K 8656: OUT ### PeakPagefileUsage: 52,756 K, diff : 0 Bytes 8656: Completed 31 Mar 2010 11:33:24 (0.032000s elapsed) 8656: @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
Ok, so my first answer was way off, so I'm changing it. Yes, that's allowed.
I missed the slightly important fact that we're dealing with interleaved events here, as in event1 starts, writes a couple of lines, event2 starts, writes a couple of lines, event1 writes some more, event2 writes some more, event 1 completes then event 2 completes.
Can Splunk re-order this so that events are separated and concatenated properly? No it cannot, Splunk reads log files as they are written, it cannot be configured to re-order data. However, it would be possible to write a scripted input to do this first, and then feed the data to Splunk.
Alternatively, you could just index the data as single-line events and then use the 'stats' or 'transaction' commands to sort the various lines together in search output.
The final option would be to change the way in which your processes write their log files. Instead of one line at a time, is it possible to write a whole event at once so they are not intermingled? Or even separate log-files would do the trick too.
If you're stuck with this kinda cross-threaded indexing picture, you can still index everything as single line events and use the search language to put things back together at search time. This often seems like a consolation prize but in many ways it can actually be a better and more flexible way to go. For instance for a long running transaction you get the start events into the index and your system has the power to detect the problem before the transaction closes.
One concrete problem is that you end up with some weird things like half of 8656 is at its own starttime, and the other half is indexed at the starttime of 10152. Therefore for any particular timerange you may need to filter out transactions whose end was clipped off, or else your statistics might get skewed slightly.
So to be able to account for that Im going to assume that there's some endEvent=1 field that we extract when we see the final line. And Im also assuming that you are correctly extracting the 'eventId' field out of the single line events, and other fields involved are getting extracted properly.
So here are 2 ways to do it. For both examples lets say you're interested in the peakPageFileUsage and elapsedTime fields
1) the transaction command can put this back together with
sourcetype="foo" | transaction eventId maxspan=1m | search endEvent=1 | fields eventId peakPageFileUsage elapsedTime
(the maxspan value you would use depends on how frequently the eventId's are reused by the system. )
2) And stats is often overlooked but it can do a lot of this stuff just fine too.
sourcetype="foo" | stats first(endEvent) first(peakPageFileUsage) first(elapsedTime) by eventId | search endEvent=1
(and if you ever have a field that appears multiple times in a single eventId you could use the values() operator within stats and then operate on a multi-valued field from there. )
What they said. Basically, you either use transaction
after the fact, or you fix how you log by writing log entries atomically.
There is a rather exotic possibility also of using transaction
and then writing the results to a summary index, but this requires a little extra work and customization.
What they said. Basically, you either use transaction
after the fact, or you fix how you log by writing log entries atomically.
There is a rather exotic possibility also of using transaction
and then writing the results to a summary index, but this requires a little extra work and customization.
If you're stuck with this kinda cross-threaded indexing picture, you can still index everything as single line events and use the search language to put things back together at search time. This often seems like a consolation prize but in many ways it can actually be a better and more flexible way to go. For instance for a long running transaction you get the start events into the index and your system has the power to detect the problem before the transaction closes.
One concrete problem is that you end up with some weird things like half of 8656 is at its own starttime, and the other half is indexed at the starttime of 10152. Therefore for any particular timerange you may need to filter out transactions whose end was clipped off, or else your statistics might get skewed slightly.
So to be able to account for that Im going to assume that there's some endEvent=1 field that we extract when we see the final line. And Im also assuming that you are correctly extracting the 'eventId' field out of the single line events, and other fields involved are getting extracted properly.
So here are 2 ways to do it. For both examples lets say you're interested in the peakPageFileUsage and elapsedTime fields
1) the transaction command can put this back together with
sourcetype="foo" | transaction eventId maxspan=1m | search endEvent=1 | fields eventId peakPageFileUsage elapsedTime
(the maxspan value you would use depends on how frequently the eventId's are reused by the system. )
2) And stats is often overlooked but it can do a lot of this stuff just fine too.
sourcetype="foo" | stats first(endEvent) first(peakPageFileUsage) first(elapsedTime) by eventId | search endEvent=1
(and if you ever have a field that appears multiple times in a single eventId you could use the values() operator within stats and then operate on a multi-valued field from there. )
If you haven't done so already, I would suggest explicitly add a TIME_FORMAT
in your props.conf
file. (Maybe even TIME_PREFIX
too.) You don't want splunk guessing timestamps formats and trying to turn your eventId
field into a date/time. You also have dates following the term EXPIRY_MONTH
that could be picked up as well, so you probably want to nail this down and keep splunk from guessing.
I have one alternate suggestion for example 1 that is less time-range sensitive. Assuming your start and end markers are as clear-cut as they were stated, then you may want to try the transaction command: transaction eventId startswith="################################################" endswith="@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@"
Then you don't need the | search endEvent=1
part.
Ok, so my first answer was way off, so I'm changing it. Yes, that's allowed.
I missed the slightly important fact that we're dealing with interleaved events here, as in event1 starts, writes a couple of lines, event2 starts, writes a couple of lines, event1 writes some more, event2 writes some more, event 1 completes then event 2 completes.
Can Splunk re-order this so that events are separated and concatenated properly? No it cannot, Splunk reads log files as they are written, it cannot be configured to re-order data. However, it would be possible to write a scripted input to do this first, and then feed the data to Splunk.
Alternatively, you could just index the data as single-line events and then use the 'stats' or 'transaction' commands to sort the various lines together in search output.
The final option would be to change the way in which your processes write their log files. Instead of one line at a time, is it possible to write a whole event at once so they are not intermingled? Or even separate log-files would do the trick too.