Refine your search:

Hi, I'm trying to parse some logs generated by Broadsoft SIP servers. The log formats follow a general pattern but the detail can vary from event to event and field meanings can be context-sensitive.

The events are multiline broken by datetime string and the first portion is pipe-separated. The fields here can differ in number and meaning, and if I use DELIMS on the pipe character it works except for the last field which flows into the remainder of the event.

The first thing I'd like to do is stop the delims at a defined point which seems to be a newline character. The following transform using "| or newline" doesn't work. If I make it "| or tab", it works better for the first line but also matches unwanted fields in the remainder of the event (many of which start with tab).

# delims are pipe OR newline.
FIELDS = "szDateTime" logLevel logType sipField1 sipField2 sipField3

Event sample:

2012.06.21 02:48:15:155 EST | Info       | CallP | SIP Endpoint | +155512345678 | Service Delivery | localHost1234:5678

        Processing Event:

2012.06.21 02:48:15:157 EST | Info       | Accounting

        Time Stamp: Thu Jun 21 02:48:15 EST 2012 (1340264895157)
        Accounting ID: [id]
        Service Name: Call Transfer
        Related Accounting ID: [id]

2012.06.21 02:48:14:773 EST | Info       | SipMedia | +155512345678 | localHost1234:5678

        udp 391 Bytes IN from
SIP/2.0 200 OK
[various amounts (10 - 30+ lines) of SIP information trimmed]

asked 26 Jun '12, 03:42

inglisn's gravatar image

accept rate: 0%

One Answer:

I think there are several options here as you seem to have variable number of varying fields in each event. One solution is to use a combination of props & transforms definitions to pull out major/high-level extractions on first pass and then pull out additional fields in second pass.

You could have a props.conf like this to efficiently break events, extract timestamp, and call the field extraction pieces::

TIME_FORMAT=%Y.%m.%d %H:%M:%S:%3N %z
REPORT-field_passes=pass_one, pass_two, pass_three

and a corresponding transforms.conf like this to first pull out static known fields (pass_one) and then pull out colon separated values (pass_two) and finally add additional passes against sipFields (extracted in pass_one) to handle anything else

FORMAT=szDateTime::$1 logLevel::$2 logType::$3 sipFields::$4


# another iteration for variable number of pipe separated values, etc

answered 26 Jun '12, 05:53

bwooden's gravatar image

bwooden ♦
accept rate: 39%

Excellent, thanks.

I came across a "2-phase" similar strategy in a question about FIX logs. Its a really useful way of working with ugly log formats. I can pull out other values with rex in the search command.

You also resolved some other issues on linebreaking I was having.

(27 Jun '12, 04:12) inglisn
Post your answer
toggle preview

Follow this question

Log In to enable email subscriptions



Answers + Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text]( "Title")
  • image?![alt text](/path/img.jpg "Title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported



Asked: 26 Jun '12, 03:42

Seen: 1,366 times

Last updated: 27 Jun '12, 04:12

Copyright © 2005-2014 Splunk Inc. All rights reserved.