Getting Data In

indexer not properly segmenting groupwise data from forwarder

Cagey
Engager

I have several groupwise servers running forwarders to a single index server. For the most part the data is arriving and being indexed but it is not being segmented properly. The entries are being indexed in 100+ line groups and lines are being split across indexed entries. For instance, this entry:

7/12/11 8:43:57.000 AM

/mslocal/mshold/po2be3/4/00065cef.001

08:43:57 496 MTP: panthera.macewan.ca: Transmitting file /gw/podom1/mslocal/mshold/po2be3/4/00065cef.001, Size: 22748

08:43:57 496 MTP: panthera.macewan.ca: End-of-file confirmation packet received

08:43:59 304 MTP: po5.podom1: Returning acknowledge (11)

08:43:59 304 MTP: po5.podom1: Returning acknowledge (11)

Show all 108 lines



The first line is the date/timestamp associated with the entry.

The next line (with no timestamp) comes the end of the previous entry and has been split off from it's timestamp.

The next 4 lines are properly formatted, However, if I were to "show all 108 lines" then the final line would be broken at a random spot and added at the start of the next entry.



It appears that splunk is not recognizing the timestamp as the start of an entry and is just grouping the data as it receives it from the forwarder into a single indexed entry.

How do I solve this???

jbsplunk
Splunk Employee
Splunk Employee

It appears as though you should tell Splunk more about how you want to see the time stamp. It isn't uncommon for Splunk to need instruction in order to improve time stamp recognition. This configuration will be done on the indexer, where the data from your forwarder is parsed.

http://www.splunk.com/base/Documentation/latest/Data/Configuretimestamprecognition

You should probably use TIME_FORMAT and TIME_PREFIX in props.conf. Since this data looks like multi line data, you'd probably use a karat as the prefix. Something like this should work:

[mysourcetype]
TIME_PREFIX = ^
TIME_FORMAT = %m/%d/%y %H:%M:%S.%3N %p
MAX_TIMESTAMP_LOOKAHEAD = 22 

You'll also probably need to use BREAK_ONLY_BEFORE and MUST_BREAK_AFTER in props.conf to define the beginning and end of your events. This will ensure that everything you'd like to be captured in a single event will be contained within that event.

http://www.splunk.com/base/Documentation/latest/admin/Propsconf

BREAK_ONLY_BEFORE = <regular expression>
* When set, Splunk creates a new event only if it encounters a new line that matches the
  regular expression.
* Defaults to empty.

MUST_BREAK_AFTER = <regular expression>
* When set and the regular expression matches the current line, Splunk creates a new event for
  the next input line.
* Splunk may still break before the current line if another rule matches.
* Defaults to empty.

There may be other line breaking settings which may work better in your instance, but this should give you a good place to start.

Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...