Getting Data In

File monitoring questions (top item change)

jcisha
Path Finder

File monitoring questions

Monitoring Point is, the log file
The peculiar form of the log file to the log record.

Log format

000000 SIZE = 099999 LOOP = 000000 WDTH = 001536 NWNO = 0004
00001 | AIX | 6.1 | LCID | xxx
00002 | AIX | 6.1 | LCID | xxx
00003 | AIX | 6.1 | LCID | xxx
00004 | AIX | 6.1 | LCID | xxx

Log record while NWNO item changes occur in the above log.
To collect duplicate indexing problems occur.

For example, the search results

index = temp 00001

00001 | AIX | 6.1 | LCID | xxx
00001 | AIX | 6.1 | LCID | xxx
00001 | AIX | 6.1 | LCID | xxx
00001 | AIX | 6.1 | LCID | xxx

Is indexing event

Could there be a way to solve the problem?

Tags (1)
0 Karma

yannK
Splunk Employee
Splunk Employee

This is not simple but can be achieved.

  • first, index your events as multiline events by creating a specific sourcetype the sourcetype has to be in props.conf or defined using the preview. and specify how you want the events to be broken.

Here we want each 000000 SIZE = 099999 LOOP = 000000 WDTH = 001536 NWNO = 0004
to be the beginning of a new event, so we will break on the line with SIZE

[test]
NO_BINARY_CHECK=1
BREAK_ONLY_BEFORE=\d+ SIZE =
SHOULD_LINEMERGE=true
pulldown_type=1
MAX_EVENTS=256
# we do not expect more than 256 lines per event.
  • second index your events using the correct sourcetype

  • third during the search do the field extraction for the second part of the events.
    using multikv, each line will be considered as a different events (but the fields from the first line will be common)
    then you can extract the fields from the line (using | as a separator)
    and finally, remove the first line to avoid confusion.

`source="*test.log" sourcetype=test |multikv noheader=t 
| rex "(?\d+)[^|]*" 
| rex "(?\d+)\s\|\s(?\w+)\s\|\s(?[^\|]*)\s\|\s(?\w+)\s\|\s(?\w+)"  
| search NOT "LOOP" 
| table ID SIZE LOOP WDTH NWNO fieldA fieldB fieldC fieldD fieldE  _raw

jcisha
Path Finder

Thank you Answer by yannK.

After all, Is there any way to avoid duplicate is collected?
Whether the only way to solve collected through search results?

Uniqueness, licensing issues

0 Karma
Get Updates on the Splunk Community!

Join Us for Splunk University and Get Your Bootcamp Game On!

If you know, you know! Splunk University is the vibe this summer so register today for bootcamps galore ...

.conf24 | Learning Tracks for Security, Observability, Platform, and Developers!

.conf24 is taking place at The Venetian in Las Vegas from June 11 - 14. Continue reading to learn about the ...

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...