Getting Data In

File monitoring questions (top item change)

jcisha
Path Finder

File monitoring questions

Monitoring Point is, the log file
The peculiar form of the log file to the log record.

Log format

000000 SIZE = 099999 LOOP = 000000 WDTH = 001536 NWNO = 0004
00001 | AIX | 6.1 | LCID | xxx
00002 | AIX | 6.1 | LCID | xxx
00003 | AIX | 6.1 | LCID | xxx
00004 | AIX | 6.1 | LCID | xxx

Log record while NWNO item changes occur in the above log.
To collect duplicate indexing problems occur.

For example, the search results

index = temp 00001

00001 | AIX | 6.1 | LCID | xxx
00001 | AIX | 6.1 | LCID | xxx
00001 | AIX | 6.1 | LCID | xxx
00001 | AIX | 6.1 | LCID | xxx

Is indexing event

Could there be a way to solve the problem?

Tags (1)
0 Karma

yannK
Splunk Employee
Splunk Employee

This is not simple but can be achieved.

  • first, index your events as multiline events by creating a specific sourcetype the sourcetype has to be in props.conf or defined using the preview. and specify how you want the events to be broken.

Here we want each 000000 SIZE = 099999 LOOP = 000000 WDTH = 001536 NWNO = 0004
to be the beginning of a new event, so we will break on the line with SIZE

[test]
NO_BINARY_CHECK=1
BREAK_ONLY_BEFORE=\d+ SIZE =
SHOULD_LINEMERGE=true
pulldown_type=1
MAX_EVENTS=256
# we do not expect more than 256 lines per event.
  • second index your events using the correct sourcetype

  • third during the search do the field extraction for the second part of the events.
    using multikv, each line will be considered as a different events (but the fields from the first line will be common)
    then you can extract the fields from the line (using | as a separator)
    and finally, remove the first line to avoid confusion.

`source="*test.log" sourcetype=test |multikv noheader=t 
| rex "(?\d+)[^|]*" 
| rex "(?\d+)\s\|\s(?\w+)\s\|\s(?[^\|]*)\s\|\s(?\w+)\s\|\s(?\w+)"  
| search NOT "LOOP" 
| table ID SIZE LOOP WDTH NWNO fieldA fieldB fieldC fieldD fieldE  _raw

jcisha
Path Finder

Thank you Answer by yannK.

After all, Is there any way to avoid duplicate is collected?
Whether the only way to solve collected through search results?

Uniqueness, licensing issues

0 Karma
Get Updates on the Splunk Community!

Introducing the Splunk Community Dashboard Challenge!

Welcome to Splunk Community Dashboard Challenge! This is your chance to showcase your skills in creating ...

Built-in Service Level Objectives Management to Bridge the Gap Between Service & ...

Wednesday, May 29, 2024  |  11AM PST / 2PM ESTRegister now and join us to learn more about how you can ...

Get Your Exclusive Splunk Certified Cybersecurity Defense Engineer Certification at ...

We’re excited to announce a new Splunk certification exam being released at .conf24! If you’re headed to Vegas ...