Getting Data In

File monitoring questions (top item change)

jcisha
Path Finder

File monitoring questions

Monitoring Point is, the log file
The peculiar form of the log file to the log record.

Log format

000000 SIZE = 099999 LOOP = 000000 WDTH = 001536 NWNO = 0004
00001 | AIX | 6.1 | LCID | xxx
00002 | AIX | 6.1 | LCID | xxx
00003 | AIX | 6.1 | LCID | xxx
00004 | AIX | 6.1 | LCID | xxx

Log record while NWNO item changes occur in the above log.
To collect duplicate indexing problems occur.

For example, the search results

index = temp 00001

00001 | AIX | 6.1 | LCID | xxx
00001 | AIX | 6.1 | LCID | xxx
00001 | AIX | 6.1 | LCID | xxx
00001 | AIX | 6.1 | LCID | xxx

Is indexing event

Could there be a way to solve the problem?

Tags (1)
0 Karma

yannK
Splunk Employee
Splunk Employee

This is not simple but can be achieved.

  • first, index your events as multiline events by creating a specific sourcetype the sourcetype has to be in props.conf or defined using the preview. and specify how you want the events to be broken.

Here we want each 000000 SIZE = 099999 LOOP = 000000 WDTH = 001536 NWNO = 0004
to be the beginning of a new event, so we will break on the line with SIZE

[test]
NO_BINARY_CHECK=1
BREAK_ONLY_BEFORE=\d+ SIZE =
SHOULD_LINEMERGE=true
pulldown_type=1
MAX_EVENTS=256
# we do not expect more than 256 lines per event.
  • second index your events using the correct sourcetype

  • third during the search do the field extraction for the second part of the events.
    using multikv, each line will be considered as a different events (but the fields from the first line will be common)
    then you can extract the fields from the line (using | as a separator)
    and finally, remove the first line to avoid confusion.

`source="*test.log" sourcetype=test |multikv noheader=t 
| rex "(?\d+)[^|]*" 
| rex "(?\d+)\s\|\s(?\w+)\s\|\s(?[^\|]*)\s\|\s(?\w+)\s\|\s(?\w+)"  
| search NOT "LOOP" 
| table ID SIZE LOOP WDTH NWNO fieldA fieldB fieldC fieldD fieldE  _raw

jcisha
Path Finder

Thank you Answer by yannK.

After all, Is there any way to avoid duplicate is collected?
Whether the only way to solve collected through search results?

Uniqueness, licensing issues

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...