Getting Data In

Remove Multiline Log Duplicates

jerrad
Path Finder

I am trying to figure out an approach to a multiline log file problem I have, the device that generates the file does so like a regular running log file however it is FIFO at the point that it reaches 10MB. The only way I can get this file is via FTP and have Splunk monitor the download path, I managed to get all my multiline event breaking working correctly for the most part aside from a few stray events that are truncated from the source but I can live with that. The issue I have is that if I simply overwrite the file with a newly downloaded copy it duplicates many events since the first 256 bytes of the file has a different CRC than before and so does the last 256 bytes of the file. It's really much of the middle portion that is potentially the same .

Is there any tweak or method anyone can suggest to deal with this situation with the goal of not indexing any duplicate events?

First FTP of File Example

MSCi      MSS01                     2010-12-08  09:43:09
40+ random lines
END OF REPORT

MSCi      MSS01                     2010-12-08  09:44:09
40+ random lines
END OF REPORT

MSCi      MSS01                     2010-12-08  09:45:09
40+ random lines
END OF REPORT

MSCi      MSS01                     2010-12-08  09:46:09
40+ random lines
END OF REPORT

MSCi      MSS01                     2010-12-08  09:47:09
40+ random lines
END OF REPORT

Second FTP of File ~1 hour later

MSCi      MSS01                     2010-12-08  10:43:09
40+ random lines
END OF REPORT

MSCi      MSS01                     2010-12-08  09:47:09
40+ random lines
END OF REPORT

MSCi      MSS01                     2010-12-08  09:46:09
40+ random lines
END OF REPORT

MSCi      MSS01                     2010-12-08  09:45:09
40+ random lines
END OF REPORT

MSCi      MSS01                     2010-12-08  09:44:09

Thanks

Jerrad

Tags (2)
0 Karma

BenAveling
Path Finder

Rather than have splunk index the ftp'd file, you could perhaps have a script running after each ftp to extract just unique events into a new file and have splunk monitor that.

0 Karma
Get Updates on the Splunk Community!

Updated Team Landing Page in Splunk Observability

We’re making some changes to the team landing page in Splunk Observability, based on your feedback. The ...

New! Splunk Observability Search Enhancements for Splunk APM Services/Traces and ...

Regardless of where you are in Splunk Observability, you can search for relevant APM targets including service ...

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...