Getting Data In

Which method allows the best performance for Splunk to ingest custom scripts: monitor, SDK, or HTTP Event Collector?

wegscd
Contributor

We're having to write some custom scripts to read/tail binary data, format them into something Splunk-able (k1=v1 k2=v2 k3=v3), and get them into Splunk. This will be running on a machine that will have a UF....

At this point, I see three options for the "get them into Splunk" end:

  1. write a scratch file into a directory that Splunk is monitoring
  2. write against one of the SDKs to push the events into Splunk
  3. use the HTTP Event Collector

I know that #1 performs well (and is easy to troubleshoot and test), but leaves me with a small scratch file management problem (which is very manageable). Since I am so lazy that I don't even want to solve that problem, I was wondering if anyone had any experience as to how well #2 and #3 hold up when looking at 7 million events/1.1 Gb a day...

0 Karma

gcusello
SplunkTrust
SplunkTrust

I don't know your scripts but probably you could directly send script output to Splunk launching your script in inputs.conf
[script://yourscript]
....
...

In this way you redirect script output directly in Splunk.
You have only to correctly set script permissions.
This is better then files.
Bye.
Giuseppe

0 Karma

wegscd
Contributor

why is it better than files?

0 Karma

starcher
SplunkTrust
SplunkTrust

Any of those methods will work fine at the scale of GB/day. Writing files at larger scales will run into normal universal forwarder issues such as ulimits, race conditions of reading large files before you log rotate them out of the UF observation etc. I am a fan of HTTP Event Collector (HEC) if you are already working in something like Python where your data is likely in a JSON payload format already. I have a simple threaded python class for it already. There customers with HEC up in the TB/day.

http://blogs.splunk.com/2015/12/11/http-event-collect-a-python-class/

0 Karma

gcusello
SplunkTrust
SplunkTrust

Because writing script output in a file and monitoring it requests more time to execute.
Bye.
Giuseppe

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...