When using an API to enrich my data, for example running MD5 hashes in my logs against VirusTotal's API, how can I control how many requests Splunk sends to my external lookup script? There are often maximum limits on APIs provided, and its most efficient to take advantage and send in bulk since it will be going off-box.
Splunk will send all the requests to your external lookup script because Splunk can't know what the script is doing.
However, the script can be smart about what it sends off and what it answers itself from a cache.
I see. While I'm not so sure about lookups, custom search commands are by default configured to receive up to 50000 rows per invocation: http://docs.splunk.com/Documentation/Splunk/6.2.3/Admin/Commandsconf
If there's no way to tell this to a lookup then you could at least convert your lookup script.
From logs created by my script, Splunk is sending the hash one at a time to the script. I have caching implemented with the script so it doesn't use the API for something its received an answer for in the last few hours.
What I'm after is configuring Splunk to send more than one hash at a time to the script as it goes down the list.