I am trying to run a python script from Splunk which takes 3 arguments and then suppose to run calculations on those arguments and print the results into Splunk. But I am not getting any results back. There are about 100 events that I want to run the script on.
Splunk command:
... | script python amp2 macaddress timestamp numofaverages
Values of arguments:
macaddress=11:22:33:44:55:66
timestamp=123456789
numofaverages=1,2,3,4,5,6,7,8,9,10
Below is the python script I am trying to run.
import sys
mac=sys.argv[1]
time=sys.argv[2]
avg=sys.argv[3]
avg=avg.split(",")
avgmin=min(avg)
avgmax=max(avg)
count=len(avg)
try:
avg=map(float, avg)
avgmean=round((sum(avg)/count),2)
except (NameError, ValueError):
avgmin="min"
avgmax="max"
count="bin count"
avgmean="mean"
print(mac+","+time+","+avgmin+","+avgmax+","+str(avgmean)+","+str(count))
You can't just call an arbitrary script in a search pipeline. The script must know how to accept Splunk pipeline inputs (unless it ignores them, which yours appears to do, so that's fine), but more importantly it must output them in the right format. As it turns out, the output format is a standard CSV file, including a header that specifies the field names. So basically you need to add a print for the CSV file header that matches the fields you're outputting. In general, you'll be a lot better off using the language-specific CSV libraries as well, rather than printing directly.
Also, I imagine you understand that this computation can be done in Splunk search language directly and you're merely going through an exercise of getting a simple command to work.
In the Splunk command for the python script, how do I send values from column as arguments? Right now it is only sending "macaddress", "timestamp" and "numofaverages" as arguments.
You can't just call an arbitrary script in a search pipeline. The script must know how to accept Splunk pipeline inputs (unless it ignores them, which yours appears to do, so that's fine), but more importantly it must output them in the right format. As it turns out, the output format is a standard CSV file, including a header that specifies the field names. So basically you need to add a print for the CSV file header that matches the fields you're outputting. In general, you'll be a lot better off using the language-specific CSV libraries as well, rather than printing directly.
Also, I imagine you understand that this computation can be done in Splunk search language directly and you're merely going through an exercise of getting a simple command to work.
Obhatti, what you wamt to do is a Splunk custom search command. Have a look here for details:
http://docs.splunk.com/Documentation/Splunk/6.0/AdvancedDev/SearchScripts
Hi gkanapathy, thanks for the reply. Yes I can do these calculations in Splunk but this exercise was for me to understand how Splunk interacts with external scripts. Can you give me more information on python CSV header and how to include them in the code?
For example, you can pretty much replace your command with:
... | eval n=split(numofaverages,",") | stats min(n) as avgmin max(n) mean(n) count(n) by macaddress,timestamp
Maybe you want to rename more fields and maybe you want to use the round function to round off the mean, but that's basically it.