I'm using Splunk 4.1.6 and getting started with creating summary data.
Edit: What I'm trying to do is eliminate fields I don't need when writing to a summary index.
I originally tried using the fields command but, I continued to see fields I did not specify in my summary index.
I created a scheduled search which runs daily for "yesterday" and writes to a summary index.
The search (I have replaced my real host names with <hostA_3> , <hostA_4> , <hostB_5> , <hostB_7> 😞
splunk_server=splunk-uad* index=uad-* host=<hostB>* OR host=<hostA>* sourcetype=access_combined_rsptime NOT netid="-" | dedup netid, clientip, host |sort - _time| stats values(host) AS host by _time, req_time, clientip, netid
This went fine.
However when I started playing with some reports against this summary data I noticed that I seemed to get 2 sporadic events out of 9,998 whose "orig_host" value is mangled.
orig_host="<hostA_3>.itsli.albany.edu <hostB_5>.itsli.albany.edu"
orig_host="<hostA_4>.itsli.albany.edu <hostB_7>.itsli.albany.edu"
It seems splunk concatenated two of my host names together for two arbitrary events.
I checked the events the summary events were created from and there is no "host" fields with those bad values in the original events.
Am I doing something subtle/ignorant in my summary search that caused this?
... View more