We have an index with access logs from multiple hosts and systems with different sourcetypes.
When I trying to add information from a dynamic lookup to events and save them in a summary index with the collect
command, I can't save original information about source, sourcetype, and host because collect command arguments take values as text, but not field values.
For example, search:
index=access sourcetype=*_type_access |
lookup xxx AS yyy |
collect index=enriched_access sourcetype=sourcetype
saves results with sourcetype equal "sourcetype", but not the original sourcetype.
When I try to rename sourcetype, result is the same.
Where a, I going wrong?
Hey!
Since I was searching for this topic/solution, I'll just add what I think is the right solution for this case.
To preserve the _time, host, source and sourcetype:
(...)
| collect index=main output_format=hec
------------
If this was helpful, some karma would be appreciated.
Have a sourcetype value in anohther field like "origSourceType" and push this value in summary index. From summary index you can search based on origSourceType field.
Totally different approach: Keep the lookup data in the lookup, enrich at search time, skip indexing things twice through collect
?
What you're doing feels quite wrong, considering collect
would index _raw
while the lookup is just adding fields - have you checked that those lookup output fields are actually retained in the second index?
That being said, https://answers.splunk.com/answers/88926/modify-raw-collect-into-second-index-how-to-best-retain-hos...
Since there are perhaps several sourcetypes I would try the map command
| metasearch index=access sourcetype=*_type_access | stats count by sourcetype | map [ search index=access sourcetype=$sourcetype$ | lookup xxx AS yyy | collect index=enriched_access sourcetype=$sourcetype$ ]
At least that works in theory; I haven't tested it. It should work though. I used the metasearch command for speed and the stats command is just to get the unique list of sourcetypes. Tstats might be a hair faster still but I'm not spun up on that one /shrug. There are folks who are kinda anti map but it is a tool in the tool chest. What you are doing is for each result line from your initial search is passing the sourcetype as a token to the included search.
I tried this out with "host=$host$" in my collect statement and no-dice.
Any other ideas?