Getting Data In

Is it possible to preserve sourcetype, host, and source when using the collect command?

gots
Path Finder

We have an index with access logs from multiple hosts and systems with different sourcetypes.
When I trying to add information from a dynamic lookup to events and save them in a summary index with the collect command, I can't save original information about source, sourcetype, and host because collect command arguments take values as text, but not field values.

For example, search:

 index=access sourcetype=*_type_access | 
 lookup xxx AS yyy |
 collect index=enriched_access sourcetype=sourcetype

saves results with sourcetype equal "sourcetype", but not the original sourcetype.
When I try to rename sourcetype, result is the same.

Where a, I going wrong?

glc_slash_it
Path Finder

Hey!

Since I was searching for this topic/solution, I'll just add what I think is the right solution for this case.

To preserve the _time, host, source and sourcetype:

(...)

| collect index=main  output_format=hec

 

------------
If this was helpful, some karma would be appreciated.

0 Karma

jvishwak
Path Finder

Have a sourcetype value in anohther field like "origSourceType" and push this value in summary index. From summary index you can search based on origSourceType field.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Totally different approach: Keep the lookup data in the lookup, enrich at search time, skip indexing things twice through collect?

What you're doing feels quite wrong, considering collect would index _raw while the lookup is just adding fields - have you checked that those lookup output fields are actually retained in the second index?

That being said, https://answers.splunk.com/answers/88926/modify-raw-collect-into-second-index-how-to-best-retain-hos...

0 Karma

Runals
Motivator

Since there are perhaps several sourcetypes I would try the map command

| metasearch index=access sourcetype=*_type_access | stats count by sourcetype | map [ search index=access sourcetype=$sourcetype$ | lookup xxx AS yyy | collect index=enriched_access sourcetype=$sourcetype$ ]

At least that works in theory; I haven't tested it. It should work though. I used the metasearch command for speed and the stats command is just to get the unique list of sourcetypes. Tstats might be a hair faster still but I'm not spun up on that one /shrug. There are folks who are kinda anti map but it is a tool in the tool chest. What you are doing is for each result line from your initial search is passing the sourcetype as a token to the included search.

0 Karma

gurlest
Path Finder

I tried this out with "host=$host$" in my collect statement and no-dice.

Any other ideas?

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...