Getting Data In

Adding host field when summary indexing?

bojanz
Communicator

Hi,

I have a heavy search on multiple sources that I want to schedule to populate a summary index. I am basically interested in certain events so I want to populate the summary index with only those events. That way I can run searches on the summary index quickly as opposed to the normal index that contains hundreds of millions of events.

I can populate the summary index like this:

index=windows OR index=linux OR index=something my search | addinfo | collect index=mysummaryindex

This works fine, however the problem is that the host field is not saved so I don't know which host generated the event.

Is there a way to add the host field into the summary index as well? The marker option for collect just adds a certain string field which is not useful in this case.

Tags (2)
0 Karma

bandit
Motivator

after further testing, this is my favorite solution

Just add the following after your base search and orig_host, orig_sourcetype, orig_source and orig_index will all be in your summary index :-)

 | rename _raw as orig_raw
0 Karma

bandit
Motivator
# a much simpler solution that I got from Splunk guru "D" :-)
# turns out renaming the _raw field corrects the issue of missing some of the "orig" fields, i.e. orig_sourcetype
# this approach is proabaly not as relavant to Splunk 6 which has many automatic acceleration features
# note: the "| collect " command is optional not needed if you are using the summary index checkbox in a saved search
index=other | rename _time as time | rename _raw as raw | stats count by time raw index host sourcetype source | collect index=collect
0 Karma

bandit
Motivator
# I was having trouble recording the raw event, original host, sourcetype and source fields and putting them into a summary index as they were always overridden with the values of the host which runs the search populating the summary index - here's one solution

# step 1 - populate summary index
# search events from an index namded "other" and prepend the _time, host, sourcetype and source fields to the _raw field with "|" as a delimeter and put into a summary index named "collect"
index=other | eval _raw=_time+"|"+host+"|"+sourcetype+"|"+source+"|"+_raw | collect index=collect

# step 2 - read from summary index named "collect"
# extract time, host, sourcetype and source fields that are stashed in the _raw field in the summary index named "collect"
index=collect | rex "^(?<time1>[^|]+)\|(?<host1>[^|]+)\|(?<sourcetype1>[^|]+)\|(?<source1>[^|]+)\|(?<raw1>[^|]+)"
0 Karma

ualbanytech
Path Finder

When writing to the summary index Splunk should have created and populated the original host in a new field "orig_host".

Does that not exist?

0 Karma

ualbanytech
Path Finder

Oh, yah. I see what you're saying. I was using stat command.
I'm trying to do something similar but, I additionally want to eliminate unwanted fields when I write to summary but, no answer for me so far:

0 Karma

bojanz
Communicator

Nope - I don't see the orig_host field at all. I'm not sure if the collect command adds that, or it is only part of sistat/sichart/etc commands.

0 Karma

MarioM
Motivator

did you try adding:

| fields host

in you search?

0 Karma

bojanz
Communicator

Yes, unfortunately that field does not get stored in the summary index.

It appears that the collect command only stores what's in the marker, and that can be only an arbitrary string.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...