Getting Data In

How to log results to an index?

ejwade
Contributor

I'm trying to figure out why you would use the various methods for sending search results to an index. Note, I'm not trying to speed up searches, I'm just looking at methods for writing search results to an index.

  1. Pipe to the "collect" command with the default "stash" source type, supposedly avoiding license usage.
  2. Pipe to the "collect" command with a specified source type, incurring license usage.
  3. Pipe to the "sendalert" command with the "logevent" alert action specified, or if it's a saved search or alert, use the Log Event alert action

Effectively, I think they all do the same thing. Option 1 seems like a sneaky way around license usage. However, I think "collect" can only be used with summary indexes. Any thoughts on this?

Labels (1)
Tags (3)
0 Karma
1 Solution

bowesmana
SplunkTrust
SplunkTrust

There is another way to write to an index (as @richgalloway says there is no difference between a normal index or a summary index, it's just a notional idea that a summary index will typically contain events that are summarising other data).

If you have a scheduled saved search, you can enable 'Summary Indexing' from the Edit Summary Indexing option in the Searches Reports and Alerts view for that saved search.

https://community.splunk.com/t5/Splunk-Search/How-to-create-summary-index/m-p/632465

This is effectively the same as using collect.

Note when using collect, the documentation is unclear and wrong about the various aspects of how time gets defined in the collected event.

I have previously posted this information about collect and time

-----------------------------------------------------------------------------------------------

Any _time field is totally ignored in the collected event.  The best way to get the time into your event is by creating _raw and setting a time field as the first part of the _raw. If _raw exists it is used in the collected data. You can see the data that is written to the file before it's ingested by setting spool=f and looking at the raw stash file created in the file system.
 
Generally the use of addtime will put the info_* values at the START of the line in the data, so when the file is ingested, these are the first "timestamps" to be found.
 
Note that if you have a saved search that is running scheduled, then it will put an additional field search_now into the start of the data.
 
These are true
 
* _time field is NEVER passed if no _raw field is present.
* Using addinfo=t will always PREPEND the fields to the summary row, this destroying the ability to define _time
* Scheduled saved searches will always PREPEND the search_* fields, again destroying the ability to define _time
We normally use this construct in a macro to collect to indexes.
 

 

| eval _raw=printf("_time=%d", desired_time_field)
| fields - desired_time_field
| foreach *
    [| eval _raw=_raw.", <<FIELD>>=\"".if(isnull('<<FIELD>>'),"",'<<FIELD>>')."\"") 
     | fields - "<<FIELD>>" ] 
| collect index=bla source="bla" addtime=f testmode=f

 

and this can reliably be controlled as needed whether the search is run ad-hoc, as a report or as a scheduled saved search.

 

 

View solution in original post

bowesmana
SplunkTrust
SplunkTrust

There is another way to write to an index (as @richgalloway says there is no difference between a normal index or a summary index, it's just a notional idea that a summary index will typically contain events that are summarising other data).

If you have a scheduled saved search, you can enable 'Summary Indexing' from the Edit Summary Indexing option in the Searches Reports and Alerts view for that saved search.

https://community.splunk.com/t5/Splunk-Search/How-to-create-summary-index/m-p/632465

This is effectively the same as using collect.

Note when using collect, the documentation is unclear and wrong about the various aspects of how time gets defined in the collected event.

I have previously posted this information about collect and time

-----------------------------------------------------------------------------------------------

Any _time field is totally ignored in the collected event.  The best way to get the time into your event is by creating _raw and setting a time field as the first part of the _raw. If _raw exists it is used in the collected data. You can see the data that is written to the file before it's ingested by setting spool=f and looking at the raw stash file created in the file system.
 
Generally the use of addtime will put the info_* values at the START of the line in the data, so when the file is ingested, these are the first "timestamps" to be found.
 
Note that if you have a saved search that is running scheduled, then it will put an additional field search_now into the start of the data.
 
These are true
 
* _time field is NEVER passed if no _raw field is present.
* Using addinfo=t will always PREPEND the fields to the summary row, this destroying the ability to define _time
* Scheduled saved searches will always PREPEND the search_* fields, again destroying the ability to define _time
We normally use this construct in a macro to collect to indexes.
 

 

| eval _raw=printf("_time=%d", desired_time_field)
| fields - desired_time_field
| foreach *
    [| eval _raw=_raw.", <<FIELD>>=\"".if(isnull('<<FIELD>>'),"",'<<FIELD>>')."\"") 
     | fields - "<<FIELD>>" ] 
| collect index=bla source="bla" addtime=f testmode=f

 

and this can reliably be controlled as needed whether the search is run ad-hoc, as a report or as a scheduled saved search.

 

 

ejwade
Contributor

@bowesmanathank you so much. This was extremely helpful. We created a similar macro using your foreach logic. Thank you again!

richgalloway
SplunkTrust
SplunkTrust

The first two are nearly equivalent with the only difference being whether or not ingestion license is charged.  There is nothing "supposed" about the stash sourcetype not incurring license usage.  It's documented at https://docs.splunk.com/Documentation/Splunk/7.2.6/SearchReference/Collect#Syntax:~:text=By%20specif....

Unlike collect, the sendalert command does not write to an index.  As the command name implies, it sends an alert.  While it may be possible, I've never encountered an alert that writes to an index.

FTR, the collect command can write to any event index to which you have access.  It does not have to be a summary index.  Technically, all event indexes are the same, whether summary or not.

 

---
If this reply helps you, Karma would be appreciated.

ejwade
Contributor

Thank you, @richgalloway. That is good to know that the collect command can write to any event index; it does not have to be a summary index.

The sendalert command invokes an alert action. I'm using it to invoke logevent, which would effectively leverage the Log event alert action.

Get Updates on the Splunk Community!

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars this month. This ...

They're back! Join the SplunkTrust and MVP at .conf24

With our highly anticipated annual conference, .conf, comes the fez-wearers you can trust! The SplunkTrust, as ...

Enterprise Security Content Update (ESCU) | New Releases

Last month, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...