Reporting

Running outputcsv without append option

keerthana_k
Communicator

Hi

We are running an outputcsv command in hourly intervals through a python script. We have not mentioned append option in the query. I would like to know what should be the expected behavior of Splunk. Will the csv file be overwritten every hour? Will the headers alone be retained? Please clarify.

Thanks in Advance.

Tags (1)
1 Solution

sideview
SplunkTrust
SplunkTrust

The given csv will be overwritten every time outputcsv runs, headers and all. the new headers will simply match the fields in the new results.

It is the same for the outputlookup command.

Also, in a lot of real-world use cases, using the append flag on the outputcsv command itself can result in a lot of duplicates. As a result it can be better to do the appending separately, along with a little search language to remove the duplicates as appropriate.

Here is a simple example, where the csv has a primary key called 'user', and the csv is just mapping each user to a "group" field.

<search terms to get the "new" rows mapping users to groups> | stats last(group) as group by user | append [| inputcsv mycsv] | stats first(group) as group by user | outputcsv mycsv

As you can see, each time the search runs it will update the corresponding users that have changed, but not duplicate them. Obviously it's a very simple example with only two fields, but with a little more attention to the stats commands you can use the same technique. Note that it's better to put the inputcsv command in the append; if you put the actual search in the append you may increase the chances of hitting limits in append concerning execution time or number of rows.

View solution in original post

sideview
SplunkTrust
SplunkTrust

The given csv will be overwritten every time outputcsv runs, headers and all. the new headers will simply match the fields in the new results.

It is the same for the outputlookup command.

Also, in a lot of real-world use cases, using the append flag on the outputcsv command itself can result in a lot of duplicates. As a result it can be better to do the appending separately, along with a little search language to remove the duplicates as appropriate.

Here is a simple example, where the csv has a primary key called 'user', and the csv is just mapping each user to a "group" field.

<search terms to get the "new" rows mapping users to groups> | stats last(group) as group by user | append [| inputcsv mycsv] | stats first(group) as group by user | outputcsv mycsv

As you can see, each time the search runs it will update the corresponding users that have changed, but not duplicate them. Obviously it's a very simple example with only two fields, but with a little more attention to the stats commands you can use the same technique. Note that it's better to put the inputcsv command in the append; if you put the actual search in the append you may increase the chances of hitting limits in append concerning execution time or number of rows.

Get Updates on the Splunk Community!

Detecting Remote Code Executions With the Splunk Threat Research Team

REGISTER NOWRemote code execution (RCE) vulnerabilities pose a significant risk to organizations. If ...

Observability | Use Synthetic Monitoring for Website Metadata Verification

If you are on Splunk Observability Cloud, you may already have Synthetic Monitoringin your observability ...

More Ways To Control Your Costs With Archived Metrics | Register for Tech Talk

Tuesday, May 14, 2024  |  11AM PT / 2PM ET Register to Attend Join us for this Tech Talk and learn how to ...