Getting Data In

Decrease bundle size

nawazns5038
Builder

The bundle in the search head has grown upto 776 MB. Its not getting pushed as a result.
How to reduce the bundle size. The files are populated in the bundle even if we delete the files in tar.
why are two bundles getting created on its own and about the same size even if we don't work on the search head.

Please provide examples of a replication blacklist which can be very specific.

Tags (2)
0 Karma
1 Solution

nickhills
Ultra Champion

Normally the last 5 copies of the bundle are dropped onto peers, so it can end up consuming a fair bit of space.
Have you been able to work out what the large files are - are they lookups, and whats the path/name of the files?

If it was a big lookup - eg yourApp/lookups/big.csv

in distsearch.conf in yourApp/local

[replicationBlacklist]
bigFile = big.csv

would do the trick

edit: to add link
http://docs.splunk.com/Documentation/Splunk/latest/DistSearch/Limittheknowledgebundlesize

If my comment helps, please give it a thumbs up!

View solution in original post

0 Karma

nickhills
Ultra Champion

Normally the last 5 copies of the bundle are dropped onto peers, so it can end up consuming a fair bit of space.
Have you been able to work out what the large files are - are they lookups, and whats the path/name of the files?

If it was a big lookup - eg yourApp/lookups/big.csv

in distsearch.conf in yourApp/local

[replicationBlacklist]
bigFile = big.csv

would do the trick

edit: to add link
http://docs.splunk.com/Documentation/Splunk/latest/DistSearch/Limittheknowledgebundlesize

If my comment helps, please give it a thumbs up!
0 Karma

nawazns5038
Builder

There are tons of files in the bundle. Do you know of any way to view the contents of tar in descending order of size.

0 Karma

nickhills
Ultra Champion

It will be the contents of your SH apps.

You should be able to find prospective big-file candidates by looking through the app folders on the SH.
Large lookup files are normally a good candidate, so you could blacklist *.csv - but you would loose the ability to execute remote lookups.

Another thing to check is that you dont have any binaries or archive files in there by mistake.

If my comment helps, please give it a thumbs up!
0 Karma

nawazns5038
Builder

Ya i can see some huge lookup files

Will there be any impact if i blacklist the lookups in the bundle .

0 Karma

nickhills
Ultra Champion

if they are used in lookups, you would have to run them them on the SH, not the remote peers:

local Syntax: local=
Description: If local=true, forces the
lookup to run on the search head and
not on any remote peers. Default:
false

https://docs.splunk.com/Documentation/SplunkCloud/6.6.3/SearchReference/Lookup

Your best bet is to blacklist the huge ones, and see what impact that has on size/usability

If my comment helps, please give it a thumbs up!
0 Karma

nawazns5038
Builder

ya if we upload a lookup in a search head it will not get pushed into search peers right ?

should we write local=true in the query while using the command ?

How do we avoid the lookup not getting pushed into the searchpeers

0 Karma

nickhills
Ultra Champion

In a distributed env, the lookups are shared with all the peers, so lookup processing is distributed across all the peers.

By default all lookups will do this, including lookups you manually upload.

Generally speaking large lookup files are discouraged in favour of summary indexing, but this is not always convenient if the lookup data is coming from a remote source. (or another search)

What I do is blacklist the big CSV in distsearch, and then on the SH run a scheduled search which does |inputlookup local=true ...|collect... to write it to a summary index. My main searches, then use the SI rather than the lookup.

If my comment helps, please give it a thumbs up!
Get Updates on the Splunk Community!

Get the T-shirt to Prove You Survived Splunk University Bootcamp

As if Splunk University, in Las Vegas, in-person, with three days of bootcamps and labs weren’t enough, now ...

Introducing the Splunk Community Dashboard Challenge!

Welcome to Splunk Community Dashboard Challenge! This is your chance to showcase your skills in creating ...

Wondering How to Build Resiliency in the Cloud?

IT leaders are choosing Splunk Cloud as an ideal cloud transformation platform to drive business resilience,  ...