The bundle in the search head has grown upto 776 MB. Its not getting pushed as a result.
How to reduce the bundle size. The files are populated in the bundle even if we delete the files in tar.
why are two bundles getting created on its own and about the same size even if we don't work on the search head.
Please provide examples of a replication blacklist which can be very specific.
Normally the last 5 copies of the bundle are dropped onto peers, so it can end up consuming a fair bit of space.
Have you been able to work out what the large files are - are they lookups, and whats the path/name of the files?
If it was a big lookup - eg yourApp/lookups/big.csv
in distsearch.conf in yourApp/local
[replicationBlacklist]
bigFile = big.csv
would do the trick
edit: to add link
http://docs.splunk.com/Documentation/Splunk/latest/DistSearch/Limittheknowledgebundlesize
Normally the last 5 copies of the bundle are dropped onto peers, so it can end up consuming a fair bit of space.
Have you been able to work out what the large files are - are they lookups, and whats the path/name of the files?
If it was a big lookup - eg yourApp/lookups/big.csv
in distsearch.conf in yourApp/local
[replicationBlacklist]
bigFile = big.csv
would do the trick
edit: to add link
http://docs.splunk.com/Documentation/Splunk/latest/DistSearch/Limittheknowledgebundlesize
There are tons of files in the bundle. Do you know of any way to view the contents of tar in descending order of size.
It will be the contents of your SH apps.
You should be able to find prospective big-file candidates by looking through the app folders on the SH.
Large lookup files are normally a good candidate, so you could blacklist *.csv - but you would loose the ability to execute remote lookups.
Another thing to check is that you dont have any binaries or archive files in there by mistake.
Ya i can see some huge lookup files
Will there be any impact if i blacklist the lookups in the bundle .
if they are used in lookups, you would have to run them them on the SH, not the remote peers:
local Syntax: local=
Description: If local=true, forces the
lookup to run on the search head and
not on any remote peers. Default:
false
https://docs.splunk.com/Documentation/SplunkCloud/6.6.3/SearchReference/Lookup
Your best bet is to blacklist the huge ones, and see what impact that has on size/usability
ya if we upload a lookup in a search head it will not get pushed into search peers right ?
should we write local=true in the query while using the command ?
How do we avoid the lookup not getting pushed into the searchpeers
In a distributed env, the lookups are shared with all the peers, so lookup processing is distributed across all the peers.
By default all lookups will do this, including lookups you manually upload.
Generally speaking large lookup files are discouraged in favour of summary indexing, but this is not always convenient if the lookup data is coming from a remote source. (or another search)
What I do is blacklist the big CSV in distsearch, and then on the SH run a scheduled search which does |inputlookup local=true ...|collect...
to write it to a summary index. My main searches, then use the SI rather than the lookup.