Solved: HiddenPostProcess limitations

fk319 · ‎10-26-2010

I had 5 summary indexes that I was able to compress into one. It turns out my final index takes about 1/4 of the space.

The problem is that I have about 400 sumary events per minute and I would like to have one search and then just sumarize in each of the 5 charts.

I am only able to process about 24 minutes, where I would like to process about 4 hours.

Everything was working fine when I had 5 independant searches, but when I started using HiddenSearch/HiddenPostProcess I started loosing data.

I seem to be hitting the 10,000 event, and I do know know how to construct my query to get around this issue.

Any Ideas?

sideview · ‎10-26-2010

The basic idea is to have the base search never match events, but instead end in a | stats count, sum(someField) by foo, bar, baz, bat, where each of the fields you're interested in is represented there somewhere.

The reason is that the stats search will compress the number of rows down a lot (you almost certainly want to put in a bucket command before the stats if you need _time), and even if it doesnt compress it much, stats isnt subject to any limitation on the number of rows, so they'll all be there.

1) Check out the app 'ui examples for 4.1', which has a view under Advanced XML called 'Using postProcess on dashboards'. That view has a lot more discussion and advice around these issues.

2) And a lot of the same items are discussed here in the docs: http://docs.splunk.com/Documentation/Splunk/4.1/Developer/PostProcess

Although notably the docs only seem to explain half of the reason for using the stats clause in the base search.

View solution in original post

baddogdown · ‎11-29-2012

The above link http://www.splunk.com/base/Documentation/latest/Developer/PostProcess is broken. Please can someone fix it.

sideview · ‎10-26-2010

The basic idea is to have the base search never match events, but instead end in a | stats count, sum(someField) by foo, bar, baz, bat, where each of the fields you're interested in is represented there somewhere.

The reason is that the stats search will compress the number of rows down a lot (you almost certainly want to put in a bucket command before the stats if you need _time), and even if it doesnt compress it much, stats isnt subject to any limitation on the number of rows, so they'll all be there.

1) Check out the app 'ui examples for 4.1', which has a view under Advanced XML called 'Using postProcess on dashboards'. That view has a lot more discussion and advice around these issues.

2) And a lot of the same items are discussed here in the docs: http://docs.splunk.com/Documentation/Splunk/4.1/Developer/PostProcess

Although notably the docs only seem to explain half of the reason for using the stats clause in the base search.

fk319 · ‎11-02-2010

we upgraded to 4.1.5, the 50,000 limit was changed.

fk319 · ‎10-27-2010

Nick, that app has some good info, but I does not help me in my case. I will just have to use multipule queries.

fk319 · ‎10-26-2010

ok, I remembered when I had 'stats', I had the left most part of the graph, and when I used 'fields' I had the right most part.

The 5 queries are from the same data, but I am presenting the data in different ways, IP, Method, RunTime and ReturnCode. It turns out that each of these methods I present in a second graph, I group the results a bit.

As for the bucket, I will do that in my next view, where I expand my time window.

I have review you link in 2), but I have not located 1) yet.

Thanks.....

HiddenPostProcess limitations

.conf24 | Registration Open!

ICYMI - Check out the latest releases of Splunk Edge Processor

Introducing the 2024 SplunkTrust!