Deployment Architecture

Error "The maximum number of concurrent running jobs for this historical scheduled search on this cluster has been reached" after upgrade to 8.0.1

gots
Path Finder

We have search head splunk cluster. After upgrade to 8.0.1 from 7.2.6 we began to get errors like:
"12-18-2019 16:47:00.816 +0300 INFO SavedSplunker - savedsearch_id="nobody;;", search_type="scheduled", user="admin", app="", savedsearch_name="", priority=default, status=skipped, reason="The maximum number of concurrent running jobs for this historical scheduled search on this cluster has been reached", concurrency_category="historical_scheduled", concurrency_context="saved-search_cluster-wide", concurrency_limit=1, scheduled_time="

Changes in limits.conf did't give results. We can't change concurrency_limit from 1 upward.

Current values of the most relevant parameters (to my opinion):

$ /opt/splunk/bin/splunk show config limits | egrep 'max_rt_search_multiplier|max_searches_per_cpu|shc_role_quota_enforcement|shc_syswide_quota_enforcement|base_max_searches'

shc_role_quota_enforcement=false
shc_syswide_quota_enforcement=false
base_max_searches=6
max_rt_search_multiplier=1
max_searches_per_cpu=8

Seems that SPL-73386 from https://docs.splunk.com/Documentation/Splunk/latest/ReleaseNotes/KnownIssues most relevant for us, but as you can see we set user to admin and this did't help.

May be error is not linked to bug, but changes in some default values?

0 Karma
1 Solution

gots
Path Finder

We found problem - it sourced from our L3 balancer.
Balancer can open first tcp session to search_head_1, but next http packets from another tcp session send to search_head_2.
With version 7.x we didn't have this problem. I don't know why - may be it is connection between balancer and search head web server or some thing else, but error raised with 8.x version.

So, if you getting on the web interface of search head beside balancer error like "The search job terminated unexpectedly.", check your balancer and try to change it from L3 (ip level) to L7 (http level)

View solution in original post

0 Karma

gots
Path Finder

We found problem - it sourced from our L3 balancer.
Balancer can open first tcp session to search_head_1, but next http packets from another tcp session send to search_head_2.
With version 7.x we didn't have this problem. I don't know why - may be it is connection between balancer and search head web server or some thing else, but error raised with 8.x version.

So, if you getting on the web interface of search head beside balancer error like "The search job terminated unexpectedly.", check your balancer and try to change it from L3 (ip level) to L7 (http level)

0 Karma

jayregu17
Loves-to-Learn Everything

Hi

 

Could you please explain how did you debug this issue?

 

Thanks

Jay

0 Karma

codebuilder
Influencer

It appears to me as if the settings within limits.conf are not being honored. Have you verified the upgrade process did not alter file or or directory permissions? Meaning from splunk user to root, e.g.

Assuming you are running Splunk as the splunk user, I would run a recursive chown and cycle the SHC.

chown -RP splunk:splunk /opt/splunk
----
An upvote would be appreciated and Accept Solution if it helps!
0 Karma

gots
Path Finder

With command:

/opt/splunk/bin/splunk show config limits

i trying to check effective values of variables in limits.conf.

But anyway, if i checking permission - everything ok:
$ find /opt/splunk -name limits.conf -exec ls -la {} \;

-rw-r--r-- 1 splunk splunk 35 Dec 18 18:02 /opt/splunk/etc/apps/<application>/default/limits.conf
-r--r--r-- 1 splunk splunk 42 Nov 28 02:31 /opt/splunk/etc/apps/SplunkLightForwarder/default/limits.conf
-r--r--r-- 1 splunk splunk 43109 Nov 28 02:31 /opt/splunk/etc/system/default/limits.conf
-rw-r--r-- 1 splunk splunk 711 Dec 18 13:28 /opt/splunk/etc/system/local/limits.conf
0 Karma

codebuilder
Influencer

I believe the limit you are hitting is in savedsearches.conf, not limits.conf

max_concurrent = <unsigned integer>
* The maximum number of concurrent instances of this search that the scheduler
  is allowed to run.
* Default: 1

See https://docs.splunk.com/Documentation/Splunk/8.0.0/Admin/Savedsearchesconf

----
An upvote would be appreciated and Accept Solution if it helps!
0 Karma

gots
Path Finder

Hm, interesting parameter.
I will try it.

Do you know something like "max_concurrent" for adhoc searches?

And may be you know or can suggest what happens with search head cluster, that it trying to run more than one search at time?

0 Karma

codebuilder
Influencer

My guess is that the scheduler is trying to kick off a new search before the previous one has completed. You can increase the interval between searches, or implement skewing.

https://docs.splunk.com/Documentation/Splunk/8.0.0/Report/Skewscheduledreportstarttimes

----
An upvote would be appreciated and Accept Solution if it helps!
0 Karma

codebuilder
Influencer

adhoc search limits can be configured/modified at the role level via the web UI.
The defaults values are pretty low.

e.g.
Settings > Access controls > Roles > admin

----
An upvote would be appreciated and Accept Solution if it helps!
0 Karma

SamHTexas
Builder

Hello, do you know an SPL to help me find a list of my saved + skipped searches in ES plus the reason for the failure please?

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...