Solved: Error "The maximum number of concurrent running jo...

gots · ‎12-18-2019

We have search head splunk cluster. After upgrade to 8.0.1 from 7.2.6 we began to get errors like:
"12-18-2019 16:47:00.816 +0300 INFO SavedSplunker - savedsearch_id="nobody;;", search_type="scheduled", user="admin", app="", savedsearch_name="", priority=default, status=skipped, reason="The maximum number of concurrent running jobs for this historical scheduled search on this cluster has been reached", concurrency_category="historical_scheduled", concurrency_context="saved-search_cluster-wide", concurrency_limit=1, scheduled_time="

Changes in limits.conf did't give results. We can't change concurrency_limit from 1 upward.

Current values of the most relevant parameters (to my opinion):

shc_role_quota_enforcement=false
shc_syswide_quota_enforcement=false
base_max_searches=6
max_rt_search_multiplier=1
max_searches_per_cpu=8

Seems that SPL-73386 from https://docs.splunk.com/Documentation/Splunk/latest/ReleaseNotes/KnownIssues most relevant for us, but as you can see we set user to admin and this did't help.

May be error is not linked to bug, but changes in some default values?

gots · ‎01-08-2020

We found problem - it sourced from our L3 balancer.
Balancer can open first tcp session to search_head_1, but next http packets from another tcp session send to search_head_2.
With version 7.x we didn't have this problem. I don't know why - may be it is connection between balancer and search head web server or some thing else, but error raised with 8.x version.

So, if you getting on the web interface of search head beside balancer error like "The search job terminated unexpectedly.", check your balancer and try to change it from L3 (ip level) to L7 (http level)

View solution in original post

gots · ‎01-08-2020

We found problem - it sourced from our L3 balancer.
Balancer can open first tcp session to search_head_1, but next http packets from another tcp session send to search_head_2.
With version 7.x we didn't have this problem. I don't know why - may be it is connection between balancer and search head web server or some thing else, but error raised with 8.x version.

So, if you getting on the web interface of search head beside balancer error like "The search job terminated unexpectedly.", check your balancer and try to change it from L3 (ip level) to L7 (http level)

jayregu17 · ‎03-22-2021

Hi

Could you please explain how did you debug this issue?

Thanks

Jay

codebuilder · ‎12-18-2019

It appears to me as if the settings within limits.conf are not being honored. Have you verified the upgrade process did not alter file or or directory permissions? Meaning from splunk user to root, e.g.

Assuming you are running Splunk as the splunk user, I would run a recursive chown and cycle the SHC.

chown -RP splunk:splunk /opt/splunk

----
An upvote would be appreciated and Accept Solution if it helps!

gots · ‎12-18-2019

With command:

/opt/splunk/bin/splunk show config limits

i trying to check effective values of variables in limits.conf.

But anyway, if i checking permission - everything ok:
$ find /opt/splunk -name limits.conf -exec ls -la {} \;

-rw-r--r-- 1 splunk splunk 35 Dec 18 18:02 /opt/splunk/etc/apps/<application>/default/limits.conf
-r--r--r-- 1 splunk splunk 42 Nov 28 02:31 /opt/splunk/etc/apps/SplunkLightForwarder/default/limits.conf
-r--r--r-- 1 splunk splunk 43109 Nov 28 02:31 /opt/splunk/etc/system/default/limits.conf
-rw-r--r-- 1 splunk splunk 711 Dec 18 13:28 /opt/splunk/etc/system/local/limits.conf

codebuilder · ‎12-18-2019

I believe the limit you are hitting is in savedsearches.conf, not limits.conf

max_concurrent = <unsigned integer>
* The maximum number of concurrent instances of this search that the scheduler
  is allowed to run.
* Default: 1

See https://docs.splunk.com/Documentation/Splunk/8.0.0/Admin/Savedsearchesconf

----
An upvote would be appreciated and Accept Solution if it helps!

gots · ‎12-18-2019

Hm, interesting parameter.
I will try it.

Do you know something like "max_concurrent" for adhoc searches?

And may be you know or can suggest what happens with search head cluster, that it trying to run more than one search at time?

codebuilder · ‎12-18-2019

My guess is that the scheduler is trying to kick off a new search before the previous one has completed. You can increase the interval between searches, or implement skewing.

https://docs.splunk.com/Documentation/Splunk/8.0.0/Report/Skewscheduledreportstarttimes

----
An upvote would be appreciated and Accept Solution if it helps!

codebuilder · ‎12-18-2019

adhoc search limits can be configured/modified at the role level via the web UI.
The defaults values are pretty low.

e.g.
Settings > Access controls > Roles > admin

----
An upvote would be appreciated and Accept Solution if it helps!

SamHTexas · ‎08-11-2021

Hello, do you know an SPL to help me find a list of my saved + skipped searches in ES plus the reason for the failure please?

Error "The maximum number of concurrent running jobs for this historical scheduled search on this cluster has been reached" after upgrade to 8.0.1

Introducing the 2024 SplunkTrust!

Introducing the 2024 Splunk MVPs!

Splunk Custom Visualizations App End of Life