Getting Data In

Splunk Connection Disconnect

ykpramodhcbt
Path Finder

Hi All,

We have 3 Search heads in a search head cluster which are mapped to a ELB which has an azure app proxy over it. When we access splunk through app proxy's url, we find that we are getting a message "Disconnected from Splunk Server" after 1 hour. The ELB stickiness settings are set to 8 hours. Our use case is that the dashboard will be left open for a long duration.

Whereas, when we open the respective search head directly through http://DNS:8000, the connection seems to be fine.

Here is a high-level architecture -

alt text

regards
Pramodh

0 Karma

nickhills
Ultra Champion

I am just checking - Do you mean ELB as in 'AWS Elastic Load Balancer'?
I only ask because you mention Azure.
Assuming yes..

The stickyness duration is set in the cookie, which will return you back to the last ELB backed node you were using when you last visited, however, it has no effect on the connection held on the frontend of the ELB.

An ELB also has an idle timeout, which is the period of time the ELB will hold a connection open without receiving any data from the client system (i assume the proxy in your case). By default, this is set to 60 seconds but in the ELB settings can be increased to 3600.
It sounds like your elb has been configured to the maximum, so its possible your bouncing off the hard limit.

Now with all of that said - if the SH is quiet, it makes no sense to keep connections open as it just ties up resources.

However - If I understand correctly, You have a proxy which is sticky to the ELB. The ELB is landing requests onto your SH servers.
If your proxy is sticky, you will only be using 1/3 of the ELBs cpacity, and will be pinning all of your requests to one AZ.

Some testing is required, but I have had very good results from accessing ELBs directly (ie with no proxy) - although your use case may require this.

You could try disabling stickyness and reducing idle timeout on the ELB, and then configure the proxy to use keep-alives, and make sure its using a DNS name for the ELB (one which is correctly setup as an AWS alias - not a normal CNAME, ) If your not using the ELB for SSL offload, I would just use the aws provided elb fqdn.

But.. if you can, I would get rid of the proxy!
:)

If my comment helps, please give it a thumbs up!
0 Karma

ykpramodhcbt
Path Finder

Thanks for sharing the comment.

  1. Yes, it is Azure App Proxy layer over Amazon AWS ELB. App proxy layer is for the purpose of eneterprise security.
  2. The dashboard is planned such that it will be left open on a Display monitor for a long duration. That is the reason

With stickiness attribute on ELB, we wanted to keep the connection to the underlying server open without any disconnect.

The search heads are configured with SSL certificates and the Azure App proxy authenticates based on certificate to determine that the site is trusted.

Please let us know if the same can be achieved with any other way?

0 Karma

nickhills
Ultra Champion

that not what stickness is for.

Stickyness just means, 'if i was using this node before, when i come back to it in 20 minutes put me back on the same server'

Is your dashboard refreshing itself every few mins?

If my comment helps, please give it a thumbs up!
0 Karma

ykpramodhcbt
Path Finder

Yes, it refreshes every few minutes.

0 Karma

nickhills
Ultra Champion

I wonder if your proxy is ignoring the stickiness cookie?
If your reloading the dashboard every few minutes, it seems unlikely that its idling - unless its ignoring the cookie, and sending requests to the other servers?

you might want to correlate the entries in the splunk access logs to see which SH is serving requests when the disconnect happens.

Another good test is to point a browser at the ELB and see how that behaves after 60 mins.
I don't like to start early finger pointing, but my money is on the proxy!

If my comment helps, please give it a thumbs up!
0 Karma

ykpramodhcbt
Path Finder

Thanks for sharing your insights.

Yes, that is one test we are running. Open from both ELB and Proxy and see where the timeout is happening.

0 Karma
Get Updates on the Splunk Community!

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...