We are facing the issue when Website Monitoring app has connection timeout URLS in configuration and looks like its blocking the other monitors to run as per schedule.
Scenario: If i have 100 URL's running every 5 mins which are giving some status code (200,401,404 etc), everything working perfectly as per the schedule
But, when I append a couple of URL's for the same inputs file which are giving timeout error status, then its blocking all the other monitors to run as per the schedule and 5 mins schedule is messing up and running randomly every 10mins, 18mins, etc....
Is there a workaround for this scenario ?
This is due to the fact that the Website Monitoring input currently runs all of the inputs under a single instance. I have a couple of options in mind that would prevent this sort of issue. I am investigating further in http://lukemurphey.com/issues/1524.
Update:
I have a version of Website Monitoring with inputs re-written to use multi-threading. It isn't released yet as I am still doing testing but I think I will likely have a new version available within a few days. I was able to reproduce the issue you observed but the new version has no problem executing all of the inputs while still keeping up.
Update [2]:
Version 2.0 is up on Splunk-base. I don't have it set as the default yet so you will have to manually select it from the version drop-down. I have it deployed in production and will make it the default version once I run it for a few more days.
This is due to the fact that the Website Monitoring input currently runs all of the inputs under a single instance. I have a couple of options in mind that would prevent this sort of issue. I am investigating further in http://lukemurphey.com/issues/1524.
Update:
I have a version of Website Monitoring with inputs re-written to use multi-threading. It isn't released yet as I am still doing testing but I think I will likely have a new version available within a few days. I was able to reproduce the issue you observed but the new version has no problem executing all of the inputs while still keeping up.
Update [2]:
Version 2.0 is up on Splunk-base. I don't have it set as the default yet so you will have to manually select it from the version drop-down. I have it deployed in production and will make it the default version once I run it for a few more days.
Ok.Thanks Luke, when can we expect v 2.0 ? 🙂
Haven't figured it out yet. I just made that version based on this Answers post. This issue is the most significant change slated for 2.0 by far.
I'll let you know once I figure out how I'm going to handle this.
HI Luke, we just saw the issue http://lukemurphey.com/issues/1524 is resolved.But we dont see software v2.0 in downloads..can you please check and upload the v2.0 file
Its on Splunk-base: https://splunkbase.splunk.com/app/1493/
Look for the dropdown at the right-side of the release notes and select 2.0 and then click the download button.
Thanks Luke, looks like its an issue with IE,firefox. Tried it in chrome and able to download and test.It is working as per the schedule now , even with high volume>
one observation found was , for the connection failed URL's its showing blank in response_code. In previous version it used to show connection failed but its blank now. Any idea ?
Where is it showing up blank? I have seen that issue on the Status Overview page but I thought I had fixed it there.
It is showing on the status overview page. Where I am getting 400,404,200 for others but the failed ones are showing blank in respose_code column.
ok, I'll test it tonight. I know I saw this issue on my Windows test box but I thought sure I fixed it. Something must have slipped through.
It turns out I had fixed this but in a later build than the one I had uploaded to Splunk-base. I uploaded the build on Splunk-base; that issue should be fixed now.
ok..Thanks Luke
If I understand correctly, your seeing inputs getting hung up on URLs that have timeout (since the other inputs are waiting on the input to timeout before they can execute).
Do I have that right?
Correct.I dont have any issue when i have all URL's which are result in some status code. Is there a work around?
I don't have a workaround yet. I'm looking into a change in the backend that would prevent the inputs from hanging one another up.