All Apps and Add-ons

Splunk App for VMware: Our Production vCenter has almost 2,000 VMs. Can I point two or more DCNs to one vCenter?

iatwal
Path Finder

I have built out the OVA DCN and bumped up the cores to 8 and doubled the RAM...understandably, we are still getting this error:

2016-11-07 16:03:05,757 ERROR [ta_vmware_collection_worker://gamma:17063] [getJob] job=job_4be4c1baa54611e69173005056ac7e38 of task=hostvmperf has expired and will not be run

I have 2 questions:

Can I point 2 or more DCNs to one vCenter?

Can I build my own DCN on a physical machine to ensure I get all my data?

I'm looking for some suggestions please.

0 Karma

iatwal
Path Finder

I ended up doubling mine to get all the data to work in. Also go into your dashboards under Search and Reporting...you'll have a Hydra Framework dashboard there...it's pretty helpful.

0 Karma

prakash007
Builder

In my case, the scheduler is running on Search Head.....

1.when i read the docs, i don't find any configs related to hostvmperf_interval and hostvmperf_expiration in \etc\apps\Splunk_TA_vmware\local., but i'm thinking to put down all the 4 changes(hostvmperf_interval,hostvmperf_expiration,hostinv_interval,hostinv_expiration) in \etc\apps\Splunk_TA_vmware\local adding a [default stanza]

2.did you also change the vCenter timeout value....??

Increase the timeout period in the vpxd file on your vCenter.
Open the vpxd.cfg file, located in C:\Documents and Settings\All Users\Application Data\VMware\VMware VirtualCenter\vpxd.cfg file (C:\ProgramData\VMware\VMware VirtualCenter\vpxd.cfg on Windows 2008) using a text editor.

0 Karma

iatwal
Path Finder
  1. Yeah you have to create the local directory and then override the values

  2. We did end up increasing the time out value here as well. Try #1 first to see what happens. Use the dashboard I mentioned above, it's a big help.

0 Karma

prakash007
Builder

I made this changes on the Search Head where the scheduler was running, i still see this job expirations on DCnodes when i looked at HydraFramework dashboard...

2017-05-09 10:59:51,007 ERROR [ta_vmware_collection_worker://eta:26954] [getJob] job=job_2016307834e011e7be7aecf4bbd1db64 of task=hostvmperf has expired and will not be run

Do i need to make this changes on DCns too...??

0 Karma

iatwal
Path Finder

Try increasing the time out value (towards the bottom of this page)

https://docs.splunk.com/Documentation/VMW/3.3.2/Installation/TroubleshoottheSplunkAppforVMware

Problem

Incomplete or no data coming from vCenters that are properly configured and connected to by a DCN. Data collection tasks are failing and/or connections between DCN and vCenter are closing before all data is transferred. This could be due to one of two issues.

The collection tasks taking longer than the vCenter and app are expecting.
Collection intervals are currently overloading your Data Collection Nodes (DCNs) and your vCenters.
Resolution

Change collection intervals in order to reduce the load on your Data Collection Nodes (DCNs) and your vCenters

Change the time interval for your host inventory job.
On the instance where your scheduler is running, navigate to \etc\apps\Splunk_TA_vmware\default.
Open the ta_vmware_collection.conf file.
Change hostinv_interval and hostinv_expiration from the 900 second default to a larger number (maximum 2700 seconds). Keep hostinv_interval and hostinv_expiration at the same number of seconds.
Save your changes and exit.
Change the time interval for host performance data.
On the instance where your scheduler is running, navigate to \etc\apps\Splunk_TA_vmware\local.
Open the ta_vmware_collection.conf file.
Change hostvmperf_interval and hostvmperf_expiration from the 180 second default to a larger number (maximum 1200 seconds). Keep hostvmperf_interval and hostvmperf_expiration at the same number of seconds.
Save your changes and exit.

0 Karma

Masa
Splunk Employee
Splunk Employee

Some users is using 32 CPU cores physical machines for DCNs. You can have multiple DCNs for one vCenter env. VMware app's scheduler will take care of which DCN will collect which data type.

One vCenter with 2000 VMs. Potentially vCenter's response for API calls are slow often times. Or, vCenter will reach its own timeout sometimes?

0 Karma

iatwal
Path Finder

We're still getting errors...we have a case opened with Splunk. We upgraded to 3.3.1 and still are getting errors. I'll keep this thread updated.

0 Karma

prakash007
Builder

Did you get any resolution of this issues...we are seeing the same error as you mentioned above....??

we're with splunk app for vmware v3.3.2 with 5DCns pointing to the vCenter...i can see the data in search sourcetype=vmware:perf*, but the dashboard in the home page says No data.

0 Karma

iatwal
Path Finder

Couple of things that you'll need to make sure of is all of your add-ons are installed on the indexers/search heads. In addition make sure the account you're logging into has the right role tied to it.

https://docs.splunk.com/Documentation/VMW/3.3.2/Installation/ConfigureuserrolesfortheSplunkAppforVMw...

0 Karma

prakash007
Builder

I did check on the roles, it looks good....seems the error is related to this known issue

VMW-4466    Frequent job expiration leads to not all data being collected, task=hostvmperf has expired and will not be running. 

https://docs.splunk.com/Documentation/AddOns/released/VMW/Releasenotes#Known_Issues
0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...