In /opt/splunk/var/log/splunk/metrics.log I am seeing this type of log entry for one forwarder:
06-24-2014 13:59:32.428 -0700 INFO DeploymentMetrics - ip=nnn.nnn.nnn.nnn, dns=systemname.dom.uci.edu, hostname=dc2-finaid, mgmt=8089, build=110225, name=deploymentClient, id=connection_nnn.nnn.nnn.nnn_8089_systemname.dom.uci.edu_dc2-finaid_deploymentClient, utsname=windows-x64 scName=XYZ_OUTPUT_9998, appName=XYZ_OUTPUT_9998, fqname=C:\Program Files\Splunk\etc\apps\XYZ_OUTPUT_9998, event=install, status=failed, reason=Failed to install app : C:\Program Files\Splunk\etc\apps\XYZ_OUTPUT_9998. Cannot update application info: /nobody/XYZ_OUTPUT_9998/app/install/state = enabled: Metadata could not be written: /nobody/XYZ_OUTPUT_9998/app/install/state: { }, removable: yes
There are two deployment apps going to the system "systemname" and only one is generating this INFO item. XYZ_OUTPUT_9998
is a global app that goes to all our forwarders through the deployment server, and this is the only one generating this INFO item in our logs. The forwarder is a heavy forwarder on a Windows domain controller. I don't have admin access to it. Looking for ideas to share with the admins who installed the forwarder.
The biggest clue is "Metadata could not be written". To me that suggests that there is a permission problem, the Splunk user is over quota on the system disk, or the disk partition is full. Most probably the failing app deployment could not create its directory tree to write into, whereas the working one already has.
You say there are two app deployments, only one of which is failing. My question would be is the first already present as part of the standard system build, or was it already deployed earlier? Either of these might explain why one app is deployable (because it already has been, and hence there is no write-permission/disk capacity issue), but a new one isn't.
My first guess would be "check permissions." Specifically, does the Splunk user (that is running the services on the forwarder) have the ability to write to the folders that contain the app?
Second thought: what happens if you manually install this app on a test server? An app must have a certain directory structure and the metadata files must also exist. If the structure or file contents are wrong, the install could fail. If this is the problem, you should see the error even on a manual install.
Finally, a possible fix: on a test server, create a new app with the same name, using the Splunk GUI. Copy in the configuration files from the original app, except for the metadata files. Restart Splunk on the test server, then examine the app's configuration using the Splunk GUI. Modify as needed. Replace the old app with the new one on the deployment server.