Getting Data In

inaccurate deployment client dns values

dstaulcu
Builder

When running splunk list deploy-clients on deployment servers, I have noticed that for some deployment-clients, the value of hostname does not match the name of the host included in the value of dns. For example, in one record for a computer whose name is truly "host1", the hostname value is accurately presented as "host1" but the hostname attribute of the dns value is inaccurately presented as "host3.domain.com".

According to http://docs.splunk.com/Documentation/Splunk/6.0.1/Admin/Serverclassconf, the dns value is supposed to be derived from a reverse dns lookup. Running splunk in debug mode on the deployment client, I can see that the deployment client is posting the dns value in question to the deployment server, so the inaccurate lookup must be occurring on the deployment client and not the deployment server.

The thing is, when running nslookup on a deployment client in question, the returned dns name is correct. Also, when reviewing the PTR record of the ip address in question, the record seems to be accurate there as well. I've even gone as far as to manipulate the reverse DNS record to an inaccurate value for another deployment-client to see if this influences what the deployment client presents to deployment server during phoneHome activity after splunk restart. -It doesn't. I'm at the point where I do not think Splunk is truly using a reverse dns lookup for the dns value provided by deployment clients to deployment servers.

On that theory I've been trying to figure out what splunk is using the derive dns value from on windows-based deployment clients. To do so, I ran process monitor during splunk startup. I filtered on events having expected domain or expected hostname or inaccurate hostname in details column of i/o request. What I found is that splunk is performing queries on a couple of registry value names for activeComputerName, hostname, and domain. I tried manipulating those values and restarting splunk to determine if those registry values truly influence what deployment-clients post during phoneHome as dns value to deployment servers.. It turns out they those registry values DO influence but instead of uploading the values contained in the registry, the deployment-clients started reporting IP address in dns field.. At this point I figure splunk is stringing together the dns name from multiple sources, performing some sort of verification and making a decision on perceived quality prior to selecting a value to post. It would be nice to know what is being factored so that I can correct the root cause of issues in our environment.

Anywho.. Guess I wanted to alert community to this situation in the case that deployment-apps are getting deployed to deployment-clients inexplicably as a result of multiple matches on serverclass whitelist entries on hostname and inaccurate dns, and to draw upon your experience for what to do moving forward.

0 Karma

dstaulcu
Builder

To workaround this problem, I successfully implemented regex expression in serverclss whitelist entries to exclude DNS or IP based values from matching in any case using Whitelist.n = [^/.]+$

I also submitted an enhancement request:

On splunk deployment servers, in serverclass.conf, within serverclasses, add feature allowing splunk admin to specify the deployment client attribute types (clientname,ipaddress,dnshostname,hostname) used in whitelist/blacklist entry matching.

Suggested configuration item name would be matchTypeFilter
Suggested default value (to preserve existing behavior) would be all in standard order of processsing
Admins should be able to specify multiple by attribute types to match against in a delimited format, eg.   matchTypeFilter=clientname,hostname,dnshostname
matching should be processed in order of attributes listed in matchTypeFilter

Adding this feature would be particularly useful in large splunk implementations having many deployment-clients. phoneHome requests appear to be processed serially and optimizations in processing of phoneHome requesets are critical to improve scalability of deployment server instances. Reducing the number of attribute types to enumerate (from 4 items to 1) would both improve speed of matching activities and reduce opportunity for undesirable matches due to data quality issues, such as stale dns records or mismatch between hostname and clientname

0 Karma

grijhwani
Motivator

Are client and server using the same DNS servers? Are you using nscd or sssd caching services? When you manipulated the DNS records as a test did you a) clear all caches, and/or b) allow sufficient time for the changes to propogate through?

0 Karma

grijhwani
Motivator

nscd and sssd are name and credential caching servers for the *ix environment, so not relevant to Windows. Sorry, not familiar with Splunk on Windows (other than writing deployment configs, and managing indexes for the traffic).

0 Karma

dstaulcu
Builder

any questions are good questions at this point as I am out of ideas. the client and server are not using the same dns servers but I'm pretty sure the problem is on the client side of things. ncsd and sssd don't ring a bell to me so I guess that's what you're referring regarding Linux. I'm pretty sure I flushed and reregistered dns as well as restarted the operating system as part of testing.. I even changed permissions on the dns entry on the dns server to make sure that my "bad entry test of influence" would stick after re-registering dns on the client side.

0 Karma

grijhwani
Motivator

Dumb questions. I was assuming Linux installation, but it just sunk in that you were mentioning registry keys...

0 Karma
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...