Getting Data In

Extracting proper hostname from different data sources

bondu
Explorer

We have a custom regex in transforms.conf and props that extracts the correct hostname from the source nginx logs, however this does not work with the other sourcetypes. Fro example automatic, I tried adding this entry to the transforms, and props files see below, However that is not working correctly for the automatic source.

Example
The actual host name is as1.br2.la.wiredrive.com , however the host is being reported as

host=as3.br2.la.wiredrive.com

Mar 22 15:19:40 as1.br2.la.wiredrive.com appfuel[97326]: package="web" env="production" userId="101603" clientCode="jpmktg" guid="WD-KXMPD" view="update-project-access-log" uid="rBAWhVFMmxck7sxdBL4vAg==" URI="/?routekey=update-project-access-log.json" method="post" scope="private"

host=as3.br2.la.wiredrive.com Options| sourcetype=automatic Options| source=/var/log/appfuel.log

transforms.conf

[setnull]
REGEX = \.(mp4|jpg|bz2|png|gif|js|swf|jar|signed|flv|json)
DEST_KEY = queue
FORMAT = nullQueue

[nginx_host]
REGEX = [\d]{2}:[\d]{2}:[\d]{2} (?P<hostname>[^\s]+)\s+nginx:
FORMAT = host::$1
DEST_KEY = MetaData:Host

[appfuel_host]
REGEX = [\d]{2}:[\d]{2}:[\d]{2} (?P<hostname>[^\s]+)\s+automatic:
FORMAT = host::$1
DEST_KEY = MetaData:Host

props.conf

[Nginx]
NO_BINARY_CHECK = 1
pulldown_type = 1
TRANSFORMS-null = setnull, nginx_host, automatic
EXTRACT-HTTPstatus = [^&\n]*&\w+=\w+\s+(?P<HTTPstatus>\w+/\d+\.\d+"\s+\d+)
EXTRACT-UpstreamTime = (?:[^\-\n]*\-){4}"\s+\w+_\w+="\d+\.\d+"\s+(?P<UpstreamTim
e>[^ ]+)
EXTRACT-RequestTime = (?:[^\-\n]*\-){4}"\s+(?P<RequestTime>[^ ]+)
EXTRACT-BytesSent = (?:[^/\n]*/){6}\d+\.\d+"\s+\d+\s+(?P<BytesSent>[^ ]+)
EXTRACT-StatusOnly = (?:[^"\n]*"){2}\s+(?P<StatusOnly>[^ ]+)
EXTRACT-FIELDNAME = (?i)^(?:[^ ]* ){3}(?P<FIELDNAME>[^ ]+)

[source::/var/log/appfuel.log]
EXTRACT-AppHostname = (?:[^ \n]* ){3}(?P<AppHostname>[^ ]+)
EXTRACT-FIELDNAME = (?i)^(?:[^ ]* ){3}(?P<FIELDNAME>[^ ]+)

Any help is appreciated in advance.
Thank you

Tags (1)
0 Karma

kristian_kolb
Ultra Champion

Hmm, there seems to be a few things that are wrong.

1) the TRANSFORMS call in props.conf will look for an [automatic] stanza in transforms.conf, but there is none. But there is one called [appfuel_host].

2) I don't know if it's a good idea to call a sourcetype 'automatic', since that word may be reserved in that context, i.e. tell Splunk to figure out the sourcetype as best it can.

3) FIELDNAME is a placeholder name, usually created by the Interactive Field Extractor. It should not be used in config files. Copy/paste?

4) there is no specific sourcetype stanza in props.conf relating to the events you want extract stuff from. Usually that is better than working with [source::/blah/log.log]

5) the CHECK_BINARY config setting will only be honoured in the inputs-phase, which happens on the same instance as the files are being read off disk. Usually that will be a forwarder, but of course some files will be read locally by the indexer. This is not related to your other problems.

6) if this data is coming from a forwarder, check the inputs.conf and server.conf files on the forwarder to see if the wrong hostname is explicitly set there. Has been known to happen when server images with an installed forwarder are being cloned.

Hope this helps,

Kristian

Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...