Getting Data In

Wrong originating host when forwarding syslog from indexer to third party?

petergus
New Member

Hi,

i have a weird problem with forwarding logs from my apache servers to both spunk and a 3rd party syslog server. As soon as i involve the Splunk Forwarder installed on the apache host i loose the original hostname of my apache server when i forward the messages to a central standard syslog-ng server, this syslog server has a custom input written that listens on port 5514 that will use the originating host from the syslog header so i can index the log data to the right host in the db I'm storing it in. But if i skip the splunk forwarder and use standard syslog from apache to the indexer the forwarding works as expected and the origination hostname is retained. I have tried multiple configuration options on the indexer, they are detailed further down. In this environment it is not possible to send the logs straight from the splunkForwarder to the syslog server for other reasons.

Any idea what to do to solve this? This should be standard stuff imho so i think i must have missed something basic.

apache - 192.168.7.166 # has default log options apart from hostname first added (%V)

SplunkIndexer - 192.168.7.157

SyslogServer - 192.168.7.15

Client - 192.168.7.1

The original log on file is:

notroot@apache:~$ tail -f /var/log/apache2/access.log 
apache 192.168.7.166 192.168.7.1 - - [25/Mar/2014:18:59:31 +0100] "GET / HTTP/1.1" 304 209 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/33.0.1750.152 Safari/537.36"

NON WORKING SETUP:

apache--SplunkUnivForwarder-->SplunkIndexerport9997--forwardToSyslogTCPoutput-->SyslogServerport5514

No matter what i do the log always come in with a source of 192.168.7.157, which should be 192.168.7.166 is the source of the log is my apache server.

Log arrives at central syslog server:

2014-03-24T21:01:45+01:00 192.168.7.157 apache 192.168.7.166 192.168.7.1 - - [24/Mar/2014:19:52:48 +0000] "GET / HTTP/1.1" 304 209 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/33.0.1750.152 Safari/537.36"

No matter what config options for forwarding i try on the indexer it always comes in with .157, see below for confg files.

WORKING SETUP:

apache--syslog (syslog-ng)-->SplunkIndexerPort514---forwardToUnomalyasTCPoutput-->UnomalyPort5514.

In Splunk it looks like this, the timestamp and originating host (apache) correctly seen:

apache apache 192.168.7.166 192.168.7.1 - - [25/Mar/2014:18:59:31 +0100] "GET / HTTP/1.1" 304 209 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/33.0.1750.152 Safari/537.36"

After forwarding to the Syslog server the originating host is intact (apache):

unomaly@unomaly:~$ tail -f /var/log/unomaly.in | grep apache
2014-03-25T18:59:32+01:00 apache apache 192.168.7.166 192.168.7.1 - - [25/Mar/2014:18:59:31 +0100] "GET / HTTP/1.1" 304 209 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/33.0.1750.152 Safari/537.36"

I suspect this would also work if the Universal Forwarder could do standard syslog to the indexer as the syslog-ng is doing in the working example…

I have tried with both syslog and tcpout options (some now commented out below) in the outputs.conf as well as syslog_routing with props and transform files on the indexer:

outputs.conf
#[syslog]
#defaultGroup=allElseGroup

#[syslog:syslogGroup]
#server = 192.168.7.111:5514
#type = tcp

#[syslog:allElseGroup]
#server = 192.168.7.111:5514
#type=tcp

[tcpout:unomaly]
server = 192.168.7.111:5514
sendCookedData = false


====

transform.conf
REGEX=.
DEST_KEY=_SYSLOG_ROUTING
FORMAT=syslogGroup

===

props.conf
[syslog]
TRANSFORMS-routing=syslogRouting
Tags (2)
0 Karma

dkuk
Path Finder

Hi,

I think the key point is that you're talking about two separate data input types, UDP and file monitor, so it's conceivable that they will be treated differently.

Each UDP message comes from one specific source (i.e. could be many devices but each device is unique) so the UDP monitor can easily set the host on the event in Splunk - it's where the UDP message came from, easy. The other reason it may work is that Splunk may well be seeing your UDP source and correctly assuming the sourcetype of "syslog" which it knows how to extract the host for (done at an event parsing level for you - see more info on this below). Or you may have specified sourcetype as syslog on the UDP input.

Once you change the input to write to a shared file Splunk does not necessarily know which host the message is from, i.e. many hosts all logging to one file, lots of different hosts.

Splunk does recognise certain formats as a pretrained sourcetypes, syslog being one of them, i.e. it will stamp a sourcetype of "syslog" on syslog messages if they fit the expected format for syslog messages.

Depending on the sourcetype, when sourcetype has been stamped on the event, the pre-canned props and transforms Splunk ships with will parse/interrogate the event and try to stamp the correct host on the event, using pattern matching defined for that sourcetype. Don't expect this to just work for all events for all types of data though, often you will need to define this logic yourself.

This parsing logic is provided by the props.conf and transforms.comf in etc/system/default for sourcetype "syslog". As you can see, the "syslog-host" transform is performing the host definition : -

    Props:    
        [syslog]
        pulldown_type = true 
        maxDist = 3
        TIME_FORMAT = %b %d %H:%M:%S
        MAX_TIMESTAMP_LOOKAHEAD = 32
        TRANSFORMS = syslog-host
        REPORT-syslog = syslog-extractions
        SHOULD_LINEMERGE = False

    Transforms: -  
        [syslog-host]
        DEST_KEY = MetaData:Host
        REGEX = :\d\d\s+(?:\d+\s+|(?:user|daemon|local.?)\.\w+\s+)*\[?(\w[\w\.\-]{2,})\]?\s
        FORMAT = host::$1

Probably what is happening here is that because they are being seen as apache messages you are not getting a sourcetype of "syslog" stamped as they are recognised as something else. As such, the host isn't extracted because whatever sourcetype is being set doesn't have pre-canned props and transforms defined. You will either need to define these or find an app that has them for you. Alternatively you could just set "sourcetype = syslog" on your inputs.conf entry for the syslog file(s) you are monitoring. However you might want your breakdowns less general than this, i.e. have a range of syslog_<vendor> sourcetypes so that you can search and perform field extractions on those sourcetypes later on in the Splunk UI.

A good way to set host is to try and do it at a source wide level (more efficient than event-by-event parsing, i.e. if you have the same host value for every event in a file you are monitoring then there's no need to make Splunk parse each event. With syslog you can achieve this by configuring your syslog server to write to a different folder for each host where the folder name reflects the name of the host. E.g. so you end up with folders like below.

/mnt/logs/192.168.1.1/syslog.log
/mnt/logs/192.168.1.2/syslog.log
/mnt/logs/192.168.1.3/syslog.log

In the inputs.conf where you specify the data collection you can tell it to look at a particular segment of the file name for the host.

E.g. the inputs.conf definition for the syslog file monitor would look something like this where the host is wildcarded by the "*":

[monitor:///mnt/logs/*/syslog.log]
host_segment = 3

Hope this helps!

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...