We are receiving over 700 sources forwarded from a Syslog-ng[remote source] service and they are being collected by Syslog-ng [local source] service running on our Splunk Indexer. The logs received from the remote source are separated by host (in this case host=IP address); with the destination directory on local source being determined by the $HOST variable.
I am having trouble with the inputs.conf and props.conf ... specifically in separating them into sourcetypes. The logs are coming from dozens of sourcetypes with the possibility of a subset of versions among each sourcetype.
I am using PCRE Regex Expressions to separate sourcetypes by host; since it seems easier to identify them by their IP since I have a list that tells me what they are.
Using inputs.conf to separate sourcetypes based on path, filtered by a whitelist regex:
Inputs.conf
[monitor:///var/log/syslog-ng/*/messages]
host_segment=4
sourcetype=type1
queue=parsingQueue
disabled=0
followTail=1
index=index1
whitelist=(reg_ex_for_type1)
[monitor:///var/log/syslog-ng/*/messages]
host_segment=4
sourcetype=type2
queue=parsingQueue
disabled=0
followTail=1
index=index1
whitelist=(reg_ex_for_type2)
[monitor:///var/log/syslog-ng/xxx.xxx.xxx.*/messages]
host_segment=4
sourcetype=type3
queue=parsingQueue
disabled=0
followTail=1
index=index1
In this version it puts everything (all sources from syslog) into sourcetype type3 and does not process the other other. I have verified that no other sourcetypes were created by running the following search:
index=index1| stats values(host) by sourcetype
I read on the forums somewhere that this cannot be done because the monitor path is technically the same, even though the regex should make the paths different.
Inputs.conf
[monitor:///var/log/syslog-ng/*/messages]
host_segment=4
followTail=1
index=index1
blacklist=(regex_exclude_certain_hosts)
Props.conf (version1)
[source::/var/log/syslog-ng/(my_specific_type1_regex)/messages]
sourcetype=type1
[source::/var/log/syslog-ng/(my_specific_type2_regex)/messages]
sourcetype=type2
[source::/var/log/syslog-ng/xxx.xxx.xxx.*/messages]
sourcetype=type3
Props.conf (version2)
[source::.../(regex_for_type1)/*]
sourcetype=type1
[source::.../(regex_for_type2)/*]
sourcetype=type2
[source::.../xxx.xxx.xxx.*/*]
sourcetype=type3
In this version the I'm using the Blacklist with inputs to filter certain logs; which is working fine, but the props.conf (both versions) attempts are not applying the sourcetypes; and are now being sourcetyped as syslog.
I have seen there is another option with props/transforms that involve looking at every event and determining the type of event by matching to a regex template of what the event is supposed to look like, however with over 700 sources and a range of versions within the subset of sourcetypes, it would be a daunting task to build a pattern for every given source.
I ended up not using Splunk to separate the source types. Instead I created a detailed Syslog-ng config file that handled all the inputs and dropped them into different sub directories. Then Splunk could just be pointed to those directories since they were different.
If anyone wants to know how I prepped Syslog-ng, here is an example config:
syslog-ng.conf
options {
long_hostnames(off);
keep_hostname(yes);
use_dns(no);
owner("root");
group("root");
perm(0640);
dir_owner("root");
dir_group("root");
dir_perm(0750);
create_dirs(yes);
};
source s_remote {
tcp(ip(0.0.0.0) port(514));
udp(ip(0.0.0.0) port(514));
};
filter f_myfilter_1 {
host("^xxx.xxx.xxx.[0-9]{1,2}$");
};
destination d_myfilter_1 {
file("/var/log/syslog-ng/filtered_source/$HOST/messages");
};
destination d_fallback {
file("/var/log/syslog-ng/$HOST/messages");
};
log {
source(s_remote);
filter(f_myfilter_1);
destination(d_myfilter_1);
flags(final);};
log {
source(s_remote);
destination(d_newsources);
flags(fallback);
};
inputs.conf
[monitor:////var/log/syslog-ng/filtered_source/*/messages]
host_segment=5
sourcetype=filtered:data
queue=parsingQueue
disabled=0
followTail=1
index=myindex
I ended up not using Splunk to separate the source types. Instead I created a detailed Syslog-ng config file that handled all the inputs and dropped them into different sub directories. Then Splunk could just be pointed to those directories since they were different.
If anyone wants to know how I prepped Syslog-ng, here is an example config:
syslog-ng.conf
options {
long_hostnames(off);
keep_hostname(yes);
use_dns(no);
owner("root");
group("root");
perm(0640);
dir_owner("root");
dir_group("root");
dir_perm(0750);
create_dirs(yes);
};
source s_remote {
tcp(ip(0.0.0.0) port(514));
udp(ip(0.0.0.0) port(514));
};
filter f_myfilter_1 {
host("^xxx.xxx.xxx.[0-9]{1,2}$");
};
destination d_myfilter_1 {
file("/var/log/syslog-ng/filtered_source/$HOST/messages");
};
destination d_fallback {
file("/var/log/syslog-ng/$HOST/messages");
};
log {
source(s_remote);
filter(f_myfilter_1);
destination(d_myfilter_1);
flags(final);};
log {
source(s_remote);
destination(d_newsources);
flags(fallback);
};
inputs.conf
[monitor:////var/log/syslog-ng/filtered_source/*/messages]
host_segment=5
sourcetype=filtered:data
queue=parsingQueue
disabled=0
followTail=1
index=myindex