Getting Data In

Field extractions problem

wvalente
Explorer

Hi everyone,

I'm a new splunk user and I need a help about field extractions.

My splunk receive data from a syslog server and at the moment of indexing data, the splunk index the following way (I've hidden some fields with 'xx'):

2017-06-05T10:59:14-04:00 ins-web01 sshd[32401]: pam_unix(sshd:session): session closed for user xxxxxxx
host =splunk01.infra.xx source =/var/log/remote/auth/ins-web01/sshd.log sourcetype =sshd-too_small

I have two problems here:

1) the host field is incorrect
2) the sourcetype field is incorrect

I've extracted the right host and sourcetype field but it does not work.

My regex is ^[^ \n] (?P[^ ]+).*

Could anyone help me?

Tks.

Tags (1)
0 Karma

woodcock
Esteemed Legend

You are doing syslog wrong. You should be sending each sourcetype to a different port and the explicitly setting the sourcetype for each port. DO NOT EVER let splunk automatically set the sourcetype.

Read these and start over.
http://www.georgestarcher.com/splunk-success-with-syslog/
http://docs.splunk.com/Documentation/Splunk/latest/Data/Listofpretrainedsourcetypes

0 Karma

cpetterborg
SplunkTrust
SplunkTrust

I have a rsyslog server that is sending its data to the indexers through an HEC, so the data is kind of like yours, in that the data appears to come from the syslog server as the host instead of the originating host, though in my case the data is classified as the right sourcetype. The following information should be able to be used with the sourcetype as well.

I have a transforms.conf file with the following configuration:

[hostextract]
REGEX = ^\w\w\w \d+ \d\d:\d\d:\d\d (([a-zA-Z]|\d+\.)[^ ]+)
SOURCE_KEY = _raw
DEST_KEY = MetaData:Host
FORMAT = host::$1

That will extract the hostname from the data and set it at index time. It is tied to the data with the following props.conf file configuration:

[cisco:asa]
TRANSFORMS-hostextract = hostextract

You should be able to do the same sort of thing by setting the props.conf to use the hostname instead of the sourcetype like this:

[host::splunk01.infra.xx]
TRANSFORMS-hostextract = hostextract

Then you should be able to extract the host as above (use a regex for your data, like the following:

REGEX = ^\\d{4}-\d\d-\d\dT\d\d:\d\d:\d\d-\d\d:\d\d ([^ ]+)

Do another for your sourcetype, but I'm going to assume that you need a sourcetype the same for all the entries coming in. If not, this would have to change (of course). The props.conf:

[host::splunk01.infra.xx]
TRANSFORMS-hostextract = hostextract,sourcetypeextract

And the entry in transforms.conf for the sourcetype might be something like:

[sourcetypeextract]
DEST_KEY = MetaData:Sourcetype
FORMAT = sourcetype::yoursourcetype

Now, I haven't tested this, but I think that the tech is close.

0 Karma

wvalente
Explorer

Thanks for your support cpetterborg.

I'll try this and I return if it works.

0 Karma

cpetterborg
SplunkTrust
SplunkTrust

What kind of syslog server (rsyslog or syslog-ng for example) are you using? I know you say in your question that it is a syslog server, but it can make a difference which one.

How are you getting that data from the syslog server to the indexers? Dumping to files and HF or UF to send the data? Forwarding the data? HTTP Event Collector on the indexers?

Where do you have your regex for extracting the host and sourcetype?

0 Karma

wvalente
Explorer

Hi cpetterborg,

I'm using rsyslog and forwarding data directly to splunk.

To extract the fields, I usually use the extract field menu, select the incorrect field and reclassify it to the correct name.

Is this clear?

Thanks.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

A sourcetype of "xx-too_small" means Splunk does not have enough data to guess about the correct sourcetype to apply. Either you have not specified a sourcetype for that input or the sourcetype specification is in the wrong place.

---
If this reply helps you, Karma would be appreciated.
0 Karma
Get Updates on the Splunk Community!

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...