I'm struggling to get Splunk 6.0.1 to properly extract fields from vsftpd logs. The log format is space separated values like so:
Thu Jun 12 23:50:13 2014 1 11.22.33.44 551 /example.tif a _ o r ftpuser4 ftp 0 * c
Those break down as follows, in example fieldname format
Thu Jun 12 23:50:13 2014 current-time
1 transfer-time
11.22.33.44 remote-host
551 byte-count
/example.tif filename this one can be complicated by additional directories in the path, eg /images4/example.tif
a transfer-type
_ special-action-flag
o direction
r access-mode
ftpuser4 username
ftp service-name
0 authentication-method
* authenticated-user-id
c completion-status
What I'm struggling with is that the field extractions are sometimes picking up the current-date year as the transfer-time value, which then throws the rest of the extractions out of whack.
You can make your extraction more flexible with regards to the number of whitespaces between the words by replacing \s
with \s+
... something like this:
(?<current_time>\S+\s+\S+\s+\S+\s+\S+\s+\S+)\s+(?<transfer_time>\S+)\s+(?<remote_host>\S+)\s+...
This way it doesn't matter if there's one or two spaces between the fields / within the date.
The problem appears to be rooted in how the logs handle the date - if the day of the month is single-digit, then it inserts an extra space after the month (eg Jun 5 instead of Jun 13) The Field Extraction UI is counting spaces and gets thrown off when there is an extra space in the event for a single-digit date.