Splunk is indexing a CSV file that contains an IP address and it looks something like this:
"Windows 7","SSHEFFIER8GDAOC","010.003.002.059","101BD9089D18"
The IP has leading zeros which I need removed and prefer to do so at index time, so based on what I've seen in the forums here I added the following SEDCMD line to the relevant props.conf file:
[my:sourcetype]
...(some field extractions)....
EXTRACT-LD_IPAddress = (?:[^"\n]"){17}(?P[^"]+)
SEDCMD = s/(src=|dst=)0([^.]+.)0*([^.]+.)0*([^.]+.)0*(\d+)/\1\2\3\4\5/g
This seems to have no effect on the data. I double checked that the props.conf file was deployed to the indexers. Is there something wrong with the way I did the SEDCMD (what is src & dst?)? Could it be that the SEDCMD needs to be placed before the EXTRACT lines in props.conf? Also, I wonder if the quotes around the IP address could be affecting this?
Thanks for your help.
So here's what's being deployed in the props.conf file to the universal forwarders. I'm getting the data but the SEDCMD isn't removing the leading zeros from IP addresses.
[source::file location]
[sourcetype]
FIELD_DELIMITER=,
FIELD_QUOTE = "
DATETIME_CONFIG = CURRENT
INDEXED_EXTRACTIONS = csv
NO_BINARY_CHECK = 1
SHOULD_LINEMERGE = false
SEDCMD-StripLeadingZeros = s/(src=|dst=)0*([^.]+.)0*([^.]+.)0*([^.]+.)0*(\d+)/\1\2\3\4\5/g
If you're applying INDEXED_EXTRACTIONS=csv
then these events are cooked even on universal forwarders... so anything you do with those events would have to be configured there. See http://wiki.splunk.com/Community:HowIndexingWorks section 4, you're going through the structuredparsing queue on the forwarder.
So I copied the SEDCMD line to the props.conf belonging to the SplunkForwarder and reloaded the deploy server. The changes were imported and it reread the CSV file, but it did not remove the leading zeros from the IP address.
Could it be something else? I'm sort of out of ideas at this point. I even tried removing the leading zeros at search time but had a problem (I think maybe the quotes around the IP address, see below). I'd rather have this done at index time. Should I try to use a transform?
(this didn't work either ... | rex field=youripfield mode=sed "s/.0+/./g")
I've just taken another look at your question - make sure you use SEDCMD-something = ...
and not SEDCMD = ...
.
Hi Martin. Is the arbitrary? I changed it to SEDCMD-RemoveLeadingZeros but this didn't help either.
Yeah, that's arbitrary to avoid multiple SEDCMDs overwriting each other.
Still no luck! I'm going bald from pulling my hair out!
Also, no indication of what happened in the splunkd.log of the SplunkForwarder.
Copy the index-time settings to the forwarder, yes. Don't move in case you ever input such a file locally. Also don't move search-time field extractions.
OK. So I should move the SEDCMD from the props.conf on the Indexer to forwarder, right? I suppose I could move the field extractions there as well.
SEDCMD happens at the parsing stage, so it applies in a heavy forwarder or in a indexer. If your forwarder is a heavy forwarder, the log data is already "cooked" when it arrives at the indexer, and the sedcmd in props.conf there will have no effect. If the forwarder is not a heavy forwarder, this should work on the indexer.
Hi. We are not using a heavy forwarder.
Is this indexed using INDEXED_EXTRACTIONS=csv
on the forwarder?
Yes, in the props.conf file.