Hi,
I am able to anonymize data in Splunk using props.conf and transforms.conf but not able to anonymize multiple occurrences on the same log event. I am trying to anonymize IP Address, please find below my setup and the output:
props.conf
[mysourcetype]
TRANSFORMS-anonymizeip = ip_anonymizer
transforms.conf
[ip_anonymizer]
REGEX = (.* )\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}(.*)
FORMAT = $1XXX.XXX.XXX.XXX$2
DEST_KEY = _raw
Log event example (before transform):
2016-03-31 09:03:52 testserv.net ProxySG: E0000 Access Log Connected to 192.168.1.101 and server 192.168.4.12:21.(0) NORMAL_EVENT
Log event example (after transform):
2016-03-31 09:03:52 testserv.net ProxySG: E0000 Access Log Connected to 192.168.1.101 and server XXX.XXX.XXX.XXX:21.(0) NORMAL_EVENT
Only the second IP Address is masked.
Does anyone know what must be changed in the config ?
Thanks for your help.
SirHill
You can try SEDCMD on the props.conf as well. To mask all IP address in the event try something like this
props.conf
[mysourcetype]
SEDCMD-anonymizeip = s/(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/XXX.XXX.XXX.XXX/g
The other solution ( REPEAT_MATCH = true
) should work but only after you restart all of your indexers AND it will only apply to NEWLY INDEXED events.
Here is another way to do it (the same "but onlys" apply) in props.conf:
[mysourcetype]
SEDCMD-anonymize_all_IPv4s = s/(\d{1,3}\.){3}\d{1,3}/IPv4_anonymized/g
Thanks, I will try again with REPEAT_MATCH = true but works fine with SEDCMD.
You can try SEDCMD on the props.conf as well. To mask all IP address in the event try something like this
props.conf
[mysourcetype]
SEDCMD-anonymizeip = s/(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/XXX.XXX.XXX.XXX/g
Perfect, it works fine!
Thanks!
Have you tried the REPEAT_MATCH = true
attribute in your transforms.conf stanza?
Cheers, Greg.
Just tried and it doesn't work, it did not collect some log events. But reading the transforms.conf documentation, it seems that the REPEAT_MATCH feature is only for field extraction:
NOTE: This attribute is only valid for index-time field extractions.
Do I understand well what the doc means?