Hi Splunkies,
I have configured a transforms.conf below:
[ABCD]
REGEX = (?m)^(.*)("ABCD":")(\w+(\w{4}["].*))
FORMAT = $1$2xxxxx$4
DEST_KEY = _raw
Together with a props.conf below:
[host::]
TRANSFORMS- ABCD = ABCD
The logs that I have below is:
"ABCD":"A1234567A", "ABCD":"A1234567B", "ABCD":"A1234567C"
And I intent to change to
"ABCD":"xxxxx567A", "ABCD":"xxxxx567B", "ABCD":"xxxxx567C"
However, the above configuration only allows the logs to be anonymize with the following:
"ABCD":"A1234567A", "ABCD":"A1234567B", "ABCD":"xxxxx567C"
I have tried on https://regex101.com/, and there seems to be a global flag that will allow me to achieve my intended anonymizing.
Will greatly appreciate if any of you have gone through the same, and able to apply a global flag on the REGEX in transforms.conf
The .*
at the start of your regex is greedy and eats up as much of the string as it can (which ends up in $1), before trying to match the rest of your regex. That's why it only modifies the last occurence of the pattern.
Since you want to write to the raw event, you cannot use repeat_match, you'll need to do this in one go. So you need to write a regex that matches the whole string of repeated patterns. And then also write the FORMAT setting accordingly. So if you're sample data is accurate, try this:
[ABCD]
REGEX = (?m)^("ABCD":")(\w+(\w{4}["])),\s("ABCD":")(\w+(\w{4}["])),\s("ABCD":")(\w+(\w{4}["]))
FORMAT = $1xxxxx$3, $4xxxxx$6, $7xxxxx$9
DEST_KEY = _raw
But you might be better of taking a look at the SEDCMD method in props.conf for this kind of masking. Especially in case your sample was really just an example and your actual data varies in the number of times a pattern occurs in the raw event and whether there is any other strings around it etc.