Splunk Search

Cannot transform string with regex

scottkurtosys
New Member

Hi

I am trying to transform a couple of strings that are being capture in my Splunk logs

The string are similar to this

{"Key":"Authorization","Value":["Basic EAAAALhzFAxssvST1j4jBCAynyb3F9kHsHFWvijwNkuBb3pnY0zFtrz61YPlxQkP73l9p9ZusdBBfjSrDXgueEipT8xUuRk3tFPIAnmwFbGxluvRa3szorgtEq6VDXuIZL9RgA=="]},{"Key":"Authorization-Token","Value":["BCDC62F494410A7ABAE80457C9566F37"]}]

I have tested the following regex expressions with a couple of tools, and they seem to match

"Authorization","Value":\["(Basic)\s[a-zA-Z0-9+\/]+={0,2}"

"Authorization-Token","Value":\["[a-zA-Z0-9+]+"

I have the following in my $SPLUNK_HOME/etc/system/local/props.conf file

[someapp]
TRANSFORMS-anonymize = authorization-anonymizer, authorization-token-anonymizer

And the following in my $SPLUNK_HOME/etc/system/local/transforms.conf file

`[authorization-anonymizer]
REGEX = "Authorization","Value":["(Basic)\s[a-zA-Z0-9+\/]+={0,2}"
FORMAT = $1"Value":["Basic ##############################################################################################################################$2 DEST_KEY = _raw

[authorization-token-anonymizer]
REGEX= "Authorization-Token","Value":["[a-zA-Z0-9+]+"
FORMAT = $1"Value":["############################$2
DEST_KEY = _raw`

The intention is to replace the strings with # characters, but I clearly have misunderstood something, as the strings are not changing

Could anyone help at all ?

Thanks

_scott

0 Karma
1 Solution

FrankVl
Ultra Champion

You're using $1 and $2 in your FORMAT values, while the first regex has only 1 capturing group and the second has none. So that doesn't line up, which is probably why these transforms are not getting applied.

I think you need to adjust your regexes, such that you're capturing the parts before and after the string that needs to be anonymized and then specify a format like $1#####$2.

View solution in original post

0 Karma

somesoni2
Revered Legend

Give this a try (transforms.conf)

[authorization-anonymizer] 
REGEX =(?m)^(.*"Authorization","Value":\["Basic\s*)[^\"]+(.+)
FORMAT = $1####################$2 
DEST_KEY = _raw 

[authorization-token-anonymizer] 
REGEX =(?m)^(.*"Authorization-Token","Value":\[")[^\"]+(.+)
FORMAT = $1####################$2 
DEST_KEY = _raw
0 Karma

FrankVl
Ultra Champion

You're using $1 and $2 in your FORMAT values, while the first regex has only 1 capturing group and the second has none. So that doesn't line up, which is probably why these transforms are not getting applied.

I think you need to adjust your regexes, such that you're capturing the parts before and after the string that needs to be anonymized and then specify a format like $1#####$2.

0 Karma

scottkurtosys
New Member

So if I were attempt to use something like this

(."Authorization","Value":["Basic\s)(.={1,2})("]},{"Key":"Authorization-Token","Value":[")(.{32})(.*)

Where each () capture group matches sections of the whole

Could I then use a FORMAT of $1 ##### $3 ##### $5

To hash out the two strings all in a single transform ?

Or am I still misunderstanding the capture groups and FORMAT statement ?

Also, do quote marks need to be escaped in Splunk regexes ?

Thanks

0 Karma

FrankVl
Ultra Champion

Yes, something like that should work. Although there is not much purpose for putting the parts you don't want to keep in a capture group.

0 Karma

scottkurtosys
New Member

Thanks for pointing me in the right direction. Have got it working now

🐵

0 Karma
Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...