Splunk Search

Regex/sed replaces and issues with succeeding numbers

alekksi
Communicator

Hi all,

I'm having issues with a rex/sed replace not cleanly working. I'm trying to anonymise session IDs in order that, in the few places where it's not yet been updated, it will join with other session IDs in the logs.

Assuming the session key is abc12345678xyz, where any of the characters can be a number or a letter, the current working regex replace I have is:

rex field=session_id mode=sed "s/^([\d\w]{3})[\d\w]{8}/\1.00000000/" | rex field=session_id mode=sed "s/\.//"

Obviously that's not particularly succinct or efficient, but with the alternative, I get the wrong result:

rex field=session_id mode=sed "s/^([\d\w]{3})[\d\w]{8}/\100000000/"

which instead of "abc00000000xyz" I will get "\100000000xyz" as my replacement.

Is there an easier way to do this? Or is there a way to terminate the matching result so I ask for the first matching result rather than the hundred millionth matching result.

Thanks in advance!
Best regards,
Alex

0 Karma
1 Solution

gcusello
SplunkTrust
SplunkTrust

Hi alekksi,
probably you had already seen https://docs.splunk.com/Documentation/Splunk/6.5.2/Data/Anonymizedata
Every way, I'd use something like this

rex field=session_id mode=sed "s/^(.{11})/10000000000/"

obtaining "100000000xyz" from "abc12345678xyz"

Bye.
Giuseppe

View solution in original post

DEAD_BEEF
Builder

To be clear is this what you want? Your post is a bit confusing.

EXISTING
session_id abch573jfuixyz

DESIRED
session_id abc00000000xyz

Does the following meet your criteria?

rex field=session_id mode=sed "s/((?<=^.{3}).{8})/abc00000000xyz/"
0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi alekksi,
probably you had already seen https://docs.splunk.com/Documentation/Splunk/6.5.2/Data/Anonymizedata
Every way, I'd use something like this

rex field=session_id mode=sed "s/^(.{11})/10000000000/"

obtaining "100000000xyz" from "abc12345678xyz"

Bye.
Giuseppe

alekksi
Communicator

I have seen that -- thank you for the link. We are using it in some places already. That said, this is anonymised at application level -- it will be fixed in a later version, but I still need to use the data at the moment.

Sorry I wasn't clear enough earlier:

This is the string I start with: "abch573jfuixyz"
This is the string I want: "abc00000000xyz"
This is the regex I am currently using: "s/^([\d\w]{3})[\d\w]{8}/\1.00000000/" | "s/.//"

I realise that I used {12} above-- it is actually that many characters, but using 3 in this example is less hassle

0 Karma

gcusello
SplunkTrust
SplunkTrust

If the leght of your session_id is fixed, you could also use eval command:

| eval session_id=substr(session_id,1,3)+"00000000"+substr(session_id,12)

Bye.
Giuseppe

alekksi
Communicator

Of course that's the obvious solution, should've thought of that. Many thanks!

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...