Getting Data In

Looking for suggestions on how to mask email addresses that could be in almost any format in a JSON?

lycollicott
Motivator

I have a JSON with an agonizing amount of PII which is mostly email addresses, but it is in no standard format and no standard postion within the JSON. Here are just some examples of the format:

\"email\":\"mr.rogers@bubba.com\"
\\\"email\\\":\\\"mr.rogers@bubba.com\\\"
\"loginNameOrEmail\": \"mr.rogers@bubba.com\"
\\\"loginNameOrEmail\\\": \\\"mr.rogers@bubba.com\\\"

I need to mask this in props and transforms before it gets indexed and I need to somehow account for all formats both known and unknown.

0 Karma

bjcross
Explorer

In your props.conf for the source-type add a SEDCMD possibly like this.

SEDCMD-email = s/[\w!#$%&'+=?^_‘{|}~.-]+@(?:[\w!#$%&'+=?^_‘{|}~.-]+)*/XXXXX@EMAIL/g

https://docs.splunk.com/Documentation/Splunk/latest/Data/Anonymizedata#Anonymize_data_through_a_sed_...

0 Karma
Get Updates on the Splunk Community!

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars this month. This ...

They're back! Join the SplunkTrust and MVP at .conf24

With our highly anticipated annual conference, .conf, comes the fez-wearers you can trust! The SplunkTrust, as ...

Enterprise Security Content Update (ESCU) | New Releases

Last month, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...