Splunk Search

Help in creating regex for encryption/masking of data at index time?

dreschke
Explorer

Hi Splunkers,

I am looking for some help in creation of regular expression to Anonymize data with a regular expression in a transforms.

Link: https://docs.splunk.com/Documentation/SplunkCloud/6.6.3/Data/Anonymizedata

Current Log format: Timestamp | Category | Machine | ApplicationDomain | ProcessId | ProcessName | ThreadId | LogID | UserName | ActionName | Module | AuthorizationStatus | RequestedBy | RequestingURL | QueryString | HTTPVerb | ClientIP| LogEvent="Response",MethodName="get.complete",ActionResult="Success",ApplicationNumber="1234567890",ApplicationLanguage="1",Section="SUMMARY",FirstName="Shrelock",LastName="Holmes",Gender="M",DateOfBirth="7/19/1976",SocialSecurityNumber="123456789",MaritalStatus="0",RaceInformation="Item8",CitizenshipCode="1",County="20",AddressLine1="221 Baker Street",City="Marylebone",State="London"

I want to write a regular expression to mask all key value pairs which start after ‘,MethodName="get.complete",’ (i.e ApplicationNumber, FirstName, DateOfBirth, SocialSecurityNumber, MaritalStatus ,etc)

Order of the field till Method name is constant and is never changing. Every event would have exact order till “MethodName” and additional PII elements added after the “MethodName”.

Unchanged order of the field Example:

Timestamp | Category | Machine | ApplicationDomain | ProcessId | ProcessName | ThreadId | LogID | UserName | ActionName | Module | AuthorizationStatus | RequestedBy | RequestingURL | QueryString | HTTPVerb | ClientIP| LogEvent="Response",MethodName="get.complete",

Note: The location of the fields to masked may change at time but it will always be in a key value pair format. (i.e ,ApplicationNumber="1234567890",ApplicationLanguage="1",Section="SUMMARY",FirstName="Sherlock",LastName="Holmes",Gender="M",DateOfBirth="7/19/1976")

Following are the solution I was planning to use to mask data at index time.

PROPS Example Using SEDCMD Regex:

[sourcetype]

**SEDCMD-mask = regex to skip first three key-value pair and mask rest

OR**

Transforms Example Using regex:

[ssn-anonymizer]
REGEX = regex to capture ssn
FORMAT = format to mask entire data
DEST_KEY = _raw

Thank you for all of your help and advice.

0 Karma

harsmarvania57
SplunkTrust
SplunkTrust

Hi @dreschke,

I have provided answer for this question https://answers.splunk.com/answers/595001/help-to-modify-existing-regex-to-mask-senstive-pii.html because that user has same requirement.

DalJeanis
SplunkTrust
SplunkTrust

Okay, as long as you don't want to retain anything after the method, and the method name is always in quotes, it's pretty straightforward.

 SEDCMD-maskPHI =  s/(MethodName=\"[^\"]+\",).*$/\1/g

The above keeps everything starting from MethodName to the comma, and deletes everything after that.

If you can't depend on the quotes, but you can depend on it being followed by a comma, then use this...

 SEDCMD-maskPHI =  s/(MethodName=[^,]+,).*$/\1/g
0 Karma

dreschke
Explorer

Thank you for your answer, but we would like to keep the data after the method name by masking it. We do not want to delete or drop the data.

Ex. Timestamp | Category | Machine | ApplicationDomain | ProcessId | ProcessName | ThreadId | LogID | UserName | ActionName | Module | AuthorizationStatus | RequestedBy | RequestingURL | QueryString | HTTPVerb | ClientIP| LogEvent="Response",MethodName="get.complete",ActionResult="######",ApplicationNumber="#######",ApplicationLanguage="#######",Section="#######",FirstName="########",LastName="#######",

0 Karma

nishitdarade
Explorer

@DalJeanis did you get a chance to review derek's comment?

0 Karma

nishitdarade
Explorer

@mtulett_splunk can you help answer this?

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...