Hi Splunkers,
I am looking for some help in modifying current regex to meet our updated project criteria.
Link: https://docs.splunk.com/Documentation/SplunkCloud/6.6.3/Data/Anonymizedata
Current Log format: Value1 | Value2 | Value3 | Value4 | Value5 | Value6 | Value7 | Value8 | Value9 | Value10 | Value11 | Value12 | ClientIP|
LogEvent="Response",MethodName="get.complete",ActionResult="Success",ApplicationNumber="1234567890",ApplicationLanguage="1",Section="SUMMARY",FirstName="jhon",LastName="doe",Gender="M",DateOfBirth="7/19/1993",SocialSecurityNumber="123456789",MaritalStatus="0",RaceInformation="Item8",CitizenshipCode="1",County="20",AddressLine1="221 Street",City="Washington",State="USA"
I want to write a regular expression to mask all key value pairs basically PII data which start after ,MethodName="get.complete", (i.e ApplicationNumber, FirstName, DateOfBirth, SocialSecurityNumber, MaritalStatus ,etc)
Order of the field till Method name is constant and is never changing. Every event would have exact order till “MethodName” and additional PII elements added after the “MethodName”.
Note: The location of the fields to masked may change at time but it will always be in a key value pair format. (i.e ,ApplicationNumber="1234567890",ApplicationLanguage="1",Section="SUMMARY",FirstName="Sherlock",LastName="Holmes",Gender="M",DateOfBirth="7/19/1976" )
Following are the solution I was planning to use to mask data at index time.
PROPS Example Using SEDCMD Regex:
[sourcetype]
**SEDCMD-mask = regex to skip first three key-value pair and mask rest
OR**
Transforms Example Using regex:
[ssn-anonymizer]
REGEX = regex to capture ssn
FORMAT = format to mask entire data
DEST_KEY = _raw
Current approaches not fulfilling our request.
1 Below expression is dropping all values after MethodName instead of masking them.
SEDCMD-maskPHI = s/(MethodName=\"[^\"]+\",).*$/\1/g
2 Below regex is masking all key value pairs after the last |. But we need to mask everything only after the MethodName="get.complete".
SEDCMD-maskall = s/(\w+)="(?:(?:(?!\s*?\|).)*?)"(?!.*\|)/\1="########"/g
Thank you for all of your help and advice.
[Edit: fixed formatting and used the code button so characters no longer are being eaten.]
... View more