I have looked at answers for this already, but when I try any of them, my search still shows the unmasked data.
Sample data:
10.98.112.52 - myid [06/Oct/2015:09:42:39 -0400] "GET /mySeal/VUEIT/myPortal/ApplicationSelection/CheckDeatilsSSN?SSN=123-45-6789&TAXID=&RecipientType=Agent&_= HTTP/1.1" 200 22
Tried the following:
props.conf:
[source::WebProxy]
TRANSFORMS-anonymize = ssn-web-anonymizer
transfoms.conf:
[ssn-web-anonymizer]
REGEX = (?m)^(.*)SSN=\d\d\d\-\d\d\-\d\d\d\d(\&.*)$
FORMAT = $1SSN=###-##-####$2
DEST_KEY = _raw
Also tried:
props.conf:
[source::WebProxy]
SEDCMD-hidessn=s/(SSN=\d{3})\-(\d{2})\-(\d{4})/SSN=xxx\-xx\-xxxx/g
What am I doing wrong?
Again I need the data masked no matter where the search is being done.
Data that has already been indexed in Splunk is immutable. If this were not so, Splunk would be fairly useless for compliance purposes. The best that you can do is to use the delete
command to hide the data (as I said, it is immutable so it doesn't really get deleted but it does become unsearchable). You could use roles
to limit who has access to the data.
One other crazy idea is to manually modify the data and then reindex it by sending into a Summary Index using the collect
command and then use both indices in your searches until the data ages out like this:
index=normal_index OR index=my_SI_hack ...
This has the benefit of not costing you any license so you could slam 11 months of backlog in and not go over (and then delete
the original events).
Did this explain what you were seeing?
Again I need the data masked for already indexed data no matter where or how the search is being done.
It seems if this is not possible I would have to delete all data for the past 11 months and that really is not an acceptable option.
Both your solution works fine for me. Did you make this changes to Indexer/Heavy forwarder and restarted the SPlunk instance in them??
Also, this will mask any future data only, historical data will remain as it is.