I found out that in one of my web logs that Splunk's been eating, there's data that I need to mask out. So, I've got two problems to solve:
(a) Removing the sensitive data (though not the WHOLE event) from already-indexed data, and
(b) Making it so newly-indexed data has this same data masked.
What's my best way to approach this? It's data like this that I'm trying to mask:
173.103.16.2 - - [10/Jun/2011:16:09:27 -0500] "GET /admin/load-scripts.jsp?c=1&failedPassword=FAILEDPASSWORDIWANTTOMASK&otheroptions=3
This is a simple trick to mask data at search time. Get the part of the event to mask with a "rex" command, then modify the "_raw" field with the masked data.
From original event, trim the last 5 digit from accountNumber. Original event:
2016-04-06 12:24:06,Event [Event=UpdateBillingProvQuote, timestamp=1337891259, properties={JMSCorrelationID=NA, JMSMessageID=ID:ESP-PD.F4CB3B4B9EF87:AA49A1BD, orderType=FeatureChange, quotePriority=NORMAL, conversationId=ESB~16214F4A71D1DA77:E35B0544:0F2958EEF3F0:B580, credits=NA, JMSReplyTo=pub.esb.genericasync.response, timeToLive=-1, serviceName=UpdateBillingProvisioning, esn=7F758AD4A3B86F, accountNumber=900013479, MethodName=InternalEvent, AdapterName=UpdateBillingProvQuote, meid=NA, orderNumber=19256698, quoteNumber=75909847, ReplyTo=NA, userName=temordia, EventConversationID=NA, mdn=5789374447, accountType=PrePaid, marketCity="ARVADA", marketState=CO, marketZip=80006, billingCycle=27, autoBillPayment=T, phoneCode=HE4G, phoneType=Android, phoneName="HTC Evo 4G", planCode=ULPRE50, planType=PrePaid, planPrice=50.00, planName="Unlimited Prepaid", planDescription="Nationwide Prepaid Unlimited Minutes", networkProviderName=Splunktel}]
New search:
index=oidemo sourcetype=business_event | rex "^(?<head>.*accountNumber=\d+)\d{5},(?<tail>.*)$" | eval _raw=head."XXXX".tail
The new event now looks like this:
2016-04-06 12:24:06,Event [Event=UpdateBillingProvQuote, timestamp=1337891259, properties={JMSCorrelationID=NA, JMSMessageID=ID:ESP-PD.F4CB3B4B9EF87:AA49A1BD, orderType=FeatureChange, quotePriority=NORMAL, conversationId=ESB~16214F4A71D1DA77:E35B0544:0F2958EEF3F0:B580, credits=NA, JMSReplyTo=pub.esb.genericasync.response, timeToLive=-1, serviceName=UpdateBillingProvisioning, esn=7F758AD4A3B86F, accountNumber=9000XXXX MethodName=InternalEvent, AdapterName=UpdateBillingProvQuote, meid=NA, orderNumber=19256698, quoteNumber=75909847, ReplyTo=NA, userName=temordia, EventConversationID=NA, mdn=5789374447, accountType=PrePaid, marketCity="ARVADA", marketState=CO, marketZip=80006, billingCycle=27, autoBillPayment=T, phoneCode=HE4G, phoneType=Android, phoneName="HTC Evo 4G", planCode=ULPRE50, planType=PrePaid, planPrice=50.00, planName="Unlimited Prepaid", planDescription="Nationwide Prepaid Unlimited Minutes", networkProviderName=Splunktel}]
This works perfectly!!
I downvoted this post because because it only works if you write every search for the users
You are right, there is no way to mask the data at search time. This solution is only for the purposes of hiding the data for specific dashboard panel/report.
Lisa, I do agree with your comments but it happened also to us to have users requiring this visibility to the original raw data limited to only certain roles. So the solution of "masking" at search time with "rex", together with disabling drilldown it was the solution we adopted.
Do you know of any other search time solution?
Regards,
Marco
There is no way to mask the data for only a subset of users at search time - unless you are going to write every search for that subset of users, and restrict those users accessing the search bar in any way.
One alternative could be to route only the sensitive data to a special index. Most of the data then could go to indexes that are widely visible, and that users can search. The sensitive data then would go in a special index that only some roles could access. For others to access the special index, they could be required to use dashboards, etc. that limit/mask their access. You would still need to be careful with those dashboards, etc. to make use that techniques like drill-down would not compromise the security of the data.
This solution does not meet my definition of "masking"
This hides the data for just this search alone.
So this solution will work only in a dashboard and only if you have also disabled drill-down and disabled "open in search." A user who drills down - or who uses the magnifying class to "open in search" - will be able to circumvent the masking.
Thus my earlier answer.
If you’re willing to use a third-party tool (Eclipse GUI) for masking, you can mask it next time (before you re-index it) with this one:
You can mask sensitive data at index time. (Ask more questions if that's not sufficient information!)
However, once the data has been indexed, there is no way to change it. Not possible.
All you can do it delete the data and re-index it. You can't mask it at search time.
I know that isn't the answer that you wanted... sorry!
I downvoted this post because there is a way to mask sensitive data at search time now as well. please see the last answer below
I disagree with your down-vote. See my comment below.
Being able to hide the data in a single search does not mask it. For the "trick" to work, users cannot be allowed to access the search bar.