How can I anonymize certain fields from data inputs in Splunk Cloud when ingesting logs from AWS S3 buckets?
Hi there,
You can make use of regex to look for fields and mask them with props.conf and transforms.conf files. Please provide some info about your architecture and some sample events if you want help with .conf settings.
Please refer to this link for instructions.
Is that the same steps for Splunk Cloud as it is for Splunk Enterprise?
I'm looking at some CloudFront logs being ingested through an s3 bucket input into a Cloud instance. I found some notes at https://answers.splunk.com/answers/149597/im-struggling-with-how-i-should-be-doing-inputs-and-also-p... that might apply, but I've also found notes saying it's impossible to anonymize this data after indexing. Do I need to be transforming it with a sed script before even having Splunk Cloud in the picture, or is there a configuration i'm missing (field transform?)?
Per documentation, "To anonymize data with Splunk Cloud, you must configure a Splunk Enterprise instance as a heavy forwarder and anonymize the incoming data with that instance before sending it to Splunk Cloud. You can follow the instructions in this topic on the heavy forwarder."
Yes. Once data is indexed, you can't change it. You should mask your data before it touches indexers in cloud.