We need to index content that may contain in-line gzip (or other compression) content. We do not need to search on the compressed content, but we do need to be able to read that content back out out of Splunk and have it be valid for decompression and display.
I've done some searching through the documentation and knowledge base but have not found any pages that address the topic of gzip content mingled into text log content.
In our case, in the file Splunk is forwarding, we have a message delimiter that we use for our linebreaker, then one line of data that we parse with a REPORT regex, then the content of the message that we are handling. That content, which includes line breaks, usually has some plain-text headers, some other text, then content which might be json, xml, or might be gzip or otherwise compressed something.
We control the writing and use of the content, so for example it would be possible for us to BASE64-encode any binary content before we write it to the log file, then have our application decode it just prior to use - making the log content plain text the rest of the way though.
We would appreciate your advice/recommendations on how best to accomplish this
... View more