About dropadrop

dropadrop · ‎02-29-2012

Thanks for a great and clear answer.

dropadrop · ‎02-28-2012

Thanks for the answers! Yeah, I don't think nullqueue is what I'm after since I don't want to remove the data. I was thinking Splunk would work more like deduplication of data; that is if I have a field that only has a few possible values the values would only be stored once and then referred to by an index (thus reducing the amount of data stored on disk). However if this is not the case, then clearly extracting during the import will not improve compression, which was my main goal.

dropadrop · ‎02-21-2012

I've been evaluating Splunk against a custom application which consists of a cluster of tomcat instances running two separate applications (partially sharing classes) and some front and back end apaches. I tested importing a months worth of log data (all at once), and have been playing around with it. Firstly it seems that Splunk reduces the data size to about half of what it was originally. If possible I'd like to be even more efficient with the indexing, as a lot of the data in the logs contains duplicate info. Looking at how splunk has processed the incoming data from the tomcat application (log4j), it seems to have only parsed the timestamp and nothing further (it's possible I'm missing something), so a log line is pretty much processed as a string. I later used field extraction (from the search) to extract fields such as the log level, the actual java class etc, but the concept is still a bit foreign to me (despite reading through a lot of documentation). If I kept on importing more data, would it automatically add the extracted fields to files with the same sourcetype? If I specified the field extractions so it's done during import would it reduce the size of the indexes stored on disk? Especially this last point is confusing, as documentation mentioned that field extraction during import can actually increase the size of the indexes. If I have 500 java classes producing a million lines of log a day, wouldn't separating that from the bulk log line during indexing actually reduce the size of the indexes (especially if the alternative is having a stored search producing a dashboard out of the data anyway)?

Posts	3
Solutions	0
Karma Given	1
Karma Received	0
Member Since	‎12-05-2011

Online Status	Offline
Date Last Visited	‎06-05-2020 02:03 AM

Strategy for deploying splunk on a custom applicat...

Re: Strategy for deploying splunk on a custom appl...

Re: Strategy for deploying splunk on a custom appl...

Strategy for deploying splunk on a custom applicat...