We have a set of applications and infrastructure on the order of thousands of nodes whose statistical, diagnostics, and log information is written to geo distributed Cassandra DB cluster. We have a need for near real-time and historical analysis and viewing of this data. What would be a mechanism and advantage if at all of using Splunk with such data? Is there a plan to support Cassandra? Is anyone working on it?
There is a great deal of advantage to this approach and there is some work out there that is being used by several companies at this kind of scale to do exactly what you are talking about. While in the future there may be a more supported app or product SKU for this feel free to use this Splunk app to get you started. https://github.com/esatterly/splunk-cassandra Also if you give it until end of next week the update CQL 3 driver and code will be added to this as right now it is in testing.