Splunk on Hadoop

riteshbansal · ‎03-21-2013

Hello Team,

I would like to know what kind of connectivity Splunk has with Hadoop and HDFS?

I noticed that index creation part of splunk takes a good amount of time, so I would like to know following:

Is it possible to install splunk over HDFS? So if we have weblog data over HDFS, can Splunk index creation done using MR jobs?
How splunk stores the data? So if I have connected it to multiple servers to fetch web logs data, will it pull all data to local server, create index and store index as well in local?

Thanks in advance,
Ritesh

araitz · ‎03-21-2013

See http://www.splunk.com/view/hadoop-connect/SP-CAAAHA3

Splunk itself does not run on HDFS, but Hadoop Connect facilitates interaction with it.

We also have Hadoop Ops for monitoring and troubleshooting Hadoop deployments: http://splunk-base.splunk.com/apps/57004/splunk-app-for-hadoopops

Splunk stores data in a distributed fashion on machines called 'indexers'. Generally indexers are seperate machines than where the data is created. You can use a 'forwarder' to get data from production machines to indexers. Many indexers can be searched at the same time from a machine configured as a 'search head'.

Splunk on Hadoop

Introducing Splunk Enterprise 9.2

Adoption of RUM and APM at Splunk

Routing logs with Splunk OTel Collector for Kubernetes