All Apps and Add-ons

How Much Latency Does Hadoop Connect Add?

David
Splunk Employee
Splunk Employee

I'm trying to figure out how much of an additional delay Hadoop Connect would add to my existing Splunk log latency to get data into Hadoop. I.e., Most of my logs are available in Splunk within seconds of being created. How long before they would be available in Hadoop? Minutes/Hours/Days?

Thank you

Tags (1)
0 Karma
1 Solution

rdagan_splunk
Splunk Employee
Splunk Employee

Using Hadoop Connect Export, every 5 minutes is the minimum frequency allowed. 
So at the minimum - every 5 minutes a search will start .. As the job runs, Splunk processes chunks of data received from the search and creates compressed files, locally on the search head. These files are moved to HDFS if they reach 64MB size or if cumulatively they consume more than 1GB, or the search completes successfully.
Therefore, for a short search with little results I would say maybe every 6 minutes you will get a new file into HDFS.  For a larger results, it will take longer for the file to get upto 64MB and to move the 64MB into HDFS.

View solution in original post

rdagan_splunk
Splunk Employee
Splunk Employee

Using Hadoop Connect Export, every 5 minutes is the minimum frequency allowed. 
So at the minimum - every 5 minutes a search will start .. As the job runs, Splunk processes chunks of data received from the search and creates compressed files, locally on the search head. These files are moved to HDFS if they reach 64MB size or if cumulatively they consume more than 1GB, or the search completes successfully.
Therefore, for a short search with little results I would say maybe every 6 minutes you will get a new file into HDFS.  For a larger results, it will take longer for the file to get upto 64MB and to move the 64MB into HDFS.

Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...