Hunk does have control on number of mapper by using the desired inputformat which controls the number of splits which in turns controls the number of mappers. As per the source code of splunk it does not have combineFileInputFormat support in it, so unless Hunk adapts it in its Code , we wont be getting this feature.
Hunk should seriously consider adding this feature , as small files are obvious when we do batch load in small intervals. This small file problem has been solved in hadoop using combineFileInputFormat, In hive using combinehiveinputformat. It should be simple and optimal if hunk can adapt this.
Preparing the data to be large file is not a good option.
... View more