We would like to know how to use the three different explicit modes correctly and how to use the implicit ones correctly in the context of Hunk.
https://answers.splunk.com/answers/239341/in-hunk-verbose-mode-vs-smart-mode-for-vix-virtual.html touches on the subject...
In most cases, you should use smart mode. The other modes are useful for diagnosing problems. For example, if Hunk is able to talk to HDFS but not able to run MR jobs due to configuration problems, then searches should run successfully in verbose mode, but not in smart mode. Verbose also will also cause the search.log to contain more information.
In most cases, you should use smart mode. The other modes are useful for diagnosing problems. For example, if Hunk is able to talk to HDFS but not able to run MR jobs due to configuration problems, then searches should run successfully in verbose mode, but not in smart mode. Verbose also will also cause the search.log to contain more information.
Great. About the implicit search modes part of the question - how do we know for sure that we converted the "regular" Splunk search to an Hunk MapR based search?
Maybe I'm not clear - we want the users to know whether a MapR job was produced and that they are not running Hunk in Splunk mode. How can they know for sure that they are in Hunk mode?
Raanan explained that adding the stats command triggers the MapR job, but I'm sure whether it's the only way.
Regards,
Dan
Sorry for losing track of this thread. I did not meant to leave your question unanswered so long.
You can tell how a search was done via either the job inspector or the search log. To see the first, after running your search, on the results page go to Job -> Inspect Job. From there, you can also click on the link "search.log".
If you run a search with a reporting command, like "index=my_vix | stats count", you will normally get a "mixed mode" search. This means Hunk will launch a Map Reduce job, and stream data while waiting for the job to complete. In the job inspector, you should see entries for:

erp..MR
erp..stream.bytes
which give you information about the Map Reduce and streaming parts of the search, respectively. If you look in search.log, you will find a lines that look like this (note the word "mixed"):
03-14-2016 11:25:27.508 INFO ExternalResultProvider - provider=, mode.config=report, mode.search=mixed
...
03-14-2016 11:25:29.225 INFO ERP. - SplunkMR$SearchHandler - Search mode: mixed
Now if you run a search like "index=my_vix", it should run as a pure streaming search. You should see that the job inspector has a line for stream.bytes, but not for MR. In the search.log, you should find something like:
03-14-2016 11:25:17.403 INFO ExternalResultProvider - provider=, mode.config=report, mode.search=stream
....
03-14-2016 11:25:18.272 INFO ERP. - SplunkMR$SearchHandler - Search mode: stream
Finally, we can force the original search ("index=my_vix | stats count") to be a pure reporting search (i.e. MR job but no streaming of data), by adding this line to the provider stanza:
vix.splunk.search.mixedmode = 0
Now the job inspector will have an entry for MR, but not for stream.bytes, and the search.log should have these lines:
03-14-2016 11:32:20.462 INFO ExternalResultProvider - provider=, mode.config=report, mode.search=report
03-14-2016 11:32:21.599 INFO ERP. - SplunkMR$SearchHandler - Search mode: report
Hope that helps.
Excellent info Keith! Can you get this documented? Thanks.
Thanks Becky. I've passed along a request to my Documentation group.
Thank you Keith!!! much appreciated.
What has been bothering me is the fact that the end users don't know whether they run the MapR job or not. I believe Raanan opened an enhancement request to make this information available via the UI.
Great. Can I use the Fast mode for Hunk?
Yes, Fast mode does work with Hunk. Fast mode disables field-discovery, so that Hunk will only know about index-time fields and required fields configured in the provider or index. This may speed up your searches, depending on the search itself, how the data is stored, the size of the data to searched, etc.
http://docs.splunk.com/Documentation/Hunk/6.3.3/Hunk/distributableandnondistributablesearchcommands
http://docs.splunk.com/Documentation/Hunk/6.3.3/Hunk/distributableandnondistributablesearchcommands
Smart mode is the default and recommended setting for VIX searches. It maintains search behavior based on whether your search contains transforming commands. When searching virtual indexes we recommend that you search in smart mode, as it is more efficient.
If you use verbose mode to search a VIX, note that Hunk does not start a MapReduce job for that search. This is because verbose mode searches search for all events as well as any reports that you might be running. The benefits of MapReduce jobs in that case are minimal and in some cases can have a negative impact on your searches.