Getting Data In

About distributed search.

yutaka1005
Builder

In my environment, I have two indexers for one Search head.

I think that these commands like "search", "dedup", "transaction" are processed by indexer in distributed search.

But are these commands in the sub search such as "map", "join" etc processed by indexer too?
Could anyone tell me?

0 Karma
1 Solution

mattymo
Splunk Employee
Splunk Employee

Hi yutaka1005!

I recommend checking out this doc on "Types of Commands"

http://docs.splunk.com/Documentation/Splunk/latest/Search/Typesofcommands

and

"command types"

https://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Commandsbytype

Which will give you an in-depth tour of the various types of search commands available to you, and how they function

Technically the indexers will be involved in all the commands you mentioned, as they will return events to the search head for further processing.

To your question specifically, join is listed as a centralized streaming command, which means it is run on the search head as events come back from the indexers.

Map is not listed but I would guess it is in the same category based on how I've seen it used

- MattyMo

View solution in original post

0 Karma

woodcock
Esteemed Legend

I am quite certain that dedup occurs both places and does map-reduce. An initial reduced local-scope dedup will occur on each Indexer and the final aggregated global-scope dedup will occur on the Search Head. Because map kicks off new searches, things must start at that point on the Search Head but it's work does map-reduce. Using join should always be avoided so let's not even talk about that (use stats, streamstats, etc. instead). Why is this important to you? Get your search working FIRST, then optimize it later. Just be sure to get it working WITHOUT using join or transaction and you should be fine.

0 Karma

yutaka1005
Builder

Does it mean that the search head collects each data once deduped with each indexer and then do dedup processing to them again?

And do you talk about "dedup" in "map" command like this?
main search | map search="... | dedup"

0 Karma

mattymo
Splunk Employee
Splunk Employee

Hi yutaka1005!

I recommend checking out this doc on "Types of Commands"

http://docs.splunk.com/Documentation/Splunk/latest/Search/Typesofcommands

and

"command types"

https://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Commandsbytype

Which will give you an in-depth tour of the various types of search commands available to you, and how they function

Technically the indexers will be involved in all the commands you mentioned, as they will return events to the search head for further processing.

To your question specifically, join is listed as a centralized streaming command, which means it is run on the search head as events come back from the indexers.

Map is not listed but I would guess it is in the same category based on how I've seen it used

- MattyMo
0 Karma

yutaka1005
Builder

Hi mmodestino_splunk!
Thank you for your polite answer.

I saw a bit of the document you taught me,but It seems that it will take time to understand it....

But I understood commands that I mentioned are processed by indexer.
And I understood that the search command in Join command is processed by indexer and the result is returned to search head, and join is processed there.
Also, I understood that map is a similar category.

But there is one point to wonder about.
Dedup is described as Centralized streaming command.
Is this command processed by search head?

0 Karma

esix_splunk
Splunk Employee
Splunk Employee

Dedup is processed on the Search Head side.

0 Karma

mattymo
Splunk Employee
Splunk Employee

^^^

Dedup requires the peers to return all the results to a central location (the search head) so that we can dedup. It is streaming because we can do it as the results come in.

- MattyMo
0 Karma

yutaka1005
Builder

Thank you for your comments !

I understood about dedup!

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

Splunk is officially part of Cisco

Revolutionizing how our customers build resilience across their entire digital footprint.   Splunk ...

Splunk APM & RUM | Planned Maintenance March 26 - March 28, 2024

There will be planned maintenance for Splunk APM and RUM between March 26, 2024 and March 28, 2024 as ...