Solved: Re: How do I put multiple indexers on the same mac...

robertlynch2020 · ‎10-10-2018

Hi

I have one machine with Splunk installed. So the search head and one indexer are set to default. I need to make 3 indexers but keep it all on the same hardware.

I am unsure of what I have to change, as most of the docs i read are assuming i have a new machine and in this case i don't.

Thanks in advance
Robert

richgalloway · ‎10-10-2018

Why do you want to have three indexers in the same server?
Unless the machine has 60+ CPUs and a ton of memory, you may be about to shoot yourself in the foot.
While running a single SH and indexer together on the same box is supported (and common), multiple indexers on the same machine will just be competing for resources. That's why you need a lot of memory and CPU.
It can be done, but you will have to make a lot of manual configuration changes, especially to port numbers.

---
If this reply helps you, Karma would be appreciated.

View solution in original post

adonio · ‎10-11-2018

ok,

i will not recommend it, and @mmodestino explains it in detail in this answer about installing forwarders on the same machine
you can use the same principles as described in the answer:
https://answers.splunk.com/answers/521464/how-to-run-multiple-universal-forwarders-on-a-sing.html

robertlynch2020 · ‎10-11-2018

HI

Thanks for the answer, but its not clear to me how to install a second indexer - this is about a forwarder.

For example if i was installing a second indexer on a different machine, how do i do that? Then if i get that working i can copy it back to my main machine and change a port perhaps.

Rob

FrankVl · ‎10-11-2018

Indexe*r*s or Indexes? Who did you discuss this with on .conf, perhaps good to tag that person, so he/she can chip in with some additional context around that suggested solution?

If you really need to run multiple Splunk Indexer instances on 1 machine, wouldn't it perhaps be a better solution to run a number of VMs on that beefy piece of hardware you have?

And separate from plenty of CPU and Memory resources, how is your storage set up on that box? If all those indexers are going to hit the same physical disk, that's probably still not going to give very satisfactory results.

robertlynch2020 · ‎10-11-2018

Hi

Thanks for your help.

I am looking for Indexers (Not indexes) , I have a 6TB SSD so i have a really good box but i am just not using it.
I understand the VM option, but before i try that i would like to try 2 or 3 Indexers on the same machine to see if it help for performance. I also have a beta site that i can test this on.

I cant remember the name of the person that was helping me in the .conf. - sorry 😞

If you can tell me what files to change so i can test that would be great as i cant find any doc on this

Cheers
Robert

felipesewaybric · ‎10-10-2018

What you want is more indexes, and not peers, correct?

You just need to create those indexes, there is no reason to install more then 1 instance of Splunk on the same machine.

robertlynch2020 · ‎10-11-2018

Thanks. But how do i create them and how do i manage ports. As they will be all on the same machine.

felipesewaybric · ‎10-11-2018

Hi Robert, now I understand the reason you want more peers (machine indexers) and is a bad idea.

You have a search that is very slow and someone told you that having another peer will result in a better performance. But will only work if that peer is another machine, so you will have double the power from getting those events (log lines), but not for computing power.

After gathering all those events, it will be the search head that will process your SPL query.

What I recommend is to take a look at your query, what about Summary your information.

Example:
Instead of looking for 30 days, how about you looking for only 1 day every day, like earliest=-1d@d latest=@d, always saving the returned value in a summary index or a CSV.

You can schedule this search to run every day in the morning, or in the range of time that fits you well.

And then you can use this summary index or CSV in your dashboard, search or alarm.

If you want, I can show you more examples, give me more information about your query.

robertlynch2020 · ‎10-11-2018

Hi

Thanks for your help. The SPL is large and only looks lover 24 hours (I cant go any longer + i need all the data it is returning).
It is tstats and what i was told that if i have multiple indexers it will help to get the data quicker.

This is the SPL

| tstats summariesonly=true max(MXTIMING.Elapsed) AS Elapsed max(MXTIMING.CPU) AS CPU max(MXTIMING.CPU_PER) AS CPU_PER values(MXTIMING.RDB_COM1) AS RDB_COM values(MXTIMING.RDB_COM_PER1) AS RDB_COM_PER max(MXTIMING.Memory_V2) AS Memory max(MXTIMING.Elapsed_C) AS Elapsed_C values(source) AS source_MXTIMING avg(MXTIMING.Elapsed) AS average, count(MXTIMING.Elapsed) AS count, stdev(MXTIMING.Elapsed) AS stdev, median(MXTIMING.Elapsed) AS median, exactperc95(MXTIMING.Elapsed) AS perc95, exactperc99.5(MXTIMING.Elapsed) AS perc99.5, min(MXTIMING.Elapsed) AS min,earliest(_time) as start, latest(_time) as stop FROM datamodel=MXTIMING_V9 WHERE 
    host=QCST_RSAT_V41 
    AND MXTIMING.Elapsed > 0 OR 1=1
    GROUPBY _time MXTIMING.Machine_Name MXTIMING.Context+Command MXTIMING.NPID MXTIMING.Date MXTIMING.Time MXTIMING.MXTIMING_TYPE_DM source MXTIMING.UserName2 MXTIMING.source_path MXTIMING.Command3 MXTIMING.Context3 span=1s 
| rename MXTIMING.Context+Command as Context+Command 
| rename MXTIMING.NPID as NPID 
| rename MXTIMING.MXTIMING_TYPE_DM as TYPE 
| rename MXTIMING.Date as Date 
| rename MXTIMING.Time as Time 
| rename MXTIMING.Machine_Name as Machine_Name 
| rename MXTIMING.UserName2 as UserName 
| rename MXTIMING.source_path as source_path 
| eval Date=strftime(strptime(Date,"%Y%m%d"),"%d/%m/%Y") 
| eval Time = Date." ".Time 
| eval FULL_EVENT=Elapsed_C 
| eval FULL_EVENT=replace(FULL_EVENT,"\d+.\d+","FULL_EVENT") 
| join Machine_Name NPID type=left 
    [| tstats summariesonly=true count(SERVICE.NPID) AS count2 values(source) AS source_SERVICES FROM datamodel=SERVICE_V6 WHERE ( host=QCST_RSAT_V41 earliest=1539054000 latest=1539212400) AND SERVICE.NICKNAME IN (*)
        GROUPBY SERVICE.Machine_Name SERVICE.NICKNAME SERVICE.NPID 
    | rename SERVICE.NPID AS NPID 
    | rename SERVICE.NICKNAME AS NICKNAME 
    | rename SERVICE.Machine_Name as Machine_Name 
    | table NICKNAME NPID source_SERVICES Machine_Name ] 
| lookup MXTIMING_BASE.csv Context_Command AS "Context+Command" Type as "TYPE" OUTPUT Tags CC_Description Threshold Alert 
| appendpipe 
    [| where isnull(Threshold) 
    | rename TYPE AS BACKUP_TYPE 
    | eval TYPE="*" 
    | lookup MXTIMING_BASE.csv Context_Command AS "Context+Command" Type as "TYPE" OUTPUT Tags CC_Description Threshold Alert 
    | rename BACKUP_TYPE AS TYPE] 
| sort Threshold 
| dedup Time, NPID,Context+Command 
| where Elapsed > Threshold OR isnull('Threshold') 
| fillnull Tags 
| eval Tags=if(Tags=0,"PLEASE_ADD_TAG",Tags) 
| makemv Tags delim="," 
| eval Tags=split(Tags,",") 
| search Tags IN (*) 
| eval source_SERVICES_count=mvcount(split(source_SERVICES, " ")) 
| eval NICKNAME=if(source_SERVICES_count > 1, "MULTIPLE_OPTIONS_FOUND",NICKNAME) 
| search 
| timechart bins=1000 max(Elapsed) by Tags limit=20

This is the job inspect

Screen reader users, click here to skip the navigation bar
Search job inspector
This search is still running and is approximately 100% complete.

(SID: admin__admin_bXVyZXhfbWxj__baseSearch_1539267931.395925)  search.log

Execution costs
Duration (seconds)      Component   Invocations Input count Output count
    5.16     .execute_input.flush_prestats  5   895,975 895,975
    52.08    command.tstats 89  956,472 1,446,184
    41.01    command.tstats.query_tsidx 10  -   -
    11.04    command.tstats.execute_input   44  956,472 -
    0.00     dispatch.check_disk_usage  4   -   -
    0.00     dispatch.createdSearchResultInfrastructure 1   -   -
    0.00     dispatch.evaluate  1   -   -
    0.01     dispatch.evaluate.rename   8   -   -
    0.00     dispatch.evaluate.eval 1   -   -
    0.00     dispatch.evaluate.tstats   1   -   -
    0.00     dispatch.evaluate.noop 1   -   -
    27.54    dispatch.fetch 45  -   -
    0.00     dispatch.optimize.FinalEval    1   -   -
    0.03     dispatch.optimize.matchReportAcceleration  1   -   -
    0.04     dispatch.optimize.optimization 1   -   -
    0.00     dispatch.optimize.reparse  1   -   -
    0.00     dispatch.optimize.toJson   1   -   -
    0.00     dispatch.optimize.toSpl    1   -   -
    10.31    dispatch.preview   1   -   -
    5.20     dispatch.preview.tstats.execute_output 1   -   -
    3.91     dispatch.preview.command.rename    8   483,976 483,976
    0.67     dispatch.preview.command.eval  1   60,497  60,497
    0.52     dispatch.preview.write_results_to_disk 1   -   -
    41.04    dispatch.stream.local  45  -   -
    0.02     dispatch.writeStatus   14  -   -
    0.14     startup.configuration  1   -   -
    0.18     startup.handoff    1   -   -
Search job properties
Server info: Splunk 7.0.3, splunk:8000, Thu Oct 11 15:27:05 2018 User: admin

felipesewaybric · ‎10-11-2018

Try this one, is set to get the last hour, see if its fast enough.

| tstats summariesonly=true 
max(MXTIMING.Elapsed) AS Elapsed 
max(MXTIMING.CPU) AS CPU 
max(MXTIMING.CPU_PER) AS CPU_PER 
values(MXTIMING.RDB_COM1) AS RDB_COM 
values(MXTIMING.RDB_COM_PER1) AS RDB_COM_PER 
max(MXTIMING.Memory_V2) AS Memory 
max(MXTIMING.Elapsed_C) AS Elapsed_C 
values(source) AS source_MXTIMING 
avg(MXTIMING.Elapsed) AS average 
count(MXTIMING.Elapsed) AS count 
stdev(MXTIMING.Elapsed) AS stdev 
median(MXTIMING.Elapsed) AS median
exactperc95(MXTIMING.Elapsed) AS perc95
exactperc99.5(MXTIMING.Elapsed) AS perc99.5
min(MXTIMING.Elapsed) AS min,earliest(_time) as start
latest(_time) as stop 

FROM datamodel=MXTIMING_V9 
WHERE 
     host=QCST_RSAT_V41 
     AND (MXTIMING.Elapsed > 0 OR 1=1)
     AND earliest=-1h@h
     GROUPBY _time MXTIMING.Machine_Name MXTIMING.Context+Command MXTIMING.NPID MXTIMING.Date MXTIMING.Time MXTIMING.MXTIMING_TYPE_DM source MXTIMING.UserName2 MXTIMING.source_path MXTIMING.Command3 MXTIMING.Context3 span=1s 


 | rename MXTIMING.Context+Command as Context+Command 
 | rename MXTIMING.NPID as NPID 
 | rename MXTIMING.MXTIMING_TYPE_DM as TYPE 
 | rename MXTIMING.Date as Date 
 | rename MXTIMING.Time as Time 
 | rename MXTIMING.Machine_Name as Machine_Name 
 | rename MXTIMING.UserName2 as UserName 
 | rename MXTIMING.source_path as source_path 

 | eval Date=strftime(strptime(Date,"%Y%m%d"),"%d/%m/%Y") 
 | eval Time = Date." ".Time 
 | eval FULL_EVENT=Elapsed_C 
 | eval FULL_EVENT=replace(FULL_EVENT,"\d+.\d+","FULL_EVENT") 

 | join Machine_Name NPID type=left 
     [| tstats summariesonly=true count(SERVICE.NPID) AS count2 values(source) AS source_SERVICES FROM datamodel=SERVICE_V6 WHERE ( host=QCST_RSAT_V41 earliest=1539054000 latest=1539212400) AND SERVICE.NICKNAME IN (*)
         GROUPBY SERVICE.Machine_Name SERVICE.NICKNAME SERVICE.NPID 
     | rename SERVICE.NPID AS NPID 
     | rename SERVICE.NICKNAME AS NICKNAME 
     | rename SERVICE.Machine_Name as Machine_Name 
     | table NICKNAME NPID source_SERVICES Machine_Name ] 

 | lookup MXTIMING_BASE.csv Context_Command AS "Context+Command" Type as "TYPE" OUTPUT Tags CC_Description Threshold Alert 

 | appendpipe 
     [| where isnull(Threshold) 
     | rename TYPE AS BACKUP_TYPE 
     | eval TYPE="*" 
     | lookup MXTIMING_BASE.csv Context_Command AS "Context+Command" Type as "TYPE" OUTPUT Tags CC_Description Threshold Alert 
     | rename BACKUP_TYPE AS TYPE] 

 | sort Threshold 
 | dedup Time, NPID,Context+Command 

 | where Elapsed > Threshold OR isnull('Threshold') 

 | fillnull Tags 
 | eval Tags=if(Tags=0,"PLEASE_ADD_TAG",Tags) 
 | makemv Tags delim="," 
 | eval Tags=split(Tags,",") 
 | search Tags IN (*) 
 | eval source_SERVICES_count=mvcount(split(source_SERVICES, " ")) 
 | eval NICKNAME=if(source_SERVICES_count > 1, "MULTIPLE_OPTIONS_FOUND",NICKNAME) 
 | timechart bins=1000 max(Elapsed) by Tags limit=20

robertlynch2020 · ‎10-11-2018

HI

I have time controls on search with, so the less time i look over the quicker it is.
$time_selection.earliest$
$time_selection.latest$

Did you change something else in the query as well to improve performance?

felipesewaybric · ‎10-11-2018

yes, there was an empty search near the end, but the point is.

Create a schedule search to run by the hour with the -1h@h and save the result using:
| outputlookup append=t some_name.csv

And if you want to search in this csv, just use:

| inputlookup some_name.csv
| where _time> relative_time(now(),"-1d@d")

or other search.

robertlynch2020 · ‎10-31-2018

HI

Sorry for the delay on getting back to you.

The issue with the solution is, i can have a user looking for any time and it wont be on the hour. So if they can want any time not just data we store in a csv.

Thanks for your answer, i will try and install another indexer on another machine.
Rob

richgalloway · ‎10-10-2018

Why do you want to have three indexers in the same server?
Unless the machine has 60+ CPUs and a ton of memory, you may be about to shoot yourself in the foot.
While running a single SH and indexer together on the same box is supported (and common), multiple indexers on the same machine will just be competing for resources. That's why you need a lot of memory and CPU.
It can be done, but you will have to make a lot of manual configuration changes, especially to port numbers.

---
If this reply helps you, Karma would be appreciated.

robertlynch2020 · ‎10-11-2018

I have a 60 core box but only use 25% of it. So after discussions in .conf we think it best to try and add more indexers onto the same machine.

robertlynch2020 · ‎10-11-2018

The issue is i dont know how to do it

richgalloway · ‎10-15-2018

On a Linux box you create a new Splunk directory (/opt/splunkidx2, for example) and extract the Splunk tarball there. Before you start Splunk, review the .conf files and change all port numbers to something else. You'll mainly want web.conf, but also look for wherever ports 8089 and 9997 are defined. Be sure to make the changes in the respective local directories so they are lost during your next upgrade.
Then you'll need to create a new /etc/init.d/splunkidx2 file so the new indexer starts after a reboot.
Repeat these steps for additional indexers.
When all of the indexers are running, you sign in to your search head and go to Settings->Distributed Search and add each indexer as a peer. Now each search will be performed by all indexers.
Please understand that the performance gains that come from having multiple indexers is based entirely on the fact that data is evenly distributed among those indexers. If all of your data is on one indexer (like it is now), adding indexers will not help.

---
If this reply helps you, Karma would be appreciated.

robertlynch2020 · ‎11-06-2018

Thanks for this good information.

So you are right , i have made 3 new indexers, but its not faster. I am using tstats off datamodels in my search.

So if i set up multiple Indexers can i add on Index clustering with 100% replication so all new indexers will have a copy of the data?

Cheers
Rob

adonio · ‎10-10-2018

can you kindly elaborate on your requirements? why would you need 3 indexers on the same server?

robertlynch2020 · ‎10-11-2018

Hi.

After been to the .conf and talking to a meet the expert we went over my issue. I have large tstat commands taking a long time. He said this can be reduced with multiple indexers. At the moment i have a 60 core Intel chip machine and we only use 25% of this machine. So i am looking to add more indexers onto it.

How do I put multiple indexers on the same machine, what ports do i need to change?

Introducing the Splunk Community Dashboard Challenge!

Get the T-shirt to Prove You Survived Splunk University Bootcamp

Wondering How to Build Resiliency in the Cloud?