All Apps and Add-ons

ML - random forest regressor

Sukisen1981
Champion

ok so I have run a random forest regressor on my sample data, trying to predict field A based on fields Dealers &Orders . I need help on interpreting the results.
Under Fit Model Parameters Summary it gives me 2 rows
feature importance
Dealers 0.219929427086

Orders 0.780070572914

so given that I am happy with R square value (0.9467), does this prediction mean
field A=0.219929427086 * Dealers + 0.780070572914 * Orders
??

0 Karma
1 Solution

aoliner_splunk
Splunk Employee
Splunk Employee

Feature importance with a random forest is different from the coefficients in a model like LinearRegression. Unfortunately, there's no simple equation like that to write down for a random forest; it's many many different regression trees, each of which are (even by themselves) not a simple linear equation.

You can find a decent explanation of how the importance is calculated, and should be interpreted, here:
http://scikit-learn.org/stable/modules/ensemble.html#feature-importance-evaluation

View solution in original post

0 Karma

aoliner_splunk
Splunk Employee
Splunk Employee

Feature importance with a random forest is different from the coefficients in a model like LinearRegression. Unfortunately, there's no simple equation like that to write down for a random forest; it's many many different regression trees, each of which are (even by themselves) not a simple linear equation.

You can find a decent explanation of how the importance is calculated, and should be interpreted, here:
http://scikit-learn.org/stable/modules/ensemble.html#feature-importance-evaluation

0 Karma

aljohnson_splun
Splunk Employee
Splunk Employee

For the quibblers interested, this SO goes into more detail on how its calculated: https://stackoverflow.com/questions/15810339/how-are-feature-importances-in-randomforestclassifier-d...

0 Karma

Sukisen1981
Champion

ok so , then how do i predict field A using dealers and orders?

0 Karma

aoliner_splunk
Splunk Employee
Splunk Employee

When you call the fit command, supply 'into your_model_name'. Then, you can use that model later with the apply command:
[training data] | fit RandomForestRegression A from dealers orders into my_model
[new data] | apply my_model

0 Karma

Sukisen1981
Champion

sorry , not clear.
I have already saved my model as "sc" in the MLTK app.
Now, customer is asking me what is the predicted value for field A when dealers=6000 and orders= 63

0 Karma

aoliner_splunk
Splunk Employee
Splunk Employee

... | eval dealers=6000 | eval orders=63 | apply sc
Then look at the value of the field called predicted(A)

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...