ok so I have run a random forest regressor on my sample data, trying to predict field A based on fields Dealers &Orders . I need help on interpreting the results.
Under Fit Model Parameters Summary it gives me 2 rows
feature importance
Dealers 0.219929427086
Orders 0.780070572914
so given that I am happy with R square value (0.9467), does this prediction mean
field A=0.219929427086 * Dealers + 0.780070572914 * Orders
??
Feature importance with a random forest is different from the coefficients in a model like LinearRegression. Unfortunately, there's no simple equation like that to write down for a random forest; it's many many different regression trees, each of which are (even by themselves) not a simple linear equation.
You can find a decent explanation of how the importance is calculated, and should be interpreted, here:
http://scikit-learn.org/stable/modules/ensemble.html#feature-importance-evaluation
Feature importance with a random forest is different from the coefficients in a model like LinearRegression. Unfortunately, there's no simple equation like that to write down for a random forest; it's many many different regression trees, each of which are (even by themselves) not a simple linear equation.
You can find a decent explanation of how the importance is calculated, and should be interpreted, here:
http://scikit-learn.org/stable/modules/ensemble.html#feature-importance-evaluation
For the quibblers interested, this SO goes into more detail on how its calculated: https://stackoverflow.com/questions/15810339/how-are-feature-importances-in-randomforestclassifier-d...
ok so , then how do i predict field A using dealers and orders?
When you call the fit command, supply 'into your_model_name'. Then, you can use that model later with the apply command:
[training data] | fit RandomForestRegression A from dealers orders into my_model
[new data] | apply my_model
sorry , not clear.
I have already saved my model as "sc" in the MLTK app.
Now, customer is asking me what is the predicted value for field A when dealers=6000 and orders= 63
... | eval dealers=6000 | eval orders=63 | apply sc
Then look at the value of the field called predicted(A)