Hello,
I am currently attempting to run a Experiment using the "Predict Numeric Fields" format with 109 possible fields to use for predicting. I am able to get a decent R^2 value but I have noticed that Splunk seems to lack any data related to T-Score or P-Value so I cannot make sense of any of my coefficient values.
I have attempted to research the "score" keyword and I have not found a strong application that can run on multiple fields.
Does Splunk have a built in way to assign a score to multiple coefficients in a regression algorithm?
-I am looking for a solution in the Machine Learning toolkit v3.3, but have access to a local copy with ML v4.1 as well.
Thanks for your question.
You will need to nest your |Score commands today - sort of like this
| inputlookup track_day.csv
| sample partitions=100 seed=1234
| search partition_number > 70
| apply example_vehicle_type as DT_prediction probabilities=true
| multireport
[| score confusion_matrix vehicleType against DT_prediction]
[| score roc_auc_score vehicleType against "probability(vehicleType=2013 Audi RS5)" pos_label="2013 Audi RS5"]
Keep an eye out for MLTK updates, we have some great stuff coming out to make these easier for you!
Hi,
Did you get a chance to look into Score command Documentation? Let me know if this helps. https://docs.splunk.com/Documentation/MLApp/4.1.0/User/ScoreCommand
I have looked through the available options and tried a few, specifically the Pearson and Spearman Regression scoring methods which give a p-value, but these will only give one score at a time which does no good.
I'm continuing to research and try the methods under "Classification" :
Accuracy
Confusion matrix
F1-score
Precision
Precision-Recall-F1-Support
Recall
ROC-AUC-score
ROC-curve