All Apps and Add-ons

Is it possible to use the cross-validation in the Machine Learning Toolkit and Showcase app?

nnetz
New Member

Hello,

Is it possible to use the cross-validation in the Machine Learning Toolkit and Showcase app?

0 Karma

gabrcg
New Member
0 Karma

aljohnson_splun
Splunk Employee
Splunk Employee

The assistants for Predict Numeric Fields and Predict Categorical Fields do 2-fold cross validation for you, automatically. You can select the train-test ratio of your choosing.

0 Karma

melonman
Motivator

Well, if you look for automated cross validation or single command to perform cross validation, maybe the answer is probably No at this moment.

Here is what I do for now.

For example, K-Fold cross validation where K=5, you could split your data into partitions (like into 5) using sample command.

... search to get your dataset | sample partitions=5

This will add partition_number to dataset so you can specify the number to get a part of data.
Then, and use partition 1(1/5 of data) to create model (use as train) and rest of data to use for test.

... search to get your dataset | sample partitions=5 | where partition_number=0 | fit ... into your_model | ..

and test with the rest

... search to get your dataset | sample partitions=5 | where partition_number!=0 | apply your_model | ..

then calculate errors and consolidate the result from each validation.

maybe you can automate this by other splunk job scheduling technologies... (scheduled search, summary index + some dashboard)

gabrcg
New Member

To apply the k-fold cross validation (using 5 folds as in the above example), you should train with 4 folds, and then test with 1 fold. The code example is doing the opposite. So, it should be:

Train with 4 folds

 | sample partitions=5 seed=1| where partition_number!=0 | fit ... into your_model |

Test with 1 fold

 | sample partitions=5 seed=1| where partition_number=0 | apply your_model
0 Karma

aljohnson_splun
Splunk Employee
Splunk Employee

Make sure you a set a seed in the sample! E.g.

| sample partitions=5 seed=42
0 Karma
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...