In response to your comment about unsupervised learning, there are two commands you might find useful.
kmeans
Partitions the events into k clusters, with each cluster defined by its mean value. Each event belongs to the cluster with the nearest mean value. Performs k-means clustering on the list of fields that you specify. If no fields are specified, performs the clustering on all numeric fields. Events in the same cluster are moved next to each other. You have the option to display the cluster number for each event.
Note that kmeans only works with numeric fields. Example:
... | kmeans k=4 disttype=cosine count
cluster (see wiki: agglomerative clustering)
The cluster command groups events together based on how similar they are to each other. Unless you specify a different field, cluster groups events based on the contents of the _raw field. The default grouping method is to break down the events into terms (match=termlist) and compute the vector between events. Set a higher threshold value for t, if you want the command to be more discriminating about which events are grouped together.
The result of the cluster command appends two new fields to each event. You can specify what to name these fields with the countfield and labelfield parameters, which default to cluster_count and cluster_label. The cluster_count value is the number of events that are part of the cluster, or the cluster size. Each event in the cluster is assigned the cluster_label value of the cluster it belongs to. For example, if the search returns 10 clusters, then the clusters are labeled from 1 to 10.
Note that cluster only works with textual data. It's actually what powers the patterns tab.
As @jeffland mentions there are a number of algorithms available for use in the ML Toolkit. At the time of writing, BIRCH, DBSCAN, SpectralClustering, and KMeans are all available for unsupervised tasks. Check out the docs for the ML Toolkit as well.
... View more