cluster

Use this command to group similar log records. The cluster command uses machine learning to group log records together based on how similar they are to each other. Clustering helps significantly reduce the total number of log entries the user has to explore and easily points out the outliers. Grouped log entries are presented as message signatures.

Syntax

cluster [<cluster_options>]

In the above syntax, cluster_options is of the format:

[similarity=<similarity_value>]

Parameters

The following table lists the parameters used in cluster_options, along with their descriptions.

Parameter Description

similarity_value

Specifies a threshold that affects the sensitivity of the algorithm in differences while it is performing the clustering. It is a number in the range [0.00, 1.00] and it indicates the percentage of the number of words that can be different in two messages that belong to the same cluster. For example, a value of 0.67 indicates that in a message of 10 words, up to 3 differences are allowed. If similarity is not specified, a default value of 0.67 is used.

For examples of using this command in typical scenarios, see:

The following command performs a cluster analysis on all the fatal logs.

Severity = fatal | cluster 

The following command performs a cluster analysis on all fatal logs, and returns the summary groupings in ascending order.

Severity = fatal | cluster | sort Count