dedup

Use the dedup command to remove results that contain identical combination of field values based on the search order generated through the sort command.

Syntax

dedup <dedup_options> <field_name> [, <field_name>, ...]

Parameters

The following table lists the parameters used in this command, along with their descriptions.

Parameter Description

field_name

Specify the field whose values must be checked for duplicates.

dedup_options

Syntax:

[count = <count>][includenulls = [true|false]] [consecutive = [true|false]]

count: Specifies the number of duplicates to return. Default value is 1, if not specified.

includenulls: Include results where the dedup fields are null. Default value is false, if not specified.

consecutive: Remove only the results with duplicate combinations of values that are consecutive. Default value is false, if not specified.

The following query groups logs by each unique combination of client host city and IP, calculates the sum of content size for each group, sorts each group by descending order of content size, and finally removes duplicate rows for a client host city. This effectively retains only those rows that correspond to the highest content size for each client host city:

* | stats sum('Content Size') as 'Content Size' by 'Client Host City', 'Source IP'
    | sort -'Content Size'
    | dedup 'Client Host City'

With the above query, the resulting records table has three columns Client Host City, Source IP, and Content Size.

If you specify the dedup option count = 2, then 2 rows that have the same value of Client Host City are available.

If you specify the dedup option includenulls = true, then those rows are included where Client Host City value is null.

If you specify the dedup option consecutive = true, then only those rows are removed where the consecutive values of Client Host City are the same.