============= dedup ============= .. rubric:: Table of contents .. contents:: :local: :depth: 2 Description ============ | Using ``dedup`` command to remove identical document defined by field from the search result. Syntax ============ dedup [int] [keepempty=] [consecutive=] * int: optional. The ``dedup`` command retains multiple events for each combination when you specify . The number for must be greater than 0. If you do not specify a number, only the first occurring event is kept. All other duplicates are removed from the results. **Default:** 1 * keepempty: optional. if true, keep the document if the any field in the field-list has NULL value or field is MISSING. **Default:** false. * consecutive: optional. If set to true, removes only events with duplicate combinations of values that are consecutive. **Default:** false. * field-list: mandatory. The comma-delimited field list. At least one field is required. Example 1: Dedup by one field ============================= The example show dedup the document with gender field. PPL query:: os> source=accounts | dedup gender | fields account_number, gender; fetched rows / total rows = 2/2 +------------------+----------+ | account_number | gender | |------------------+----------| | 1 | M | | 13 | F | +------------------+----------+ Example 2: Keep 2 duplicates documents ====================================== The example show dedup the document with gender field keep 2 duplication. PPL query:: os> source=accounts | dedup 2 gender | fields account_number, gender; fetched rows / total rows = 3/3 +------------------+----------+ | account_number | gender | |------------------+----------| | 1 | M | | 6 | M | | 13 | F | +------------------+----------+ Example 3: Keep or Ignore the empty field by default ============================================ The example show dedup the document by keep null value field. PPL query:: os> source=accounts | dedup email keepempty=true | fields account_number, email; fetched rows / total rows = 4/4 +------------------+-----------------------+ | account_number | email | |------------------+-----------------------| | 1 | amberduke@pyrami.com | | 6 | hattiebond@netagy.com | | 13 | null | | 18 | daleadams@boink.com | +------------------+-----------------------+ The example show dedup the document by ignore the empty value field. PPL query:: os> source=accounts | dedup email | fields account_number, email; fetched rows / total rows = 3/3 +------------------+-----------------------+ | account_number | email | |------------------+-----------------------| | 1 | amberduke@pyrami.com | | 6 | hattiebond@netagy.com | | 18 | daleadams@boink.com | +------------------+-----------------------+ Example 4: Dedup in consecutive document ========================================= The example show dedup the consecutive document. PPL query:: os> source=accounts | dedup gender consecutive=true | fields account_number, gender; fetched rows / total rows = 3/3 +------------------+----------+ | account_number | gender | |------------------+----------| | 1 | M | | 13 | F | | 18 | M | +------------------+----------+ Limitation ========== The ``dedup`` command is not rewritten to OpenSearch DSL, it is only executed on the coordination node.