$sample

The `$sample` stage is used in aggregation pipelines to randomly select a specified number of documents from a collection. The `$sample` command is useful during testing, data analysis, and generating random subsets of data for machine learning.

Syntax

{
  $sample: { size: <number> }
}

Parameters

sizenumberrequired

The number of documents to randomly select from the collection.

Examples

Sample Data

{
  "_id": "0fcc0bf0-ed18-4ab8-b558-9848e18058f4",
  "name": "First Up Consultants | Beverage Shop",
  "sales": { "totalSales": 75670 }
}

Randomly select five documents

Randomly select five documents and project their IDs.

This query returns a random sample of 5 document IDs.

Query:

db.stores.aggregate([{
  $sample: { size: 5 }
}, {
  $project: { _id: 1 }
}])

Output:

[
  { "_id": "f7ae8b40-0c66-4e80-9261-ab31bbabffb4" },
  { "_id": "25350272-6797-4f98-91f8-fe79084755c7" },
  { "_id": "c7fd1d22-1a29-4cb0-9155-1ad71d600c2b" },
  { "_id": "e602b444-9519-42e3-a2e1-b5a3da5f6e64" },
  { "_id": "189c239a-edca-434b-baae-aada3a27a2c5" }
]

Related