Introduction to MongoDB $sample Operator
$sample
operator is an aggregation pipeline operator in MongoDB, which randomly selects a specified number of documents as output. This operator can be used to display random documents or for testing purposes.
Syntax
The syntax of the $sample
operator is as follows:
{
$sample: {
size: <positive integer>
}
}
Here, size
is required and specifies the number of documents to randomly select. It must be a positive integer.
Use Cases
The $sample
operator is mainly used in the following scenarios:
- Randomly selecting documents for display
- Randomly sampling data sets
- Testing purposes to simulate random data in production environments
Example
Here is an example of using the $sample
operator.
Assuming there is a users
collection containing the following documents:
{ "_id": 1, "name": "Alice", "age": 28 }
{ "_id": 2, "name": "Bob", "age": 35 }
{ "_id": 3, "name": "Charlie", "age": 42 }
{ "_id": 4, "name": "David", "age": 19 }
{ "_id": 5, "name": "Eva", "age": 25 }
The following aggregation pipeline randomly selects 2 documents:
db.users.aggregate([{ $sample: { size: 2 } }])
The output could be one of the following documents, but not necessarily these exact documents:
{ "_id": 2, "name": "Bob", "age": 35 }
{ "_id": 4, "name": "David", "age": 19 }
Conclusion
The $sample
operator is a very useful aggregation pipeline operator in MongoDB that is used to randomly select a specified number of documents. It can be used in various scenarios, such as displaying random documents, randomly sampling data, and testing. Note that this operator may incur significant performance overhead for large datasets, so it needs to be weighed based on specific circumstances.