Introduction to MongoDB $substrBytes Operator
The $substrBytes
operator is a string aggregation operator in MongoDB used to extract a substring from a string. Unlike the $substrCP
operator, which extracts substrings based on code points, the $substrBytes
operator extracts substrings based on byte positions and is therefore suitable for strings containing multi-byte characters.
Syntax
The syntax for the $substrBytes
operator is as follows:
{ $substrBytes: [ <string>, <start>, <length> ] }
Where:
string
: The original string from which to extract the substring.start
: The starting position from which to extract the substring, starting from 0.length
: The length of the substring to extract, in bytes.
Use Cases
The $substrBytes
operator is typically used to extract a portion of a string, such as a specific field from a string containing log information or a substring from a string containing multi-byte characters. In these scenarios, the $substrBytes
operator can be used to conveniently process strings.
Examples
Suppose we have the following documents:
{ "_id": 1, "name": "John", "address": "123 Main St, Anytown, USA" }
{ "_id": 2, "name": "Alice", "address": "456 Second St, Othertown, USA" }
The following example uses the $substrBytes
operator to extract the first two bytes from the name
field and stores the result in a new field, name_short
:
db.users.aggregate([
{
$project: {
name: 1,
name_short: { $substrBytes: ["$name", 0, 2] }
}
}
])
After executing the above aggregation pipeline, the following results will be obtained:
{ "_id": 1, "name": "John", "name_short": "Jo" }
{ "_id": 2, "name": "Alice", "name_short": "Al" }
The following example uses the $substrBytes
operator to extract 10 bytes starting from the fifth byte of the address
field and stores the result in a new field, address_short
:
db.users.aggregate([
{
$project: {
address: 1,
address_short: { $substrBytes: ["$address", 4, 10] }
}
}
])
After executing the above aggregation pipeline, the following results will be obtained:
{ "_id": 1, "address": "123 Main St, Anytown, USA", "address_short": "Main St, " }
{ "_id": 2, "address": "456 Second St, Othertown, USA", "address_short": "ond St, Ot" }
Conclusion
The $substrBytes
operator is a string aggregation operator in MongoDB used to extract a substring from a string based on byte positions.