skip to content
Alvin Lucillo

Projecting a substring

/ 1 min read

The use of substring function can vary in programming languages, but here in mongodb, we can extract a substring with the use of $substrBytes, which accepts the property of the document, starting index of a substring, and the number of bytes. Note that $substrBytes slices the value based on bytes, which if you are not careful, you might encounter an error when dealing with multi-byte characters. In the upcominng entries, I will discuss about multi-byte characters.

// test data
db.getCollection("unicode_demo").insertMany([
	{ label: "with_emoji", text: "Hi☺!" }, // emoji is multi-byte
]);

db.getCollection("unicode_demo").aggregate([
	{
		$project: {
			label: 1,
			text: 1,
			substr_bytes: { $substrBytes: ["$text", 0, 2] },
		},
	},
]);

Output:

[
  {
    _id: ObjectId('69b2a1f53e54149a74d5346f'),
    label: 'with_emoji',
    text: 'Hi☺!',
    substr_bytes: 'Hi'
  }
]