[Mongoose] Trouble with Stream Processing Only Default batchSize When Sorting by Unindexed Field
I’ll introduce a troublesome issue in Mongoose where specifying an unindexed field for sort and retrieving with stream only processes the default batchSize of documents and then completes normally.
When retrieving all documents with find(), the first argument err in the callback contained no value, but the last element of the array contained $err and code.
User.find().sort({
createdAt: -1
}).setOptions({
noCursorTimeout: true
}).exec(function(err, docs) {
console.log(err);
console.log(docs.length);
console.log(docs[docs.length - 1]);
/**
Console.log output results below:
null
1001
{ '$err': 'getMore runner error: Overflow sort stage buffered data usage of 33555763 bytes exceeds internal limit of 33554432 bytes',
code: 17406,
...
**/
});
The same applies when retrieving with stream() - the last retrieved document contained $err and code.
var stream = User.find()
.sort({ createdAt: -1 })
.setOptions({ noCursorTimeout: true })
.stream();
stream.on('data', function(user){
/**
Only retrieves default batchSize 1000 documents, gets data containing $err and code as the 1001st document, and completes normally
{ '$err': 'getMore runner error: Overflow sort stage buffered data usage of 33555763 bytes exceeds internal limit of 33554432 bytes',
code: 17406,
...
**/
});
stream.on('error', function(err){
// Error event is not called
});
Since the lack of an index is the root problem, creating an index will solve it.
User.path('createdAt').index(true);
If you don’t want to create an index due to index size concerns, you can specify the default _id for sort.
User.find().sort({
_id: -1
}).setOptions({
noCursorTimeout: true
}).exec(function(err, docs) {});
Unless you notice that only 1000 documents are being retrieved, you won’t realize an error has occurred, which seems like poor error handling implementation in Mongoose…
That’s all from the Gemba.