[Mongoose] Trouble with Stream Processing Only Default batchSize When Sorting by Unindexed Field

Tadashi Shigeoka ·  Mon, October 26, 2015

I’ll introduce a troublesome issue in Mongoose where specifying an unindexed field for sort and retrieving with stream only processes the default batchSize of documents and then completes normally.

mongoose | マングース

Case of Bulk Retrieval with find()

When retrieving all documents with find(), the first argument err in the callback contained no value, but the last element of the array contained $err and code.

User.find().sort({
  createdAt: -1
}).setOptions({
  noCursorTimeout: true
}).exec(function(err, docs) {
  console.log(err);
  console.log(docs.length);
  console.log(docs[docs.length - 1]);

/**
Console.log output results below:

null
1001
{ '$err': 'getMore runner error: Overflow sort stage buffered data usage of 33555763 bytes exceeds internal limit of 33554432 bytes',
  code: 17406,
...
**/
});

Case of Retrieving All Documents with stream()

The same applies when retrieving with stream() - the last retrieved document contained $err and code.

var stream = User.find()
.sort({ createdAt: -1 })
.setOptions({ noCursorTimeout: true })
.stream();

stream.on('data', function(user){
/**
Only retrieves default batchSize 1000 documents, gets data containing $err and code as the 1001st document, and completes normally
{ '$err': 'getMore runner error: Overflow sort stage buffered data usage of 33555763 bytes exceeds internal limit of 33554432 bytes',
  code: 17406,
...
**/
});

stream.on('error', function(err){
  // Error event is not called
});

Solutions

Create an index for the field specified in sort

Since the lack of an index is the root problem, creating an index will solve it.

User.path('createdAt').index(true);

Specify a field with an existing index for sort

If you don’t want to create an index due to index size concerns, you can specify the default _id for sort.

User.find().sort({
  _id: -1
}).setOptions({
  noCursorTimeout: true
}).exec(function(err, docs) {});

Thoughts

Unless you notice that only 1000 documents are being retrieved, you won’t realize an error has occurred, which seems like poor error handling implementation in Mongoose…

Reference Information

That’s all from the Gemba.