Chapter 5. Administration Tips
Tip #39: Manually clean up your chunks collections
GridFS keeps file contents in a collection of chunks, called fs.chunks by default. Each document in the files collection points to one or more document in the chunks collection. It’s good to check every once and a while and make sure that there are no “orphan” chunks—chunks floating around with no link to a file. This could occur if the database was shut down in the middle of saving a file (the fs.files document is written after the chunks).
To check over your chunks collection, choose a time when there’s little traffic (as you’ll be loading a lot of data into memory) and run something like:
> var cursor = db.fs.chunks.find({}, {"_id" : 1, "files_id" : 1}); > while (cursor.hasNext()) { ... var chunk = cursor.next(); ... if (db.fs.files.findOne({_id : chunk.files_id}) == null) { ... print("orphaned chunk: " + chunk._id); ... }
This will print out the _id
s for all orphaned
chunks.
Now, before you go through and delete all of the orphaned chunks,
make sure that they are not parts of files that are currently being
written! You should check db.currentOp()
and the fs.files collection for recent
uploadDate
s.
Tip #40: Compact databases with repair
In Tip #31: Do not depend on repair to recover data, we cover why you usually shouldn’t use
repair
to actually repair your data (unless you’re in
dire straits). However, repair
can be
used to compact databases.
Note
Hopefully this tip will become irrelevant soon, once the ...
Get 50 Tips and Tricks for MongoDB Developers now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.