All that glitters
MongoDB Gotchas (a couple)

A friend asked me last night for my favorite “Mongo Gotcha.”  He of course had one in mind, but I came up with a few, so I thought I’d jot a down the ones that I came up with on the spot (if you have more good ones, feel free to add in comments, and I’ll update)

Replication Gotcha #1: (this was his) If you lose a majority of your replica nodes, those left will be unable to assume Primary roles (because they can’t come up with a voting majority), and your cluster will be down.  This is a good thing (prevents nasty segmentation problems), but if you’re in that situation, the only way to recover is to restart a node in a non-replica-set mode, alter its config manually to grant it more votes, and restart.  You won’t be able to do anything that writes to config without a Primary, so you can’t add arbiters or anything else in this mode.

Replication Gotcha #2: When adding new replicas, pay very close attention to the size of your dataset and the size of your oplog.  You can run `db.printReplicationStatus()` on the primary node to see how many hours of oplog the node has.  When you bring a new node into a replica set cold, it will first load all data from Primary (or a Secondary, if you tell it to- and I highly recommend this), and then start “catching up” by getting oplog data shipped to it.  I’ve gone to the trouble of dropping a new node into a cluster, only to have it overrun available oplog by 15 minutes (if the initial sync takes so long that the capped oplog rolls over in the interim, the node is permanently unable to catch up).  Best strategy is to take a snapshot/backup of an existing replica, and start new nodes using —fastsync (they skip the initial download, and can get into oplog syncing quickly).  Also, try to make sure you have a ton of extra oplog (—oplogSize on the command line, I think).

Sharding Gotcha: Understand the heck out of your shard keys, and think about them very carefully.  Mongo distributes data according to key ranges within the total keyspace of all shard keys.  If your keys are too random, you’ll get nicely distributed writes but put excessive load on the balancer when it has to act, because there is no simple way to segment the keyspace in order to re-jigger chunk distribution.  If they’re too sequential, you’ll end up writing all new data to only 1 shard, because all new keys will end up in one keyrange, and then forcing the balancer to do all the work of distributing data across the cluster.  I’ve seen the absolute worst of both worlds, where 90% of my existing data used a random value for the shard key, and then I changed it to a semi-sequential key (which was actually a great key construction, by itself).  In the overall keyspace including all of my random keys, all of my new “good” shard keys ended up writing to a single node- and thus forcing the balancer to do the work.  But of course, all of the rest of my data was randomly distributed, so the balancer became (a) overworked and (b) horribly inefficient.  Ouch.

I should be clear that MongoDB is an awesome piece of technology- and 10gen is already working to make some of these gotchas less gotcha-ish.  But knowing the details of your technology stack is crucial to managing anything at scale (and at GameChanger, that’s the priority of the hour).

Fun.

blog comments powered by Disqus
blog comments powered by Disqus