TDLR Pros and Cons for me personally. Your mileage may vary.
- MongoDB is a lot more flexible than MySQL and it is the main benefit and drawback at the same time.
- Can scale horizontally.
- Is faster. But not due to underlying technology, but due to best practices encouraging developers to avoid normalization. For example,
user -> addressrelationships would be stored in the same document. In MySQL, it should be as separate tables. When data is queried it is returned from the same document. On MySQL, it needs to be looked up and then joined together.
- Data duplication is encouraged. I advise caution and implementing a data update mechanism and don't postpone implementation.
- MongoDB has an "aggregation pipeline" which dramatically eases data transformation and computation and it is more readable.
- MongoDB doesn't support JOINs but supports JOIN operations.
- MongoDB has no SQL injection vulnerability because there is no SQL.
- No foreign keys on MongoDB. If you have one document referencing another one you'd have to act accordingly on the application layer to keep data in sync.
- I'm probably a fan and biased.
I'm a developer, not a data engineer or data scientist. I don't do data analytics. Yet.
MongoDB is very flexible and it is a blessing and a curse. It will store any data you provide. By default, there is no validation you set up validation rules on a database or use ODM like Mongoose if you're using NodeJs to help with validation in the application layer.
Documents (think rows in MySQL) can have a versioning key. This is useful to know with which document version you're dealing with in the app layer. This is useful when schema changes and you need to migrate user data to reflect recent changes. When using MySQL one would make changes on the database layer (add/change columns) and then write a long-running job to migrate data over to reflect changes which could introduce some downtime or cause bad UX. This can be done in MongoDB too. On the other hand, one can utilize the version key and write a middleware when the user accesses the part of the application which requires changes and migrate the document to the new structure. This way there are no long-running jobs and can be done in a serverless and stateless environment. The drawback is that it introduces data inconsistency across collection documents until all users have migrated themselves. Depends on the specific use case.
In MySQL, we strive to normalize data. MongoDB is not so straightforward. It encourages data denormalization and duplication. Eg
user->address relation, in MySQL this would be a separate table and call it day. With MongoDB, it's more of a like how you query data. If the application requires for a user address frequently then it would make sense to store it together in the same document (row).
This can be extended even further by utilizing a "computed pattern". Let's say we need to know how many cats the user has. Cats are stored in another collection (table). So we query the number of cats that belong to the user and store the result in the user document as
cat_count. Another example could be, let's say application requires to display of the owner of a cat. Owners of pets don't change that often. In this case, it would make sense to embed required owner data into
cat a document e.g. first, last name, and address.
MongoDB can be hosted on the majority of cloud providers as a managed service. So it shouldn't be much of a problem. Atlas (managed MongoDB service) offers a free tier on the shared cluster so it is enough to get started. If you want to run it yourself, then it's a different story. Haven't done it myself tho, so can't comment much on that.
Where MongoDB is great and where it isn't.
From my point of view, MongoDB is a good general-purpose database. Which has many use cases such as gaming, analytics, big data, and real-time analytics.
In my use cases, I haven't found that many negative things to say, but from what I've seen people complain about are:
- No comfortable support. Few apps these days require this.
- Bad search performance. This is also not true, for Atlas uses there are a
$textis a legacy query and is meant to be used in self-managed deployments.
- Data duplication. I often see it mentioned as a con. It is probably because authors haven't used it or mismanaged data to corruption. If you're not comfortable with data duplication just do not use it. NoSQL databases can be used with SQL-like design patterns and use features of NoSQL where you need them.
- MongoDB doesn't enforce a schema. It will store what you give it and this can become a big problem. There are validation rules that can be set up, but this probably would be set up on ODM/ORM layer. It won't throw an error if you try to store
ageas a string instead of int.
There are some limitations one wouldn't think about before starting:
- Max document size is 16 MB. In normal use cases, it should be fine although MySQL
LONG TEXTsupports up to 4 GB of text data per row.
- Document nesting is level is 100.
These are the limitations that I think it's worth mentioning, there are more, you can look up by yourself, but in general, I think if you hit other MongoDB's limits either you're doing something wrong or you have an interesting problem with your hands.
MongoDB is the most popular NoSQL database and I think should be considered for a new project. As with everything, it has pros and cons, but in general, from my experience pros outweigh the cons.
These are my thoughts on MongoDB.
Let's get over this already