TDLR Pros and Cons for me personally. Your milage may vary.
- MongoDB is a lot more flexible than a MySQL and it is the main benefit and drawback at the same time.
- Can scale horizontally.
- Is faster. But not due underlying technology, but due to best practices encouraging developers to avoid normalization. For example
user -> addressrelationship would be stored in the same document. In MySQL it should be as a separate tables. When data is queried it is returned from the same document. On MySQL it needs to be looked up and then joined together.
- Data duplication is encouraged. I advise caution and implement data update mechasim and don't postpone implementation.
- MongoDB has a "agregation pipeline" which dramatically eases data transformation, computation and it is more readable.
- MongoDB don't support JOINs, but supports JOIN operations.
- MongoDB has no SQL injection vulnerability, because there is no SQL.
- No foreign keys on MongoDB. If you one document referencing other one you'd have to act accordingly on application layer.
- I'm probably a fan and biased.
I'm a developer, not a data engineer or data scientist. I don't do data analytics. Yet.
MongoDB is very flexible and it is a blessing and a curse. It will store any data you provide. By default there is no validition you setup validation rules on database or use ODM like Mongoose if you're using NodeJs to help with validation in the application layer.
Documents (think rows in MySQL) can have a versioning key. This is useful to know with which document version you're dealing with in the app layer. This is useful when schema changes and you need to migrate user data to reflect recent changes. When using MySQL one would make changes on database layer (add/change columns) and then write a long running job to migrate data over to reflect changes which could possibly introduce some downtime or cause bad UX. This can be done in MongoDB too. On the other hand one can utilise version key and write a middleware when user access the part of application which requires changes and migrate the document to new structure. This way there is no long running jobs, can be done in serverless and stateless environment. Drawback is that it introduces data inconsistency across collection documents until all users have migrated themselves. Depends on the specific use case.
In MySQL we strive to normalize data. In MongoDB not so straightforward it encourages data denormalization and duplication. Eg
user->address relation, in MySQL this would be a separate tables and call it day. With MongoDB it's more of a like how you query data. If application requires for a user address frequently then it would make sense to store it together in same document (row).
This can be extended even further by utilizing "computed pattern". Lets say we need to know how many cats user has. Cats are stored in other collection (table). So we query number of cats that belong to user and store result in user document as
cat_count. Other example could be, let's say application requires to display owner of a cat. Owners of pets don't change that often. In this case it would make sense to embed required owner data into
cat document e.g. first, last name and an address.
MongoDB can be hosted for majority of cloud providers as a managed service. So it shouldn't be much of a problem. Atlas (managed MongoDB service) offers free tier on shared cluster so it is enough to get started. If you want to run it yourself, then it's a different story. Haven't done it myself tho, so can't comment much on that.
Where MongoDB is great and where it isn't.
From my point of view MongoDB is a good general purpose database. Which has many use cases such as gaming, analytics, big data, real time analytics.
In my use cases I haven't found that much negative things to say, but from what I've seen people complain about are:
- No trasaction support. Few and few apps there days require this.
- Bad search perfomance. Which is also not true, for Atlas uses there is a
$textis legacy query and meant to be used in self-managed deployments.
- Data duplication. I often see it mentioned as a con. It is probably due the fact that authors haven't used it or mismanaged data to corruption. If you're not confortable with data duplication just do not use it. NoSQL databases can be used with SQL like design patterns and use features of NoSQL where you need.
- MongoDB don't enforce schema. It will store what you give it and this can become a big problem. There are validation rules that can be setup, but this probably would be setup on ODM/ORM layer. It won't throw an error if you try to store
ageas a string instead of int.
There are some limitations one wouldn't think about before starting out:
- Max document size is 16mb. In normal use cases it should be fine although MySQL
LONG TEXTsupports up to 4gb of text data per row.
- Document nesting is level is 100.
These are the limitations which I think it's worth mentioning, there are more, you can look up by youself, but in general I think if you hit other MongoDB's limits either you're doing something wrong or you have an interesting problem on your hands.
MongoDB is most popular NoSQL database and I think should be considered for a new project. As everything it has pros and cons, but in general from my experience pros outweigh cons.
These are my thoughts on MongoDB.