MongoDB Assignment Help

What is MongoDB?

  1. Scalable high performance open source, document oriented database.
  2. MongoDB is an open source document database written in C++.
  3. Document oriented
  4. Stores JSON like documents
  5. Includes a strong query language

MongoDB in a nutshell:

  1. Document oriented storage
  2. JSON style documents with dynamic schemas offer simplecity and power
  3. Full index support index on any attribute, just like you are used to.
  4. Rich, document based query language
  5. Flexible aggregation and data processing

MongoDB connector for Apache Spark:

  1. Native scala connector, sertified by databricks
  2. Exposes all spark APIs and libraries
  3. Efficient data filtering with predicate pushdown, secondary indexes and in database aggregations
  4. Locality awareness to reduce data movement

MongoDB connector for Apache Spark

RDBMS and MongoDB differ:

Relational Database

Comparisons between the Mysql and MongoDB:

Written in
C++, C
C++, C , javascript
Document oriented
Main points
Table, row, column
Collection, document, field
  1. Full text searching and indexing
  2. Integrated replication support
  3. Triggers
  4. Sub selects
  5. Query caching
  6. SSL support
  7. Different storage engines with various performance characteristics
  1. Auto sharding
  2. Native replacation
  3. In memory speed
  4. Embedded data models support
  5. Comprehensive secondary indexes
  6. Riche query language support
  7. Various storage engines support
Best used for
  1. Data structure fits for tables and rows
  2. Strong dependence on multi row transactions
  3. Relatively small datasets
  1. High write loads
  2. Unstable schema
  3. Your DB is set to grow big
  4. Data is location based
  5. High available in unstable environment is required
  6. No database administrators
  1. Atomic transactions support
  2. JOIN support
  3. Mature solution
  4. Privilege and password security system
  1. Document validation
  2. Integrated storage engines
  3. Shortened time between primary failure and recovery
  1. Tough scaling
  2. Stability concern
  3. Is not community driven development
  1. Not the best option for apps with complex transactions
  2. Not a snap in replacement for legacy solutions 
  3. Young solution