Application Development Series
Back to Basics – Introduction

Daniel Roberts
@dmroberts
#MongoDBBasics
Introduction
• About the Webinar Series
• Data Model
• Query Model
• Scalability
• Availability

• Deployment Architectures
• Performance
• Next Session
2
Series Outline & Approach
• Split into 2 sections
– Application Development (4 parts)
•
•
•
•

Schema Design
Interacting with the database query and update operators
Indexing
Reporting

– Operations (3 parts)
• Deployment – scale out and high availability
• Monitoring and performance tuning
• Backup and recovery

3
Application Overview
• Content Management System
– Will utilise :
•
•
•
•
•
•

Query & update operators
Aggregation Framework
Geospatial queries
Pre Aggregated reports for fast analytics
Polymorphic documents
And more…

• Take away framework
• An approach that you can reuse in your own
applications
4
Q&A
• Virtual Genius Bar
– Use the chat to post
questions
– EMEA Solution
Architecture team are
on hand
– Make use of them
during the sessions!!!
5
MongoDB
Operational Database

7
Document Data Model
Document - Collections
Relational - Tables

8

{ first_name: „Paul‟,
surname: „Miller‟,
city: „London‟,
location: {
type: “Point”,
coordinates :
[-0.128, 51.507]
},
cars: [
{ model: „Bentley‟,
year: 1973,
value: 100000, … },
{ model: „Rolls Royce‟,
year: 1965,
value: 330000, … }
}
}
Document Model
• Agility and flexibility – dynamic schema
– Data models can evolve easily
– Companies can adapt to changes quickly

• Intuitive, natural data representation
– Remove impedance mismatch
– Many types of applications are a good fit

• Reduces the need for joins, disk seeks
– Programming is more simple
– Performance can be delivered at scale
9
Simplify development

10
Simplify development

11
Rich database interaction

12
Query Model
Shell and Drivers
Drivers
Drivers for most popular
programming languages and
frameworks

Java

Ruby

JavaScript

Python

Shell
Command-line shell for
interacting directly with
database

14

Perl

Haskell

> db.collection.insert({company:“10gen”,
product:“MongoDB”})
>
> db.collection.findOne()
{
“_id”
: ObjectId(“5106c1c2fc629bfe52792e86”),
“company”
: “10gen”
“product”
: “MongoDB”
}
MongoDB is full featured
Queries

• Find Paul’s cars
• Find everybody in London with a car
built between 1970 and 1980

Geospatial

• Find all of the car owners within 5km of
Trafalgar Sq.

Text Search

• Find all the cars described as having
leather seats

Aggregation

• Calculate the average value of Paul’s
car collection

Map Reduce

• What is the ownership pattern of colors
by geography over time? (is purple
trending up in China?)

15

{ first_name: „Paul‟,
surname: „Miller‟,
city: „London‟,
location: {
type: “Point”,
coordinates :
[-0.128, 51.507]
},
cars: [
{ model: „Bentley‟,
year: 1973,
value: 100000, … },
{ model: „Rolls Royce‟,
year: 1965,
value: 330000, … }
}
}
Query Example
Rich Queries

• Find Paul’s cars
• Find everybody in London with a car
built between 1970 and 1980

db.cars.find({
first_name: „Paul‟
})
db.cars.find({
city: „London‟,
”cars.year" : {
$gte : 1970,
$lte : 1980
}
})

16

{ first_name: „Paul‟,
surname: „Miller‟,
city: „London‟,
location: {
type: “Point”,
coordinates :
[-0.128, 51.507]
},
cars: [
{ model: „Bentley‟,
year: 1973,
value: 100000, … },
{ model: „Rolls Royce‟,
year: 1965,
value: 330000, … }
}
}
Geo Spatial Example
Geospatial

• Find all of the car owners within 5km of
Trafalgar Sq.

db.cars.find( {
location:
{ $near :
{ $geometry :
{
type: 'Point' ,
coordinates :
[-0.128, 51.507]
}
},
$maxDistance :5000
}
})

17

{ first_name: „Paul‟,
surname: „Miller‟,
city: „London‟,
location: {
type: “Point”,
coordinates :
[-0.128, 51.507]
},
cars: [
{ model: „Bentley‟,
year: 1973,
value: 100000, … },
{ model: „Rolls Royce‟,
year: 1965,
value: 330000, … }
}
}
Aggregation Framework Example
Aggregation

• Calculate the average value of Paul’s
car collection

db.cars.aggregate( [
{$match : {"first_name" : "Paul"}},
{$project : {"first_name":1,"cars":1}},
{$unwind : "$cars"},
{ $group : {_id:"$first_name",
average : {
$avg : "$cars.value"}}}
])
{ "_id" : "Paul", "average" : 215000 }

18

{ first_name: „Paul‟,
surname: „Miller‟,
city: „London‟,
location: {
type: “Point”,
coordinates :
[-0.128, 51.507]
},
cars: [
{ model: „Bentley‟,
year: 1973,
value: 100000, … },
{ model: „Rolls Royce‟,
year: 1965,
value: 330000, … }
}
}
Scalability
Automatic Sharding

• Three types of sharding: hash-based, range-based, tagaware
• Increase or decrease capacity as you go
• Automatic balancing

20
Query Routing

• Multiple query optimization models
• Each sharding option appropriate for different apps
21
Availability
Availability Considerations
• High Availability – Ensure application availability during
many types of failures
• Disaster Recovery – Address the RTO and RPO goals
for business continuity
• Maintenance – Perform upgrades and other maintenance
operations with no application downtime

23
Replica Sets
• Replica Set – two or more copies
• “Self-healing” shard
• Addresses many concerns:
- High Availability
- Disaster Recovery
- Maintenance

24
Replica Set Benefits

Business Needs

High Availability

Automated failover

Disaster Recovery

Hot backups offsite

Maintenance

Rolling upgrades

Low Latency

Locate data near users

Workload Isolation

Read from non-primary replicas

Data Privacy

Restrict data to physical location

Data Consistency

25

Replica Set Benefits

Tunable Consistency
Performance
Performance

Better Data
Locality
27

In-Memory
Caching

In-Place
Updates
Summary
• Document Model
– Simplify development
– Simplify scale out
– Improve performance

• MongoDB
– Rich general purpose database
– Built in High Availability and Failover
– Built in scale out

28
Next Week – 6th February
• Matt Bates
– Schema design for the CMS application
• Collections
• Design decisions

– Application architecture
• Example technologies
• RESTful interface
• We‟ve chosen python for the examples

– Code Examples
29
Webinar: Getting Started with MongoDB - Back to Basics

Webinar: Getting Started with MongoDB - Back to Basics

  • 1.
    Application Development Series Backto Basics – Introduction Daniel Roberts @dmroberts #MongoDBBasics
  • 2.
    Introduction • About theWebinar Series • Data Model • Query Model • Scalability • Availability • Deployment Architectures • Performance • Next Session 2
  • 3.
    Series Outline &Approach • Split into 2 sections – Application Development (4 parts) • • • • Schema Design Interacting with the database query and update operators Indexing Reporting – Operations (3 parts) • Deployment – scale out and high availability • Monitoring and performance tuning • Backup and recovery 3
  • 4.
    Application Overview • ContentManagement System – Will utilise : • • • • • • Query & update operators Aggregation Framework Geospatial queries Pre Aggregated reports for fast analytics Polymorphic documents And more… • Take away framework • An approach that you can reuse in your own applications 4
  • 5.
    Q&A • Virtual GeniusBar – Use the chat to post questions – EMEA Solution Architecture team are on hand – Make use of them during the sessions!!! 5
  • 6.
  • 7.
  • 8.
    Document Data Model Document- Collections Relational - Tables 8 { first_name: „Paul‟, surname: „Miller‟, city: „London‟, location: { type: “Point”, coordinates : [-0.128, 51.507] }, cars: [ { model: „Bentley‟, year: 1973, value: 100000, … }, { model: „Rolls Royce‟, year: 1965, value: 330000, … } } }
  • 9.
    Document Model • Agilityand flexibility – dynamic schema – Data models can evolve easily – Companies can adapt to changes quickly • Intuitive, natural data representation – Remove impedance mismatch – Many types of applications are a good fit • Reduces the need for joins, disk seeks – Programming is more simple – Performance can be delivered at scale 9
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
    Shell and Drivers Drivers Driversfor most popular programming languages and frameworks Java Ruby JavaScript Python Shell Command-line shell for interacting directly with database 14 Perl Haskell > db.collection.insert({company:“10gen”, product:“MongoDB”}) > > db.collection.findOne() { “_id” : ObjectId(“5106c1c2fc629bfe52792e86”), “company” : “10gen” “product” : “MongoDB” }
  • 15.
    MongoDB is fullfeatured Queries • Find Paul’s cars • Find everybody in London with a car built between 1970 and 1980 Geospatial • Find all of the car owners within 5km of Trafalgar Sq. Text Search • Find all the cars described as having leather seats Aggregation • Calculate the average value of Paul’s car collection Map Reduce • What is the ownership pattern of colors by geography over time? (is purple trending up in China?) 15 { first_name: „Paul‟, surname: „Miller‟, city: „London‟, location: { type: “Point”, coordinates : [-0.128, 51.507] }, cars: [ { model: „Bentley‟, year: 1973, value: 100000, … }, { model: „Rolls Royce‟, year: 1965, value: 330000, … } } }
  • 16.
    Query Example Rich Queries •Find Paul’s cars • Find everybody in London with a car built between 1970 and 1980 db.cars.find({ first_name: „Paul‟ }) db.cars.find({ city: „London‟, ”cars.year" : { $gte : 1970, $lte : 1980 } }) 16 { first_name: „Paul‟, surname: „Miller‟, city: „London‟, location: { type: “Point”, coordinates : [-0.128, 51.507] }, cars: [ { model: „Bentley‟, year: 1973, value: 100000, … }, { model: „Rolls Royce‟, year: 1965, value: 330000, … } } }
  • 17.
    Geo Spatial Example Geospatial •Find all of the car owners within 5km of Trafalgar Sq. db.cars.find( { location: { $near : { $geometry : { type: 'Point' , coordinates : [-0.128, 51.507] } }, $maxDistance :5000 } }) 17 { first_name: „Paul‟, surname: „Miller‟, city: „London‟, location: { type: “Point”, coordinates : [-0.128, 51.507] }, cars: [ { model: „Bentley‟, year: 1973, value: 100000, … }, { model: „Rolls Royce‟, year: 1965, value: 330000, … } } }
  • 18.
    Aggregation Framework Example Aggregation •Calculate the average value of Paul’s car collection db.cars.aggregate( [ {$match : {"first_name" : "Paul"}}, {$project : {"first_name":1,"cars":1}}, {$unwind : "$cars"}, { $group : {_id:"$first_name", average : { $avg : "$cars.value"}}} ]) { "_id" : "Paul", "average" : 215000 } 18 { first_name: „Paul‟, surname: „Miller‟, city: „London‟, location: { type: “Point”, coordinates : [-0.128, 51.507] }, cars: [ { model: „Bentley‟, year: 1973, value: 100000, … }, { model: „Rolls Royce‟, year: 1965, value: 330000, … } } }
  • 19.
  • 20.
    Automatic Sharding • Threetypes of sharding: hash-based, range-based, tagaware • Increase or decrease capacity as you go • Automatic balancing 20
  • 21.
    Query Routing • Multiplequery optimization models • Each sharding option appropriate for different apps 21
  • 22.
  • 23.
    Availability Considerations • HighAvailability – Ensure application availability during many types of failures • Disaster Recovery – Address the RTO and RPO goals for business continuity • Maintenance – Perform upgrades and other maintenance operations with no application downtime 23
  • 24.
    Replica Sets • ReplicaSet – two or more copies • “Self-healing” shard • Addresses many concerns: - High Availability - Disaster Recovery - Maintenance 24
  • 25.
    Replica Set Benefits BusinessNeeds High Availability Automated failover Disaster Recovery Hot backups offsite Maintenance Rolling upgrades Low Latency Locate data near users Workload Isolation Read from non-primary replicas Data Privacy Restrict data to physical location Data Consistency 25 Replica Set Benefits Tunable Consistency
  • 26.
  • 27.
  • 28.
    Summary • Document Model –Simplify development – Simplify scale out – Improve performance • MongoDB – Rich general purpose database – Built in High Availability and Failover – Built in scale out 28
  • 29.
    Next Week –6th February • Matt Bates – Schema design for the CMS application • Collections • Design decisions – Application architecture • Example technologies • RESTful interface • We‟ve chosen python for the examples – Code Examples 29