IT 24
DATABASE MANAGEMENT SYSTEMS
ALDERSGATE COLLEGE
LECTURE 1
FUNDAMENTAL DATABASE CONCEPTS
WHAT IS A DATABASE?
• A database is any collection of data.
• A DBMS is a software system
designed to maintain a database.
• We use a DBMS when
• there is a large amount of data
• security and integrity of the data are important
• many users access the data concurrently
EXAMPLE DATABASE APPLICATION
• Consider a Phone Company, such as AT&T
• Kinds of information they deal with:
customer records
billing information
employee records
management records
customer service orders
switching and wiring diagrams
CONCERNS OF A DATABASE USER
• With all that data,
AT&T must be concerned with questions such as:
• Where is the information kept?
• How is the data structured?
• How is the data kept consistent?
• How is the data described?
• How is the data kept secure?
• How do different pieces of data interrelate?
WHY USE A DBMS?
• Without a DBMS, we'd have:
data stored as bits on disks
organized as files
Access by a collection
of ad hoc programs
in C++, Java, PHP, etc.
users of
the data
There is no control or
coordination of what
these programs do
with the data
WHY USE A DBMS?
• With a DBMS, we have:
data stored as bits on disks
organized as files
users of
the data
DBMS provides control
and coordination to
protect the data.
DBMS
applications
DBMS STRUCTURE
data
users of
the data
data
dictionary
data
definition
processor
query processor
security manager
concurrency manager
index manager
application
program(s)
application
program(s)
application
program(s)
application
program(s)
internal/implementation view
external/application view
DBMS
software
components
data
description
DBMS LANGUAGES
users of
the data
application
program(s)
application
program(s)
application
program(s)
application
program(s)
data
data
dictionary
data
definition
processor
query processor
security manager
concurrency manager
index manager
DDL:
data
definition
language
system
configuration
languages
QL: query language
DML: data manipulation language
GPL: general purpose languages
ADVANTAGES OF USING A DBMS
• Anything you can do with a DBMS,
you can do with a file system, a network
and a heap of C code
• So why spend the money to buy a DBMS?
• there is a well defined collection of capabilities common to a certain
class of applications
• for applications in this class, the DBMS already has these capabilities
and probably does them better than you could with home-brewed
code
DATABASE
FUNCTIONALITY
ADVANTAGES OF USING A DBMS
users of
the data
application
program(s)
application
program(s)
application
program(s)
application
program(s)
data
data
dictionary
data
definition
processor
query processor
security manager
concurrency manager
index manager
software operating
between the data and
the applications can
provide many
capabilities
in a generic way
PERSISTENCE
• A DBMS provides
persistent objects, types and data structures
• persistent = having a lifetime longer than
the programs that use the data
• any information that fits the data model
of a particular DBMS
can be made persistent with little effort
• data model = concepts that can be used to describe the data
CONCURRENCY
• A DBMS supports access by concurrent users
• concurrent = happening at the same time
• concurrent access, particularly writes (data changes),
can result in inconsistent states
(even when the individual operations are correct)
• the DBMS can check the actual operations of concurrent users, to
prevent activity that will lead to inconsistent states
ACCESS CONTROL
• A DBMS can restrict access to authorized users
• security policies often require control that is more fine-grained than
that provided by a file system
• since the DBMS understands the data structure, it can enforce fairly
sophisticated and detailed security policies
• on subsets of the data
• on subsets of the available operations
REDUNDANCY CONTROL
• A DBMS can assist in controlling redundancy
• redundancy = multiple copies of the same data
• with file storage, it's often convenient to store multiple copies of the
same data, so that it's "local" to other data and applications
• this can cause many problems:
• wasted disk space
• inconsistencies
• need to enter the data multiple times
COMPLEX SEMANTICS
• A DBMS supports representation
of complex relationships and integrity constraints
• the semantics (meaning) of an application often
includes many relationships and rules
about the relative values of subsets of the data
• these further restrict the possible instances of the database
• relationships and constraints can be defined as part of the schema
BACKUP AND RECOVERY
• A DBMS can provide backup and recovery
• backup = snapshots of the data particular times
• recovery = restoring the data to a consistent state
after a system crash
• the higher level semantics (relationships and constraints)
can make it difficult to restore a consistent state
• transaction analysis can allow a DBMS to reconstruct a consistent
state from a number of backups
VIEWS AND INTERFACES
• A DBMS can support
multiple user interfaces and user views
• since the DBMS provides a well-defined data model and a persistent data
dictionary, many different interfaces can be developed to access the same
data
• data independence ensures that these UIs will not be made invalid by most
changes to the data
• new user views can be supported as new schemas defined against the
conceptual schema
ADVANTAGES OF USING A DBMS
• persistent objects, types and data structures
• control of concurrent users
• controlling of redundancy
• restricting access (security)
• representation of complex relationships
and integrity constraints
• backup and recovery
• multiple user interfaces and user views
DATABASE USERS
AND ROLES
DBMS ROLES
users of
the data
application
program(s)
application
program(s)
application
program(s)
application
program(s)
data
data
dictionary
data
definition
processor
query processor
security manager
concurrency manager
index manager
database
designer
application developers
system
administrator
(and DB
designer)
DBMS
system
developers
DBMS ROLES
• Actors On the Scene
(people interested in the actual data):
• database administrators
• database designers
• systems analysts and application programmers
• end users
ACTORS ON THE SCENE
• Database Administrators
• acquiring a DBMS
• managing the system
• acquiring HW and SW to support the DBMS
• authorizing access (security policies)
• managing staff, including DB designers
ACTORS ON THE SCENE
• Database Designers
• identifying the information of interested
in the Universe of Discourse (UoD)
• designing the database conceptual schema
• designing views for particular users
• designing the physical data layout and logical schema
• adjusting data parameters for performance
ACTORS ON THE SCENE
• Systems Analysts and Application Programmers
(generic database developers)
• provide specialized knowledge to optimize database usage
• provide generic (canned) application programs
ACTORS ON THE SCENE
• End Users
• casual users: ad-hoc queries
• naïve or parametric users: canned queries such as menus for a phone
company customer service agent
• sophisticated users: people who understand the system and the data
and use it in many novel ways
• standalone users: people who use personal easy-to-use databases for
personal data
DBMS ROLES
• Actors Behind the Scene:
people who maintain the environment
but aren't interested in the actual data
• DBMS designers and implementers
• tools developers
• operators and maintenance personnel
• database researchers
ACTORS BEHIND THE SCENE
• DBMS designers and implementers
• work for the company that supplies the DBMS
(i.e. Microsoft , Oracle, Sybase, MySQL …)
• programmers and engineers
• design and implement the DBMS
ACTORS BEHIND THE SCENE
• Tools Developers
• design and implement DBMS add-ons or plug-ins
• may work for DBMS supplier or be independent
• kinds of tools: database design aids, performance monitoring tools,
user and designer interfaces
ACTORS BEHIND THE SCENE
• Operators and maintenance personnel
• run and maintain the computer environment in which a DBMS
operates
• probably work for the database administrator (DBA)
ACTORS BEHIND THE SCENE
• Database Researchers
• academic or industrial researchers
• develop new theory, new designs, new data models and new
algorithms to improve future database management systems
DATA
ORGANIZATION
SCHEMAS AND INSTANCES
• A database instance is the collective values of
all database objects at some point in time
• also called the (data) instance or (database) state
• A schema describes the database and
defines the possible instances
• also called the data definition, data dictionary, or meta-data
CONCEPTUAL DATA MODELS
• A data model describes the possible schemas
(essentially the meta-schema)
• A DBMS is designed around a particular data model
• this is what allows all system components (and humans)
to understand the schema and data
• possible data models
• relational,
object-oriented, object-relational,
entity-relationship,
semantic, network, hierarchical, etc.
PHYSICAL DATA MODELS
• A physical data model describes the way
in which data is stored in the computer
• typically only of interest to database designers, implementers and
maintainers …not end users
• must provide a well-defined structure that can be mapped to the
conceptual schema
• allows optimization strategies to be defined generically
THREE-SCHEMA ARCHITECTURE
ExternalView ExternalView ExternalView
Internal Schema
Conceptual Schema generic view
physical view
user-specific
views
DATA INDEPENDENCE
• physical data independence
• conceptual and external schema are defined
in terms of the data model,
rather than the actual data layout
• ensures that conceptual and external schemas
are not affected by changes to the physical data layout
• logical data independence
• ensures that changes to the conceptual schema
don't affect the external views
• (this is not always achievable)
TRANSACTIONS
TRANSACTIONS
• transaction = an indivisible unit of data processing
• All transactions must have the ACID properties:
• Atomicity: all or nothing
• Consistency: no constraint violations
• Isolation: no interference from other concurrent transactions
• Durability: committed changes must not be lost
due to any kind of failure
ATOMIC TRANSACTIONS
•Fred wants to move $200 from
his savings account to his checking account.
1) Money must be subtracted from savings account.
2) Money must be added to checking count.
If both happen, Fred and the bank are both happy.
If neither happens, Fred and the bank are both happy.
If only one happens, either Fred or the bank will be unhappy.
Fred’s transfer must be all or nothing.
TRANSACTIONS ARE ATOMIC
• Transactions must be atomic (indivisible)
• the DBMS must ensure atomicity
• everything happens, or nothing happens
• boundaries of transaction (in time)
are generally set by the application …
the DBMS has no means of determining
the intention of a transaction
CORRECT TRANSACTIONS
• Wilma tries to withdraw $1000 from account 387.
constraint:
account.Balance must be non-negative
any transaction withdrawing
more than $652.55 from acct 387
will violate this constraint
No PIN Balance
Account
s
101
896
5
10965.78
387
664
3
652.55
543 4287 8720.12
Wilma’s transaction cannot be accepted.
TRANSACTIONS ARE CONSISTENT
• A transaction must leave the database
in an valid or consistent state
• valid state == no constraint violations
• A constraint is a declared rule defining specifying
database states
• Constraints may be violated temporarily …
but must be corrected
before the transaction completes
CONCURRENT TRANSACTIONS
• Fred is withdrawing $500 from account 543.
• Wilma’s employer is depositing $1983.23 to account 543.
• These transactions are happening at the same time.
No PIN Balance
Account
s
101
896
5
10965.78
387
664
3
652.55
543 4287 8720.12
No PIN Balance
Account
s
101
896
5
10965.78
387
664
3
652.55
543 4287 10233.35
Combined result of both
transactions must be correct
TRANSACTIONS ARE ISOLATED
• If two transactions occur at the same time,
the cumulative effect must be the same as
if they had been done in isolation
• ($8720.12 - $500) + $1983.23 = $10233.35
• ($8720.12 + $1983.23) - $500 = $10233.35
•
• Ensuring isolation is the task of concurrency control
happen
concurrently
DURABLE TRANSACTIONS
• Wilma deposits $50,000 to account 387.
• Later, the bank’s computer crashes due to a lightning storm.
No PIN Balance
Account
s
101
896
5
10965.78
387
664
3
652.55
543 4287 8720.12
No PIN Balance
Account
s
101
896
5
10965.78
387
664
3
50652.55
543 4287 8720.12
Wilma’s deposit cannot be lost.
TRANSACTIONS ARE DURABLE
• Once a transaction's effect
on the database state has been committed,
it must be permanent
• The DBMS must ensure persistence,
even in the event of system failures
• Sources of failure:
• computer or operating system crash
• disk failure
• fire, theft, power outage, earthquake, operator errors, …
TRANSACTIONS • transaction = an indivisible unit of
data processing
• All transactions must have the ACID
properties:
• Atomicity: all or nothing
• Consistency: no constraint violations
• Isolation: no interference from other
concurrent transactions
• Durability: committed changes must not
be lost
due to any kind of failure

Lecture-01-Fundamental-Database-Concepts.pptx.pdf

  • 1.
    IT 24 DATABASE MANAGEMENTSYSTEMS ALDERSGATE COLLEGE LECTURE 1 FUNDAMENTAL DATABASE CONCEPTS
  • 2.
    WHAT IS ADATABASE? • A database is any collection of data. • A DBMS is a software system designed to maintain a database. • We use a DBMS when • there is a large amount of data • security and integrity of the data are important • many users access the data concurrently
  • 3.
    EXAMPLE DATABASE APPLICATION •Consider a Phone Company, such as AT&T • Kinds of information they deal with: customer records billing information employee records management records customer service orders switching and wiring diagrams
  • 4.
    CONCERNS OF ADATABASE USER • With all that data, AT&T must be concerned with questions such as: • Where is the information kept? • How is the data structured? • How is the data kept consistent? • How is the data described? • How is the data kept secure? • How do different pieces of data interrelate?
  • 5.
    WHY USE ADBMS? • Without a DBMS, we'd have: data stored as bits on disks organized as files Access by a collection of ad hoc programs in C++, Java, PHP, etc. users of the data There is no control or coordination of what these programs do with the data
  • 6.
    WHY USE ADBMS? • With a DBMS, we have: data stored as bits on disks organized as files users of the data DBMS provides control and coordination to protect the data. DBMS applications
  • 7.
    DBMS STRUCTURE data users of thedata data dictionary data definition processor query processor security manager concurrency manager index manager application program(s) application program(s) application program(s) application program(s) internal/implementation view external/application view DBMS software components data description
  • 8.
    DBMS LANGUAGES users of thedata application program(s) application program(s) application program(s) application program(s) data data dictionary data definition processor query processor security manager concurrency manager index manager DDL: data definition language system configuration languages QL: query language DML: data manipulation language GPL: general purpose languages
  • 9.
    ADVANTAGES OF USINGA DBMS • Anything you can do with a DBMS, you can do with a file system, a network and a heap of C code • So why spend the money to buy a DBMS? • there is a well defined collection of capabilities common to a certain class of applications • for applications in this class, the DBMS already has these capabilities and probably does them better than you could with home-brewed code
  • 10.
  • 11.
    ADVANTAGES OF USINGA DBMS users of the data application program(s) application program(s) application program(s) application program(s) data data dictionary data definition processor query processor security manager concurrency manager index manager software operating between the data and the applications can provide many capabilities in a generic way
  • 12.
    PERSISTENCE • A DBMSprovides persistent objects, types and data structures • persistent = having a lifetime longer than the programs that use the data • any information that fits the data model of a particular DBMS can be made persistent with little effort • data model = concepts that can be used to describe the data
  • 13.
    CONCURRENCY • A DBMSsupports access by concurrent users • concurrent = happening at the same time • concurrent access, particularly writes (data changes), can result in inconsistent states (even when the individual operations are correct) • the DBMS can check the actual operations of concurrent users, to prevent activity that will lead to inconsistent states
  • 14.
    ACCESS CONTROL • ADBMS can restrict access to authorized users • security policies often require control that is more fine-grained than that provided by a file system • since the DBMS understands the data structure, it can enforce fairly sophisticated and detailed security policies • on subsets of the data • on subsets of the available operations
  • 15.
    REDUNDANCY CONTROL • ADBMS can assist in controlling redundancy • redundancy = multiple copies of the same data • with file storage, it's often convenient to store multiple copies of the same data, so that it's "local" to other data and applications • this can cause many problems: • wasted disk space • inconsistencies • need to enter the data multiple times
  • 16.
    COMPLEX SEMANTICS • ADBMS supports representation of complex relationships and integrity constraints • the semantics (meaning) of an application often includes many relationships and rules about the relative values of subsets of the data • these further restrict the possible instances of the database • relationships and constraints can be defined as part of the schema
  • 17.
    BACKUP AND RECOVERY •A DBMS can provide backup and recovery • backup = snapshots of the data particular times • recovery = restoring the data to a consistent state after a system crash • the higher level semantics (relationships and constraints) can make it difficult to restore a consistent state • transaction analysis can allow a DBMS to reconstruct a consistent state from a number of backups
  • 18.
    VIEWS AND INTERFACES •A DBMS can support multiple user interfaces and user views • since the DBMS provides a well-defined data model and a persistent data dictionary, many different interfaces can be developed to access the same data • data independence ensures that these UIs will not be made invalid by most changes to the data • new user views can be supported as new schemas defined against the conceptual schema
  • 19.
    ADVANTAGES OF USINGA DBMS • persistent objects, types and data structures • control of concurrent users • controlling of redundancy • restricting access (security) • representation of complex relationships and integrity constraints • backup and recovery • multiple user interfaces and user views
  • 20.
  • 21.
    DBMS ROLES users of thedata application program(s) application program(s) application program(s) application program(s) data data dictionary data definition processor query processor security manager concurrency manager index manager database designer application developers system administrator (and DB designer) DBMS system developers
  • 22.
    DBMS ROLES • ActorsOn the Scene (people interested in the actual data): • database administrators • database designers • systems analysts and application programmers • end users
  • 23.
    ACTORS ON THESCENE • Database Administrators • acquiring a DBMS • managing the system • acquiring HW and SW to support the DBMS • authorizing access (security policies) • managing staff, including DB designers
  • 24.
    ACTORS ON THESCENE • Database Designers • identifying the information of interested in the Universe of Discourse (UoD) • designing the database conceptual schema • designing views for particular users • designing the physical data layout and logical schema • adjusting data parameters for performance
  • 25.
    ACTORS ON THESCENE • Systems Analysts and Application Programmers (generic database developers) • provide specialized knowledge to optimize database usage • provide generic (canned) application programs
  • 26.
    ACTORS ON THESCENE • End Users • casual users: ad-hoc queries • naïve or parametric users: canned queries such as menus for a phone company customer service agent • sophisticated users: people who understand the system and the data and use it in many novel ways • standalone users: people who use personal easy-to-use databases for personal data
  • 27.
    DBMS ROLES • ActorsBehind the Scene: people who maintain the environment but aren't interested in the actual data • DBMS designers and implementers • tools developers • operators and maintenance personnel • database researchers
  • 28.
    ACTORS BEHIND THESCENE • DBMS designers and implementers • work for the company that supplies the DBMS (i.e. Microsoft , Oracle, Sybase, MySQL …) • programmers and engineers • design and implement the DBMS
  • 29.
    ACTORS BEHIND THESCENE • Tools Developers • design and implement DBMS add-ons or plug-ins • may work for DBMS supplier or be independent • kinds of tools: database design aids, performance monitoring tools, user and designer interfaces
  • 30.
    ACTORS BEHIND THESCENE • Operators and maintenance personnel • run and maintain the computer environment in which a DBMS operates • probably work for the database administrator (DBA)
  • 31.
    ACTORS BEHIND THESCENE • Database Researchers • academic or industrial researchers • develop new theory, new designs, new data models and new algorithms to improve future database management systems
  • 32.
  • 33.
    SCHEMAS AND INSTANCES •A database instance is the collective values of all database objects at some point in time • also called the (data) instance or (database) state • A schema describes the database and defines the possible instances • also called the data definition, data dictionary, or meta-data
  • 34.
    CONCEPTUAL DATA MODELS •A data model describes the possible schemas (essentially the meta-schema) • A DBMS is designed around a particular data model • this is what allows all system components (and humans) to understand the schema and data • possible data models • relational, object-oriented, object-relational, entity-relationship, semantic, network, hierarchical, etc.
  • 35.
    PHYSICAL DATA MODELS •A physical data model describes the way in which data is stored in the computer • typically only of interest to database designers, implementers and maintainers …not end users • must provide a well-defined structure that can be mapped to the conceptual schema • allows optimization strategies to be defined generically
  • 36.
    THREE-SCHEMA ARCHITECTURE ExternalView ExternalViewExternalView Internal Schema Conceptual Schema generic view physical view user-specific views
  • 37.
    DATA INDEPENDENCE • physicaldata independence • conceptual and external schema are defined in terms of the data model, rather than the actual data layout • ensures that conceptual and external schemas are not affected by changes to the physical data layout • logical data independence • ensures that changes to the conceptual schema don't affect the external views • (this is not always achievable)
  • 38.
  • 39.
    TRANSACTIONS • transaction =an indivisible unit of data processing • All transactions must have the ACID properties: • Atomicity: all or nothing • Consistency: no constraint violations • Isolation: no interference from other concurrent transactions • Durability: committed changes must not be lost due to any kind of failure
  • 40.
    ATOMIC TRANSACTIONS •Fred wantsto move $200 from his savings account to his checking account. 1) Money must be subtracted from savings account. 2) Money must be added to checking count. If both happen, Fred and the bank are both happy. If neither happens, Fred and the bank are both happy. If only one happens, either Fred or the bank will be unhappy. Fred’s transfer must be all or nothing.
  • 41.
    TRANSACTIONS ARE ATOMIC •Transactions must be atomic (indivisible) • the DBMS must ensure atomicity • everything happens, or nothing happens • boundaries of transaction (in time) are generally set by the application … the DBMS has no means of determining the intention of a transaction
  • 42.
    CORRECT TRANSACTIONS • Wilmatries to withdraw $1000 from account 387. constraint: account.Balance must be non-negative any transaction withdrawing more than $652.55 from acct 387 will violate this constraint No PIN Balance Account s 101 896 5 10965.78 387 664 3 652.55 543 4287 8720.12 Wilma’s transaction cannot be accepted.
  • 43.
    TRANSACTIONS ARE CONSISTENT •A transaction must leave the database in an valid or consistent state • valid state == no constraint violations • A constraint is a declared rule defining specifying database states • Constraints may be violated temporarily … but must be corrected before the transaction completes
  • 44.
    CONCURRENT TRANSACTIONS • Fredis withdrawing $500 from account 543. • Wilma’s employer is depositing $1983.23 to account 543. • These transactions are happening at the same time. No PIN Balance Account s 101 896 5 10965.78 387 664 3 652.55 543 4287 8720.12 No PIN Balance Account s 101 896 5 10965.78 387 664 3 652.55 543 4287 10233.35 Combined result of both transactions must be correct
  • 45.
    TRANSACTIONS ARE ISOLATED •If two transactions occur at the same time, the cumulative effect must be the same as if they had been done in isolation • ($8720.12 - $500) + $1983.23 = $10233.35 • ($8720.12 + $1983.23) - $500 = $10233.35 • • Ensuring isolation is the task of concurrency control happen concurrently
  • 46.
    DURABLE TRANSACTIONS • Wilmadeposits $50,000 to account 387. • Later, the bank’s computer crashes due to a lightning storm. No PIN Balance Account s 101 896 5 10965.78 387 664 3 652.55 543 4287 8720.12 No PIN Balance Account s 101 896 5 10965.78 387 664 3 50652.55 543 4287 8720.12 Wilma’s deposit cannot be lost.
  • 47.
    TRANSACTIONS ARE DURABLE •Once a transaction's effect on the database state has been committed, it must be permanent • The DBMS must ensure persistence, even in the event of system failures • Sources of failure: • computer or operating system crash • disk failure • fire, theft, power outage, earthquake, operator errors, …
  • 48.
    TRANSACTIONS • transaction= an indivisible unit of data processing • All transactions must have the ACID properties: • Atomicity: all or nothing • Consistency: no constraint violations • Isolation: no interference from other concurrent transactions • Durability: committed changes must not be lost due to any kind of failure