Parallel and distributed computing lecture 4

INTRODUCTION TO DISTRIBUTED
SYSTEMS
Engr. Muhammad Ahsan Raees

INTRODUCTION TO DISTRIBUTED
SYSTEMS
`
 Definition
 Motivation for Distributed system
 Architectural Categories
 Characteristics, Issues, Goals,
 Advantages
 Disadvantages

DEFINITION
 A distributed system is a collection of independent
computers, interconnected via a network, capable of
collaborating on a task.
 A distributed system can be characterized as collection
of multiple autonomous computers that communicate
over a communication network and having following
features:
 No common Physical clock
 Enhanced Reliability
 Increased performance/cost ratio
 Access to geographically remote data and resources
 Scalability
3

DEFINITION CNTD…
 Distributed system is a collection of independent
entities that cooperate to solve a problem that
cannot be solved individually.
 So, basically it is nothing but a collection of
computers.
 DCS do not share a common memory or do not have
a common physical clock, and the only way they can
communicate is through the message passing and
for that they require a communication network

Definition of a Distributed System
A distributed system is (Tannenbaum):
A collection of independent computers that
appears to its users as a single coherent system.
A distributed system is (Lamport):
One in which the failure of a computer you
didn't even know existed can render your own
computer unusable

Overview…
 Distributed system connects autonomous processors
by communication network.
 The software component that run on each of the
computers use the local operating system and network
protocol stack.
 The distributed software is termed as middleware.
 The distributed execution is the execution of the
processes across the distributed system to collectively
achieve a common goal.

Centralized system
 All data and computational resources are kept and
controlled in a single central place, such as a server, in
a centralized system.
 Applications and users connect to this hub in order to
access and handle data.
 Although this configuration is easy to maintain and
secure, if too many users access it simultaneously or if
the central server malfunctions, it could become a
bottleneck.

Motivation for Distributed system
 Inherently distributed computation that is many
applications such as money transfer in the banking, or
reaching a consensus among the parties that are
geographically distant, the computation is inherently
distributed.
 Resource sharing the sharing of the resources such as
peripherals, and a complete data set and so on and so forth.
 Access the geographically remote data and resources,
such as bank database, supercomputer and so on.
 Reliability enhanced reliability possibility of replicating
the resources and execution to enhance the reliability.

Architectures of Distributed systems
 Client-Server Architecture
 Peer-to-Peer (P2P) Architecture
 Three-Tier Architecture
 Microservices Architecture
 Service-Oriented Architecture (SOA)(Software
Architecture)
 Event-Driven Architecture(Software Architecture)

Client-Server Architecture
 In this setup, servers provide resources or services, and
clients request them. Clients and servers communicate
over a network.
 Examples: Web applications, where browsers (clients)
request pages from web servers.

Peer-to-Peer (P2P) Architecture
 Each node, or "peer," in the network acts as both a
client and a server, sharing resources directly with each
other.
 Examples: File-sharing networks like BitTorrent,
where files are shared between users without a central
server.

Three-Tier Architecture
 This model has three layers: presentation (user
interface), application (business logic), and data
(database). Each layer is separated to allow easier
scaling and maintenance.
 Examples: Many web applications use this to separate
user interfaces, logic processing, and data storage.

Microservices Architecture
 The application is split into small, independent
services, each handling specific functions. These
services communicate over a network, often using
REST APIs or messaging.
 Examples: Modern web applications like Netflix or
Amazon, where different services handle user
accounts, orders, and recommendations independently.

Service-Oriented Architecture (SOA)
 Similar to microservices, SOA organizes functions as
services. However, SOA typically uses an enterprise
service bus (ESB) to manage communication between
services.
 Examples: Large enterprise applications in finance or
government, where different services handle various
aspects of business processes.

Event-Driven Architecture
 Components interact by sending and responding to
events rather than direct requests. An event triggers
specific actions or processes in various parts of the
system.
 Examples: Real-time applications like IoT systems,
where sensors trigger actions based on detected events.

Architectural Categories
Computer architectures consisting of
interconnected, multiple processors are
basically of two types:
1). Tightly coupled system
2). Loosely coupled system

TIGHTLY COUPLED SYSTEMS
In these systems, there is a single system wide
primary memory (address space) that is shared
by all the processors . Usually tightly coupled
systems are referred to as parallel processing
systems.
CPU CPU
System-
Wide
Shared
memory CPU
Interconnection hardware
CPU

LOOSELY COUPLED SYSTEMS
 In these systems, the processors do not
share memory, and each processor has its
own local memory .Loosely coupled systems
are referred to as distributed computing
systems, or simply distributed systems
Local memory
CPU
Local memory
CPU
Local memory
CPU
Local memory
CPU
Communication network

CHARACTERISTICS OF DISTRIBUTED
SYSTEM
Concurrency
No global clock
Independent failures
More reliable
Fault tolerant
Scalable

EXAMPLES OF DISTRIBUTED
SYSTEMS
 Database Management System
 Automatic Teller Machine Network
 Internet/World-Wide Web
 Mobile and Ubiquitous Computing
21

AUTOMATIC TELLER MACHINE
NETWORK
23

INTERNET
24
intranet
ISP
desktop computer:
backbone
satellite link
server:

network link:




WEB SERVERS AND WEB BROWSERS
26
Internet
Browsers
Web servers
www.google.com
www.uu.se
www.w3c.org
Protocols
Activity.html
http://www.w3c.org/Protocols/Activity.html
http://www.google.comlsearch?q=lyu
http://www.uu.se/
File system of
www.w3c.org

MOBILE AND UBIQUITOUS
COMPUTING
27
Laptop
Mobile
Printer
Camera
Internet
Host intranet Home intranet
GSM/GPRS
Wireless LAN
phone
gateway
Host site

Distributed System
A distributed system organized as middleware. The
middleware layer extends over multiple machines,
and offers each application the same interface.

GOALS:COMMON HARACTERISTICS
 Making resources accessible
 Openness
 Transparency
 Security
 Scalability
 Failure Handling
 Concurrency
 Heterogeneity

Making resources accessible
• The main goal of a distributed system is to make it
easy for the users (and applications) to access
remote resources, and to share them in a controlled
and efficient way.
• Resources can be just about anything, but typical
examples include things like printers, computers,
storage facilities, data, files, Web pages, and
networks,
Reasons to share resources.
• Economics.

OPENNESS
 An open distributed system is a system that
offers services according to standard rules that
describe the syntax and semantics of those
services.
 Detailed interfaces of components need to be
published.
 New components have to be integrated with
existing components. An open distributed system
should also be extensible.
 Differences in data representation of interface
types on different processors (of different
vendors) have to be resolved. 31

TRANSPARENCY
 Distributed systems should be perceived by users
and application programmers as a whole rather
than as a collection of cooperating components.
 Ability to hide the fact that process and resources
are distributed .
 Transparency has different aspects.
 These represent various properties that
distributed systems should have.
32

Transparency in a Distributed
System

ACCESS TRANSPARENCY
 Enables local and remote information objects
to be accessed using identical operations.
 Example: File system operations in NFS.
 Example: Navigation in the Web.
 Example: SQL Queries
34

LOCATION TRANSPARENCY
 Enables information objects to be accessed
without knowledge of their location.
 Example: File system operations in NFS
 Example: Pages in the Web
 Example: Tables in distributed databases
35

CONCURRENCY TRANSPARENCY
 Enables several processes to operate
concurrently using shared information
objects without interference between them.
 Example: Automatic teller machine network
 Example: Database management system
36

REPLICATION TRANSPARENCY
 Enables multiple instances of information
objects to be used to increase reliability and
performance without knowledge of the
replicas by users or application programs
 Example: Distributed DBMS
 Example: Mirroring Web Pages.
37

FAILURE TRANSPARENCY
 Enables the concealment of faults
 Allows users and applications to complete
their tasks despite the failure of other
components.
 Partial failure transparency is achievable
but complete failure transparency is not
possible
 Example: Database Management System
38

MIGRATION TRANSPARENCY
 Allows the movement of information objects
within a system without affecting the
operations of users or application programs
 Relocation Transparency:
 Situation in which resources can be
relocated while they are being accessed
without the user or application noticing
anything. In such cases, the system is said
to support relocation transparency.
39

PERFORMANCE TRANSPARENCY
 Allows the system to be reconfigured to
improve performance as loads vary.
 Load should be evenly distributed among
all the machines.
40

SCALING TRANSPARENCY
 Allows the system and applications to
expand in scale without change to the
system structure or the application
algorithms.
 Example: World-Wide-Web
 Example: Distributed Database
41

HETEROGENEITY
 Variety and differences in
 Networks
 Computer hardware
 Operating systems
 Programming languages
 Implementations by different developers
42

SECURITY
 In a distributed system, clients send
requests to access data managed by servers,
resources in the networks:
 Doctors requesting records from hospitals
 Users purchase products through electronic commerce
 Security is required for:
 Concealing the contents of messages: security and
privacy
 Identifying a remote user or other agent correctly
(authentication)
 New challenges:
 Denial of service attack
 Security of mobile code 43

FAILURE HANDLING (FAULT
TOLERANCE)
 Hardware, software and networks fail!
 Distributed systems must maintain
availability even at low levels of
hardware/software/network reliability.
 Fault tolerance is achieved by
 recovery
 redundancy
44

CONCURRENCY
 Components in distributed systems are
executed in concurrent processes.
 Components access and update shared
resources (e.g. variables, databases, device
drivers).
 Integrity of the system may be violated if
concurrent updates are not coordinated.
45

SCALABILITY
 Scalability of a system can be measured along at
least three different dimensions
 scalability with respect to size: meaning that we
can easily add more users and resources to the
system.
 geographically scalable :system is one in which
the users and resources may lie far apart.
 Administratively scalable: meaning that it can
still be easy to manage even if it spans many
independent administrative organizations.

SCALING TECHNIQUES
 Hiding communication latencies
 Asynchronous communication
 Allocate more job to client machine
 Distribution
 Distribution involves taking a component, splitting it
into smaller parts, and subsequently spreading those
parts across the system. An excellent example of
distribution is the Internet Domain Name System
(DNS)
 Replicate

4. BASIC DESIGN ISSUES
 Specific issues for distributed
systems:
 Naming
 Communication
 Software structure
 System architecture
 Workload allocation
 Consistency maintenance
48

NAMING
 A name is resolved when translated into an
interpretable form for resource/object reference.
 Communication identifier (IP address + port number)
 Name resolution involves several translation steps
 Design considerations
 Choice of name space for each resource type
 Name service to resolve resource names to comm. id.
 Name services include naming context
resolution, hierarchical structure, resource
protection
49

COMMUNICATION
 Separated components communicate with
sending processes and receiving processes for
data transfer and synchronization.
 Message passing: send and receive primitives
 synchronous or blocking
 asynchronous or non-blocking
 Abstractions defined: channels, sockets, ports.
 Communication patterns: client-server
communication (e.g., RPC, function shipping)
and group multicast
50

SOFTWARE STRUCTURE
 Layers in centralized computer systems:
51
Applications
Middleware
Operating system
Computer and Network Hardware

SOFTWARE STRUCTURE
 Layers and dependencies in distributed systems:
52
Applications
Distributed programming
support
Open
services
Open system kernel services
Computer and network hardware

Challenges
• Performance
• Concurrency
• Failures
• Scalability
• System updates/growth
• Heterogeneity
• Openness
• Multiplicity of ownership, authority
• Security
• Quality of service/user experience
• Transparency
• Debugging

ADVANTAGES OF DISTRIBUTED
SYSTEM
 Information Sharing among Distributed
Users
 Resource Sharing
 Extensibility and Incremental growth
 Shorter Response Time and Higher Output
 Higher Reliability
 Better Flexibility’s in meeting User’s needs
 Better price/performance ratio
 Scalability
 Transparency
7

DISADVANTAGES OF DISTRIBUTED
SYSTEM
 Difficulties of developing distributed
software
 Networking Problem
 Security Problems
 Performance
 Openness
 Reliability and Fault Tolerance
8

NEXT LECTURE
 •Introduction to Big Data
 Big Data Sources
 5 V’s of Big Data
 Big Data Processing Frameworks (Hadoop,
Spark, and NoSQL Databases)
 Introduction to Apache Hadoop Stack (HDFS,
MapReduce, Sqoop, Zookeeper, HBase, Hive, Pig)

REFERENCES:
 Tanenbaum, Andrew S., and Maarten Van
Steen. Distributed systems: principles and
paradigms. Prentice-Hall, 2007.
 Sinha, Pradeep K. Distributed operating systems:
concepts and design. PHI Learning Pvt. Ltd.,
1998.
 NOC:Distributed Systems,NPTEL

Parallel and distributed computing lecture 4

More Related Content

Similar to Parallel and distributed computing lecture 4

Recently uploaded

Parallel and distributed computing lecture 4