PROCESSOR ALLOCATION 
B Y R I TU RANJAN SHR I VAS TWA 
Distributed 
Systems
WHAT YOU WILL LEARN? 
Why Distributed Systems need processor allocation 
How performance of Distributed Systems can be 
enhanced by using different Processor allocation 
strategies 
What are the issues that we face while designing a 
processor allocation strategy 
RITU RANJAN SHRIVASTWA
MOTIVATION 
• We are talking about distributed systems, hence multiple 
connected machines 
• A good algorithm is always appreciated 
• Speeds up Computation 
• Proper use of resources 
• Minimizing CPU Idle time 
• Concept of using idle workstations is a weak attempt at 
recapturing the wasted cycles 
• Using a single 1000-MIPS CPU may be much more expensive than 
100 10-MIPS CPU, then the Price/Performance ratio of the latter is 
much better. (It may also not be possible to build a much higher 
performance CPU) 
RITU RANJAN SHRIVASTWA 
Highest Performance 
system has: 
3,120,000 cores at 
2.2 GHz 
54,902.4 TFLOPS/s
ALLOCATION MODELS 
• Before talking about allocating processor, we make 
assumptions about the allocation models: 
• All machines are identical or at least code compatible 
• They differ at most by speed (MIPS or FLOPS) 
• Homogeneity (architecture) 
• The system is fully connected (doesn’t always mean a wire 
to each system; just that transport connections can be 
established) 
• New work is generated when a process decides to fork or 
otherwise create a sub-process 
RITU RANJAN SHRIVASTWA
PROCESSOR ALLOCATION STRATEGIES 
• NONMIGRATORY 
• A process when created is assigned a machine where it 
stays until it terminates. It doesn’t matter how overloaded 
the machine becomes or how many other machines are 
idle. 
• MIGRATORY 
• In contrast, a process can be moved even after execution 
hence allowing better load balancing. 
• Although these provide better load balancing, they have a 
major impact on system design 
RITU RANJAN SHRIVASTWA
AN EXAMPLE OF PROCESSOR ALLOCATION 
TO GIVE AN IDEA OF THE NEED 
RITU RANJAN SHRIVASTWA 
Mean 
Response Time 
Processor1 <- A 
Processor2 <- B 
=(10+8)/2 = 9 sec 
Processor1 <- B 
Processor2 <- A 
=(30+6)/2 = 18 sec 
Q. Which allocation is better?
AN EXAMPLE OF PROCESSOR ALLOCATION 
TO GIVE AN IDEA OF THE NEED 
RITU RANJAN SHRIVASTWA 
Mean 
Response Time 
Processor1 <- A 
Processor2 <- B 
=(10+8)/2 = 9 sec 
Processor1 <- B 
Processor2 <- A 
=(30+6)/2 = 18 sec 
Q. Which allocation is better?
ISSUES IN PROCESSOR ALLOCATION 
• Design Issues 
• Deterministic vs Heuristic Algorithms 
• Centralized vs Distributed Algorithms 
• Optimal vs Sub-optimal Algorithms 
• Local vs Global Algorithms 
• Sender-initiated vs Receiver-initiated Algorithms 
• Implementation Issues 
RITU RANJAN SHRIVASTWA
DETERMINISTIC VS HEURISTIC 
ALGORITHMS 
• Deterministic 
• All information regarding processes is known (for example: 
computing requirements, file requirements, communication 
requirements, etc.) 
• Total information is not always available but approximations 
can be done. For example: In Banking, Insurance, Airline 
Reservation, today’s work is just like yesterdays so nature of 
workload can at least be statistically characterized. 
• Heuristic 
• Workload is completely unpredictable 
• Requests for work may change dramatically from hour to 
hour or minute to minute 
RITU RANJAN SHRIVASTWA
CENTRALIZED VS DISTRIBUTED 
• Centralized 
• Collecting all the information at one place 
(machine/system) allows better decision to be made but is 
less robust and can put a heavy load on the central 
machine. 
• Distributed 
• Opposite to centralized (may also be termed as 
Decentralized). Here there is no central machine and 
algorithm is implemented on all the machines. 
RITU RANJAN SHRIVASTWA
OPTIMAL VS SUB-OPTIMAL 
• Depends upon the first two issues 
• Are we trying to find best solution or simply an 
acceptable one 
• Optimal Solutions can be found out in both 
centralize and distributed systems but finding 
optimal solution may be costly as they involve 
collecting more information and processing it more 
thoroughly. 
• In practice we use Heuristic, Distributed and Sub-optimal 
solutions 
RITU RANJAN SHRIVASTWA
LOCAL VS GLOBAL 
• Deciding whether to keep a new born or forked 
process in the same machine or transferring to other 
• Crude algorithms suggest to keep the newly born 
process to the same machine if the workload on 
that machine is below threshold value. But this 
technique may be far from optimal. 
• A better approach is to keep information of all the 
systems and decide upon which system to be 
allocated with the new process. This can provide a 
slight better result than the local technique but at a 
much higher cost. 
RITU RANJAN SHRIVASTWA
SENDER-INITIATED VS RECEIVER-INITIATED 
ALGORITHMS 
• This issue deals with location policy 
• Once transfer policy decides whether to keep a process or 
not, this comes into play 
Sender Initiated Receiver Initiated 
RITU RANJAN SHRIVASTWA
IMPLEMENTATION ISSUES 
• Calculating work load (not an easy task) 
• A way suggests to count the total no. of processes and use the number as 
the load – but on idle systems even there are various processes that keep 
on running in background so process count says nothing about current 
load) 
• A second way is to count just the running or ready processes 
• A more direct measure is to capture the busy time of the CPU that can be 
achieved by setting a timer to generate periodic interrupts that records the 
current CPU status. Con: Interrupts are switched off when kernel executes 
critical code. This may lead to faulty readings and will tend to 
underestimate the true CPU usage 
• Another implementation takes into consideration the Overhead of the 
algorithms (during transferring processes) but is not easy so most algorithms 
ignore it 
• Next we consider complexity of the algorithm as an issue. (The algorithm 
may produce better results but its running time degrades the outcome and 
which may not be better than existing algorithms). An example. 
RITU RANJAN SHRIVASTWA
PROCESSOR ALLOCATION 
ALGORITHMS 
• There are many algorithms like 
• A GRAPH-THEORETIC DETERMINISTIC ALGORITHM 
• A CENTRALIZED ALGORITHM 
• A HIERARCHICAL ALGORITHM 
• A SENDER-INITIATED DISTRIBUTED HEURISTIC ALGORITHM 
• A RECEIVER INITITATED DISTRIBUTED HEURISTIC ALGORITHM 
• A BIDDING ALGORITHM 
• In this part we will study only about 
• A GRAPH-THEORETIC DETERMINISTIC ALGORITHM 
RITU RANJAN SHRIVASTWA
A GRAPH-THEORETIC DETERMINISTIC 
ALGORITHM 
• Recall assumptions of Deterministic Algorithms 
• Here the communication requirements are known 
• There can be more processes than processors 
• In which case multiple processes are allocated to one 
processor 
• The system can be represented as a weighted graph 
• Each node is a process 
• Each arc (edge) represents the flow of messages between two 
processes 
• Lets take a scenario where there are 3 processors and 9 
processes 
RITU RANJAN SHRIVASTWA
A GRAPH-THEORETIC DETERMINISTIC 
ALGORITHM (CONTD.) 
• The weighted graph would look like 
RITU RANJAN SHRIVASTWA
A GRAPH-THEORETIC DETERMINISTIC 
ALGORITHM (CONTD.) 
• The problem is reduced to finding a way to partition 
(i.e., cut) the graph into k disjoint sub-graphs, 
subject to certain constraints (e.g., total CPU and 
memory req. below some limits for each sub-graph) 
• Arcs joining two sub-graphs will represent network traffic 
• Arcs joining two processes within a sub-graph can be 
ignored as it is intra-machine communication. 
• Goal is to find the partitioning that minimizes the 
network traffic while meeting all the constraints. 
RITU RANJAN SHRIVASTWA
A GRAPH-THEORETIC DETERMINISTIC 
ALGORITHM (CONTD.) 
CPU1 CPU 2 CPU3 
Partitioning the graph to allocate 9 processes to 3 processors 
Network traffic = ΣEn [sum of all network edges] 
= 30 
We can also partition the graph differently, as we will see in the next slide 
RITU RANJAN SHRIVASTWA
A GRAPH-THEORETIC DETERMINISTIC 
ALGORITHM (CONTD.) 
CPU1 CPU 2 CPU3 
Partitioning the graph to allocate 9 processes to 3 processors 
Network traffic = ΣEn [sum of all network edges] 
= 28 
Clearly we can see that a different approach reduces network traffic 
RITU RANJAN SHRIVASTWA
POST QUESTIONS OR COMMENTS BELOW 
RITU RANJAN SHRIVASTWA

Processor allocation in Distributed Systems

  • 1.
    PROCESSOR ALLOCATION BY R I TU RANJAN SHR I VAS TWA Distributed Systems
  • 2.
    WHAT YOU WILLLEARN? Why Distributed Systems need processor allocation How performance of Distributed Systems can be enhanced by using different Processor allocation strategies What are the issues that we face while designing a processor allocation strategy RITU RANJAN SHRIVASTWA
  • 3.
    MOTIVATION • Weare talking about distributed systems, hence multiple connected machines • A good algorithm is always appreciated • Speeds up Computation • Proper use of resources • Minimizing CPU Idle time • Concept of using idle workstations is a weak attempt at recapturing the wasted cycles • Using a single 1000-MIPS CPU may be much more expensive than 100 10-MIPS CPU, then the Price/Performance ratio of the latter is much better. (It may also not be possible to build a much higher performance CPU) RITU RANJAN SHRIVASTWA Highest Performance system has: 3,120,000 cores at 2.2 GHz 54,902.4 TFLOPS/s
  • 4.
    ALLOCATION MODELS •Before talking about allocating processor, we make assumptions about the allocation models: • All machines are identical or at least code compatible • They differ at most by speed (MIPS or FLOPS) • Homogeneity (architecture) • The system is fully connected (doesn’t always mean a wire to each system; just that transport connections can be established) • New work is generated when a process decides to fork or otherwise create a sub-process RITU RANJAN SHRIVASTWA
  • 5.
    PROCESSOR ALLOCATION STRATEGIES • NONMIGRATORY • A process when created is assigned a machine where it stays until it terminates. It doesn’t matter how overloaded the machine becomes or how many other machines are idle. • MIGRATORY • In contrast, a process can be moved even after execution hence allowing better load balancing. • Although these provide better load balancing, they have a major impact on system design RITU RANJAN SHRIVASTWA
  • 6.
    AN EXAMPLE OFPROCESSOR ALLOCATION TO GIVE AN IDEA OF THE NEED RITU RANJAN SHRIVASTWA Mean Response Time Processor1 <- A Processor2 <- B =(10+8)/2 = 9 sec Processor1 <- B Processor2 <- A =(30+6)/2 = 18 sec Q. Which allocation is better?
  • 7.
    AN EXAMPLE OFPROCESSOR ALLOCATION TO GIVE AN IDEA OF THE NEED RITU RANJAN SHRIVASTWA Mean Response Time Processor1 <- A Processor2 <- B =(10+8)/2 = 9 sec Processor1 <- B Processor2 <- A =(30+6)/2 = 18 sec Q. Which allocation is better?
  • 8.
    ISSUES IN PROCESSORALLOCATION • Design Issues • Deterministic vs Heuristic Algorithms • Centralized vs Distributed Algorithms • Optimal vs Sub-optimal Algorithms • Local vs Global Algorithms • Sender-initiated vs Receiver-initiated Algorithms • Implementation Issues RITU RANJAN SHRIVASTWA
  • 9.
    DETERMINISTIC VS HEURISTIC ALGORITHMS • Deterministic • All information regarding processes is known (for example: computing requirements, file requirements, communication requirements, etc.) • Total information is not always available but approximations can be done. For example: In Banking, Insurance, Airline Reservation, today’s work is just like yesterdays so nature of workload can at least be statistically characterized. • Heuristic • Workload is completely unpredictable • Requests for work may change dramatically from hour to hour or minute to minute RITU RANJAN SHRIVASTWA
  • 10.
    CENTRALIZED VS DISTRIBUTED • Centralized • Collecting all the information at one place (machine/system) allows better decision to be made but is less robust and can put a heavy load on the central machine. • Distributed • Opposite to centralized (may also be termed as Decentralized). Here there is no central machine and algorithm is implemented on all the machines. RITU RANJAN SHRIVASTWA
  • 11.
    OPTIMAL VS SUB-OPTIMAL • Depends upon the first two issues • Are we trying to find best solution or simply an acceptable one • Optimal Solutions can be found out in both centralize and distributed systems but finding optimal solution may be costly as they involve collecting more information and processing it more thoroughly. • In practice we use Heuristic, Distributed and Sub-optimal solutions RITU RANJAN SHRIVASTWA
  • 12.
    LOCAL VS GLOBAL • Deciding whether to keep a new born or forked process in the same machine or transferring to other • Crude algorithms suggest to keep the newly born process to the same machine if the workload on that machine is below threshold value. But this technique may be far from optimal. • A better approach is to keep information of all the systems and decide upon which system to be allocated with the new process. This can provide a slight better result than the local technique but at a much higher cost. RITU RANJAN SHRIVASTWA
  • 13.
    SENDER-INITIATED VS RECEIVER-INITIATED ALGORITHMS • This issue deals with location policy • Once transfer policy decides whether to keep a process or not, this comes into play Sender Initiated Receiver Initiated RITU RANJAN SHRIVASTWA
  • 14.
    IMPLEMENTATION ISSUES •Calculating work load (not an easy task) • A way suggests to count the total no. of processes and use the number as the load – but on idle systems even there are various processes that keep on running in background so process count says nothing about current load) • A second way is to count just the running or ready processes • A more direct measure is to capture the busy time of the CPU that can be achieved by setting a timer to generate periodic interrupts that records the current CPU status. Con: Interrupts are switched off when kernel executes critical code. This may lead to faulty readings and will tend to underestimate the true CPU usage • Another implementation takes into consideration the Overhead of the algorithms (during transferring processes) but is not easy so most algorithms ignore it • Next we consider complexity of the algorithm as an issue. (The algorithm may produce better results but its running time degrades the outcome and which may not be better than existing algorithms). An example. RITU RANJAN SHRIVASTWA
  • 15.
    PROCESSOR ALLOCATION ALGORITHMS • There are many algorithms like • A GRAPH-THEORETIC DETERMINISTIC ALGORITHM • A CENTRALIZED ALGORITHM • A HIERARCHICAL ALGORITHM • A SENDER-INITIATED DISTRIBUTED HEURISTIC ALGORITHM • A RECEIVER INITITATED DISTRIBUTED HEURISTIC ALGORITHM • A BIDDING ALGORITHM • In this part we will study only about • A GRAPH-THEORETIC DETERMINISTIC ALGORITHM RITU RANJAN SHRIVASTWA
  • 16.
    A GRAPH-THEORETIC DETERMINISTIC ALGORITHM • Recall assumptions of Deterministic Algorithms • Here the communication requirements are known • There can be more processes than processors • In which case multiple processes are allocated to one processor • The system can be represented as a weighted graph • Each node is a process • Each arc (edge) represents the flow of messages between two processes • Lets take a scenario where there are 3 processors and 9 processes RITU RANJAN SHRIVASTWA
  • 17.
    A GRAPH-THEORETIC DETERMINISTIC ALGORITHM (CONTD.) • The weighted graph would look like RITU RANJAN SHRIVASTWA
  • 18.
    A GRAPH-THEORETIC DETERMINISTIC ALGORITHM (CONTD.) • The problem is reduced to finding a way to partition (i.e., cut) the graph into k disjoint sub-graphs, subject to certain constraints (e.g., total CPU and memory req. below some limits for each sub-graph) • Arcs joining two sub-graphs will represent network traffic • Arcs joining two processes within a sub-graph can be ignored as it is intra-machine communication. • Goal is to find the partitioning that minimizes the network traffic while meeting all the constraints. RITU RANJAN SHRIVASTWA
  • 19.
    A GRAPH-THEORETIC DETERMINISTIC ALGORITHM (CONTD.) CPU1 CPU 2 CPU3 Partitioning the graph to allocate 9 processes to 3 processors Network traffic = ΣEn [sum of all network edges] = 30 We can also partition the graph differently, as we will see in the next slide RITU RANJAN SHRIVASTWA
  • 20.
    A GRAPH-THEORETIC DETERMINISTIC ALGORITHM (CONTD.) CPU1 CPU 2 CPU3 Partitioning the graph to allocate 9 processes to 3 processors Network traffic = ΣEn [sum of all network edges] = 28 Clearly we can see that a different approach reduces network traffic RITU RANJAN SHRIVASTWA
  • 21.
    POST QUESTIONS ORCOMMENTS BELOW RITU RANJAN SHRIVASTWA