PHASES OF DISTRIBUTED
QUERY PROCESSING -I
PRESENTATION MEMBERS
-ANU ISSAC
-CRYSTAL CUTHINHO
-LEON D’SOUZA
-NEVIL D’SOUZA
-ANDREA FURTADO
• SQL , Parsing , The SQL query determines what data is to be found, but does not define the method by
which the data manager searches the database.
• Indexing, hashing, Views , ...
OBJECTIVES OF QUERY PROCESSING
• First and main objective of query Processing, "To convert high level language(SQL) to low level
language(Relational Algebra)".
• In distributed systems, the query written on single machine but are actually executed on different local
database. as the local database provides the data to execute the query.
• A query can be processed using various techniques, such that the technique should be efficient to
execute the query in distributed environment.
• To minimize the overall cost of execution.(I/O cost + CPU cost +Communication cost).
• to minimize the time required to execute the query.(I/O time + CPU time +Communication time).
• This phase brings query into normalized form in order of easier processing of query.
• its specifically focus on the where clause of query.
2. ANALYSIS
• Lexical & Syntactical analysis
• Verification of relations & attributes
• Conflict between operations
• Checks if result is possible
2. ANALYSIS
SEMANTIC ANALYSIS :
1) Connection Graph:
aka Query Graph
2. ANALYSIS
SEMANTIC ANALYSIS :
2) Join Graph:
-Subgraph of connection graph
-Only join operations considered
3. SIMPLIFICATION
• Detects redundant predicates
• Transforms queries – makes them simple & efficient
• Not at the cost of semantic correctness
• Checks for factors responsible for redundancy
QUERY RESTRUCTURING
1. Rewrites Query into equivalent Relational Algebra
2. Makes use of a Query Tree or Operator Tree
• Leaf node for every relation in the query
• Non-leaf node for intermediate relation we can generate
• Root node for the result of a query
3.Sequence of operation is from Leaf towards the Root
4.Transformation Rules are applied .
EXAMPLE
• Q) Find the names of employees other than Raj who worked on CAD/CAM project for either one or two
years.
• Query:
SELECT Ename
FROM PROJ, ASG, EMP
WHERE ASG.ENO=EMP.ENO
AND ASG.PNO=PROJ.PNO
AND ENAME ≠ “Raj”
AND PROJ.PNAME=“CAD/CAM”
AND (DUR=12 OR DUR=24);
ΠENAME (P)
σDUR=12 ORDUR=24(S)
σPNAME=“CAD/CAM”(S)
σENAME≠“RAJ”(S)
Pno(J) ENO
PROJ
ASG
EMP
RESTRUCTURING
FRAGMENTATION:
• Forming relational algebraic queries to be used on fragmented relations.
• Generation of fragmented query is done by replacing the global relations with
fragmented relation in the query tree of distributed query.
• The generic tree still has some scope for reconstruction and simplification.
• Generic tree is used to generate a simpler and optimized query by using
reduction technique.
• Type of fragmentation determines the reduction technique to be used.
REDUCTION FOR HORIZONTAL FRAGMENTATION:
• Reduction of generic tree is done using either selection operation
or join operation.
• Selection operation: Produces an intermediate relation that
remains empty if there is contradiction between selection
predicate and definition of the fragment.
• Join operation: Detects useless join operation by commuting joins
with union operation.
EXAMPLE
• Consider the schemas:
EMP(ENO, ENAME, TITLE) , ASG(ENO, PNO, RESP, DUR)
• Consider the following query and fragmentation: Query: SELECT *
FROM EMP, ASG WHERE EMP.ENO=ASG.ENO
• Horizontal fragmentation:
∗ EMP1 = σENO ≤ “ E3”(EMP) ASG1 = σENO ≤ ” E3”(ASG)
EMP2 = σ “ E3””<ENO<” E6”(EMP) ASG2 = σENO> ” E3”(ASG) –
EMP3 = σENO> ” E6”(EMP)
USING SELECTION OPERATION
• Consider the query: SELECT * FROM EMP WHERE ENO=”E5”
• Here the leaf node that corresponds to Employee relation in the generic tree
can be replaced by reconstruction rule.
• The selection predicate has a contradiction with the definition of EMP1 and
EMP3 fragment and produces empty result.
Department(deptno,dname,location)
• DEPT1= σ deptno<=10(Department)
• DEPT2= σ deptno>10(Department)
• Assume that the fragmention of employee relation is derived from department
• EMPi=Employee deptno DEPTi i=1,2
Select * from Employee,Department where depno>10 and Employee.deptno=Department.deptno.
Employee.deptno=Department.deptno
deptno>10
U
U
DEPT2
EMP1 EMP2
EMP2
Employee.deptno=Department.deptno
DEPT2DEPT1
Phases of distributed query processing

Phases of distributed query processing

  • 1.
    PHASES OF DISTRIBUTED QUERYPROCESSING -I PRESENTATION MEMBERS -ANU ISSAC -CRYSTAL CUTHINHO -LEON D’SOUZA -NEVIL D’SOUZA -ANDREA FURTADO
  • 2.
    • SQL ,Parsing , The SQL query determines what data is to be found, but does not define the method by which the data manager searches the database. • Indexing, hashing, Views , ...
  • 4.
    OBJECTIVES OF QUERYPROCESSING • First and main objective of query Processing, "To convert high level language(SQL) to low level language(Relational Algebra)". • In distributed systems, the query written on single machine but are actually executed on different local database. as the local database provides the data to execute the query. • A query can be processed using various techniques, such that the technique should be efficient to execute the query in distributed environment. • To minimize the overall cost of execution.(I/O cost + CPU cost +Communication cost). • to minimize the time required to execute the query.(I/O time + CPU time +Communication time). • This phase brings query into normalized form in order of easier processing of query. • its specifically focus on the where clause of query.
  • 5.
    2. ANALYSIS • Lexical& Syntactical analysis • Verification of relations & attributes • Conflict between operations • Checks if result is possible
  • 6.
    2. ANALYSIS SEMANTIC ANALYSIS: 1) Connection Graph: aka Query Graph
  • 7.
    2. ANALYSIS SEMANTIC ANALYSIS: 2) Join Graph: -Subgraph of connection graph -Only join operations considered
  • 8.
    3. SIMPLIFICATION • Detectsredundant predicates • Transforms queries – makes them simple & efficient • Not at the cost of semantic correctness • Checks for factors responsible for redundancy
  • 9.
    QUERY RESTRUCTURING 1. RewritesQuery into equivalent Relational Algebra 2. Makes use of a Query Tree or Operator Tree • Leaf node for every relation in the query • Non-leaf node for intermediate relation we can generate • Root node for the result of a query 3.Sequence of operation is from Leaf towards the Root 4.Transformation Rules are applied .
  • 10.
    EXAMPLE • Q) Findthe names of employees other than Raj who worked on CAD/CAM project for either one or two years. • Query: SELECT Ename FROM PROJ, ASG, EMP WHERE ASG.ENO=EMP.ENO AND ASG.PNO=PROJ.PNO AND ENAME ≠ “Raj” AND PROJ.PNAME=“CAD/CAM” AND (DUR=12 OR DUR=24);
  • 11.
  • 12.
  • 13.
    FRAGMENTATION: • Forming relationalalgebraic queries to be used on fragmented relations. • Generation of fragmented query is done by replacing the global relations with fragmented relation in the query tree of distributed query. • The generic tree still has some scope for reconstruction and simplification. • Generic tree is used to generate a simpler and optimized query by using reduction technique. • Type of fragmentation determines the reduction technique to be used.
  • 14.
    REDUCTION FOR HORIZONTALFRAGMENTATION: • Reduction of generic tree is done using either selection operation or join operation. • Selection operation: Produces an intermediate relation that remains empty if there is contradiction between selection predicate and definition of the fragment. • Join operation: Detects useless join operation by commuting joins with union operation.
  • 15.
    EXAMPLE • Consider theschemas: EMP(ENO, ENAME, TITLE) , ASG(ENO, PNO, RESP, DUR) • Consider the following query and fragmentation: Query: SELECT * FROM EMP, ASG WHERE EMP.ENO=ASG.ENO • Horizontal fragmentation: ∗ EMP1 = σENO ≤ “ E3”(EMP) ASG1 = σENO ≤ ” E3”(ASG) EMP2 = σ “ E3””<ENO<” E6”(EMP) ASG2 = σENO> ” E3”(ASG) – EMP3 = σENO> ” E6”(EMP)
  • 16.
    USING SELECTION OPERATION •Consider the query: SELECT * FROM EMP WHERE ENO=”E5” • Here the leaf node that corresponds to Employee relation in the generic tree can be replaced by reconstruction rule. • The selection predicate has a contradiction with the definition of EMP1 and EMP3 fragment and produces empty result.
  • 18.
    Department(deptno,dname,location) • DEPT1= σdeptno<=10(Department) • DEPT2= σ deptno>10(Department) • Assume that the fragmention of employee relation is derived from department • EMPi=Employee deptno DEPTi i=1,2 Select * from Employee,Department where depno>10 and Employee.deptno=Department.deptno.
  • 19.