Graph Database
Prepared By
:
Ali Rajab
Content
-
Graph Database
.
-
Types Of Graph Databases
.
-
Graph Database Query Language
.
-
Neo4j Graph database
.
-
Case Study :Social Recommandations
.
Graph database
-
A graph database system, or just graph database, is a system specifically designed for
managing graph-like data following the basic principles of database systems
.
-
The graph databases are gaining relevance in the industry due to their use in several
domains where graphs and network analytics are required (social networking, master
data management, geospatial, recommendations)
.
-
Popular graph databases are Neptune , Cosmos, Neo4j and Titan
.
1
Graph Database Model
-
Data structures for the schema and/or instances are modeled as graphs
.
-
Data manipulation is expressed by graph-oriented operations (i.e. a graph query
language)
.
-
Integrity constraints are defined over the graph structure
.
2
Types Of Graph Databases
-
There are numerous dissimilar graph data models
:
Property graphs Hypergraphs Triple stores
3
Property Graph Data Model
-
A property graph has the following characteristics
:
-
It contains nodes and relationships
.
-
Nodes contain properties (key-value pairs)
.
-
Nodes can be labeled with one or more labels
.
-
Relationships are named and directed, and always have a start and end node
.
-
Relationships can also contain properties
.
4
-
L is an infinite set of labels (for nodes and edges)
.
-
P is an infinite set of property names
.
-
V is an infinite set of atomic values
.
-
T is a finite set of datatypes (e.g., integer )
.
-
Given a set X, we assume that SET+
(X) is the set of all finite subsets of X, excluding
the empty set
.
-
Given a value v V, the function type(v) returns the data type of v
∈
.
-
The values in V will be distinguished as quoted strings
.
Property Graph Data Model
5
-
A property graph is a tuple G = (N, E, ρ, λ, σ) where
:
-
N is a finite set of nodes (also called vertices)
.
-
E is a finite set of edges such that E has no elements in common with N
.
-
ρ : E → (N × N) is a total function that associates each edge in E with a pair of nodes in N
.
-
λ : (N E) → SET
∪ +
(L) is a partial function that associates a node/edge with a set of labels
from L
.
Property Graph Data Model
6
-
σ : (N E)×P → SET
∪ +
(V) is a partial function that associates nodes/edges with
properties, and for each property it assigns a set of values from V
.
-
Given two nodes n1, n2 N and an edge e E, such that ρ(e) = (n1, n2), we will say
∈ ∈
that n1 and n2 are the “source node” and the “target node” of e respectively
.
Property Graph Data Model
7
Example
N = {n1, n2, n3, n4, n5, n6, n7} E = {e1, e2, e3, e4, e5, e6, e7}
ρ(e3) = (n5, n2)
λ(n1)={Author}, (n1, fname)=“Mariano”, (n1, lname)=“Consens”
8
Hypergraph Model
-
A hypergraph is a generalized graph model in which a relationship (called a
hyperedge) can connect any number of nodes
.
-
Hypergraph model allows any number of nodes at either end of a relationship
.
-
Hypergraphs can be useful where the domain consists mainly of many-to-many
relationships
.
9
Triples
-
Triple stores come from the Semantic Web movement, where researchers are
interested in large-scale knowledge inference by adding semantic markup to the links
that connect web resources
.
-
A triple is a subject-predicate-object data structure
.
-
Using triples, we can capture facts, such as “Ginger dances with Fred” and “Fred
likes ice cream.” Individually, single triples are semantically rather poor, but enmasse
they provide a rich dataset from which to harvest knowledge and infer connections
.
-
Triple stores typically provide SPARQL capabilities to reason about and stored RDF
data
.
10
Graph Database Query Language
-
A query takes as input a graph (called the target graph), applies graph pattern
matching, and returns a table of values as output
.
-
The syntax of the query language is based on four main clauses: SELECT, FROM,
MATCH and WHERE
.
-
These clauses allow to express basic pattern matching queries. Additionally, the
query language introduces the UNMATCH and UNION clauses in order to support the
negation and union of graph patterns respectively
.
11
Pattern Matching
-
The core feature of the language is the support to express a graph pattern which is
matched against the target graph
.
-
The SELECT clause defines the output of the query (specifically, a table of values)
.
-
The FROM clause defines the input graph
.
-
The MATCH clause defines a graph pattern
.
-
The WHERE clause defines conditions over the graph pattern
.
12
Example
Query returns the names of authors which have papers in common, i.e. the co-
authorship relationship
.
13
Neo4j Graph database
-
Neo4j uses a property graph database model consists of
:
-
Nodes describe entities (discrete objects) of a domain
.
-
Nodes can have zero or more labels to define (classify) what kind of nodes they
are
.
-
Relationships describes a connection between a source node and a target node
.
14
Cypher Query Language(CQL)
-
Cypher is Neo4j’s graph query language that allows users to store and retrieve data
from the graph database
.
-
Designed to be easily read and understood by developers, database professionals,
and business stakeholders
.
-
Enables a user (or an application acting on behalf of a user) to ask the database to
find data that matches a specific pattern
.
15
Case Study : Social Recommendations
-
Talent.net is a social recommendations application that enables users to discover
their own professional network, and identify other users with particular skill sets
.
-
Users work for companies, work on projects, and have one or more interests or skills.
Based on this information, Talent.net can describe a user’s professional network by
identifying other subscribers who share his or her interests
.
-
Searches can be restricted to the user’s current company, or extended to encompass
the entire subscriber base
.
16
-
Talent.net can also identify individuals with specific skills who are directly or
indirectly connected to the current user. Such searches are useful when looking for a
subject matter expert for a current engagement
.
-
Based on people’s interests and skills, and their work history, the application can
suggest likely candidates to include in one’s professional network
.
Case Study : Social Recommendations
17
Figure : Sample of the Talent.net social network
18
19
References
-
ROBINSON, Ian; WEBBER, Jim; EIFREM, Emil. Graph databases: new opportunities for
connected data. " O'Reilly Media, Inc.", 2015
.
-
ANGLES, Renzo. The Property Graph Database Model. In: AMW. 2018
.
Thank you

Graph_Databases__And_Its_Usage_Presentation.pptx

  • 1.
  • 2.
    Content - Graph Database . - Types OfGraph Databases . - Graph Database Query Language . - Neo4j Graph database . - Case Study :Social Recommandations .
  • 3.
    Graph database - A graphdatabase system, or just graph database, is a system specifically designed for managing graph-like data following the basic principles of database systems . - The graph databases are gaining relevance in the industry due to their use in several domains where graphs and network analytics are required (social networking, master data management, geospatial, recommendations) . - Popular graph databases are Neptune , Cosmos, Neo4j and Titan . 1
  • 4.
    Graph Database Model - Datastructures for the schema and/or instances are modeled as graphs . - Data manipulation is expressed by graph-oriented operations (i.e. a graph query language) . - Integrity constraints are defined over the graph structure . 2
  • 5.
    Types Of GraphDatabases - There are numerous dissimilar graph data models : Property graphs Hypergraphs Triple stores 3
  • 6.
    Property Graph DataModel - A property graph has the following characteristics : - It contains nodes and relationships . - Nodes contain properties (key-value pairs) . - Nodes can be labeled with one or more labels . - Relationships are named and directed, and always have a start and end node . - Relationships can also contain properties . 4
  • 7.
    - L is aninfinite set of labels (for nodes and edges) . - P is an infinite set of property names . - V is an infinite set of atomic values . - T is a finite set of datatypes (e.g., integer ) . - Given a set X, we assume that SET+ (X) is the set of all finite subsets of X, excluding the empty set . - Given a value v V, the function type(v) returns the data type of v ∈ . - The values in V will be distinguished as quoted strings . Property Graph Data Model 5
  • 8.
    - A property graphis a tuple G = (N, E, ρ, λ, σ) where : - N is a finite set of nodes (also called vertices) . - E is a finite set of edges such that E has no elements in common with N . - ρ : E → (N × N) is a total function that associates each edge in E with a pair of nodes in N . - λ : (N E) → SET ∪ + (L) is a partial function that associates a node/edge with a set of labels from L . Property Graph Data Model 6
  • 9.
    - σ : (NE)×P → SET ∪ + (V) is a partial function that associates nodes/edges with properties, and for each property it assigns a set of values from V . - Given two nodes n1, n2 N and an edge e E, such that ρ(e) = (n1, n2), we will say ∈ ∈ that n1 and n2 are the “source node” and the “target node” of e respectively . Property Graph Data Model 7
  • 10.
    Example N = {n1,n2, n3, n4, n5, n6, n7} E = {e1, e2, e3, e4, e5, e6, e7} ρ(e3) = (n5, n2) λ(n1)={Author}, (n1, fname)=“Mariano”, (n1, lname)=“Consens” 8
  • 11.
    Hypergraph Model - A hypergraphis a generalized graph model in which a relationship (called a hyperedge) can connect any number of nodes . - Hypergraph model allows any number of nodes at either end of a relationship . - Hypergraphs can be useful where the domain consists mainly of many-to-many relationships . 9
  • 12.
    Triples - Triple stores comefrom the Semantic Web movement, where researchers are interested in large-scale knowledge inference by adding semantic markup to the links that connect web resources . - A triple is a subject-predicate-object data structure . - Using triples, we can capture facts, such as “Ginger dances with Fred” and “Fred likes ice cream.” Individually, single triples are semantically rather poor, but enmasse they provide a rich dataset from which to harvest knowledge and infer connections . - Triple stores typically provide SPARQL capabilities to reason about and stored RDF data . 10
  • 13.
    Graph Database QueryLanguage - A query takes as input a graph (called the target graph), applies graph pattern matching, and returns a table of values as output . - The syntax of the query language is based on four main clauses: SELECT, FROM, MATCH and WHERE . - These clauses allow to express basic pattern matching queries. Additionally, the query language introduces the UNMATCH and UNION clauses in order to support the negation and union of graph patterns respectively . 11
  • 14.
    Pattern Matching - The corefeature of the language is the support to express a graph pattern which is matched against the target graph . - The SELECT clause defines the output of the query (specifically, a table of values) . - The FROM clause defines the input graph . - The MATCH clause defines a graph pattern . - The WHERE clause defines conditions over the graph pattern . 12
  • 15.
    Example Query returns thenames of authors which have papers in common, i.e. the co- authorship relationship . 13
  • 16.
    Neo4j Graph database - Neo4juses a property graph database model consists of : - Nodes describe entities (discrete objects) of a domain . - Nodes can have zero or more labels to define (classify) what kind of nodes they are . - Relationships describes a connection between a source node and a target node . 14
  • 17.
    Cypher Query Language(CQL) - Cypheris Neo4j’s graph query language that allows users to store and retrieve data from the graph database . - Designed to be easily read and understood by developers, database professionals, and business stakeholders . - Enables a user (or an application acting on behalf of a user) to ask the database to find data that matches a specific pattern . 15
  • 18.
    Case Study :Social Recommendations - Talent.net is a social recommendations application that enables users to discover their own professional network, and identify other users with particular skill sets . - Users work for companies, work on projects, and have one or more interests or skills. Based on this information, Talent.net can describe a user’s professional network by identifying other subscribers who share his or her interests . - Searches can be restricted to the user’s current company, or extended to encompass the entire subscriber base . 16
  • 19.
    - Talent.net can alsoidentify individuals with specific skills who are directly or indirectly connected to the current user. Such searches are useful when looking for a subject matter expert for a current engagement . - Based on people’s interests and skills, and their work history, the application can suggest likely candidates to include in one’s professional network . Case Study : Social Recommendations 17
  • 20.
    Figure : Sampleof the Talent.net social network 18
  • 21.
  • 22.
    References - ROBINSON, Ian; WEBBER,Jim; EIFREM, Emil. Graph databases: new opportunities for connected data. " O'Reilly Media, Inc.", 2015 . - ANGLES, Renzo. The Property Graph Database Model. In: AMW. 2018 .
  • 23.