Conversation
fabratu
left a comment
There was a problem hiding this comment.
In order to keep the PR manageable, I suggest to fix all remaining issues concerning the integration in this PR and move optimization and additional thoughts to another PR.
Concerning building / testing: It seems that DHB is not correctly added as a submodule or there is some ill-defined logic in CMakeLists.txt. CI does fetch all submodules, however DHB is not found.
|
@HazelCC Could you clarify and elaborate what exactly you mean with the point: Write tests to check the consistency of unweighted DHBGraph objects.? |
An undirected graph object should store directed edges in both directions. Previously there were issues where logical bugs causing inconsistency for undirected graph, therefore it's good to have more tests addressing this. |
For now, we have referenced you as taking over for this task. Is that fine? |
For sure. I'd happy to do that. |
Including all changes necessary for integrating the DHB graph data structure represented by a single graph class. Does not include tests nor necessary refactorings to reuse Edge type.
This enables us to reuse the Edge type for the DHB Graph class.
…cached boolean value.
c-bebop
left a comment
There was a problem hiding this comment.
Looks very good, now just some small fixes.
bernlu
left a comment
There was a problem hiding this comment.
We should probably discuss some of these comments, some are quite detailed and maybe out of scope for this PR.
An additional general comment: I think it would be nice to add time complexity to the documentation for (almost) all functions.
| /** | ||
| * A weighted edge used for the graph constructor with | ||
| * initializer list syntax. | ||
| */ |
There was a problem hiding this comment.
I am not sure what this comment is supposed to tell me. The 'initializer list' part is obvious from reading the code. A short explanation of this struct would be more useful, or maybe just no comment at all?
| struct WeightedEdge : Edge { | ||
| edgeweight weight; | ||
|
|
||
| // Needed by cython |
There was a problem hiding this comment.
I guess this function should never be used in c++? Then the comment should clarify this and tell the user to never call this constructor.
| // forward declaration to randomization/CurveballImpl.hpp | ||
| namespace CurveballDetails { | ||
| class CurveballMaterialization; | ||
| } |
| if (!directed) { | ||
| is_inserted_out = m_dhb_graph.insert(u, v, edge_data, do_update); | ||
| is_inserted_in = m_dhb_graph.insert(v, u, edge_data, do_update); | ||
| } else { | ||
| is_inserted = m_dhb_graph.insert(u, v, edge_data, do_update); | ||
| } |
There was a problem hiding this comment.
Reorder for readability
| if (!directed) { | |
| is_inserted_out = m_dhb_graph.insert(u, v, edge_data, do_update); | |
| is_inserted_in = m_dhb_graph.insert(v, u, edge_data, do_update); | |
| } else { | |
| is_inserted = m_dhb_graph.insert(u, v, edge_data, do_update); | |
| } | |
| if (directed) { | |
| is_inserted = m_dhb_graph.insert(u, v, edge_data, do_update); | |
| } else { | |
| is_inserted_out = m_dhb_graph.insert(u, v, edge_data, do_update); | |
| is_inserted_in = m_dhb_graph.insert(v, u, edge_data, do_update); | |
| } |
| edgesIndexed(edgesIndexed), // edges are not indexed by default | ||
| nodeAttributeMap(this), edgeAttributeMap(this) { | ||
|
|
||
| m_dhb_graph = dhb::Matrix<EdgeData>(n); |
| void DHBGraph::removeAllEdges() { | ||
| m_dhb_graph = dhb::Matrix<EdgeData>(m_dhb_graph.vertices_count()); | ||
| m = 0; | ||
| storedNumberOfSelfLoops = 0; | ||
| omega = 0; | ||
| } |
There was a problem hiding this comment.
this function should invalidate all edgeAttributes as well.
| * BEWARE: Running time is \Theta(1) for DHB implementation but \Theta(deg(u)) for fallback | ||
| * NetworKit graph DS. |
There was a problem hiding this comment.
i think this fallback nk graph is outdated information?
| * | ||
| * @param u Node. | ||
| * @param i index; should be in [0, degreeOut(u)) | ||
| * @return @a edge weight to the i-th (outgoing) neighbor of @a u, or @c +inf if no such |
There was a problem hiding this comment.
The function actually returns the default edge weight and not inf when the neighbor does not exist.
| * @param u Node. | ||
| * @param i index; should be in [0, degreeOut(u)) | ||
| * @return pair: i-th (outgoing) neighbor of @a u and the corresponding | ||
| * edge weight, or @c defaultEdgeWeight if unweighted. |
There was a problem hiding this comment.
we should add that none is returned when the neighbor does not exist.
| * | ||
| * This function is not supported for DHB. Please use getIthNeighborWithWeight(node u, index i). | ||
| */ | ||
| std::pair<node, edgeweight> getIthNeighborWithWeight(Unsafe, node u, index i) const; |
There was a problem hiding this comment.
why is this function declared twice, but once we claim that it is not supported while the other one is implemented?
There was a problem hiding this comment.
For @c-bebop : Look whether all Unsafe methods are marked as unsupported / unimplemented.
There was a problem hiding this comment.
Pull Request Overview
This PR integrates the Dynamic Hashed Blocks (DHB) data structure into NetworKit to accelerate dynamic graph operations. Key changes include:
- Adding a new DHBGraph implementation and test (DHBGraphGTest).
- Updating source files such as EdgeIterators.cpp to add DHB-specific specializations.
- Modifying multiple CMakeLists and CI configuration files to incorporate DHB as a submodule and update linking and build settings.
Reviewed Changes
Copilot reviewed 10 out of 13 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| networkit/cpp/graph/test/CMakeLists.txt | Added DHBGraphGTest to the test suite. |
| networkit/cpp/graph/EdgeIterators.cpp | Added DHBGraph-specific template specializations for edge iterators. |
| networkit/cpp/graph/CMakeLists.txt | Included DHBGraph.cpp in source files for compilation. |
| include/networkit/graph/Graph.hpp | Removed duplicate edge type definitions (moved to Edge.hpp). |
| include/networkit/graph/Edge.hpp | Introduced new definitions for Edge and WeightedEdge types. |
| (Other files) | Updated build scripts, submodules (.gitmodules) and CI configuration accordingly. |
Comments suppressed due to low confidence (1)
networkit/cpp/graph/EdgeIterators.cpp:22
- The DHBGraph specialization of validEdge omits the Unsafe parameter found in the Graph version's getIthNeighbor call; please confirm that this API difference is intentional and that it correctly reflects the DHBGraph interface design.
}
| target_link_libraries(networkit PRIVATE | ||
| dhb | ||
| OpenMP::OpenMP_CXX | ||
| PUBLIC | ||
| tlx) |
There was a problem hiding this comment.
Review the linking order and the use of PRIVATE vs. PUBLIC for the added 'dhb' dependency to ensure that transitive dependencies are handled correctly.
| target_link_libraries(networkit PRIVATE | |
| dhb | |
| OpenMP::OpenMP_CXX | |
| PUBLIC | |
| tlx) | |
| target_link_libraries(networkit PUBLIC | |
| dhb | |
| tlx | |
| PRIVATE | |
| OpenMP::OpenMP_CXX) |
|
There are still some review comments but.. are we merging this one? |
|
I see that there have been some review remarks by @bernlu . We are working on this feature for more than 1 year now. The merge should have happened this spring. I am sure I can address some of the comments. I would like us to focus on a merge ASAP as I need this feature for my work and I would be happy to have this merged into |
|
True, not so easy this one - formost having a final decision. :) If it's possible for you, address the comments you see fit, and I will handle the rest. |
| * TODO: | ||
| * - [ ] Review the passing by value semantics for all lambdas passed | ||
| * to a method such as f(L handle). Instead, we might want to pass | ||
| * all lambdas by rvalue-reference as in f(L&& handle). |
There was a problem hiding this comment.
Please mark the DHBGraph class as experimental. It suffices to mark it in the documentation for C++ by adding @note to users.
This PR integrate the data structure Dynamic Hashed Blocks (DHB) into Networkit. This improves NetworKit for dynamic graph operations.
TLTR:
For a detailed description of DHB, refer to the paper: A Fast Data Structure for Dynamic Graphs Based on Hash-Indexed Adjacency Blocks (2022)
Why?
Experiments have shown that DHB outperforms competing dynamic graph structures in edge insertions, updates, deletions, and traversal operations. This integration allows users to leverage DHB to gain performance for dynamic graph operations compared to the (vanilla) Graph class.
How?
This integration extends the functionalities of the default
Graphclass by supporting (almost) all methods implemented inGraph. When NetworKit is build using DHB, the necessary resources will be pulled in as a sub-module, and the associated code branches will be compiled. If the respective NetworKit modules are not compiled using the DHB integration,DHBGraphwill fall back using the implementation of the defaultGraphclass.Measurement:
Experiments run on commodity hardware (MacOS, M1, 8 GB RAM) have been run to get some baseline measurements on the performance of default
Graphclass vs. theDHBGraphclass. For edge insertions, sequential weight update, and sequential edge removal,DHBGraphoutperformsGraphclass in an unweighted graph setting.TODOs:
Review the passing by value semantics for all lambdas passed to a method such asf(L handle). Instead, we might want to pass all lambdas by rvalue-reference as inf(L&& handle).omp parallel forscheduler (guided?) we use.Review possibilities to use a concurrent/parallel version of functions that iterate over edges/vertices. For example, we might makeremoveNodeto use the concurrent version offorNeighbors.edgeIteratorandnodeIterator. (@fabratu)unweightedundirectedDHBGraphobjects. Anunweightedundirected graph should store one edge in both direction. (@HazelCC)