Adding GraphParams to be able to save graph parameters of index to SavedParams#786
Adding GraphParams to be able to save graph parameters of index to SavedParams#786
Conversation
… vector type to SavedParams
There was a problem hiding this comment.
Pull request overview
This PR adds GraphParams struct to persist graph configuration parameters (l_build, alpha, backedge_ratio, vector_dtype) alongside the BfTreeProvider index. The changes enable saving and loading these parameters as part of the index's SavedParams, allowing the DiskANNIndex configuration to be reconstructed on load.
Changes:
- Introduced
GraphParamsstruct with fields for l_build, alpha, backedge_ratio, and vector_dtype - Added optional
graph_paramsfield toBfTreeProvider,BfTreeProviderParameters, andSavedParams - Updated save/load implementations to persist and restore
graph_params - Updated all tests and documentation examples to set
graph_params: None
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| diskann-providers/src/model/graph/provider/async_/bf_tree/provider.rs | Added GraphParams struct, integrated graph_params field into BfTreeProvider/BfTreeProviderParameters/SavedParams, updated save/load logic, updated all doc examples and tests |
| diskann-providers/src/model/graph/provider/async_/bf_tree/mod.rs | Exported GraphParams in public API |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| /// Graph configuration parameters persisted alongside the index. | ||
| /// These are needed to reconstruct the `DiskANNIndex` config on load. | ||
| #[derive(Serialize, Deserialize, Clone, Debug)] |
There was a problem hiding this comment.
The GraphParams struct derives Debug while other serialized parameter structs like SavedParams, BfTreeParams, and QuantParams do not. For consistency, consider either removing Debug from GraphParams or adding it to the other parameter structs.
| pub struct GraphParams { | ||
| pub l_build: usize, | ||
| pub alpha: f32, | ||
| pub backedge_ratio: f32, |
There was a problem hiding this comment.
The GraphParams struct fields lack documentation. Consider adding doc comments to each field to explain what they represent. For example, what is l_build, what does alpha control, what is backedge_ratio used for, and what format should vector_dtype be in?
| pub struct GraphParams { | |
| pub l_build: usize, | |
| pub alpha: f32, | |
| pub backedge_ratio: f32, | |
| pub struct GraphParams { | |
| /// Graph build parameter controlling the size of the candidate list | |
| /// used during graph construction (larger values usually improve recall | |
| /// at the cost of higher build time and memory usage). | |
| pub l_build: usize, | |
| /// Tuning factor that controls how aggressively edges are pruned during | |
| /// graph construction, trading off graph sparsity against search quality. | |
| pub alpha: f32, | |
| /// Ratio of backward (reverse) edges to forward edges to retain in the | |
| /// graph, used to improve connectivity and robustness of the index. | |
| pub backedge_ratio: f32, | |
| /// Identifier for the vector element data type stored in the index, | |
| /// for example `"f32"` for 32-bit floating point vectors. |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #786 +/- ##
==========================================
- Coverage 89.00% 89.00% -0.01%
==========================================
Files 428 428
Lines 78417 78417
==========================================
- Hits 69795 69793 -2
- Misses 8622 8624 +2
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
| pub l_build: usize, | ||
| pub alpha: f32, | ||
| pub backedge_ratio: f32, | ||
| pub vector_dtype: String, |
There was a problem hiding this comment.
Is this the data type as in f32/f16 etc.? It's probably better for it to be an enum instead of a raw string.
This PR addresses the following issue:
We want to save alpha, l_build, backedge_ratio and vector_dtype somewhere and the best place to do it (in my opinion) is SavedParams.
For that we need to save GraphParams in BfTreeProviderParameters and in BfTreeProvider. This is what this PR does.