Increase unit test coverage for diskann-tools crate#763
Conversation
…helper, tracing, gen_associated_data_from_range, data_type, and graph_data_types Co-authored-by: arrayka <1551741+arrayka@users.noreply.github.com>
Co-authored-by: arrayka <1551741+arrayka@users.noreply.github.com>
Co-authored-by: arrayka <1551741+arrayka@users.noreply.github.com>
Co-authored-by: arrayka <1551741+arrayka@users.noreply.github.com>
Co-authored-by: arrayka <1551741+arrayka@users.noreply.github.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #763 +/- ##
==========================================
+ Coverage 89.00% 89.16% +0.15%
==========================================
Files 428 431 +3
Lines 78417 78739 +322
==========================================
+ Hits 69795 70207 +412
+ Misses 8622 8532 -90
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
|
@arrayka Unfortunately I hit an unexpected error while processing your comment. I've automatically reported this to GitHub. You can ask me to try again later by mentioning me in a new comment. If you want to contact GitHub about this error, please mention the following identifier so they can better serve you: Sorry for the inconvenience! |
…atibility Co-authored-by: arrayka <1551741+arrayka@users.noreply.github.com>
…unique file paths Co-authored-by: arrayka <1551741+arrayka@users.noreply.github.com>
Co-authored-by: arrayka <1551741+arrayka@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR increases unit test coverage across the diskann-tools crate by adding targeted tests for utility modules, while also making a couple of small implementation adjustments to improve testability and input validation behavior.
Changes:
- Added new unit tests for multiple
diskann-tools/src/utils/*modules (errors, tracing, params, bitmap utilities, recall helpers, data generation). - Made
gen_associated_data_from_rangegeneric overStorageWriteProviderto enable in-memory provider usage in tests. - Fixed
int8/uint8radius validation logic inrandom_data_generator(previous condition was unreachable).
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| diskann-tools/src/utils/tracing.rs | Adds unit tests for init_test_subscriber and guard scoping. |
| diskann-tools/src/utils/search_index_utils.rs | Adds tests for RecallBoundsError display/conversion and KRecallAtN getters. |
| diskann-tools/src/utils/random_data_generator.rs | Fixes radius validation logic and adds broader unit tests (including Fp16). |
| diskann-tools/src/utils/parameter_helper.rs | Adds tests for get_num_threads(Some/None). |
| diskann-tools/src/utils/generate_synthetic_labels_utils.rs | Switches label output to use a storage provider and adds tests using VirtualStorageProvider::new_memory(). |
| diskann-tools/src/utils/gen_associated_data_from_range.rs | Generalizes storage provider type and adds range generation tests. |
| diskann-tools/src/utils/filter_search_utils.rs | Adds tests for SerializableBitSet conversion and bitmap matching edge cases. |
| diskann-tools/src/utils/cmd_tool_error.rs | Adds tests for CMDToolError formatting and From conversions. |
Comments suppressed due to low confidence (2)
diskann-tools/src/utils/random_data_generator.rs:40
- The radius validation error message says the radius cannot be "greater than 127 and less than or equal to 0", but the condition is now
(radius > 127.0 || radius <= 0.0). This message is logically incorrect/misleading for users (e.g., radius=128 triggers it). Consider updating the message (or splitting into two messages) to match the actual validation.
if (data_type == DataType::Int8 || data_type == DataType::Uint8)
&& (radius > 127.0 || radius <= 0.0)
{
return Err(CMDToolError {
details:
"Error: for int8/uint8 datatypes, radius (L2 norm) cannot be greater than 127 and less than or equal to 0"
.to_string(),
});
diskann-tools/src/utils/random_data_generator.rs:75
- Same issue as above in
write_random_data_writer: the validation condition is(radius > 127.0 || radius <= 0.0)but the error text uses "greater than 127 and less than or equal to 0". Updating the message will also require adjusting the unit tests that assert on this string.
if (data_type == DataType::Int8 || data_type == DataType::Uint8)
&& (radius > 127.0 || radius <= 0.0)
{
return Err(CMDToolError {
details:
"Error: for int8/uint8 datatypes, radius (L2 norm) cannot be greater than 127 and less than or equal to 0"
.to_string(),
});
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Summary
This PR significantly improves test coverage for the diskann-tools crate, increasing overall coverage from 52% to 63% with function coverage at 75%.
Changes Made
New Tests Added:
cmd_tool_error.rs (91% coverage, was 0%)
parameter_helper.rs (100% coverage, was 0%)
tracing.rs (70% coverage, was 0%)
gen_associated_data_from_range.rs (98% coverage, was 0%)
filter_search_utils.rs (77% coverage, was 63%)
search_index_utils.rs (68% coverage, was 66%)
random_data_generator.rs (95% coverage, was 77%)
generate_synthetic_labels_utils.rs (92% coverage, was 87%)
Removed Per Review Feedback:
data_type.rsandgraph_data_types.rs(enums and derived traits only)cmd_tool_error.rsFixed for CI Compatibility:
VirtualStorageProvider::new()calls to useVirtualStorageProvider::new_memory()for compatibility withvirtual_storagefeature flag used in CIrelative_contrast.rsandbuild_disk_index.rsgen_associated_data_from_rangefunction generic overStorageWriteProvidertrait to support both FileStorageProvider (CLI usage) and VirtualStorageProvider (tests)Coverage Summary
Overall Metrics:
Modules with 90%+ Coverage (7 total):
Modules Not Covered:
These modules are better suited for integration tests rather than unit tests and would require significant test infrastructure setup.
Known Issues
radius > 127.0 && radius <= 0.0can never be true. This should be fixed in a separate PR (likely should be||instead of&&).Testing
All 61 tests pass successfully:
Tests pass with and without CI features:
cargo test -p diskann-tools --libcargo test -p diskann-tools --lib --features diskann-providers/virtual_storagecargo build -p diskann-tools --bin gen_associated_data_from_rangeCode Quality
-D warningscargo fmt --checkOriginal prompt
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.