[WIP] feat: add automatic precision checking for backward pass#101
Closed
chen2021673 wants to merge 3 commits intoInfiniTensor:masterfrom
Closed
[WIP] feat: add automatic precision checking for backward pass#101chen2021673 wants to merge 3 commits intoInfiniTensor:masterfrom
chen2021673 wants to merge 3 commits intoInfiniTensor:masterfrom
Conversation
Add PrecisionCheckLevel enum (NONE/FUNCTION/MODULE) to GlobalEnv for fine-grained control of precision checking. Register backward hooks for both Function and Module levels to check gradients during backward pass. Key changes: - Add PrecisionCheckLevel to GlobalEnv with env var support (INFINI_PRECISION_CHECK=module/function) - Register backward pre/post hooks in Module::operator() on output tensor grad_fn - Register precision check hooks in Function::Apply based on precision level - Add HookHandle base class definition - Update Module::Forward call syntax from Forward() to operator() - Simplify CheckTensors output format 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
…te hook types - Replace environment variables (INFINI_PRECISION_CHECK, INFINI_PRECISION_CHECK_ALL_RANKS) with command-line flags (--precision_check, --precision_check_all_ranks) - Move hook type definitions from global scope into Function class to eliminate duplication between function.h and function_hook.h - Update GlobalEnv to accept precision check parameters and propagate through InitAllEnv - Update precision_checker to use GlobalEnv instead of getenv() - Add gflags definitions to gpt2 and llama3 examples - Fix module operator() calls to use correct syntax Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add automatic precision checking for backward pass via INFINI_PRECISION_CHECK environment variable. Hooks are registered on gradient functions to detect NaN/Inf and monitor gradient statistics during backpropagation.
