fix: pass full sequence to stopping criteria in generation loop#104
Closed
Jameswlepage wants to merge 1 commit intoCodeWithKyrian:mainfrom
Closed
fix: pass full sequence to stopping criteria in generation loop#104Jameswlepage wants to merge 1 commit intoCodeWithKyrian:mainfrom
Jameswlepage wants to merge 1 commit intoCodeWithKyrian:mainfrom
Conversation
The stopping criteria was incorrectly receiving $generatedInputIds (only the newly generated token from the current step) instead of $allInputIds (the full sequence including prompt and all generated tokens). This caused MaxLengthCriteria to never trigger because it was always checking a sequence of length 1, which would never exceed max_length.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What:
Description:
The stopping criteria in the generation loop was incorrectly receiving
$generatedInputIds(only the newly generated token from the current step) instead of$allInputIds(the full sequence including prompt and all generated tokens).This caused
MaxLengthCriteriato never trigger based on sequence length, because it was always checking a sequence of length 1, which would never exceedmax_length. As a result, text generation would run indefinitely until hitting memory limits or an EOS token.The Fix:
Testing:
Verified that
maxNewTokensparameter now correctly limits generation length.