text.c: Improve the performance of text rendering#101
Open
TheGag96 wants to merge 2 commits intodevkitPro:masterfrom
Open
text.c: Improve the performance of text rendering#101TheGag96 wants to merge 2 commits intodevkitPro:masterfrom
TheGag96 wants to merge 2 commits intodevkitPro:masterfrom
Conversation
...by doing the following: - Create caches around `fontGetCharWidthInfo` and `fontCalcGlyphPos` for ASCII characters, because those are pretty slow. (Non-English languages do not get this benefit - perhaps an LRU cache for non-ASCII glyphs could be made to help here.) - Instead of doing one draw call per glyph, try to batch them as much as possible. - Because the system font is so fragmented (5 glyphs per texture), this requires collecting glyphs and sorting them before drawing batches to avoid costly texture swaps. (citro2d has the function `C2D_TextOptimize` for this exact reason.) This lets me maintain 60 FPS on my o3DS with the 3D turned on in most but not all circumstances. Enough glyphs on screen can still cause dropped frames.
…ets as combined As it turns out, the sytem font texture sheets are all 128x32 pixels and adjacent in memory! We can reinterpet the memory starting at sheet 0 and describe a much bigger texture that encompasses all of the ASCII glyphs and make our cache use that instead of the individual sheets. This will massively improve performance by reducing texture swaps within a piece of text, down to 0 if it's all English. We don't need any extra linear allocating to do this! The coalescing will be applied to all characters / glyph sheets up until the last `glyphInfo.nSheets % 32` sheets. This means that there are more operations per glyph being done in `textGetGlyphPosFromCodePoint`, but this is probably offset by the savings from not switching textures as often. And, this won't matter for English text, which has these results cached.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
...by doing the following:
fontGetCharWidthInfoandfontCalcGlyphPosfor ASCII characters, because those are pretty slow. (Non-English languages do not get this benefit - perhaps an LRU cache for non-ASCII glyphs could be made to help here.)C2D_TextOptimizefor this exact reason.)This lets me maintain 60 FPS on my o3DS with the 3D turned on in most but not all circumstances. Enough glyphs on screen can still cause dropped frames.
UPDATE: One more commit from another breakthrough:
As it turns out, the system font texture sheets are all 128x32 pixels and adjacent in memory! We can reinterpet the memory starting at sheet 0 and describe a much bigger texture that encompasses all of the ASCII glyphs and make our cache use that instead of the individual sheets. This will massively improve performance by reducing texture swaps within a piece of text, down to 0 if it's all English. We don't need any extra linear allocating to do this!
The coalescing will be applied to all characters / glyph sheets up until the last
glyphInfo.nSheets % 32sheets. This means that there are more operations per glyph being done intextGetGlyphPosFromCodePoint, but this is probably offset by the savings from not switching textures as often. And, this won't matter for English text, which has these results cached.