Skip to content

text.c: Improve the performance of text rendering#101

Open
TheGag96 wants to merge 2 commits intodevkitPro:masterfrom
TheGag96:faster-text
Open

text.c: Improve the performance of text rendering#101
TheGag96 wants to merge 2 commits intodevkitPro:masterfrom
TheGag96:faster-text

Conversation

@TheGag96
Copy link

@TheGag96 TheGag96 commented Jan 10, 2026

...by doing the following:

  • Create caches around fontGetCharWidthInfo and fontCalcGlyphPos for ASCII characters, because those are pretty slow. (Non-English languages do not get this benefit - perhaps an LRU cache for non-ASCII glyphs could be made to help here.)
  • Instead of doing one draw call per glyph, try to batch them as much as possible.
  • Because the system font is so fragmented (5 glyphs per texture), this requires collecting glyphs and sorting them before drawing batches to avoid costly texture swaps. (citro2d has the function C2D_TextOptimize for this exact reason.)

This lets me maintain 60 FPS on my o3DS with the 3D turned on in most but not all circumstances. Enough glyphs on screen can still cause dropped frames.

UPDATE: One more commit from another breakthrough:

As it turns out, the system font texture sheets are all 128x32 pixels and adjacent in memory! We can reinterpet the memory starting at sheet 0 and describe a much bigger texture that encompasses all of the ASCII glyphs and make our cache use that instead of the individual sheets. This will massively improve performance by reducing texture swaps within a piece of text, down to 0 if it's all English. We don't need any extra linear allocating to do this!

The coalescing will be applied to all characters / glyph sheets up until the last glyphInfo.nSheets % 32 sheets. This means that there are more operations per glyph being done in textGetGlyphPosFromCodePoint, but this is probably offset by the savings from not switching textures as often. And, this won't matter for English text, which has these results cached.

...by doing the following:

- Create caches around `fontGetCharWidthInfo` and `fontCalcGlyphPos`
  for ASCII characters, because those are pretty slow. (Non-English
  languages do not get this benefit - perhaps an LRU cache for non-ASCII
  glyphs could be made to help here.)
- Instead of doing one draw call per glyph, try to batch them as much as
  possible.
- Because the system font is so fragmented (5 glyphs per
  texture), this requires collecting glyphs and sorting them before
  drawing batches to avoid costly texture swaps. (citro2d has the
  function `C2D_TextOptimize` for this exact reason.)

This lets me maintain 60 FPS on my o3DS with the 3D turned on in most
but not all circumstances. Enough glyphs on screen can still cause
dropped frames.
…ets as combined

As it turns out, the sytem font texture sheets are all 128x32 pixels and
adjacent in memory! We can reinterpet the memory starting at sheet 0 and
describe a much bigger texture that encompasses all of the ASCII glyphs
and make our cache use that instead of the individual sheets. This will
massively improve performance by reducing texture swaps within a piece
of text, down to 0 if it's all English. We don't need any extra linear
allocating to do this!

The coalescing will be applied to all characters / glyph sheets up until
the last `glyphInfo.nSheets % 32` sheets. This means that there are more
operations per glyph being done in `textGetGlyphPosFromCodePoint`, but
this is probably offset by the savings from not switching textures as
often. And, this won't matter for English text, which has these results
cached.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant