perf: optimize drawPixel() (#748)

## Summary

Ref https://github.com/crosspoint-reader/crosspoint-reader/pull/737

This PR further reduce ~25ms from rendering time, testing inside the
Setting screen:

```
master:
[68440] [GFX] Time = 73 ms from clearScreen to displayBuffer

PR:
[97806] [GFX] Time = 47 ms from clearScreen to displayBuffer
```

And in extreme case (fill the entire screen with black or gray color):

```
master:
[1125] [   ] Test fillRectDither drawn in 327 ms
[1347] [   ] Test fillRect drawn in 222 ms

PR:
[1334] [   ] Test fillRectDither drawn in 225 ms
[1455] [   ] Test fillRect drawn in 121 ms
```

Note that
https://github.com/crosspoint-reader/crosspoint-reader/pull/737 is NOT
applied on top of this PR. But with 2 of them combined, it should reduce
from 47ms --> 42ms

## Details

This PR based on the fact that function calls are costly if the function
is small enough. For example, this simple call:

```
  int rotatedX = 0;
  int rotatedY = 0;
  rotateCoordinates(x, y, &rotatedX, &rotatedY);
```

Generated assembly code:

<img width="771" height="215" alt="image"
src="https://github.com/user-attachments/assets/37991659-3304-41c3-a3b2-fb967da53f82"
/>

This adds ~10 instructions just to prepare the registers prior to the
function call, plus some more instructions for the function's
epilogue/prologue. Inlining it removing all of these:

<img width="1471" height="832" alt="image"
src="https://github.com/user-attachments/assets/b67a22ee-93ba-4017-88ed-c973e28ec914"
/>

Of course, this optimization is not magic. It's only beneficial under 3
conditions:
- The function is small, not in size, but in terms of effective
instructions. For example, the `rotateCoordinates` is simply a jump
table, where each branch is just 3-4 inst
- The function has multiple input arguments, which requires some move to
put it onto the correct place
- The function is called very frequently (i.e. critical path)

---

### AI Usage

While CrossPoint doesn't have restrictions on AI tools in contributing,
please be transparent about their usage as it
helps set the right context for reviewers.

Did you use AI tools to help write this code? **NO**
This commit is contained in:
Xuan-Son Nguyen
2026-02-08 19:05:42 +01:00
committed by Dave Allie
parent 4f0a3aa4dd
commit 6909f127b4
3 changed files with 52 additions and 55 deletions

View File

@@ -33,12 +33,12 @@ class GfxRenderer {
RenderMode renderMode;
Orientation orientation;
bool fadingFix;
uint8_t* frameBuffer = nullptr;
uint8_t* bwBufferChunks[BW_BUFFER_NUM_CHUNKS] = {nullptr};
std::map<int, EpdFontFamily> fontMap;
void renderChar(const EpdFontFamily& fontFamily, uint32_t cp, int* x, const int* y, bool pixelState,
EpdFontFamily::Style style) const;
void freeBwBufferChunks();
void rotateCoordinates(int x, int y, int* rotatedX, int* rotatedY) const;
template <Color color>
void drawPixelDither(int x, int y) const;
template <Color color>
@@ -55,6 +55,7 @@ class GfxRenderer {
static constexpr int VIEWABLE_MARGIN_LEFT = 3;
// Setup
void begin(); // must be called right after display.begin()
void insertFont(int fontId, EpdFontFamily font);
// Orientation control (affects logical width/height and coordinate transforms)
@@ -72,6 +73,7 @@ class GfxRenderer {
// void displayWindow(int x, int y, int width, int height) const;
void invertScreen() const;
void clearScreen(uint8_t color = 0xFF) const;
void getOrientedViewableTRBL(int* outTop, int* outRight, int* outBottom, int* outLeft) const;
// Drawing
void drawPixel(int x, int y, bool state = true) const;
@@ -125,6 +127,4 @@ class GfxRenderer {
// Low level functions
uint8_t* getFrameBuffer() const;
static size_t getBufferSize();
void grayscaleRevert() const;
void getOrientedViewableTRBL(int* outTop, int* outRight, int* outBottom, int* outLeft) const;
};