Files
crosspoint-reader-mod/lib/Epub/Epub/htmlEntities.h
Uri Tauber 1abe307f20 perf: Optimize HTML entities lookup to O(log(n)) (#1194)
## Summary

**What is the goal of this PR?** Replace the linear scan of
`lookupHtmlEntity` with a simple binary search to improve lookup
performance.

**What changes are included?**
`lib/Epub/Epub/Entities/htmlEntities.cpp`: 
 - Sorted the `ENTITY_LOOKUP` array.
 - Added a compile-time assertion to guarantee the array remains sorted.
 - Rewrote `lookupHtmlEntity` to use a binary search.

## Additional Context

Benchmarked on my x64 laptop (probably will be different on RISC-V)
```
=== Benchmark (53 entities x 10000 iterations) ===

Version           Total time   Avg per lookup
----------------------------------------------
linear          236.97 ms total     447.11 ns/lookup
binary search    22.09 ms total      41.68 ns/lookup

=== Summary ===

Binary search is 10.73x faster than linear scan.
```

This is a simplified alternative to #1180, focused on keeping the
implementation clean, and maintainable.

### AI Usage


Did you use AI tools to help write this code? _**< NO >**_

---------

Co-authored-by: Zach Nelson <zach@zdnelson.com>
2026-02-26 12:55:31 -06:00

10 lines
370 B
C++

// based on
// https://github.com/atomic14/diy-esp32-epub-reader/blob/2c2f57fdd7e2a788d14a0bcb26b9e845a47aac42/lib/Epub/RubbishHtmlParser/htmlEntities.cpp
#pragma once
#include <string>
// Lookup a single HTML entity (including & and ;) and return its UTF-8 value
// Returns nullptr if entity is not found
const char* lookupHtmlEntity(const char* entity, size_t len);