## Summary **What is the goal of this PR?** Replace the linear scan of `lookupHtmlEntity` with a simple binary search to improve lookup performance. **What changes are included?** `lib/Epub/Epub/Entities/htmlEntities.cpp`: - Sorted the `ENTITY_LOOKUP` array. - Added a compile-time assertion to guarantee the array remains sorted. - Rewrote `lookupHtmlEntity` to use a binary search. ## Additional Context Benchmarked on my x64 laptop (probably will be different on RISC-V) ``` === Benchmark (53 entities x 10000 iterations) === Version Total time Avg per lookup ---------------------------------------------- linear 236.97 ms total 447.11 ns/lookup binary search 22.09 ms total 41.68 ns/lookup === Summary === Binary search is 10.73x faster than linear scan. ``` This is a simplified alternative to #1180, focused on keeping the implementation clean, and maintainable. ### AI Usage Did you use AI tools to help write this code? _**< NO >**_ --------- Co-authored-by: Zach Nelson <zach@zdnelson.com>
10 lines
370 B
C++
10 lines
370 B
C++
// based on
|
|
// https://github.com/atomic14/diy-esp32-epub-reader/blob/2c2f57fdd7e2a788d14a0bcb26b9e845a47aac42/lib/Epub/RubbishHtmlParser/htmlEntities.cpp
|
|
|
|
#pragma once
|
|
#include <string>
|
|
|
|
// Lookup a single HTML entity (including & and ;) and return its UTF-8 value
|
|
// Returns nullptr if entity is not found
|
|
const char* lookupHtmlEntity(const char* entity, size_t len);
|