Files
crosspoint-reader-mod/lib/Epub/Epub.h
Uri Tauber 30d8a8d011 feat: slim footnotes support (#1031)
## Summary
**What is the goal of this PR?** Implement support for footnotes in epub
files.
It is based on #553, but simplified — removed the parts which
complicated the code and burden the CPU/RAM. This version supports basic
footnotes and lets the user jump from location to location inside the
epub.

**What changes are included?**
- `FootnoteEntry` struct — A small POD struct (number[24], href[64])
shared between parser, page storage, and UI.
- Parser: `<a href>` detection (`ChapterHtmlSlimParser`) — During a
single parsing pass, internal epub links are detected and collected as
footnotes. The link text is underlined to hint navigability.
Bracket/whitespace normalization is applied to the display label (e.g.
[1] → 1).
- Footnote-to-page assignment (`ChapterHtmlSlimParser`, `Page`) —
Footnotes are attached to the exact page where their anchor word
appears, tracked via a cumulative word counter during layout, surviving
paragraph splits and the 750-word mid-paragraph safety flush.
- Page serialization (`Page`, `Section`) — Footnotes are
serialized/deserialized per page (max 16 per page). Section cache
version bumped to 14 to force a clean rebuild.
- Href → spine resolution (`Epub`) — `resolveHrefToSpineIndex()` maps an
href (e.g. `chapter2.xhtml#note1`) to its spine index by filename
matching.
- Footnotes menu + activity (`EpubReaderMenuActivity`,
`EpubReaderFootnotesActivity`) — A new "Footnotes" entry in the reader
menu lists all footnote links found on the current page. The user
scrolls and selects to navigate.
- Navigate & restore (`EpubReaderActivity`) — `navigateToHref()` saves
the current spine index and page number, then jumps to the target. The
Back button restores the saved position when the user is done reading
the footnote.

  **Additional Context**

**What was removed vs #553:** virtual spine items
(`addVirtualSpineItem`, `isVirtualSpineItem`), two-pass parsing,
`<aside>` content extraction to temp HTML files, `<p class="note">`
paragraph note extraction, `replaceHtmlEntities` (master already has
`lookupHtmlEntity`), `footnotePages` / `buildFilteredChapterList`,
`noterefCallback` / `Noteref` struct, and the stack size increase from 8
KB to 24 KB (not needed without two-pass parsing and virtual file I/O on
the render task).
 
**Performance:** Single-pass parsing. No new heap allocations in the hot
path — footnote text is collected into fixed stack buffers (char[24],
char[64]). Active runtime memory is ~2.8 KB worst-case (one page × 16
footnotes × 88 bytes, mirrored in `currentPageFootnotes`). Flash usage
is unchanged at 97.4%; RAM stays at 31%.
   
**Known limitations:** When clicking a footnote, it jumps to the start
of the HTML file instead of the specific anchor. This could be
problematic for books that don't have separate files for each footnote.
(no element-id-to-page mapping yet - will be another PR soon).

---

### AI Usage

Did you use AI tools to help write this code? _**< PARTIALLY>**_
Claude Opus 4.6 was used to do most of the migration, I checked manually
its work, and fixed some stuff, but I haven't review all the changes
yet, so feedback is welcomed.

---------

Co-authored-by: Arthur Tazhitdinov <lisnake@gmail.com>
2026-02-26 08:47:34 -06:00

77 lines
2.8 KiB
C++

#pragma once
#include <Print.h>
#include <memory>
#include <string>
#include <unordered_map>
#include <vector>
#include "Epub/BookMetadataCache.h"
#include "Epub/css/CssParser.h"
class ZipFile;
class Epub {
// the ncx file (EPUB 2)
std::string tocNcxItem;
// the nav file (EPUB 3)
std::string tocNavItem;
// where is the EPUBfile?
std::string filepath;
// the base path for items in the EPUB file
std::string contentBasePath;
// Uniq cache key based on filepath
std::string cachePath;
// Spine and TOC cache
std::unique_ptr<BookMetadataCache> bookMetadataCache;
// CSS parser for styling
std::unique_ptr<CssParser> cssParser;
// CSS files
std::vector<std::string> cssFiles;
bool findContentOpfFile(std::string* contentOpfFile) const;
bool parseContentOpf(BookMetadataCache::BookMetadata& bookMetadata);
bool parseTocNcxFile() const;
bool parseTocNavFile() const;
void parseCssFiles() const;
public:
explicit Epub(std::string filepath, const std::string& cacheDir) : filepath(std::move(filepath)) {
// create a cache key based on the filepath
cachePath = cacheDir + "/epub_" + std::to_string(std::hash<std::string>{}(this->filepath));
}
~Epub() = default;
std::string& getBasePath() { return contentBasePath; }
bool load(bool buildIfMissing = true, bool skipLoadingCss = false);
bool clearCache() const;
void setupCacheDir() const;
const std::string& getCachePath() const;
const std::string& getPath() const;
const std::string& getTitle() const;
const std::string& getAuthor() const;
const std::string& getLanguage() const;
std::string getCoverBmpPath(bool cropped = false) const;
bool generateCoverBmp(bool cropped = false) const;
std::string getThumbBmpPath() const;
std::string getThumbBmpPath(int height) const;
bool generateThumbBmp(int height) const;
uint8_t* readItemContentsToBytes(const std::string& itemHref, size_t* size = nullptr,
bool trailingNullByte = false) const;
bool readItemContentsToStream(const std::string& itemHref, Print& out, size_t chunkSize) const;
bool getItemSize(const std::string& itemHref, size_t* size) const;
BookMetadataCache::SpineEntry getSpineItem(int spineIndex) const;
BookMetadataCache::TocEntry getTocItem(int tocIndex) const;
int getSpineItemsCount() const;
int getTocItemsCount() const;
int getSpineIndexForTocIndex(int tocIndex) const;
int getTocIndexForSpineIndex(int spineIndex) const;
size_t getCumulativeSpineItemSize(int spineIndex) const;
int getSpineIndexForTextReference() const;
size_t getBookSize() const;
float calculateProgress(int currentSpineIndex, float currentSpineRead) const;
CssParser* getCssParser() const { return cssParser.get(); }
int resolveHrefToSpineIndex(const std::string& href) const;
};