feat: dict based Hyphenation (#305)
## Summary * Adds (optional) Hyphenation for English, French, German, Russian languages ## Additional Context * Included hyphenation dictionaries add approximately 280kb to the flash usage (German alone takes 200kb) * Trie encoded dictionaries are adopted from hypher project (https://github.com/typst/hypher) * Soft hyphens (and other explicit hyphens) take precedence over dict-based hyphenation. Overall, the hyphenation rules are quite aggressive, as I believe it makes more sense on our smaller screen. --------- Co-authored-by: Dave Allie <dave@daveallie.com>
This commit is contained in:
parent
5fef99c641
commit
8824c87490
2
.gitignore
vendored
2
.gitignore
vendored
@ -4,3 +4,5 @@
|
||||
.vscode
|
||||
lib/EpdFont/fontsrc
|
||||
*.generated.h
|
||||
build
|
||||
**/__pycache__/
|
||||
66
docs/hyphenation-trie-format.md
Normal file
66
docs/hyphenation-trie-format.md
Normal file
@ -0,0 +1,66 @@
|
||||
# Hypher Binary Tries
|
||||
|
||||
CrossPoint embeds the exact binary automata produced by
|
||||
[Typst's `hypher`](https://github.com/typst/hypher).
|
||||
|
||||
## File layout
|
||||
|
||||
Each `.bin` blob is a single self-contained automaton:
|
||||
|
||||
```
|
||||
uint32_t root_addr_be; // big-endian offset of the root node
|
||||
uint8_t levels[]; // shared "levels" tape (dist/score pairs)
|
||||
uint8_t nodes[]; // node records packed back-to-back
|
||||
```
|
||||
|
||||
The size of the `levels` tape is implicit. Individual nodes reference slices
|
||||
inside that tape via 12-bit offsets, so no additional pointers are required.
|
||||
|
||||
### Node encoding
|
||||
|
||||
Every node starts with a single control byte:
|
||||
|
||||
- Bit 7 – set when the node stores scores (`levels`).
|
||||
- Bits 5-6 – stride of the target deltas (1, 2, or 3 bytes, big-endian).
|
||||
- Bits 0-4 – transition count (values ≥ 31 spill into an extra byte).
|
||||
|
||||
If the `levels` flag is set, two more bytes follow. Together they encode a
|
||||
12-bit offset into the global `levels` tape and a 4-bit length. Each byte in the
|
||||
levels tape packs a distance/score pair as `dist * 10 + score`, where `dist`
|
||||
counts how many UTF-8 bytes we advanced since the previous digit.
|
||||
|
||||
After the optional levels header come the transition labels (one byte per edge)
|
||||
followed by the signed target deltas. Targets are stored as relative offsets
|
||||
from the current node address. Deltas up to ±128 fit in a single byte, larger
|
||||
distances grow to 2 or 3 bytes. The runtime walks the transitions with a simple
|
||||
linear scan and materializes the absolute address by adding the decoded delta
|
||||
to the current node’s base.
|
||||
|
||||
## Embedding blobs into the firmware
|
||||
|
||||
The helper script `scripts/generate_hyphenation_trie.py` acts as a thin
|
||||
wrapper: it reads the hypher-generated `.bin` files, formats them as `constexpr`
|
||||
byte arrays, and emits headers under
|
||||
`lib/Epub/Epub/hyphenation/generated/`. Each header defines the raw data plus a
|
||||
`SerializedHyphenationPatterns` descriptor so the reader can keep the automaton
|
||||
in flash.
|
||||
|
||||
To refresh the firmware assets after updating the `.bin` files, run:
|
||||
|
||||
```
|
||||
./scripts/generate_hyphenation_trie.py \
|
||||
--input lib/Epub/Epub/hyphenation/tries/en.bin \
|
||||
--output lib/Epub/Epub/hyphenation/generated/hyph-en.trie.h
|
||||
|
||||
./scripts/generate_hyphenation_trie.py \
|
||||
--input lib/Epub/Epub/hyphenation/tries/fr.bin \
|
||||
--output lib/Epub/Epub/hyphenation/generated/hyph-fr.trie.h
|
||||
|
||||
./scripts/generate_hyphenation_trie.py \
|
||||
--input lib/Epub/Epub/hyphenation/tries/de.bin \
|
||||
--output lib/Epub/Epub/hyphenation/generated/hyph-de.trie.h
|
||||
|
||||
./scripts/generate_hyphenation_trie.py \
|
||||
--input lib/Epub/Epub/hyphenation/tries/ru.bin \
|
||||
--output lib/Epub/Epub/hyphenation/generated/hyph-ru.trie.h
|
||||
```
|
||||
@ -74,6 +74,7 @@ bool Epub::parseContentOpf(BookMetadataCache::BookMetadata& bookMetadata) {
|
||||
// Grab data from opfParser into epub
|
||||
bookMetadata.title = opfParser.title;
|
||||
bookMetadata.author = opfParser.author;
|
||||
bookMetadata.language = opfParser.language;
|
||||
bookMetadata.coverItemHref = opfParser.coverItemHref;
|
||||
bookMetadata.textReferenceHref = opfParser.textReferenceHref;
|
||||
|
||||
@ -348,6 +349,15 @@ const std::string& Epub::getAuthor() const {
|
||||
return bookMetadataCache->coreMetadata.author;
|
||||
}
|
||||
|
||||
const std::string& Epub::getLanguage() const {
|
||||
static std::string blank;
|
||||
if (!bookMetadataCache || !bookMetadataCache->isLoaded()) {
|
||||
return blank;
|
||||
}
|
||||
|
||||
return bookMetadataCache->coreMetadata.language;
|
||||
}
|
||||
|
||||
std::string Epub::getCoverBmpPath(bool cropped) const {
|
||||
const auto coverFileName = "cover" + cropped ? "_crop" : "";
|
||||
return cachePath + "/" + coverFileName + ".bmp";
|
||||
|
||||
@ -44,6 +44,7 @@ class Epub {
|
||||
const std::string& getPath() const;
|
||||
const std::string& getTitle() const;
|
||||
const std::string& getAuthor() const;
|
||||
const std::string& getLanguage() const;
|
||||
std::string getCoverBmpPath(bool cropped = false) const;
|
||||
bool generateCoverBmp(bool cropped = false) const;
|
||||
std::string getThumbBmpPath() const;
|
||||
|
||||
@ -9,7 +9,7 @@
|
||||
#include "FsHelpers.h"
|
||||
|
||||
namespace {
|
||||
constexpr uint8_t BOOK_CACHE_VERSION = 4;
|
||||
constexpr uint8_t BOOK_CACHE_VERSION = 5;
|
||||
constexpr char bookBinFile[] = "/book.bin";
|
||||
constexpr char tmpSpineBinFile[] = "/spine.bin.tmp";
|
||||
constexpr char tmpTocBinFile[] = "/toc.bin.tmp";
|
||||
@ -87,8 +87,9 @@ bool BookMetadataCache::buildBookBin(const std::string& epubPath, const BookMeta
|
||||
|
||||
constexpr uint32_t headerASize =
|
||||
sizeof(BOOK_CACHE_VERSION) + /* LUT Offset */ sizeof(uint32_t) + sizeof(spineCount) + sizeof(tocCount);
|
||||
const uint32_t metadataSize = metadata.title.size() + metadata.author.size() + metadata.coverItemHref.size() +
|
||||
metadata.textReferenceHref.size() + sizeof(uint32_t) * 4;
|
||||
const uint32_t metadataSize = metadata.title.size() + metadata.author.size() + metadata.language.size() +
|
||||
metadata.coverItemHref.size() + metadata.textReferenceHref.size() +
|
||||
sizeof(uint32_t) * 5;
|
||||
const uint32_t lutSize = sizeof(uint32_t) * spineCount + sizeof(uint32_t) * tocCount;
|
||||
const uint32_t lutOffset = headerASize + metadataSize;
|
||||
|
||||
@ -100,6 +101,7 @@ bool BookMetadataCache::buildBookBin(const std::string& epubPath, const BookMeta
|
||||
// Metadata
|
||||
serialization::writeString(bookFile, metadata.title);
|
||||
serialization::writeString(bookFile, metadata.author);
|
||||
serialization::writeString(bookFile, metadata.language);
|
||||
serialization::writeString(bookFile, metadata.coverItemHref);
|
||||
serialization::writeString(bookFile, metadata.textReferenceHref);
|
||||
|
||||
@ -289,6 +291,7 @@ bool BookMetadataCache::load() {
|
||||
|
||||
serialization::readString(bookFile, coreMetadata.title);
|
||||
serialization::readString(bookFile, coreMetadata.author);
|
||||
serialization::readString(bookFile, coreMetadata.language);
|
||||
serialization::readString(bookFile, coreMetadata.coverItemHref);
|
||||
serialization::readString(bookFile, coreMetadata.textReferenceHref);
|
||||
|
||||
|
||||
@ -9,6 +9,7 @@ class BookMetadataCache {
|
||||
struct BookMetadata {
|
||||
std::string title;
|
||||
std::string author;
|
||||
std::string language;
|
||||
std::string coverItemHref;
|
||||
std::string textReferenceHref;
|
||||
};
|
||||
|
||||
@ -5,11 +5,50 @@
|
||||
#include <algorithm>
|
||||
#include <cmath>
|
||||
#include <functional>
|
||||
#include <iterator>
|
||||
#include <limits>
|
||||
#include <vector>
|
||||
|
||||
#include "hyphenation/Hyphenator.h"
|
||||
|
||||
constexpr int MAX_COST = std::numeric_limits<int>::max();
|
||||
|
||||
namespace {
|
||||
|
||||
// Soft hyphen byte pattern used throughout EPUBs (UTF-8 for U+00AD).
|
||||
constexpr char SOFT_HYPHEN_UTF8[] = "\xC2\xAD";
|
||||
constexpr size_t SOFT_HYPHEN_BYTES = 2;
|
||||
|
||||
bool containsSoftHyphen(const std::string& word) { return word.find(SOFT_HYPHEN_UTF8) != std::string::npos; }
|
||||
|
||||
// Removes every soft hyphen in-place so rendered glyphs match measured widths.
|
||||
void stripSoftHyphensInPlace(std::string& word) {
|
||||
size_t pos = 0;
|
||||
while ((pos = word.find(SOFT_HYPHEN_UTF8, pos)) != std::string::npos) {
|
||||
word.erase(pos, SOFT_HYPHEN_BYTES);
|
||||
}
|
||||
}
|
||||
|
||||
// Returns the rendered width for a word while ignoring soft hyphen glyphs and optionally appending a visible hyphen.
|
||||
uint16_t measureWordWidth(const GfxRenderer& renderer, const int fontId, const std::string& word,
|
||||
const EpdFontFamily::Style style, const bool appendHyphen = false) {
|
||||
const bool hasSoftHyphen = containsSoftHyphen(word);
|
||||
if (!hasSoftHyphen && !appendHyphen) {
|
||||
return renderer.getTextWidth(fontId, word.c_str(), style);
|
||||
}
|
||||
|
||||
std::string sanitized = word;
|
||||
if (hasSoftHyphen) {
|
||||
stripSoftHyphensInPlace(sanitized);
|
||||
}
|
||||
if (appendHyphen) {
|
||||
sanitized.push_back('-');
|
||||
}
|
||||
return renderer.getTextWidth(fontId, sanitized.c_str(), style);
|
||||
}
|
||||
|
||||
} // namespace
|
||||
|
||||
void ParsedText::addWord(std::string word, const EpdFontFamily::Style fontStyle) {
|
||||
if (word.empty()) return;
|
||||
|
||||
@ -25,10 +64,19 @@ void ParsedText::layoutAndExtractLines(const GfxRenderer& renderer, const int fo
|
||||
return;
|
||||
}
|
||||
|
||||
// Apply fixed transforms before any per-line layout work.
|
||||
applyParagraphIndent();
|
||||
|
||||
const int pageWidth = viewportWidth;
|
||||
const int spaceWidth = renderer.getSpaceWidth(fontId);
|
||||
const auto wordWidths = calculateWordWidths(renderer, fontId);
|
||||
const auto lineBreakIndices = computeLineBreaks(pageWidth, spaceWidth, wordWidths);
|
||||
auto wordWidths = calculateWordWidths(renderer, fontId);
|
||||
std::vector<size_t> lineBreakIndices;
|
||||
if (hyphenationEnabled) {
|
||||
// Use greedy layout that can split words mid-loop when a hyphenated prefix fits.
|
||||
lineBreakIndices = computeHyphenatedLineBreaks(renderer, fontId, pageWidth, spaceWidth, wordWidths);
|
||||
} else {
|
||||
lineBreakIndices = computeLineBreaks(renderer, fontId, pageWidth, spaceWidth, wordWidths);
|
||||
}
|
||||
const size_t lineCount = includeLastLine ? lineBreakIndices.size() : lineBreakIndices.size() - 1;
|
||||
|
||||
for (size_t i = 0; i < lineCount; ++i) {
|
||||
@ -42,17 +90,11 @@ std::vector<uint16_t> ParsedText::calculateWordWidths(const GfxRenderer& rendere
|
||||
std::vector<uint16_t> wordWidths;
|
||||
wordWidths.reserve(totalWordCount);
|
||||
|
||||
// add em-space at the beginning of first word in paragraph to indent
|
||||
if ((style == TextBlock::JUSTIFIED || style == TextBlock::LEFT_ALIGN) && !extraParagraphSpacing) {
|
||||
std::string& first_word = words.front();
|
||||
first_word.insert(0, "\xe2\x80\x83");
|
||||
}
|
||||
|
||||
auto wordsIt = words.begin();
|
||||
auto wordStylesIt = wordStyles.begin();
|
||||
|
||||
while (wordsIt != words.end()) {
|
||||
wordWidths.push_back(renderer.getTextWidth(fontId, wordsIt->c_str(), *wordStylesIt));
|
||||
wordWidths.push_back(measureWordWidth(renderer, fontId, *wordsIt, *wordStylesIt));
|
||||
|
||||
std::advance(wordsIt, 1);
|
||||
std::advance(wordStylesIt, 1);
|
||||
@ -61,8 +103,21 @@ std::vector<uint16_t> ParsedText::calculateWordWidths(const GfxRenderer& rendere
|
||||
return wordWidths;
|
||||
}
|
||||
|
||||
std::vector<size_t> ParsedText::computeLineBreaks(const int pageWidth, const int spaceWidth,
|
||||
const std::vector<uint16_t>& wordWidths) const {
|
||||
std::vector<size_t> ParsedText::computeLineBreaks(const GfxRenderer& renderer, const int fontId, const int pageWidth,
|
||||
const int spaceWidth, std::vector<uint16_t>& wordWidths) {
|
||||
if (words.empty()) {
|
||||
return {};
|
||||
}
|
||||
|
||||
// Ensure any word that would overflow even as the first entry on a line is split using fallback hyphenation.
|
||||
for (size_t i = 0; i < wordWidths.size(); ++i) {
|
||||
while (wordWidths[i] > pageWidth) {
|
||||
if (!hyphenateWordAtIndex(i, pageWidth, renderer, fontId, wordWidths, /*allowFallbackBreaks=*/true)) {
|
||||
break;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
const size_t totalWordCount = words.size();
|
||||
|
||||
// DP table to store the minimum badness (cost) of lines starting at index i
|
||||
@ -140,6 +195,138 @@ std::vector<size_t> ParsedText::computeLineBreaks(const int pageWidth, const int
|
||||
return lineBreakIndices;
|
||||
}
|
||||
|
||||
void ParsedText::applyParagraphIndent() {
|
||||
if (extraParagraphSpacing || words.empty()) {
|
||||
return;
|
||||
}
|
||||
|
||||
if (style == TextBlock::JUSTIFIED || style == TextBlock::LEFT_ALIGN) {
|
||||
words.front().insert(0, "\xe2\x80\x83");
|
||||
}
|
||||
}
|
||||
|
||||
// Builds break indices while opportunistically splitting the word that would overflow the current line.
|
||||
std::vector<size_t> ParsedText::computeHyphenatedLineBreaks(const GfxRenderer& renderer, const int fontId,
|
||||
const int pageWidth, const int spaceWidth,
|
||||
std::vector<uint16_t>& wordWidths) {
|
||||
std::vector<size_t> lineBreakIndices;
|
||||
size_t currentIndex = 0;
|
||||
|
||||
while (currentIndex < wordWidths.size()) {
|
||||
const size_t lineStart = currentIndex;
|
||||
int lineWidth = 0;
|
||||
|
||||
// Consume as many words as possible for current line, splitting when prefixes fit
|
||||
while (currentIndex < wordWidths.size()) {
|
||||
const bool isFirstWord = currentIndex == lineStart;
|
||||
const int spacing = isFirstWord ? 0 : spaceWidth;
|
||||
const int candidateWidth = spacing + wordWidths[currentIndex];
|
||||
|
||||
// Word fits on current line
|
||||
if (lineWidth + candidateWidth <= pageWidth) {
|
||||
lineWidth += candidateWidth;
|
||||
++currentIndex;
|
||||
continue;
|
||||
}
|
||||
|
||||
// Word would overflow — try to split based on hyphenation points
|
||||
const int availableWidth = pageWidth - lineWidth - spacing;
|
||||
const bool allowFallbackBreaks = isFirstWord; // Only for first word on line
|
||||
|
||||
if (availableWidth > 0 &&
|
||||
hyphenateWordAtIndex(currentIndex, availableWidth, renderer, fontId, wordWidths, allowFallbackBreaks)) {
|
||||
// Prefix now fits; append it to this line and move to next line
|
||||
lineWidth += spacing + wordWidths[currentIndex];
|
||||
++currentIndex;
|
||||
break;
|
||||
}
|
||||
|
||||
// Could not split: force at least one word per line to avoid infinite loop
|
||||
if (currentIndex == lineStart) {
|
||||
lineWidth += candidateWidth;
|
||||
++currentIndex;
|
||||
}
|
||||
break;
|
||||
}
|
||||
|
||||
lineBreakIndices.push_back(currentIndex);
|
||||
}
|
||||
|
||||
return lineBreakIndices;
|
||||
}
|
||||
|
||||
// Splits words[wordIndex] into prefix (adding a hyphen only when needed) and remainder when a legal breakpoint fits the
|
||||
// available width.
|
||||
bool ParsedText::hyphenateWordAtIndex(const size_t wordIndex, const int availableWidth, const GfxRenderer& renderer,
|
||||
const int fontId, std::vector<uint16_t>& wordWidths,
|
||||
const bool allowFallbackBreaks) {
|
||||
// Guard against invalid indices or zero available width before attempting to split.
|
||||
if (availableWidth <= 0 || wordIndex >= words.size()) {
|
||||
return false;
|
||||
}
|
||||
|
||||
// Get iterators to target word and style.
|
||||
auto wordIt = words.begin();
|
||||
auto styleIt = wordStyles.begin();
|
||||
std::advance(wordIt, wordIndex);
|
||||
std::advance(styleIt, wordIndex);
|
||||
|
||||
const std::string& word = *wordIt;
|
||||
const auto style = *styleIt;
|
||||
|
||||
// Collect candidate breakpoints (byte offsets and hyphen requirements).
|
||||
auto breakInfos = Hyphenator::breakOffsets(word, allowFallbackBreaks);
|
||||
if (breakInfos.empty()) {
|
||||
return false;
|
||||
}
|
||||
|
||||
size_t chosenOffset = 0;
|
||||
int chosenWidth = -1;
|
||||
bool chosenNeedsHyphen = true;
|
||||
|
||||
// Iterate over each legal breakpoint and retain the widest prefix that still fits.
|
||||
for (const auto& info : breakInfos) {
|
||||
const size_t offset = info.byteOffset;
|
||||
if (offset == 0 || offset >= word.size()) {
|
||||
continue;
|
||||
}
|
||||
|
||||
const bool needsHyphen = info.requiresInsertedHyphen;
|
||||
const int prefixWidth = measureWordWidth(renderer, fontId, word.substr(0, offset), style, needsHyphen);
|
||||
if (prefixWidth > availableWidth || prefixWidth <= chosenWidth) {
|
||||
continue; // Skip if too wide or not an improvement
|
||||
}
|
||||
|
||||
chosenWidth = prefixWidth;
|
||||
chosenOffset = offset;
|
||||
chosenNeedsHyphen = needsHyphen;
|
||||
}
|
||||
|
||||
if (chosenWidth < 0) {
|
||||
// No hyphenation point produced a prefix that fits in the remaining space.
|
||||
return false;
|
||||
}
|
||||
|
||||
// Split the word at the selected breakpoint and append a hyphen if required.
|
||||
std::string remainder = word.substr(chosenOffset);
|
||||
wordIt->resize(chosenOffset);
|
||||
if (chosenNeedsHyphen) {
|
||||
wordIt->push_back('-');
|
||||
}
|
||||
|
||||
// Insert the remainder word (with matching style) directly after the prefix.
|
||||
auto insertWordIt = std::next(wordIt);
|
||||
auto insertStyleIt = std::next(styleIt);
|
||||
words.insert(insertWordIt, remainder);
|
||||
wordStyles.insert(insertStyleIt, style);
|
||||
|
||||
// Update cached widths to reflect the new prefix/remainder pairing.
|
||||
wordWidths[wordIndex] = static_cast<uint16_t>(chosenWidth);
|
||||
const uint16_t remainderWidth = measureWordWidth(renderer, fontId, remainder, style);
|
||||
wordWidths.insert(wordWidths.begin() + wordIndex + 1, remainderWidth);
|
||||
return true;
|
||||
}
|
||||
|
||||
void ParsedText::extractLine(const size_t breakIndex, const int pageWidth, const int spaceWidth,
|
||||
const std::vector<uint16_t>& wordWidths, const std::vector<size_t>& lineBreakIndices,
|
||||
const std::function<void(std::shared_ptr<TextBlock>)>& processLine) {
|
||||
@ -191,5 +378,11 @@ void ParsedText::extractLine(const size_t breakIndex, const int pageWidth, const
|
||||
std::list<EpdFontFamily::Style> lineWordStyles;
|
||||
lineWordStyles.splice(lineWordStyles.begin(), wordStyles, wordStyles.begin(), wordStyleEndIt);
|
||||
|
||||
for (auto& word : lineWords) {
|
||||
if (containsSoftHyphen(word)) {
|
||||
stripSoftHyphensInPlace(word);
|
||||
}
|
||||
}
|
||||
|
||||
processLine(std::make_shared<TextBlock>(std::move(lineWords), std::move(lineXPos), std::move(lineWordStyles), style));
|
||||
}
|
||||
@ -17,16 +17,24 @@ class ParsedText {
|
||||
std::list<EpdFontFamily::Style> wordStyles;
|
||||
TextBlock::Style style;
|
||||
bool extraParagraphSpacing;
|
||||
bool hyphenationEnabled;
|
||||
|
||||
std::vector<size_t> computeLineBreaks(int pageWidth, int spaceWidth, const std::vector<uint16_t>& wordWidths) const;
|
||||
void applyParagraphIndent();
|
||||
std::vector<size_t> computeLineBreaks(const GfxRenderer& renderer, int fontId, int pageWidth, int spaceWidth,
|
||||
std::vector<uint16_t>& wordWidths);
|
||||
std::vector<size_t> computeHyphenatedLineBreaks(const GfxRenderer& renderer, int fontId, int pageWidth,
|
||||
int spaceWidth, std::vector<uint16_t>& wordWidths);
|
||||
bool hyphenateWordAtIndex(size_t wordIndex, int availableWidth, const GfxRenderer& renderer, int fontId,
|
||||
std::vector<uint16_t>& wordWidths, bool allowFallbackBreaks);
|
||||
void extractLine(size_t breakIndex, int pageWidth, int spaceWidth, const std::vector<uint16_t>& wordWidths,
|
||||
const std::vector<size_t>& lineBreakIndices,
|
||||
const std::function<void(std::shared_ptr<TextBlock>)>& processLine);
|
||||
std::vector<uint16_t> calculateWordWidths(const GfxRenderer& renderer, int fontId);
|
||||
|
||||
public:
|
||||
explicit ParsedText(const TextBlock::Style style, const bool extraParagraphSpacing)
|
||||
: style(style), extraParagraphSpacing(extraParagraphSpacing) {}
|
||||
explicit ParsedText(const TextBlock::Style style, const bool extraParagraphSpacing,
|
||||
const bool hyphenationEnabled = false)
|
||||
: style(style), extraParagraphSpacing(extraParagraphSpacing), hyphenationEnabled(hyphenationEnabled) {}
|
||||
~ParsedText() = default;
|
||||
|
||||
void addWord(std::string word, EpdFontFamily::Style fontStyle);
|
||||
|
||||
@ -4,12 +4,14 @@
|
||||
#include <Serialization.h>
|
||||
|
||||
#include "Page.h"
|
||||
#include "hyphenation/Hyphenator.h"
|
||||
#include "parsers/ChapterHtmlSlimParser.h"
|
||||
|
||||
namespace {
|
||||
constexpr uint8_t SECTION_FILE_VERSION = 9;
|
||||
constexpr uint8_t SECTION_FILE_VERSION = 10;
|
||||
constexpr uint32_t HEADER_SIZE = sizeof(uint8_t) + sizeof(int) + sizeof(float) + sizeof(bool) + sizeof(uint8_t) +
|
||||
sizeof(uint16_t) + sizeof(uint16_t) + sizeof(uint16_t) + sizeof(uint32_t);
|
||||
sizeof(uint16_t) + sizeof(uint16_t) + sizeof(uint16_t) + sizeof(bool) +
|
||||
sizeof(uint32_t);
|
||||
} // namespace
|
||||
|
||||
uint32_t Section::onPageComplete(std::unique_ptr<Page> page) {
|
||||
@ -31,14 +33,15 @@ uint32_t Section::onPageComplete(std::unique_ptr<Page> page) {
|
||||
|
||||
void Section::writeSectionFileHeader(const int fontId, const float lineCompression, const bool extraParagraphSpacing,
|
||||
const uint8_t paragraphAlignment, const uint16_t viewportWidth,
|
||||
const uint16_t viewportHeight) {
|
||||
const uint16_t viewportHeight, const bool hyphenationEnabled) {
|
||||
if (!file) {
|
||||
Serial.printf("[%lu] [SCT] File not open for writing header\n", millis());
|
||||
return;
|
||||
}
|
||||
static_assert(HEADER_SIZE == sizeof(SECTION_FILE_VERSION) + sizeof(fontId) + sizeof(lineCompression) +
|
||||
sizeof(extraParagraphSpacing) + sizeof(paragraphAlignment) + sizeof(viewportWidth) +
|
||||
sizeof(viewportHeight) + sizeof(pageCount) + sizeof(uint32_t),
|
||||
sizeof(viewportHeight) + sizeof(pageCount) + sizeof(hyphenationEnabled) +
|
||||
sizeof(uint32_t),
|
||||
"Header size mismatch");
|
||||
serialization::writePod(file, SECTION_FILE_VERSION);
|
||||
serialization::writePod(file, fontId);
|
||||
@ -47,13 +50,14 @@ void Section::writeSectionFileHeader(const int fontId, const float lineCompressi
|
||||
serialization::writePod(file, paragraphAlignment);
|
||||
serialization::writePod(file, viewportWidth);
|
||||
serialization::writePod(file, viewportHeight);
|
||||
serialization::writePod(file, hyphenationEnabled);
|
||||
serialization::writePod(file, pageCount); // Placeholder for page count (will be initially 0 when written)
|
||||
serialization::writePod(file, static_cast<uint32_t>(0)); // Placeholder for LUT offset
|
||||
}
|
||||
|
||||
bool Section::loadSectionFile(const int fontId, const float lineCompression, const bool extraParagraphSpacing,
|
||||
const uint8_t paragraphAlignment, const uint16_t viewportWidth,
|
||||
const uint16_t viewportHeight) {
|
||||
const uint16_t viewportHeight, const bool hyphenationEnabled) {
|
||||
if (!SdMan.openFileForRead("SCT", filePath, file)) {
|
||||
return false;
|
||||
}
|
||||
@ -74,16 +78,19 @@ bool Section::loadSectionFile(const int fontId, const float lineCompression, con
|
||||
float fileLineCompression;
|
||||
bool fileExtraParagraphSpacing;
|
||||
uint8_t fileParagraphAlignment;
|
||||
bool fileHyphenationEnabled;
|
||||
serialization::readPod(file, fileFontId);
|
||||
serialization::readPod(file, fileLineCompression);
|
||||
serialization::readPod(file, fileExtraParagraphSpacing);
|
||||
serialization::readPod(file, fileParagraphAlignment);
|
||||
serialization::readPod(file, fileViewportWidth);
|
||||
serialization::readPod(file, fileViewportHeight);
|
||||
serialization::readPod(file, fileHyphenationEnabled);
|
||||
|
||||
if (fontId != fileFontId || lineCompression != fileLineCompression ||
|
||||
extraParagraphSpacing != fileExtraParagraphSpacing || paragraphAlignment != fileParagraphAlignment ||
|
||||
viewportWidth != fileViewportWidth || viewportHeight != fileViewportHeight) {
|
||||
viewportWidth != fileViewportWidth || viewportHeight != fileViewportHeight ||
|
||||
hyphenationEnabled != fileHyphenationEnabled) {
|
||||
file.close();
|
||||
Serial.printf("[%lu] [SCT] Deserialization failed: Parameters do not match\n", millis());
|
||||
clearCache();
|
||||
@ -115,7 +122,8 @@ bool Section::clearCache() const {
|
||||
|
||||
bool Section::createSectionFile(const int fontId, const float lineCompression, const bool extraParagraphSpacing,
|
||||
const uint8_t paragraphAlignment, const uint16_t viewportWidth,
|
||||
const uint16_t viewportHeight, const std::function<void()>& progressSetupFn,
|
||||
const uint16_t viewportHeight, const bool hyphenationEnabled,
|
||||
const std::function<void()>& progressSetupFn,
|
||||
const std::function<void(int)>& progressFn) {
|
||||
constexpr uint32_t MIN_SIZE_FOR_PROGRESS = 50 * 1024; // 50KB
|
||||
const auto localPath = epub->getSpineItem(spineIndex).href;
|
||||
@ -172,14 +180,15 @@ bool Section::createSectionFile(const int fontId, const float lineCompression, c
|
||||
return false;
|
||||
}
|
||||
writeSectionFileHeader(fontId, lineCompression, extraParagraphSpacing, paragraphAlignment, viewportWidth,
|
||||
viewportHeight);
|
||||
viewportHeight, hyphenationEnabled);
|
||||
std::vector<uint32_t> lut = {};
|
||||
|
||||
ChapterHtmlSlimParser visitor(
|
||||
tmpHtmlPath, renderer, fontId, lineCompression, extraParagraphSpacing, paragraphAlignment, viewportWidth,
|
||||
viewportHeight,
|
||||
viewportHeight, hyphenationEnabled,
|
||||
[this, &lut](std::unique_ptr<Page> page) { lut.emplace_back(this->onPageComplete(std::move(page))); },
|
||||
progressFn);
|
||||
Hyphenator::setPreferredLanguage(epub->getLanguage());
|
||||
success = visitor.parseAndBuildPages();
|
||||
|
||||
SdMan.remove(tmpHtmlPath.c_str());
|
||||
|
||||
@ -15,7 +15,7 @@ class Section {
|
||||
FsFile file;
|
||||
|
||||
void writeSectionFileHeader(int fontId, float lineCompression, bool extraParagraphSpacing, uint8_t paragraphAlignment,
|
||||
uint16_t viewportWidth, uint16_t viewportHeight);
|
||||
uint16_t viewportWidth, uint16_t viewportHeight, bool hyphenationEnabled);
|
||||
uint32_t onPageComplete(std::unique_ptr<Page> page);
|
||||
|
||||
public:
|
||||
@ -29,10 +29,10 @@ class Section {
|
||||
filePath(epub->getCachePath() + "/sections/" + std::to_string(spineIndex) + ".bin") {}
|
||||
~Section() = default;
|
||||
bool loadSectionFile(int fontId, float lineCompression, bool extraParagraphSpacing, uint8_t paragraphAlignment,
|
||||
uint16_t viewportWidth, uint16_t viewportHeight);
|
||||
uint16_t viewportWidth, uint16_t viewportHeight, bool hyphenationEnabled);
|
||||
bool clearCache() const;
|
||||
bool createSectionFile(int fontId, float lineCompression, bool extraParagraphSpacing, uint8_t paragraphAlignment,
|
||||
uint16_t viewportWidth, uint16_t viewportHeight,
|
||||
uint16_t viewportWidth, uint16_t viewportHeight, bool hyphenationEnabled,
|
||||
const std::function<void()>& progressSetupFn = nullptr,
|
||||
const std::function<void(int)>& progressFn = nullptr);
|
||||
std::unique_ptr<Page> loadPageFromSectionFile();
|
||||
|
||||
179
lib/Epub/Epub/hyphenation/HyphenationCommon.cpp
Normal file
179
lib/Epub/Epub/hyphenation/HyphenationCommon.cpp
Normal file
@ -0,0 +1,179 @@
|
||||
#include "HyphenationCommon.h"
|
||||
|
||||
#include <Utf8.h>
|
||||
|
||||
namespace {
|
||||
|
||||
// Convert Latin uppercase letters (ASCII plus Latin-1 supplement) to lowercase
|
||||
uint32_t toLowerLatinImpl(const uint32_t cp) {
|
||||
if (cp >= 'A' && cp <= 'Z') {
|
||||
return cp - 'A' + 'a';
|
||||
}
|
||||
if ((cp >= 0x00C0 && cp <= 0x00D6) || (cp >= 0x00D8 && cp <= 0x00DE)) {
|
||||
return cp + 0x20;
|
||||
}
|
||||
|
||||
switch (cp) {
|
||||
case 0x0152: // Œ
|
||||
return 0x0153; // œ
|
||||
case 0x0178: // Ÿ
|
||||
return 0x00FF; // ÿ
|
||||
case 0x1E9E: // ẞ
|
||||
return 0x00DF; // ß
|
||||
default:
|
||||
return cp;
|
||||
}
|
||||
}
|
||||
|
||||
// Convert Cyrillic uppercase letters to lowercase
|
||||
// Cyrillic uppercase range 0x0410-0x042F maps to lowercase by adding 0x20
|
||||
// Special case: Cyrillic capital IO (0x0401) maps to lowercase io (0x0451)
|
||||
uint32_t toLowerCyrillicImpl(const uint32_t cp) {
|
||||
if (cp >= 0x0410 && cp <= 0x042F) {
|
||||
return cp + 0x20;
|
||||
}
|
||||
if (cp == 0x0401) {
|
||||
return 0x0451;
|
||||
}
|
||||
return cp;
|
||||
}
|
||||
|
||||
} // namespace
|
||||
|
||||
uint32_t toLowerLatin(const uint32_t cp) { return toLowerLatinImpl(cp); }
|
||||
|
||||
uint32_t toLowerCyrillic(const uint32_t cp) { return toLowerCyrillicImpl(cp); }
|
||||
|
||||
bool isLatinLetter(const uint32_t cp) {
|
||||
if ((cp >= 'A' && cp <= 'Z') || (cp >= 'a' && cp <= 'z')) {
|
||||
return true;
|
||||
}
|
||||
|
||||
if (((cp >= 0x00C0 && cp <= 0x00D6) || (cp >= 0x00D8 && cp <= 0x00F6) || (cp >= 0x00F8 && cp <= 0x00FF)) &&
|
||||
cp != 0x00D7 && cp != 0x00F7) {
|
||||
return true;
|
||||
}
|
||||
|
||||
switch (cp) {
|
||||
case 0x0152: // Œ
|
||||
case 0x0153: // œ
|
||||
case 0x0178: // Ÿ
|
||||
case 0x1E9E: // ẞ
|
||||
return true;
|
||||
default:
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
bool isCyrillicLetter(const uint32_t cp) { return (cp >= 0x0400 && cp <= 0x052F); }
|
||||
|
||||
bool isAlphabetic(const uint32_t cp) { return isLatinLetter(cp) || isCyrillicLetter(cp); }
|
||||
|
||||
bool isPunctuation(const uint32_t cp) {
|
||||
switch (cp) {
|
||||
case '-':
|
||||
case '.':
|
||||
case ',':
|
||||
case '!':
|
||||
case '?':
|
||||
case ';':
|
||||
case ':':
|
||||
case '"':
|
||||
case '\'':
|
||||
case ')':
|
||||
case '(':
|
||||
case 0x00AB: // «
|
||||
case 0x00BB: // »
|
||||
case 0x2018: // ‘
|
||||
case 0x2019: // ’
|
||||
case 0x201C: // “
|
||||
case 0x201D: // ”
|
||||
case 0x00A0: // no-break space
|
||||
case '{':
|
||||
case '}':
|
||||
case '[':
|
||||
case ']':
|
||||
case '/':
|
||||
case 0x203A: // ›
|
||||
case 0x2026: // …
|
||||
return true;
|
||||
default:
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
bool isAsciiDigit(const uint32_t cp) { return cp >= '0' && cp <= '9'; }
|
||||
|
||||
bool isExplicitHyphen(const uint32_t cp) {
|
||||
switch (cp) {
|
||||
case '-':
|
||||
case 0x00AD: // soft hyphen
|
||||
case 0x058A: // Armenian hyphen
|
||||
case 0x2010: // hyphen
|
||||
case 0x2011: // non-breaking hyphen
|
||||
case 0x2012: // figure dash
|
||||
case 0x2013: // en dash
|
||||
case 0x2014: // em dash
|
||||
case 0x2015: // horizontal bar
|
||||
case 0x2043: // hyphen bullet
|
||||
case 0x207B: // superscript minus
|
||||
case 0x208B: // subscript minus
|
||||
case 0x2212: // minus sign
|
||||
case 0x2E17: // double oblique hyphen
|
||||
case 0x2E3A: // two-em dash
|
||||
case 0x2E3B: // three-em dash
|
||||
case 0xFE58: // small em dash
|
||||
case 0xFE63: // small hyphen-minus
|
||||
case 0xFF0D: // fullwidth hyphen-minus
|
||||
return true;
|
||||
default:
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
bool isSoftHyphen(const uint32_t cp) { return cp == 0x00AD; }
|
||||
|
||||
void trimSurroundingPunctuationAndFootnote(std::vector<CodepointInfo>& cps) {
|
||||
if (cps.empty()) {
|
||||
return;
|
||||
}
|
||||
|
||||
// Remove trailing footnote references like [12], even if punctuation trails after the closing bracket.
|
||||
if (cps.size() >= 3) {
|
||||
int end = static_cast<int>(cps.size()) - 1;
|
||||
while (end >= 0 && isPunctuation(cps[end].value)) {
|
||||
--end;
|
||||
}
|
||||
int pos = end;
|
||||
if (pos >= 0 && isAsciiDigit(cps[pos].value)) {
|
||||
while (pos >= 0 && isAsciiDigit(cps[pos].value)) {
|
||||
--pos;
|
||||
}
|
||||
if (pos >= 0 && cps[pos].value == '[' && end - pos > 1) {
|
||||
cps.erase(cps.begin() + pos, cps.end());
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
while (!cps.empty() && isPunctuation(cps.front().value)) {
|
||||
cps.erase(cps.begin());
|
||||
}
|
||||
while (!cps.empty() && isPunctuation(cps.back().value)) {
|
||||
cps.pop_back();
|
||||
}
|
||||
}
|
||||
|
||||
std::vector<CodepointInfo> collectCodepoints(const std::string& word) {
|
||||
std::vector<CodepointInfo> cps;
|
||||
cps.reserve(word.size());
|
||||
|
||||
const unsigned char* base = reinterpret_cast<const unsigned char*>(word.c_str());
|
||||
const unsigned char* ptr = base;
|
||||
while (*ptr != 0) {
|
||||
const unsigned char* current = ptr;
|
||||
const uint32_t cp = utf8NextCodepoint(&ptr);
|
||||
cps.push_back({cp, static_cast<size_t>(current - base)});
|
||||
}
|
||||
|
||||
return cps;
|
||||
}
|
||||
25
lib/Epub/Epub/hyphenation/HyphenationCommon.h
Normal file
25
lib/Epub/Epub/hyphenation/HyphenationCommon.h
Normal file
@ -0,0 +1,25 @@
|
||||
#pragma once
|
||||
|
||||
#include <cstddef>
|
||||
#include <cstdint>
|
||||
#include <string>
|
||||
#include <vector>
|
||||
|
||||
struct CodepointInfo {
|
||||
uint32_t value;
|
||||
size_t byteOffset;
|
||||
};
|
||||
|
||||
uint32_t toLowerLatin(uint32_t cp);
|
||||
uint32_t toLowerCyrillic(uint32_t cp);
|
||||
|
||||
bool isLatinLetter(uint32_t cp);
|
||||
bool isCyrillicLetter(uint32_t cp);
|
||||
|
||||
bool isAlphabetic(uint32_t cp);
|
||||
bool isPunctuation(uint32_t cp);
|
||||
bool isAsciiDigit(uint32_t cp);
|
||||
bool isExplicitHyphen(uint32_t cp);
|
||||
bool isSoftHyphen(uint32_t cp);
|
||||
void trimSurroundingPunctuationAndFootnote(std::vector<CodepointInfo>& cps);
|
||||
std::vector<CodepointInfo> collectCodepoints(const std::string& word);
|
||||
97
lib/Epub/Epub/hyphenation/Hyphenator.cpp
Normal file
97
lib/Epub/Epub/hyphenation/Hyphenator.cpp
Normal file
@ -0,0 +1,97 @@
|
||||
#include "Hyphenator.h"
|
||||
|
||||
#include <vector>
|
||||
|
||||
#include "HyphenationCommon.h"
|
||||
#include "LanguageRegistry.h"
|
||||
|
||||
const LanguageHyphenator* Hyphenator::cachedHyphenator_ = nullptr;
|
||||
|
||||
namespace {
|
||||
|
||||
// Maps a BCP-47 language tag to a language-specific hyphenator.
|
||||
const LanguageHyphenator* hyphenatorForLanguage(const std::string& langTag) {
|
||||
if (langTag.empty()) return nullptr;
|
||||
|
||||
// Extract primary subtag and normalize to lowercase (e.g., "en-US" -> "en").
|
||||
std::string primary;
|
||||
primary.reserve(langTag.size());
|
||||
for (char c : langTag) {
|
||||
if (c == '-' || c == '_') break;
|
||||
if (c >= 'A' && c <= 'Z') c = static_cast<char>(c - 'A' + 'a');
|
||||
primary.push_back(c);
|
||||
}
|
||||
if (primary.empty()) return nullptr;
|
||||
|
||||
return getLanguageHyphenatorForPrimaryTag(primary);
|
||||
}
|
||||
|
||||
// Maps a codepoint index back to its byte offset inside the source word.
|
||||
size_t byteOffsetForIndex(const std::vector<CodepointInfo>& cps, const size_t index) {
|
||||
return (index < cps.size()) ? cps[index].byteOffset : (cps.empty() ? 0 : cps.back().byteOffset);
|
||||
}
|
||||
|
||||
// Builds a vector of break information from explicit hyphen markers in the given codepoints.
|
||||
std::vector<Hyphenator::BreakInfo> buildExplicitBreakInfos(const std::vector<CodepointInfo>& cps) {
|
||||
std::vector<Hyphenator::BreakInfo> breaks;
|
||||
|
||||
// Scan every codepoint looking for explicit/soft hyphen markers that are surrounded by letters.
|
||||
for (size_t i = 1; i + 1 < cps.size(); ++i) {
|
||||
const uint32_t cp = cps[i].value;
|
||||
if (!isExplicitHyphen(cp) || !isAlphabetic(cps[i - 1].value) || !isAlphabetic(cps[i + 1].value)) {
|
||||
continue;
|
||||
}
|
||||
// Offset points to the next codepoint so rendering starts after the hyphen marker.
|
||||
breaks.push_back({cps[i + 1].byteOffset, isSoftHyphen(cp)});
|
||||
}
|
||||
|
||||
return breaks;
|
||||
}
|
||||
|
||||
} // namespace
|
||||
|
||||
std::vector<Hyphenator::BreakInfo> Hyphenator::breakOffsets(const std::string& word, const bool includeFallback) {
|
||||
if (word.empty()) {
|
||||
return {};
|
||||
}
|
||||
|
||||
// Convert to codepoints and normalize word boundaries.
|
||||
auto cps = collectCodepoints(word);
|
||||
trimSurroundingPunctuationAndFootnote(cps);
|
||||
const auto* hyphenator = cachedHyphenator_;
|
||||
|
||||
// Explicit hyphen markers (soft or hard) take precedence over language breaks.
|
||||
auto explicitBreakInfos = buildExplicitBreakInfos(cps);
|
||||
if (!explicitBreakInfos.empty()) {
|
||||
return explicitBreakInfos;
|
||||
}
|
||||
|
||||
// Ask language hyphenator for legal break points.
|
||||
std::vector<size_t> indexes;
|
||||
if (hyphenator) {
|
||||
indexes = hyphenator->breakIndexes(cps);
|
||||
}
|
||||
|
||||
// Only add fallback breaks if needed
|
||||
if (includeFallback && indexes.empty()) {
|
||||
const size_t minPrefix = hyphenator ? hyphenator->minPrefix() : LiangWordConfig::kDefaultMinPrefix;
|
||||
const size_t minSuffix = hyphenator ? hyphenator->minSuffix() : LiangWordConfig::kDefaultMinSuffix;
|
||||
for (size_t idx = minPrefix; idx + minSuffix <= cps.size(); ++idx) {
|
||||
indexes.push_back(idx);
|
||||
}
|
||||
}
|
||||
|
||||
if (indexes.empty()) {
|
||||
return {};
|
||||
}
|
||||
|
||||
std::vector<Hyphenator::BreakInfo> breaks;
|
||||
breaks.reserve(indexes.size());
|
||||
for (const size_t idx : indexes) {
|
||||
breaks.push_back({byteOffsetForIndex(cps, idx), true});
|
||||
}
|
||||
|
||||
return breaks;
|
||||
}
|
||||
|
||||
void Hyphenator::setPreferredLanguage(const std::string& lang) { cachedHyphenator_ = hyphenatorForLanguage(lang); }
|
||||
24
lib/Epub/Epub/hyphenation/Hyphenator.h
Normal file
24
lib/Epub/Epub/hyphenation/Hyphenator.h
Normal file
@ -0,0 +1,24 @@
|
||||
#pragma once
|
||||
|
||||
#include <cstddef>
|
||||
#include <string>
|
||||
#include <vector>
|
||||
|
||||
class LanguageHyphenator;
|
||||
|
||||
class Hyphenator {
|
||||
public:
|
||||
struct BreakInfo {
|
||||
size_t byteOffset;
|
||||
bool requiresInsertedHyphen;
|
||||
};
|
||||
// Returns byte offsets where the word may be hyphenated. When includeFallback is true, all positions obeying the
|
||||
// minimum prefix/suffix constraints are returned even if no language-specific rule matches.
|
||||
static std::vector<BreakInfo> breakOffsets(const std::string& word, bool includeFallback);
|
||||
|
||||
// Provide a publication-level language hint (e.g. "en", "en-US", "ru") used to select hyphenation rules.
|
||||
static void setPreferredLanguage(const std::string& lang);
|
||||
|
||||
private:
|
||||
static const LanguageHyphenator* cachedHyphenator_;
|
||||
};
|
||||
23
lib/Epub/Epub/hyphenation/LanguageHyphenator.h
Normal file
23
lib/Epub/Epub/hyphenation/LanguageHyphenator.h
Normal file
@ -0,0 +1,23 @@
|
||||
#pragma once
|
||||
|
||||
#include "LiangHyphenation.h"
|
||||
|
||||
// Generic Liang-backed hyphenator that stores pattern metadata plus language-specific helpers.
|
||||
class LanguageHyphenator {
|
||||
public:
|
||||
LanguageHyphenator(const SerializedHyphenationPatterns& patterns, bool (*isLetterFn)(uint32_t),
|
||||
uint32_t (*toLowerFn)(uint32_t), size_t minPrefix = LiangWordConfig::kDefaultMinPrefix,
|
||||
size_t minSuffix = LiangWordConfig::kDefaultMinSuffix)
|
||||
: patterns_(patterns), config_(isLetterFn, toLowerFn, minPrefix, minSuffix) {}
|
||||
|
||||
std::vector<size_t> breakIndexes(const std::vector<CodepointInfo>& cps) const {
|
||||
return liangBreakIndexes(cps, patterns_, config_);
|
||||
}
|
||||
|
||||
size_t minPrefix() const { return config_.minPrefix; }
|
||||
size_t minSuffix() const { return config_.minSuffix; }
|
||||
|
||||
protected:
|
||||
const SerializedHyphenationPatterns& patterns_;
|
||||
LiangWordConfig config_;
|
||||
};
|
||||
42
lib/Epub/Epub/hyphenation/LanguageRegistry.cpp
Normal file
42
lib/Epub/Epub/hyphenation/LanguageRegistry.cpp
Normal file
@ -0,0 +1,42 @@
|
||||
#include "LanguageRegistry.h"
|
||||
|
||||
#include <algorithm>
|
||||
#include <array>
|
||||
|
||||
#include "HyphenationCommon.h"
|
||||
#include "generated/hyph-de.trie.h"
|
||||
#include "generated/hyph-en.trie.h"
|
||||
#include "generated/hyph-fr.trie.h"
|
||||
#include "generated/hyph-ru.trie.h"
|
||||
|
||||
namespace {
|
||||
|
||||
// English hyphenation patterns (3/3 minimum prefix/suffix length)
|
||||
LanguageHyphenator englishHyphenator(en_us_patterns, isLatinLetter, toLowerLatin, 3, 3);
|
||||
LanguageHyphenator frenchHyphenator(fr_patterns, isLatinLetter, toLowerLatin);
|
||||
LanguageHyphenator germanHyphenator(de_patterns, isLatinLetter, toLowerLatin);
|
||||
LanguageHyphenator russianHyphenator(ru_ru_patterns, isCyrillicLetter, toLowerCyrillic);
|
||||
|
||||
using EntryArray = std::array<LanguageEntry, 4>;
|
||||
|
||||
const EntryArray& entries() {
|
||||
static const EntryArray kEntries = {{{"english", "en", &englishHyphenator},
|
||||
{"french", "fr", &frenchHyphenator},
|
||||
{"german", "de", &germanHyphenator},
|
||||
{"russian", "ru", &russianHyphenator}}};
|
||||
return kEntries;
|
||||
}
|
||||
|
||||
} // namespace
|
||||
|
||||
const LanguageHyphenator* getLanguageHyphenatorForPrimaryTag(const std::string& primaryTag) {
|
||||
const auto& allEntries = entries();
|
||||
const auto it = std::find_if(allEntries.begin(), allEntries.end(),
|
||||
[&primaryTag](const LanguageEntry& entry) { return primaryTag == entry.primaryTag; });
|
||||
return (it != allEntries.end()) ? it->hyphenator : nullptr;
|
||||
}
|
||||
|
||||
LanguageEntryView getLanguageEntries() {
|
||||
const auto& allEntries = entries();
|
||||
return LanguageEntryView{allEntries.data(), allEntries.size()};
|
||||
}
|
||||
26
lib/Epub/Epub/hyphenation/LanguageRegistry.h
Normal file
26
lib/Epub/Epub/hyphenation/LanguageRegistry.h
Normal file
@ -0,0 +1,26 @@
|
||||
#pragma once
|
||||
|
||||
#include <cstddef>
|
||||
#include <string>
|
||||
|
||||
#include "LanguageHyphenator.h"
|
||||
|
||||
struct LanguageEntry {
|
||||
const char* cliName;
|
||||
const char* primaryTag;
|
||||
const LanguageHyphenator* hyphenator;
|
||||
};
|
||||
|
||||
struct LanguageEntryView {
|
||||
const LanguageEntry* data;
|
||||
size_t size;
|
||||
|
||||
const LanguageEntry* begin() const { return data; }
|
||||
const LanguageEntry* end() const { return data + size; }
|
||||
};
|
||||
|
||||
// Returns the Liang-backed hyphenator for a given primary language tag (e.g., "en", "fr").
|
||||
const LanguageHyphenator* getLanguageHyphenatorForPrimaryTag(const std::string& primaryTag);
|
||||
|
||||
// Exposes the list of supported languages primarily for tooling/tests.
|
||||
LanguageEntryView getLanguageEntries();
|
||||
405
lib/Epub/Epub/hyphenation/LiangHyphenation.cpp
Normal file
405
lib/Epub/Epub/hyphenation/LiangHyphenation.cpp
Normal file
@ -0,0 +1,405 @@
|
||||
#include "LiangHyphenation.h"
|
||||
|
||||
#include <algorithm>
|
||||
#include <vector>
|
||||
|
||||
/*
|
||||
* Liang hyphenation pipeline overview (Typst-style binary trie variant)
|
||||
* --------------------------------------------------------------------
|
||||
* 1. Input normalization (buildAugmentedWord)
|
||||
* - Accepts a vector of CodepointInfo structs emitted by the EPUB text
|
||||
* parser. Each codepoint is validated with LiangWordConfig::isLetter so
|
||||
* we abort early on digits, punctuation, etc. If the word is valid we
|
||||
* build an "augmented" byte sequence: leading '.', lowercase UTF-8 bytes
|
||||
* for every letter, then a trailing '.'. While doing this we capture the
|
||||
* UTF-8 byte offset for each character and a reverse lookup table that
|
||||
* maps UTF-8 byte indexes back to codepoint indexes. This lets the rest
|
||||
* of the algorithm stay byte-oriented (matching the serialized automaton)
|
||||
* while still emitting hyphen positions in codepoint space.
|
||||
*
|
||||
* 2. Automaton decoding
|
||||
* - SerializedHyphenationPatterns stores a contiguous blob generated from
|
||||
* Typst's binary tries. The first 4 bytes contain the root offset. Each
|
||||
* node packs transitions, variable-stride relative offsets to child
|
||||
* nodes, and an optional pointer into a shared "levels" list. We parse
|
||||
* that layout lazily via decodeState/transition, keeping everything in
|
||||
* flash memory; no heap allocations besides the stack-local AutomatonState
|
||||
* structs. getAutomaton caches parseAutomaton results per blob pointer so
|
||||
* multiple words hitting the same language only pay the cost once.
|
||||
*
|
||||
* 3. Pattern application
|
||||
* - We walk the augmented bytes left-to-right. For each starting byte we
|
||||
* stream transitions through the trie, terminating when a transition
|
||||
* fails. Whenever a node exposes level data we expand the packed
|
||||
* "dist+level" bytes: `dist` is the delta (in UTF-8 bytes) from the
|
||||
* starting cursor and `level` is the Liang priority digit. Using the
|
||||
* byte→codepoint lookup we mark the corresponding index in `scores`.
|
||||
* Scores are only updated if the new level is higher, mirroring Liang's
|
||||
* "max digit wins" rule.
|
||||
*
|
||||
* 4. Output filtering
|
||||
* - collectBreakIndexes converts odd-valued score entries back to codepoint
|
||||
* break positions while enforcing `minPrefix`/`minSuffix` constraints from
|
||||
* LiangWordConfig. The caller (language-specific hyphenators) can then
|
||||
* translate these indexes into renderer glyph offsets, page layout data,
|
||||
* etc.
|
||||
*
|
||||
* Keeping the entire algorithm small and deterministic is critical on the
|
||||
* ESP32-C3: we avoid recursion, dynamic allocations per node, or copying the
|
||||
* trie. All lookups stay within the generated blob, which lives in flash, and
|
||||
* the working buffers (augmented bytes/scores) scale with the word length rather
|
||||
* than the pattern corpus.
|
||||
*/
|
||||
|
||||
namespace {
|
||||
|
||||
struct AugmentedWord {
|
||||
std::vector<uint8_t> bytes;
|
||||
std::vector<size_t> charByteOffsets;
|
||||
std::vector<int32_t> byteToCharIndex;
|
||||
|
||||
bool empty() const { return bytes.empty(); }
|
||||
size_t charCount() const { return charByteOffsets.size(); }
|
||||
};
|
||||
|
||||
// Encode a single Unicode codepoint into UTF-8 and append to the provided buffer.
|
||||
size_t encodeUtf8(uint32_t cp, std::vector<uint8_t>& out) {
|
||||
if (cp <= 0x7Fu) {
|
||||
out.push_back(static_cast<uint8_t>(cp));
|
||||
return 1;
|
||||
}
|
||||
if (cp <= 0x7FFu) {
|
||||
out.push_back(static_cast<uint8_t>(0xC0u | ((cp >> 6) & 0x1Fu)));
|
||||
out.push_back(static_cast<uint8_t>(0x80u | (cp & 0x3Fu)));
|
||||
return 2;
|
||||
}
|
||||
if (cp <= 0xFFFFu) {
|
||||
out.push_back(static_cast<uint8_t>(0xE0u | ((cp >> 12) & 0x0Fu)));
|
||||
out.push_back(static_cast<uint8_t>(0x80u | ((cp >> 6) & 0x3Fu)));
|
||||
out.push_back(static_cast<uint8_t>(0x80u | (cp & 0x3Fu)));
|
||||
return 3;
|
||||
}
|
||||
out.push_back(static_cast<uint8_t>(0xF0u | ((cp >> 18) & 0x07u)));
|
||||
out.push_back(static_cast<uint8_t>(0x80u | ((cp >> 12) & 0x3Fu)));
|
||||
out.push_back(static_cast<uint8_t>(0x80u | ((cp >> 6) & 0x3Fu)));
|
||||
out.push_back(static_cast<uint8_t>(0x80u | (cp & 0x3Fu)));
|
||||
return 4;
|
||||
}
|
||||
|
||||
// Build the dotted, lowercase UTF-8 representation plus lookup tables.
|
||||
AugmentedWord buildAugmentedWord(const std::vector<CodepointInfo>& cps, const LiangWordConfig& config) {
|
||||
AugmentedWord word;
|
||||
if (cps.empty()) {
|
||||
return word;
|
||||
}
|
||||
|
||||
word.bytes.reserve(cps.size() * 2 + 2);
|
||||
word.charByteOffsets.reserve(cps.size() + 2);
|
||||
|
||||
word.charByteOffsets.push_back(0);
|
||||
word.bytes.push_back('.');
|
||||
|
||||
for (const auto& info : cps) {
|
||||
if (!config.isLetter(info.value)) {
|
||||
word.bytes.clear();
|
||||
word.charByteOffsets.clear();
|
||||
word.byteToCharIndex.clear();
|
||||
return word;
|
||||
}
|
||||
word.charByteOffsets.push_back(word.bytes.size());
|
||||
encodeUtf8(config.toLower(info.value), word.bytes);
|
||||
}
|
||||
|
||||
word.charByteOffsets.push_back(word.bytes.size());
|
||||
word.bytes.push_back('.');
|
||||
|
||||
word.byteToCharIndex.assign(word.bytes.size(), -1);
|
||||
for (size_t i = 0; i < word.charByteOffsets.size(); ++i) {
|
||||
const size_t offset = word.charByteOffsets[i];
|
||||
if (offset < word.byteToCharIndex.size()) {
|
||||
word.byteToCharIndex[offset] = static_cast<int32_t>(i);
|
||||
}
|
||||
}
|
||||
return word;
|
||||
}
|
||||
|
||||
// Decoded view of a single trie node pulled straight out of the serialized blob.
|
||||
// - transitions: contiguous list of next-byte values
|
||||
// - targets: packed relative offsets (1/2/3 bytes) for each transition
|
||||
// - levels: optional pointer into the global levels list with packed dist/level pairs
|
||||
struct AutomatonState {
|
||||
const uint8_t* data = nullptr;
|
||||
size_t size = 0;
|
||||
size_t addr = 0;
|
||||
uint8_t stride = 1;
|
||||
size_t childCount = 0;
|
||||
const uint8_t* transitions = nullptr;
|
||||
const uint8_t* targets = nullptr;
|
||||
const uint8_t* levels = nullptr;
|
||||
size_t levelsLen = 0;
|
||||
|
||||
bool valid() const { return data != nullptr; }
|
||||
};
|
||||
|
||||
// Lightweight descriptor for the entire embedded automaton.
|
||||
// The blob format is:
|
||||
// [0..3] - big-endian root offset
|
||||
// [4....] - node heap containing variable-sized headers + transition data
|
||||
struct EmbeddedAutomaton {
|
||||
const uint8_t* data = nullptr;
|
||||
size_t size = 0;
|
||||
uint32_t rootOffset = 0;
|
||||
|
||||
bool valid() const { return data != nullptr && size >= 4 && rootOffset < size; }
|
||||
};
|
||||
|
||||
// Decode the serialized automaton header and root offset.
|
||||
EmbeddedAutomaton parseAutomaton(const SerializedHyphenationPatterns& patterns) {
|
||||
EmbeddedAutomaton automaton;
|
||||
if (!patterns.data || patterns.size < 4) {
|
||||
return automaton;
|
||||
}
|
||||
|
||||
automaton.data = patterns.data;
|
||||
automaton.size = patterns.size;
|
||||
automaton.rootOffset = (static_cast<uint32_t>(patterns.data[0]) << 24) |
|
||||
(static_cast<uint32_t>(patterns.data[1]) << 16) |
|
||||
(static_cast<uint32_t>(patterns.data[2]) << 8) | static_cast<uint32_t>(patterns.data[3]);
|
||||
if (automaton.rootOffset >= automaton.size) {
|
||||
automaton.data = nullptr;
|
||||
automaton.size = 0;
|
||||
}
|
||||
return automaton;
|
||||
}
|
||||
|
||||
// Cache parsed automata per blob pointer to avoid reparsing.
|
||||
const EmbeddedAutomaton& getAutomaton(const SerializedHyphenationPatterns& patterns) {
|
||||
struct CacheEntry {
|
||||
const SerializedHyphenationPatterns* key;
|
||||
EmbeddedAutomaton automaton;
|
||||
};
|
||||
static std::vector<CacheEntry> cache;
|
||||
|
||||
for (const auto& entry : cache) {
|
||||
if (entry.key == &patterns) {
|
||||
return entry.automaton;
|
||||
}
|
||||
}
|
||||
|
||||
cache.push_back({&patterns, parseAutomaton(patterns)});
|
||||
return cache.back().automaton;
|
||||
}
|
||||
|
||||
// Interpret the node located at `addr`, returning transition metadata.
|
||||
AutomatonState decodeState(const EmbeddedAutomaton& automaton, size_t addr) {
|
||||
AutomatonState state;
|
||||
if (!automaton.valid() || addr >= automaton.size) {
|
||||
return state;
|
||||
}
|
||||
|
||||
const uint8_t* base = automaton.data + addr;
|
||||
size_t remaining = automaton.size - addr;
|
||||
size_t pos = 0;
|
||||
|
||||
const uint8_t header = base[pos++];
|
||||
// Header layout (bits):
|
||||
// 7 - hasLevels flag
|
||||
// 6..5 - stride selector (0 -> 1 byte, otherwise 1|2|3)
|
||||
// 4..0 - child count (5 bits), 31 == overflow -> extra byte
|
||||
const bool hasLevels = (header >> 7) != 0;
|
||||
uint8_t stride = static_cast<uint8_t>((header >> 5) & 0x03u);
|
||||
if (stride == 0) {
|
||||
stride = 1;
|
||||
}
|
||||
size_t childCount = static_cast<size_t>(header & 0x1Fu);
|
||||
if (childCount == 31u) {
|
||||
if (pos >= remaining) {
|
||||
return AutomatonState{};
|
||||
}
|
||||
childCount = base[pos++];
|
||||
}
|
||||
|
||||
const uint8_t* levelsPtr = nullptr;
|
||||
size_t levelsLen = 0;
|
||||
if (hasLevels) {
|
||||
if (pos + 1 >= remaining) {
|
||||
return AutomatonState{};
|
||||
}
|
||||
const uint8_t offsetHi = base[pos++];
|
||||
const uint8_t offsetLoLen = base[pos++];
|
||||
// The 12-bit offset (hi<<4 | top nibble) points into the blob-level levels list.
|
||||
// The bottom nibble stores how many packed entries belong to this node.
|
||||
const size_t offset = (static_cast<size_t>(offsetHi) << 4) | (offsetLoLen >> 4);
|
||||
levelsLen = offsetLoLen & 0x0Fu;
|
||||
if (offset + levelsLen > automaton.size) {
|
||||
return AutomatonState{};
|
||||
}
|
||||
levelsPtr = automaton.data + offset;
|
||||
}
|
||||
|
||||
if (pos + childCount > remaining) {
|
||||
return AutomatonState{};
|
||||
}
|
||||
const uint8_t* transitions = base + pos;
|
||||
pos += childCount;
|
||||
|
||||
const size_t targetsBytes = childCount * stride;
|
||||
if (pos + targetsBytes > remaining) {
|
||||
return AutomatonState{};
|
||||
}
|
||||
const uint8_t* targets = base + pos;
|
||||
|
||||
state.data = automaton.data;
|
||||
state.size = automaton.size;
|
||||
state.addr = addr;
|
||||
state.stride = stride;
|
||||
state.childCount = childCount;
|
||||
state.transitions = transitions;
|
||||
state.targets = targets;
|
||||
state.levels = levelsPtr;
|
||||
state.levelsLen = levelsLen;
|
||||
return state;
|
||||
}
|
||||
|
||||
// Convert the packed stride-sized delta back into a signed offset.
|
||||
int32_t decodeDelta(const uint8_t* buf, uint8_t stride) {
|
||||
if (stride == 1) {
|
||||
return static_cast<int8_t>(buf[0]);
|
||||
}
|
||||
if (stride == 2) {
|
||||
return static_cast<int16_t>((static_cast<uint16_t>(buf[0]) << 8) | static_cast<uint16_t>(buf[1]));
|
||||
}
|
||||
const int32_t unsignedVal =
|
||||
(static_cast<int32_t>(buf[0]) << 16) | (static_cast<int32_t>(buf[1]) << 8) | static_cast<int32_t>(buf[2]);
|
||||
return unsignedVal - (1 << 23);
|
||||
}
|
||||
|
||||
// Follow a single byte transition from `state`, decoding the child node on success.
|
||||
bool transition(const EmbeddedAutomaton& automaton, const AutomatonState& state, uint8_t letter, AutomatonState& out) {
|
||||
if (!state.valid()) {
|
||||
return false;
|
||||
}
|
||||
|
||||
// Children remain sorted by letter in the serialized blob, but the lists are
|
||||
// short enough that a linear scan keeps code size down compared to binary search.
|
||||
for (size_t idx = 0; idx < state.childCount; ++idx) {
|
||||
if (state.transitions[idx] != letter) {
|
||||
continue;
|
||||
}
|
||||
const uint8_t* deltaPtr = state.targets + idx * state.stride;
|
||||
const int32_t delta = decodeDelta(deltaPtr, state.stride);
|
||||
// Deltas are relative to the current node's address, allowing us to keep all
|
||||
// targets within 24 bits while still referencing further nodes in the blob.
|
||||
const int64_t nextAddr = static_cast<int64_t>(state.addr) + delta;
|
||||
if (nextAddr < 0 || static_cast<size_t>(nextAddr) >= automaton.size) {
|
||||
return false;
|
||||
}
|
||||
out = decodeState(automaton, static_cast<size_t>(nextAddr));
|
||||
return out.valid();
|
||||
}
|
||||
return false;
|
||||
}
|
||||
|
||||
// Converts odd score positions back into codepoint indexes, honoring min prefix/suffix constraints.
|
||||
// Each break corresponds to scores[breakIndex + 1] because of the leading '.' sentinel.
|
||||
// Convert odd score entries into hyphen positions while honoring prefix/suffix limits.
|
||||
std::vector<size_t> collectBreakIndexes(const std::vector<CodepointInfo>& cps, const std::vector<uint8_t>& scores,
|
||||
const size_t minPrefix, const size_t minSuffix) {
|
||||
std::vector<size_t> indexes;
|
||||
const size_t cpCount = cps.size();
|
||||
if (cpCount < 2) {
|
||||
return indexes;
|
||||
}
|
||||
|
||||
for (size_t breakIndex = 1; breakIndex < cpCount; ++breakIndex) {
|
||||
if (breakIndex < minPrefix) {
|
||||
continue;
|
||||
}
|
||||
|
||||
const size_t suffixCount = cpCount - breakIndex;
|
||||
if (suffixCount < minSuffix) {
|
||||
continue;
|
||||
}
|
||||
|
||||
const size_t scoreIdx = breakIndex + 1;
|
||||
if (scoreIdx >= scores.size()) {
|
||||
break;
|
||||
}
|
||||
if ((scores[scoreIdx] & 1u) == 0) {
|
||||
continue;
|
||||
}
|
||||
indexes.push_back(breakIndex);
|
||||
}
|
||||
|
||||
return indexes;
|
||||
}
|
||||
|
||||
} // namespace
|
||||
|
||||
// Entry point that runs the full Liang pipeline for a single word.
|
||||
std::vector<size_t> liangBreakIndexes(const std::vector<CodepointInfo>& cps,
|
||||
const SerializedHyphenationPatterns& patterns, const LiangWordConfig& config) {
|
||||
const auto augmented = buildAugmentedWord(cps, config);
|
||||
if (augmented.empty()) {
|
||||
return {};
|
||||
}
|
||||
|
||||
const EmbeddedAutomaton& automaton = getAutomaton(patterns);
|
||||
if (!automaton.valid()) {
|
||||
return {};
|
||||
}
|
||||
|
||||
const AutomatonState root = decodeState(automaton, automaton.rootOffset);
|
||||
if (!root.valid()) {
|
||||
return {};
|
||||
}
|
||||
|
||||
// Liang scores: one entry per augmented char (leading/trailing dots included).
|
||||
std::vector<uint8_t> scores(augmented.charCount(), 0);
|
||||
|
||||
// Walk every starting character position and stream bytes through the trie.
|
||||
for (size_t charStart = 0; charStart < augmented.charByteOffsets.size(); ++charStart) {
|
||||
const size_t byteStart = augmented.charByteOffsets[charStart];
|
||||
AutomatonState state = root;
|
||||
|
||||
for (size_t cursor = byteStart; cursor < augmented.bytes.size(); ++cursor) {
|
||||
AutomatonState next;
|
||||
if (!transition(automaton, state, augmented.bytes[cursor], next)) {
|
||||
break; // No more matches for this prefix.
|
||||
}
|
||||
state = next;
|
||||
|
||||
if (state.levels && state.levelsLen > 0) {
|
||||
size_t offset = 0;
|
||||
// Each packed byte stores the byte-distance delta and the Liang level digit.
|
||||
for (size_t i = 0; i < state.levelsLen; ++i) {
|
||||
const uint8_t packed = state.levels[i];
|
||||
const size_t dist = static_cast<size_t>(packed / 10);
|
||||
const uint8_t level = static_cast<uint8_t>(packed % 10);
|
||||
|
||||
offset += dist;
|
||||
const size_t splitByte = byteStart + offset;
|
||||
if (splitByte >= augmented.byteToCharIndex.size()) {
|
||||
continue;
|
||||
}
|
||||
|
||||
const int32_t boundary = augmented.byteToCharIndex[splitByte];
|
||||
if (boundary < 0) {
|
||||
continue; // Mid-codepoint byte, wait for the next one.
|
||||
}
|
||||
if (boundary < 2 || boundary + 2 > static_cast<int32_t>(augmented.charCount())) {
|
||||
continue; // Skip splits that land in the leading/trailing sentinels.
|
||||
}
|
||||
|
||||
const size_t idx = static_cast<size_t>(boundary);
|
||||
if (idx >= scores.size()) {
|
||||
continue;
|
||||
}
|
||||
scores[idx] = std::max(scores[idx], level);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return collectBreakIndexes(cps, scores, config.minPrefix, config.minSuffix);
|
||||
}
|
||||
38
lib/Epub/Epub/hyphenation/LiangHyphenation.h
Normal file
38
lib/Epub/Epub/hyphenation/LiangHyphenation.h
Normal file
@ -0,0 +1,38 @@
|
||||
#pragma once
|
||||
|
||||
#include <cstddef>
|
||||
#include <cstdint>
|
||||
#include <vector>
|
||||
|
||||
#include "HyphenationCommon.h"
|
||||
#include "SerializedHyphenationTrie.h"
|
||||
|
||||
// Encapsulates every language-specific dial the Liang algorithm needs at runtime. The helpers are
|
||||
// intentionally represented as bare function pointers because we invoke them inside tight loops and
|
||||
// want to avoid the overhead of std::function or functors. The minima default to the TeX-recommended
|
||||
// "2/2" split but individual languages (English, for example) can override them.
|
||||
struct LiangWordConfig {
|
||||
static constexpr size_t kDefaultMinPrefix = 2;
|
||||
static constexpr size_t kDefaultMinSuffix = 2;
|
||||
// Predicate used to reject non-alphabetic characters before pattern lookup. Returning false causes
|
||||
// the entire word to be skipped, matching the behavior of classic TeX hyphenation tables.
|
||||
bool (*isLetter)(uint32_t);
|
||||
// Language-specific case folding that matches how the TeX patterns were authored (usually lower-case
|
||||
// ASCII for Latin and lowercase Cyrillic for Russian). Patterns are stored in UTF-8, so this must
|
||||
// operate on Unicode scalars rather than bytes.
|
||||
uint32_t (*toLower)(uint32_t);
|
||||
// Minimum codepoints required on the left/right of any break. These correspond to TeX's
|
||||
// lefthyphenmin and righthyphenmin knobs.
|
||||
size_t minPrefix;
|
||||
size_t minSuffix;
|
||||
|
||||
// Lightweight aggregate constructor so call sites can declare `const LiangWordConfig config(...)`
|
||||
// without verbose member assignment boilerplate.
|
||||
LiangWordConfig(bool (*letterFn)(uint32_t), uint32_t (*lowerFn)(uint32_t), size_t prefix = kDefaultMinPrefix,
|
||||
size_t suffix = kDefaultMinSuffix)
|
||||
: isLetter(letterFn), toLower(lowerFn), minPrefix(prefix), minSuffix(suffix) {}
|
||||
};
|
||||
|
||||
// Shared Liang pattern evaluator used by every language-specific hyphenator.
|
||||
std::vector<size_t> liangBreakIndexes(const std::vector<CodepointInfo>& cps,
|
||||
const SerializedHyphenationPatterns& patterns, const LiangWordConfig& config);
|
||||
10
lib/Epub/Epub/hyphenation/SerializedHyphenationTrie.h
Normal file
10
lib/Epub/Epub/hyphenation/SerializedHyphenationTrie.h
Normal file
@ -0,0 +1,10 @@
|
||||
#pragma once
|
||||
|
||||
#include <cstddef>
|
||||
#include <cstdint>
|
||||
|
||||
// Lightweight descriptor that points at a serialized Liang hyphenation trie stored in flash.
|
||||
struct SerializedHyphenationPatterns {
|
||||
const std::uint8_t* data;
|
||||
size_t size;
|
||||
};
|
||||
10871
lib/Epub/Epub/hyphenation/generated/hyph-de.trie.h
Normal file
10871
lib/Epub/Epub/hyphenation/generated/hyph-de.trie.h
Normal file
File diff suppressed because it is too large
Load Diff
1434
lib/Epub/Epub/hyphenation/generated/hyph-en.trie.h
Normal file
1434
lib/Epub/Epub/hyphenation/generated/hyph-en.trie.h
Normal file
File diff suppressed because it is too large
Load Diff
383
lib/Epub/Epub/hyphenation/generated/hyph-fr.trie.h
Normal file
383
lib/Epub/Epub/hyphenation/generated/hyph-fr.trie.h
Normal file
@ -0,0 +1,383 @@
|
||||
#pragma once
|
||||
|
||||
#include <cstddef>
|
||||
#include <cstdint>
|
||||
|
||||
#include "../SerializedHyphenationTrie.h"
|
||||
|
||||
// Auto-generated by generate_hyphenation_trie.py. Do not edit manually.
|
||||
alignas(4) constexpr uint8_t fr_trie_data[] = {
|
||||
0x00, 0x00, 0x1A, 0xF4, 0x02, 0x0C, 0x18, 0x22, 0x16, 0x21, 0x0B, 0x16, 0x21, 0x0E, 0x01, 0x0C, 0x0B, 0x3D, 0x0C,
|
||||
0x2B, 0x0E, 0x0C, 0x0C, 0x33, 0x0C, 0x33, 0x16, 0x34, 0x2A, 0x0D, 0x20, 0x0D, 0x0C, 0x0D, 0x2A, 0x17, 0x04, 0x1F,
|
||||
0x0C, 0x29, 0x0C, 0x20, 0x0B, 0x0C, 0x17, 0x17, 0x0C, 0x3F, 0x35, 0x53, 0x4A, 0x36, 0x34, 0x21, 0x2A, 0x0D, 0x0C,
|
||||
0x2A, 0x0D, 0x16, 0x02, 0x17, 0x15, 0x15, 0x0C, 0x15, 0x16, 0x2C, 0x47, 0x0C, 0x49, 0x2B, 0x0C, 0x0D, 0x34, 0x0D,
|
||||
0x2A, 0x0B, 0x16, 0x2B, 0x0C, 0x17, 0x2A, 0x0B, 0x0C, 0x03, 0x0C, 0x16, 0x0D, 0x01, 0x16, 0x0C, 0x0B, 0x0C, 0x3E,
|
||||
0x48, 0x2C, 0x0B, 0x29, 0x16, 0x37, 0x40, 0x1F, 0x16, 0x20, 0x17, 0x36, 0x0D, 0x52, 0x3D, 0x16, 0x1F, 0x0C, 0x16,
|
||||
0x3E, 0x0D, 0x49, 0x0C, 0x03, 0x16, 0x35, 0x0C, 0x22, 0x0F, 0x02, 0x0D, 0x51, 0x0C, 0x21, 0x0C, 0x20, 0x0B, 0x16,
|
||||
0x21, 0x0C, 0x17, 0x21, 0x0C, 0x0D, 0xA0, 0x00, 0x91, 0x21, 0x61, 0xFD, 0x21, 0xA9, 0xFD, 0x21, 0xC3, 0xFD, 0x21,
|
||||
0x72, 0xFD, 0xA0, 0x00, 0xC2, 0x21, 0x68, 0xFD, 0x21, 0x63, 0xFD, 0x21, 0x73, 0xFD, 0xA0, 0x00, 0x51, 0x21, 0x6C,
|
||||
0xFD, 0x21, 0x6F, 0xFD, 0x21, 0x6F, 0xFD, 0x21, 0x63, 0xFD, 0xA0, 0x01, 0x12, 0x21, 0x63, 0xFD, 0x21, 0x61, 0xFD,
|
||||
0x21, 0x6F, 0xFD, 0x21, 0x6E, 0xFD, 0x21, 0x69, 0xFD, 0xA0, 0x01, 0x32, 0x21, 0x72, 0xFD, 0x21, 0x74, 0xFD, 0x21,
|
||||
0x73, 0xFD, 0xA0, 0x01, 0x52, 0x21, 0x69, 0xFD, 0x21, 0x73, 0xFD, 0x21, 0xA9, 0xFD, 0x21, 0xC3, 0xFD, 0x21, 0x68,
|
||||
0xFD, 0x21, 0x74, 0xFD, 0x21, 0x73, 0xFD, 0xA0, 0x01, 0x72, 0xA0, 0x01, 0xB1, 0x21, 0x65, 0xFD, 0x21, 0x6E, 0xFD,
|
||||
0xA1, 0x01, 0x72, 0x6E, 0xFD, 0xA0, 0x01, 0x92, 0x21, 0xA9, 0xFD, 0x24, 0x61, 0x65, 0xC3, 0x73, 0xE9, 0xF5, 0xFD,
|
||||
0xE9, 0x21, 0x69, 0xF7, 0x23, 0x61, 0x65, 0x74, 0xC2, 0xDA, 0xFD, 0xA0, 0x01, 0xC2, 0x21, 0x61, 0xFD, 0x21, 0x74,
|
||||
0xFD, 0x21, 0x73, 0xFD, 0x21, 0x6F, 0xFD, 0xA0, 0x01, 0xE1, 0x21, 0x61, 0xFD, 0x21, 0x74, 0xFD, 0x41, 0x2E, 0xFF,
|
||||
0x5E, 0x21, 0x74, 0xFC, 0x21, 0x6E, 0xFD, 0x21, 0x65, 0xFD, 0x22, 0x67, 0x70, 0xFD, 0xFD, 0xA0, 0x05, 0x72, 0x21,
|
||||
0x74, 0xFD, 0x21, 0x61, 0xFD, 0x21, 0x6E, 0xFD, 0xC9, 0x00, 0x61, 0x62, 0x65, 0x6C, 0x6D, 0x6E, 0x70, 0x73, 0x72,
|
||||
0x67, 0xFF, 0x4C, 0xFF, 0x58, 0xFF, 0x67, 0xFF, 0x79, 0xFF, 0xC3, 0xFF, 0xD6, 0xFF, 0xDF, 0xFF, 0xEF, 0xFF, 0xFD,
|
||||
0xA0, 0x00, 0x71, 0x27, 0xA2, 0xAA, 0xA9, 0xA8, 0xAE, 0xB4, 0xBB, 0xFD, 0xFD, 0xFD, 0xFD, 0xFD, 0xFD, 0xFD, 0xA0,
|
||||
0x02, 0x52, 0x22, 0x61, 0x6F, 0xFD, 0xFD, 0xA0, 0x02, 0x93, 0x21, 0x61, 0xFD, 0x21, 0x72, 0xFD, 0xA2, 0x00, 0x61,
|
||||
0x6E, 0x75, 0xF2, 0xFD, 0x21, 0xA9, 0xAC, 0x42, 0xC3, 0x69, 0xFF, 0xFD, 0xFF, 0xA9, 0x21, 0x6E, 0xF9, 0x41, 0x74,
|
||||
0xFF, 0x06, 0x21, 0x61, 0xFC, 0x21, 0x6D, 0xFD, 0x21, 0x72, 0xFD, 0x21, 0x6F, 0xFD, 0xA0, 0x01, 0xE2, 0x21, 0x74,
|
||||
0xFD, 0x21, 0x69, 0xFD, 0x41, 0x72, 0xFF, 0x6B, 0x21, 0x75, 0xFC, 0x21, 0x67, 0xFD, 0xA2, 0x02, 0x52, 0x6E, 0x75,
|
||||
0xF3, 0xFD, 0x41, 0x62, 0xFF, 0x5A, 0x21, 0x61, 0xFC, 0x21, 0x66, 0xFD, 0x41, 0x74, 0xFF, 0x50, 0x41, 0x72, 0xFF,
|
||||
0x4F, 0x21, 0x6F, 0xFC, 0xC4, 0x02, 0x52, 0x66, 0x70, 0x72, 0x78, 0xFF, 0xF2, 0xFF, 0xF5, 0xFF, 0x45, 0xFF, 0xFD,
|
||||
0xA0, 0x06, 0x82, 0x21, 0x61, 0xFD, 0x21, 0x74, 0xFD, 0x21, 0x63, 0xFD, 0x21, 0x75, 0xFD, 0x21, 0x72, 0xF4, 0x21,
|
||||
0x72, 0xFD, 0x21, 0x61, 0xFD, 0xA2, 0x06, 0x62, 0x6C, 0x6E, 0xF4, 0xFD, 0x21, 0xA9, 0xF9, 0x41, 0x69, 0xFF, 0xA0,
|
||||
0x21, 0x74, 0xFC, 0x21, 0x69, 0xFD, 0xC3, 0x02, 0x52, 0x6D, 0x71, 0x74, 0xFF, 0xFD, 0xFF, 0x96, 0xFF, 0x96, 0x41,
|
||||
0x6C, 0xFF, 0x8A, 0x21, 0x75, 0xFC, 0x41, 0x64, 0xFE, 0xF7, 0xA2, 0x02, 0x52, 0x63, 0x6E, 0xF9, 0xFC, 0x41, 0x62,
|
||||
0xFF, 0x43, 0x21, 0x61, 0xFC, 0x21, 0x74, 0xFD, 0xA0, 0x05, 0xF1, 0xA0, 0x06, 0xC1, 0x21, 0xA9, 0xFD, 0xA7, 0x06,
|
||||
0xA2, 0x61, 0x65, 0xC3, 0x69, 0x6F, 0x75, 0x73, 0xF7, 0xF7, 0xFD, 0xF7, 0xF7, 0xF7, 0xF7, 0x21, 0x72, 0xEF, 0x21,
|
||||
0x65, 0xFD, 0xC2, 0x02, 0x52, 0x69, 0x6C, 0xFF, 0x72, 0xFF, 0x4E, 0x49, 0x66, 0x61, 0x65, 0xC3, 0x69, 0x6F, 0x73,
|
||||
0x74, 0x75, 0xFF, 0x42, 0xFF, 0x58, 0xFF, 0x74, 0xFF, 0xA2, 0xFF, 0xAF, 0xFF, 0xC6, 0xFF, 0xD4, 0xFF, 0xF4, 0xFF,
|
||||
0xF7, 0xC2, 0x00, 0x61, 0x67, 0x6E, 0xFF, 0x16, 0xFF, 0xE4, 0x41, 0x75, 0xFE, 0xA7, 0x21, 0x67, 0xFC, 0x41, 0x65,
|
||||
0xFF, 0x09, 0x21, 0x74, 0xFC, 0xA0, 0x02, 0x71, 0x21, 0x75, 0xFD, 0x21, 0x6F, 0xFD, 0x21, 0x61, 0xFD, 0xA0, 0x02,
|
||||
0x72, 0x21, 0x63, 0xFD, 0x21, 0x73, 0xFD, 0x21, 0x69, 0xFD, 0xA4, 0x00, 0x61, 0x6E, 0x63, 0x75, 0x76, 0xDE, 0xE5,
|
||||
0xF1, 0xFD, 0xA0, 0x00, 0x61, 0xC7, 0x00, 0x42, 0x61, 0xC3, 0x65, 0x69, 0x6F, 0x75, 0x79, 0xFE, 0x87, 0xFE, 0xA8,
|
||||
0xFE, 0xC8, 0xFF, 0xC3, 0xFF, 0xF2, 0xFF, 0xFD, 0xFF, 0xFD, 0x42, 0x61, 0x74, 0xFD, 0xF4, 0xFE, 0x2F, 0x43, 0x64,
|
||||
0x67, 0x70, 0xFE, 0x54, 0xFE, 0x54, 0xFE, 0x54, 0xC8, 0x00, 0x61, 0x62, 0x65, 0x6D, 0x6E, 0x70, 0x73, 0x72, 0x67,
|
||||
0xFD, 0xAA, 0xFD, 0xB6, 0xFD, 0xD7, 0xFF, 0xEF, 0xFE, 0x34, 0xFE, 0x3D, 0xFF, 0xF6, 0xFE, 0x5B, 0xA0, 0x03, 0x01,
|
||||
0x21, 0x2E, 0xFD, 0x21, 0x74, 0xFD, 0x21, 0x6E, 0xFD, 0x21, 0x65, 0xFD, 0x21, 0x6E, 0xFD, 0x21, 0x69, 0xFD, 0xA1,
|
||||
0x00, 0x71, 0x6D, 0xFD, 0x47, 0xA2, 0xAA, 0xA9, 0xA8, 0xAE, 0xB4, 0xBB, 0xFE, 0x47, 0xFE, 0x47, 0xFF, 0xFB, 0xFE,
|
||||
0x47, 0xFE, 0x47, 0xFE, 0x47, 0xFE, 0x47, 0xA0, 0x02, 0x22, 0x21, 0x6E, 0xFD, 0x21, 0x69, 0xFD, 0x21, 0x61, 0xFD,
|
||||
0x21, 0x6D, 0xFD, 0x21, 0x65, 0xFD, 0x21, 0x73, 0xFD, 0x21, 0x69, 0xFD, 0xA0, 0x02, 0x51, 0x43, 0x63, 0x74, 0x75,
|
||||
0xFE, 0x28, 0xFE, 0x28, 0xFF, 0xFD, 0x41, 0x61, 0xFF, 0x4D, 0x44, 0x61, 0x6F, 0x73, 0x75, 0xFF, 0xF2, 0xFF, 0xFC,
|
||||
0xFE, 0x25, 0xFE, 0x1A, 0x22, 0x61, 0x69, 0xDF, 0xF3, 0xA0, 0x03, 0x42, 0x21, 0x65, 0xFD, 0x21, 0x6C, 0xFD, 0x21,
|
||||
0x6C, 0xFD, 0x21, 0x69, 0xFD, 0x21, 0x75, 0xFD, 0x21, 0x65, 0xFD, 0x21, 0x66, 0xFD, 0x21, 0x65, 0xFD, 0x21, 0x72,
|
||||
0xFD, 0x21, 0x76, 0xFD, 0x21, 0xA8, 0xFD, 0xA1, 0x00, 0x71, 0xC3, 0xFD, 0xA0, 0x02, 0x92, 0x21, 0x70, 0xFD, 0x21,
|
||||
0x6C, 0xFD, 0x21, 0x61, 0xFD, 0x21, 0x73, 0xFD, 0xA0, 0x03, 0x31, 0xA0, 0x04, 0x42, 0x21, 0x63, 0xFD, 0xA0, 0x04,
|
||||
0x61, 0x21, 0x65, 0xFD, 0x21, 0x72, 0xFD, 0x21, 0x74, 0xFD, 0x21, 0xAE, 0xFD, 0x21, 0xC3, 0xFD, 0x21, 0x61, 0xFD,
|
||||
0x22, 0x73, 0x6D, 0xE8, 0xFD, 0x21, 0x65, 0xFB, 0x21, 0x72, 0xFD, 0xA2, 0x04, 0x31, 0x73, 0x74, 0xD7, 0xFD, 0x41,
|
||||
0x65, 0xFD, 0xD5, 0x21, 0x69, 0xFC, 0xA1, 0x02, 0x52, 0x6C, 0xFD, 0xA0, 0x01, 0x31, 0x21, 0x2E, 0xFD, 0x21, 0x74,
|
||||
0xFD, 0x21, 0x6E, 0xFD, 0x21, 0x65, 0xFD, 0x21, 0x6D, 0xFD, 0x23, 0x6E, 0x6F, 0x6D, 0xDB, 0xE9, 0xFD, 0xA0, 0x04,
|
||||
0x31, 0x21, 0x6C, 0xFD, 0x44, 0x68, 0x69, 0x6F, 0x75, 0xFF, 0x91, 0xFF, 0xA2, 0xFF, 0xF3, 0xFF, 0xFD, 0x41, 0x61,
|
||||
0xFF, 0x9B, 0x21, 0x6F, 0xFC, 0x21, 0x79, 0xFD, 0x21, 0x72, 0xFD, 0x21, 0x63, 0xFD, 0x41, 0x6F, 0xFE, 0x7B, 0xA0,
|
||||
0x04, 0x73, 0x21, 0x72, 0xFD, 0xA0, 0x04, 0xA2, 0x21, 0x6C, 0xF7, 0x21, 0x6C, 0xFD, 0x21, 0x65, 0xFD, 0xA0, 0x04,
|
||||
0x72, 0x21, 0x72, 0xFD, 0x21, 0x74, 0xFD, 0x24, 0x63, 0x6D, 0x74, 0x73, 0xE8, 0xEB, 0xF4, 0xFD, 0xA0, 0x04, 0xF3,
|
||||
0x21, 0x72, 0xFD, 0xA1, 0x04, 0xC3, 0x67, 0xFD, 0x21, 0xA9, 0xFB, 0x21, 0x62, 0xE0, 0x21, 0x69, 0xFD, 0x21, 0x73,
|
||||
0xFD, 0x21, 0x74, 0xD7, 0x21, 0x75, 0xD4, 0x23, 0x6E, 0x72, 0x78, 0xF7, 0xFA, 0xFD, 0x21, 0x6E, 0xB8, 0x21, 0x69,
|
||||
0xB5, 0x21, 0x6F, 0xC4, 0x22, 0x65, 0x76, 0xF7, 0xFD, 0xC6, 0x05, 0x23, 0x64, 0x67, 0x6C, 0x6E, 0x72, 0x73, 0xFF,
|
||||
0xAA, 0xFF, 0xF2, 0xFF, 0xF5, 0xFF, 0xFB, 0xFF, 0xAA, 0xFF, 0xE5, 0x41, 0xA9, 0xFF, 0x95, 0x21, 0xC3, 0xFC, 0x41,
|
||||
0x69, 0xFF, 0x97, 0x42, 0x6D, 0x70, 0xFF, 0x9C, 0xFF, 0x9C, 0x41, 0x66, 0xFF, 0x98, 0x45, 0x64, 0x6C, 0x70, 0x72,
|
||||
0x75, 0xFF, 0xEE, 0xFF, 0x7F, 0xFF, 0xF1, 0xFF, 0xF5, 0xFF, 0xFC, 0xA0, 0x04, 0xC2, 0x21, 0x93, 0xFD, 0xA0, 0x05,
|
||||
0x23, 0x21, 0x6E, 0xFD, 0xCA, 0x01, 0xC1, 0x61, 0x63, 0xC3, 0x65, 0x69, 0x6F, 0xC5, 0x70, 0x74, 0x75, 0xFF, 0x7E,
|
||||
0xFF, 0x75, 0xFF, 0x92, 0xFF, 0xA4, 0xFF, 0xB9, 0xFF, 0xE4, 0xFF, 0xF7, 0xFF, 0x75, 0xFF, 0x75, 0xFF, 0xFD, 0x44,
|
||||
0x61, 0x69, 0x6F, 0x73, 0xFD, 0xC5, 0xFF, 0x3E, 0xFD, 0xC5, 0xFF, 0xDF, 0x21, 0xA9, 0xF3, 0x41, 0xA9, 0xFC, 0x86,
|
||||
0x41, 0x64, 0xFC, 0x82, 0x22, 0xC3, 0x69, 0xF8, 0xFC, 0x41, 0x64, 0xFE, 0x4E, 0x41, 0x69, 0xFC, 0x75, 0x41, 0x6D,
|
||||
0xFC, 0x71, 0x21, 0x6F, 0xFC, 0x24, 0x63, 0x6C, 0x6D, 0x74, 0xEC, 0xF1, 0xF5, 0xFD, 0x41, 0x6E, 0xFC, 0x61, 0x41,
|
||||
0x68, 0xFC, 0x92, 0x23, 0x61, 0x65, 0x73, 0xEF, 0xF8, 0xFC, 0xC4, 0x01, 0xE2, 0x61, 0x69, 0x6F, 0x75, 0xFC, 0x5A,
|
||||
0xFC, 0x5A, 0xFC, 0x5A, 0xFC, 0x5A, 0x21, 0x73, 0xF1, 0x41, 0x6C, 0xFB, 0xFC, 0x45, 0x61, 0xC3, 0x69, 0x79, 0x6F,
|
||||
0xFE, 0xE1, 0xFF, 0xB3, 0xFF, 0xE3, 0xFF, 0xF9, 0xFF, 0xFC, 0x48, 0x61, 0x65, 0xC3, 0x69, 0x6F, 0x73, 0x74, 0x75,
|
||||
0xFC, 0x74, 0xFC, 0x90, 0xFC, 0xBE, 0xFC, 0xCB, 0xFC, 0xE2, 0xFC, 0xF0, 0xFD, 0x10, 0xFD, 0x13, 0xC2, 0x00, 0x61,
|
||||
0x67, 0x6E, 0xFC, 0x35, 0xFF, 0xE7, 0x41, 0x64, 0xFE, 0x6A, 0x21, 0x69, 0xFC, 0x41, 0x61, 0xFC, 0x3B, 0x21, 0x63,
|
||||
0xFC, 0x21, 0x69, 0xFD, 0x22, 0x63, 0x66, 0xF3, 0xFD, 0x41, 0x6D, 0xFC, 0x29, 0x22, 0x69, 0x75, 0xF7, 0xFC, 0x21,
|
||||
0x6E, 0xFB, 0x41, 0x73, 0xFB, 0x25, 0x21, 0x6F, 0xFC, 0x42, 0x6B, 0x72, 0xFC, 0x16, 0xFF, 0xFD, 0x41, 0x73, 0xFB,
|
||||
0xE2, 0x42, 0x65, 0x6F, 0xFF, 0xFC, 0xFB, 0xDE, 0x21, 0x72, 0xF9, 0x41, 0xA9, 0xFD, 0xED, 0x21, 0xC3, 0xFC, 0x21,
|
||||
0x73, 0xFD, 0x44, 0x64, 0x69, 0x70, 0x76, 0xFF, 0xF3, 0xFF, 0xFD, 0xFD, 0xE3, 0xFB, 0xCA, 0x41, 0x6E, 0xFD, 0xD6,
|
||||
0x41, 0x74, 0xFD, 0xD2, 0x21, 0x6E, 0xFC, 0x42, 0x63, 0x64, 0xFD, 0xCB, 0xFB, 0xB2, 0x24, 0x61, 0x65, 0x69, 0x6F,
|
||||
0xE1, 0xEE, 0xF6, 0xF9, 0x41, 0x78, 0xFD, 0xBB, 0x24, 0x67, 0x63, 0x6C, 0x72, 0xAB, 0xB5, 0xF3, 0xFC, 0x41, 0x68,
|
||||
0xFE, 0xCA, 0x21, 0x6F, 0xFC, 0xC1, 0x01, 0xC1, 0x6E, 0xFD, 0xF2, 0x41, 0x73, 0xFE, 0xBD, 0x41, 0x73, 0xFE, 0xBF,
|
||||
0x44, 0x61, 0x65, 0x69, 0x75, 0xFF, 0xF2, 0xFF, 0xF8, 0xFE, 0xB5, 0xFF, 0xFC, 0x41, 0x61, 0xFA, 0xA5, 0x21, 0x74,
|
||||
0xFC, 0x21, 0x73, 0xFD, 0x21, 0x61, 0xFD, 0x23, 0x67, 0x73, 0x74, 0xD5, 0xE6, 0xFD, 0x21, 0xA9, 0xF9, 0xA0, 0x01,
|
||||
0x11, 0x21, 0x6D, 0xFD, 0x21, 0x61, 0xFD, 0x21, 0x69, 0xFD, 0x21, 0x6C, 0xFD, 0x21, 0x6C, 0xFD, 0x41, 0xC3, 0xFA,
|
||||
0xC6, 0x21, 0x64, 0xFC, 0x42, 0xA9, 0xAF, 0xFA, 0xBC, 0xFF, 0xFD, 0x47, 0x61, 0x65, 0xC3, 0x69, 0x6F, 0x75, 0x73,
|
||||
0xFA, 0xA4, 0xFA, 0xA4, 0xFF, 0xF9, 0xFA, 0xA4, 0xFA, 0xA4, 0xFA, 0xA4, 0xFA, 0xA4, 0x21, 0x6F, 0xEA, 0x21, 0x6E,
|
||||
0xFD, 0x44, 0x61, 0xC3, 0x69, 0x6F, 0xFF, 0x82, 0xFF, 0xC1, 0xFF, 0xD3, 0xFF, 0xFD, 0x41, 0x68, 0xFA, 0xA5, 0x21,
|
||||
0x74, 0xFC, 0x21, 0x61, 0xFD, 0x21, 0x6E, 0xFD, 0xA0, 0x06, 0x22, 0x21, 0xA9, 0xFD, 0x41, 0xA9, 0xFC, 0x27, 0x21,
|
||||
0xC3, 0xFC, 0x21, 0x63, 0xFD, 0xA0, 0x07, 0x82, 0x21, 0x68, 0xFD, 0x21, 0x64, 0xFD, 0x24, 0x67, 0xC3, 0x73, 0x75,
|
||||
0xE4, 0xEA, 0xF4, 0xFD, 0x41, 0x61, 0xFD, 0x8E, 0xC2, 0x01, 0x72, 0x6C, 0x75, 0xFF, 0xFC, 0xFA, 0x4B, 0x47, 0x61,
|
||||
0xC3, 0x65, 0x69, 0x6F, 0x75, 0x73, 0xFF, 0xF7, 0xFA, 0x53, 0xFA, 0x3F, 0xFA, 0x3F, 0xFA, 0x3F, 0xFA, 0x3F, 0xFA,
|
||||
0x3F, 0x21, 0xA9, 0xEA, 0x22, 0x6F, 0xC3, 0xD1, 0xFD, 0x41, 0xA9, 0xFA, 0xB9, 0x21, 0xC3, 0xFC, 0x43, 0x66, 0x6D,
|
||||
0x72, 0xFA, 0xB2, 0xFF, 0xFD, 0xFA, 0xB5, 0x41, 0x73, 0xFC, 0xC1, 0x42, 0x68, 0x74, 0xFA, 0xA4, 0xFC, 0xBD, 0x21,
|
||||
0x70, 0xF9, 0x23, 0x61, 0x69, 0x6F, 0xE8, 0xF2, 0xFD, 0x41, 0xA8, 0xFA, 0x93, 0x42, 0x65, 0xC3, 0xFA, 0x8F, 0xFF,
|
||||
0xFC, 0x21, 0x68, 0xF9, 0x42, 0x63, 0x73, 0xFF, 0xFD, 0xF9, 0xED, 0x41, 0xA9, 0xFA, 0xAB, 0x21, 0xC3, 0xFC, 0x43,
|
||||
0x61, 0x68, 0x65, 0xFF, 0xF2, 0xFF, 0xFD, 0xFA, 0x28, 0x43, 0x6E, 0x72, 0x74, 0xFF, 0xD3, 0xFF, 0xF6, 0xFA, 0x21,
|
||||
0xA0, 0x01, 0xC1, 0x21, 0x61, 0xFD, 0x21, 0x74, 0xFD, 0xC6, 0x00, 0x71, 0x61, 0x65, 0xC3, 0x69, 0x6F, 0x75, 0xFB,
|
||||
0x81, 0xFB, 0x81, 0xFF, 0x57, 0xFB, 0x81, 0xFB, 0x81, 0xFB, 0x81, 0x22, 0x6E, 0x72, 0xE8, 0xEB, 0x41, 0x73, 0xFE,
|
||||
0xE4, 0xA0, 0x07, 0x22, 0x21, 0x61, 0xFD, 0xA2, 0x01, 0x12, 0x73, 0x74, 0xFA, 0xFD, 0x43, 0x6F, 0x73, 0x75, 0xFF,
|
||||
0xEF, 0xFF, 0xF9, 0xF9, 0x61, 0x21, 0x69, 0xF6, 0x21, 0x72, 0xFD, 0x21, 0xA9, 0xFD, 0xA0, 0x07, 0x42, 0x21, 0x74,
|
||||
0xFD, 0x21, 0x73, 0xFD, 0x21, 0x6E, 0xFD, 0x21, 0x61, 0xFD, 0x21, 0x6C, 0xFD, 0xA1, 0x00, 0x71, 0x61, 0xFD, 0x41,
|
||||
0x61, 0xFE, 0xA9, 0x21, 0x69, 0xFC, 0x21, 0x72, 0xFD, 0x21, 0x75, 0xFD, 0x41, 0x74, 0xFF, 0x95, 0x21, 0x65, 0xFC,
|
||||
0x21, 0x74, 0xFD, 0x41, 0x6E, 0xFD, 0x23, 0x45, 0x68, 0x69, 0x6F, 0x72, 0x73, 0xF9, 0x7C, 0xFF, 0xFC, 0xFD, 0x25,
|
||||
0xF9, 0x7C, 0xF9, 0x52, 0x21, 0x74, 0xF0, 0x22, 0x6E, 0x73, 0xE6, 0xFD, 0x41, 0x6E, 0xFB, 0xFD, 0x21, 0x61, 0xFC,
|
||||
0x21, 0x6F, 0xFD, 0x21, 0x68, 0xFD, 0x21, 0x63, 0xFD, 0x21, 0x79, 0xFD, 0x41, 0x6C, 0xFA, 0xE6, 0x21, 0x64, 0xFC,
|
||||
0x21, 0x64, 0xFD, 0x49, 0x72, 0x61, 0x65, 0xC3, 0x68, 0x6C, 0x6F, 0x73, 0x75, 0xFE, 0xF7, 0xFF, 0x48, 0xFF, 0x70,
|
||||
0xFF, 0x96, 0xFF, 0xAB, 0xFF, 0xBA, 0xFF, 0xDE, 0xFF, 0xF3, 0xFF, 0xFD, 0x41, 0x6E, 0xF9, 0x2B, 0x21, 0x67, 0xFC,
|
||||
0x41, 0x6C, 0xFB, 0x17, 0x21, 0x6C, 0xFC, 0x22, 0x61, 0x69, 0xF6, 0xFD, 0x41, 0x67, 0xFE, 0x7D, 0x21, 0x6E, 0xFC,
|
||||
0x41, 0x72, 0xFB, 0xF2, 0x41, 0x65, 0xFF, 0x18, 0x21, 0x6C, 0xFC, 0x42, 0x72, 0x75, 0xFB, 0xE7, 0xFF, 0xFD, 0x41,
|
||||
0x68, 0xFB, 0xEA, 0xA0, 0x08, 0x02, 0x21, 0x74, 0xFD, 0xA1, 0x02, 0x93, 0x6C, 0xFD, 0xA0, 0x08, 0x53, 0xA1, 0x08,
|
||||
0x23, 0x72, 0xFD, 0x21, 0xA9, 0xFB, 0x41, 0x6E, 0xF9, 0x80, 0x21, 0x69, 0xFC, 0x42, 0x6D, 0x6E, 0xFF, 0xFD, 0xF9,
|
||||
0x79, 0x42, 0x69, 0x75, 0xFF, 0xF9, 0xF9, 0x72, 0x41, 0x72, 0xFB, 0x57, 0x45, 0x61, 0xC3, 0x69, 0x6C, 0x75, 0xFF,
|
||||
0xD7, 0xFF, 0xE4, 0xFD, 0x7D, 0xFF, 0xF5, 0xFF, 0xFC, 0xA0, 0x08, 0x83, 0xA1, 0x02, 0x93, 0x74, 0xFD, 0x21, 0x75,
|
||||
0xB9, 0x21, 0x6C, 0xB6, 0xA3, 0x02, 0x93, 0x61, 0x6C, 0x74, 0xFA, 0xFD, 0xB3, 0xA0, 0x08, 0x23, 0x21, 0xA9, 0xFD,
|
||||
0x42, 0x66, 0x74, 0xFB, 0x26, 0xFB, 0x26, 0x42, 0x6D, 0x6E, 0xF9, 0x06, 0xFF, 0xF9, 0x42, 0x66, 0x78, 0xFB, 0x18,
|
||||
0xFB, 0x18, 0x46, 0x61, 0x65, 0xC3, 0x68, 0x69, 0x6F, 0xFF, 0xD1, 0xFF, 0xDC, 0xFF, 0xE8, 0xF9, 0x25, 0xFF, 0xF2,
|
||||
0xFF, 0xF9, 0x22, 0x62, 0x72, 0xAB, 0xED, 0x41, 0x76, 0xFB, 0x50, 0x21, 0x75, 0xFC, 0x48, 0x74, 0x79, 0x61, 0x65,
|
||||
0x63, 0x68, 0x75, 0x6F, 0xFF, 0x4E, 0xFF, 0x57, 0xFF, 0x5A, 0xFF, 0x65, 0xFF, 0x6C, 0xF8, 0xBF, 0xFF, 0xF4, 0xFF,
|
||||
0xFD, 0xC3, 0x00, 0x61, 0x6E, 0x75, 0x76, 0xF9, 0xD1, 0xF9, 0xE4, 0xF9, 0xF0, 0x41, 0x68, 0xF8, 0x9A, 0x43, 0x63,
|
||||
0x6E, 0x74, 0xF9, 0xD7, 0xF9, 0xD7, 0xF9, 0xD7, 0x41, 0x6E, 0xF9, 0xCD, 0x22, 0x61, 0x6F, 0xF2, 0xFC, 0x21, 0x69,
|
||||
0xFB, 0x43, 0x61, 0x68, 0x72, 0xFC, 0x52, 0xF8, 0x80, 0xFF, 0xFD, 0x41, 0x2E, 0xFE, 0x2D, 0x21, 0x74, 0xFC, 0x21,
|
||||
0x6E, 0xFD, 0x21, 0x65, 0xFD, 0x21, 0x6D, 0xFD, 0x21, 0x6D, 0xFD, 0x21, 0x65, 0xFD, 0x41, 0x62, 0xFD, 0xD2, 0x21,
|
||||
0x6F, 0xFC, 0x21, 0x6E, 0xFD, 0x21, 0x6F, 0xFD, 0x42, 0x73, 0x74, 0xF7, 0xFF, 0xF7, 0xFF, 0x42, 0x65, 0x69, 0xF7,
|
||||
0xF8, 0xFF, 0xF9, 0x41, 0x78, 0xFD, 0xFC, 0xA2, 0x02, 0x72, 0x6C, 0x75, 0xF5, 0xFC, 0x41, 0x72, 0xFD, 0xF1, 0x42,
|
||||
0xA9, 0xA8, 0xFD, 0x4A, 0xFF, 0xFC, 0xC2, 0x02, 0x72, 0x6C, 0x72, 0xFD, 0xE6, 0xFD, 0xE6, 0x41, 0x69, 0xF7, 0xD2,
|
||||
0xA1, 0x02, 0x72, 0x66, 0xFC, 0x41, 0x73, 0xFD, 0xD4, 0xA1, 0x01, 0xB1, 0x73, 0xFC, 0x41, 0x72, 0xFA, 0xC2, 0x47,
|
||||
0x61, 0xC3, 0x65, 0x69, 0x6F, 0x75, 0x74, 0xFF, 0xCF, 0xFF, 0xDA, 0xFF, 0xE1, 0xFF, 0xEE, 0xF9, 0x51, 0xFF, 0xF7,
|
||||
0xFF, 0xFC, 0x21, 0xA9, 0xEA, 0x41, 0x70, 0xF8, 0x3E, 0x42, 0x69, 0x6F, 0xF8, 0x3A, 0xF8, 0x3A, 0x21, 0x73, 0xF9,
|
||||
0x41, 0x75, 0xF8, 0x30, 0x44, 0x61, 0x69, 0x6F, 0x72, 0xFF, 0xEE, 0xFF, 0xF9, 0xFF, 0xFC, 0xF8, 0x8C, 0x41, 0x63,
|
||||
0xF8, 0x22, 0x41, 0x72, 0xF8, 0x1B, 0x41, 0x64, 0xF8, 0x17, 0x21, 0x6E, 0xFC, 0x21, 0x65, 0xFD, 0x41, 0x73, 0xF8,
|
||||
0x0D, 0x21, 0x6E, 0xFC, 0x24, 0x65, 0x69, 0x6C, 0x6F, 0xE7, 0xEB, 0xF6, 0xFD, 0x41, 0x69, 0xF8, 0x73, 0x21, 0x75,
|
||||
0xFC, 0xC1, 0x01, 0xE2, 0x65, 0xFA, 0x36, 0x41, 0x64, 0xF6, 0xDA, 0x44, 0x62, 0x67, 0x6E, 0x74, 0xF6, 0xD6, 0xF6,
|
||||
0xD6, 0xFF, 0xFC, 0xF6, 0xD6, 0x42, 0x6E, 0x72, 0xF6, 0xC9, 0xF6, 0xC9, 0x21, 0xA9, 0xF9, 0x42, 0x6D, 0x70, 0xF6,
|
||||
0xBF, 0xF6, 0xBF, 0x42, 0x63, 0x70, 0xF6, 0xB8, 0xF6, 0xB8, 0xA0, 0x07, 0xA2, 0x21, 0x6E, 0xFD, 0x21, 0x69, 0xFD,
|
||||
0x21, 0x74, 0xF7, 0x22, 0x63, 0x6E, 0xFD, 0xF4, 0xA2, 0x00, 0xC2, 0x65, 0x69, 0xF5, 0xFB, 0xC7, 0x01, 0xE2, 0x61,
|
||||
0xC3, 0x69, 0x6F, 0x72, 0x75, 0x79, 0xFF, 0xC3, 0xFF, 0xD7, 0xFF, 0xDA, 0xFF, 0xE1, 0xFF, 0xF9, 0xF6, 0x99, 0xF6,
|
||||
0x99, 0xC5, 0x02, 0x52, 0x63, 0x70, 0x71, 0x73, 0x74, 0xFF, 0x6B, 0xFF, 0x91, 0xFF, 0x9E, 0xFF, 0xA1, 0xFF, 0xE8,
|
||||
0x21, 0x73, 0xEE, 0x42, 0xC3, 0x65, 0xFF, 0x41, 0xFF, 0xFD, 0x41, 0x74, 0xF7, 0x02, 0x21, 0x61, 0xFC, 0x53, 0x61,
|
||||
0xC3, 0x62, 0x63, 0x64, 0x65, 0x69, 0x6D, 0x70, 0x73, 0x6F, 0x6B, 0x74, 0x67, 0x6E, 0x72, 0x6C, 0x75, 0x79, 0xF8,
|
||||
0xB1, 0xF8, 0xE6, 0xF9, 0x32, 0xF9, 0xCA, 0xFB, 0x03, 0xF7, 0x50, 0xFB, 0x2C, 0xFC, 0x27, 0xFD, 0x92, 0xFE, 0x6E,
|
||||
0xFE, 0x87, 0xFE, 0x93, 0xFE, 0xAD, 0xFE, 0xCA, 0xFE, 0xD7, 0xFF, 0xF2, 0xFF, 0xFD, 0xF8, 0x85, 0xF8, 0x85, 0xA0,
|
||||
0x00, 0x81, 0x41, 0xAE, 0xFE, 0x87, 0xA0, 0x02, 0x31, 0x21, 0x2E, 0xFD, 0x21, 0x74, 0xFD, 0x21, 0x6E, 0xFD, 0x42,
|
||||
0x74, 0x65, 0xF8, 0x91, 0xFF, 0xFD, 0x23, 0x68, 0xC3, 0x73, 0xE6, 0xE9, 0xF9, 0x21, 0x68, 0xDF, 0xA0, 0x00, 0xA2,
|
||||
0x21, 0x65, 0xFD, 0x21, 0x72, 0xFD, 0x21, 0x64, 0xFD, 0x21, 0xA8, 0xFD, 0xA0, 0x00, 0xE1, 0x21, 0x6C, 0xFD, 0x21,
|
||||
0x6F, 0xFD, 0x21, 0x6F, 0xFD, 0xA0, 0x00, 0xF2, 0x21, 0x69, 0xFD, 0x21, 0x67, 0xFD, 0x21, 0x6C, 0xFD, 0x22, 0x63,
|
||||
0x61, 0xF1, 0xFD, 0xA0, 0x00, 0xE2, 0x21, 0x69, 0xFD, 0x21, 0x73, 0xFD, 0x21, 0xA9, 0xFD, 0x21, 0xC3, 0xFD, 0x21,
|
||||
0x68, 0xFD, 0x21, 0x74, 0xFD, 0x21, 0x73, 0xFD, 0x41, 0x2E, 0xF6, 0x46, 0x21, 0x74, 0xFC, 0x21, 0x6E, 0xFD, 0x21,
|
||||
0x65, 0xFD, 0x21, 0x6D, 0xFD, 0x41, 0x2E, 0xF8, 0xC6, 0x21, 0x74, 0xFC, 0x21, 0x6E, 0xFD, 0x21, 0x65, 0xFD, 0x21,
|
||||
0x6D, 0xFD, 0x21, 0x72, 0xFD, 0x21, 0x65, 0xFD, 0x21, 0x66, 0xFD, 0x21, 0x69, 0xFD, 0x23, 0x65, 0x69, 0x74, 0xD1,
|
||||
0xE1, 0xFD, 0x41, 0x74, 0xFE, 0x84, 0x21, 0x73, 0xFC, 0x41, 0x72, 0xF8, 0xDB, 0x21, 0x61, 0xFC, 0x22, 0x6F, 0x70,
|
||||
0xF6, 0xFD, 0x41, 0x73, 0xF5, 0xD8, 0x21, 0x69, 0xFC, 0x21, 0x70, 0xFD, 0x21, 0xA9, 0xFD, 0x21, 0xC3, 0xFD, 0x21,
|
||||
0x69, 0xFD, 0x21, 0x68, 0xFD, 0xA0, 0x06, 0x41, 0x21, 0x6C, 0xFD, 0x21, 0x6C, 0xFD, 0x41, 0x2E, 0xFF, 0x33, 0x21,
|
||||
0x74, 0xFC, 0x21, 0x6E, 0xFD, 0x22, 0x69, 0x65, 0xF3, 0xFD, 0x22, 0x63, 0x6D, 0xE5, 0xFB, 0xA0, 0x02, 0x02, 0x21,
|
||||
0x6F, 0xFD, 0x21, 0x72, 0xFD, 0x21, 0x65, 0xEA, 0x22, 0x74, 0x6D, 0xFA, 0xFD, 0x41, 0x65, 0xFF, 0x1E, 0xA0, 0x03,
|
||||
0x21, 0x21, 0x2E, 0xFD, 0x21, 0x74, 0xFD, 0x21, 0x6E, 0xFD, 0x21, 0x65, 0xFD, 0x21, 0x63, 0xFD, 0x21, 0x73, 0xFD,
|
||||
0x21, 0x65, 0xFD, 0x21, 0x69, 0xFD, 0x21, 0x75, 0xFD, 0x22, 0x63, 0x71, 0xDE, 0xFD, 0x21, 0x73, 0xC8, 0x21, 0x6F,
|
||||
0xFD, 0x21, 0x6E, 0xFD, 0x41, 0x6C, 0xF8, 0x6B, 0x21, 0x69, 0xFC, 0xA0, 0x05, 0xE1, 0x21, 0x2E, 0xFD, 0x21, 0x74,
|
||||
0xFD, 0x21, 0x6E, 0xFD, 0x21, 0x65, 0xFD, 0x21, 0x6D, 0xFD, 0x21, 0x61, 0xFD, 0x21, 0x67, 0xFD, 0x21, 0x6C, 0xFD,
|
||||
0x21, 0x61, 0xFD, 0x41, 0x6D, 0xFF, 0xA3, 0x4E, 0x62, 0x64, 0xC3, 0x6C, 0x6E, 0x70, 0x72, 0x73, 0x63, 0x67, 0x76,
|
||||
0x6D, 0x69, 0x75, 0xFE, 0xCF, 0xFE, 0xD6, 0xFE, 0xE5, 0xFF, 0x00, 0xFF, 0x49, 0xFF, 0x5E, 0xFF, 0x91, 0xFF, 0xA2,
|
||||
0xFF, 0xC9, 0xFF, 0xD4, 0xFF, 0xDB, 0xFF, 0xF9, 0xFF, 0xFC, 0xFF, 0xFC, 0x47, 0xA2, 0xA9, 0xA8, 0xAA, 0xAE, 0xB4,
|
||||
0xBB, 0xFE, 0xBD, 0xFE, 0xBD, 0xFE, 0xBD, 0xFE, 0xBD, 0xFE, 0xBD, 0xFE, 0xBD, 0xFE, 0xBD, 0xA0, 0x02, 0x41, 0x21,
|
||||
0x2E, 0xFD, 0xA0, 0x00, 0x41, 0x21, 0x2E, 0xFD, 0x21, 0x74, 0xFD, 0xA3, 0x00, 0xE1, 0x2E, 0x73, 0x6E, 0xF1, 0xF4,
|
||||
0xFD, 0x23, 0x2E, 0x73, 0x6E, 0xE8, 0xEB, 0xF4, 0xA1, 0x00, 0xE2, 0x65, 0xF9, 0xA0, 0x02, 0xF1, 0x21, 0x6C, 0xFD,
|
||||
0x21, 0x6C, 0xFD, 0x21, 0x69, 0xFD, 0x42, 0x74, 0x6D, 0xFF, 0xFD, 0xFE, 0xB6, 0xA1, 0x00, 0xE1, 0x75, 0xF9, 0xC2,
|
||||
0x00, 0xE2, 0x65, 0x75, 0xFF, 0xDC, 0xFE, 0xAD, 0x49, 0x61, 0xC3, 0x65, 0x69, 0x6C, 0x6F, 0x72, 0x75, 0x79, 0xFE,
|
||||
0x62, 0xFF, 0xA5, 0xFF, 0xCA, 0xFE, 0x62, 0xFF, 0xDA, 0xFF, 0xF2, 0xFF, 0xF7, 0xFE, 0x62, 0xFE, 0x62, 0x43, 0x65,
|
||||
0x69, 0x75, 0xFE, 0x23, 0xFC, 0x9D, 0xFC, 0x9D, 0x41, 0x69, 0xF4, 0xB7, 0xA0, 0x05, 0x92, 0x21, 0x65, 0xFD, 0x21,
|
||||
0x75, 0xFD, 0x22, 0x65, 0x71, 0xF7, 0xFD, 0x21, 0x69, 0xFB, 0x43, 0x65, 0x68, 0x72, 0xFE, 0x04, 0xFF, 0xEB, 0xFF,
|
||||
0xFD, 0x21, 0x72, 0xE5, 0x21, 0x74, 0xFD, 0x21, 0x63, 0xFD, 0x21, 0x74, 0xDC, 0x21, 0x6E, 0xFD, 0x21, 0x65, 0xFD,
|
||||
0x21, 0x6D, 0xFD, 0x21, 0xA9, 0xFD, 0x41, 0x75, 0xF7, 0x4F, 0x21, 0x71, 0xFC, 0x44, 0x65, 0xC3, 0x69, 0x6F, 0xFF,
|
||||
0xE7, 0xFF, 0xF6, 0xFC, 0x55, 0xFF, 0xFD, 0x21, 0x67, 0xB9, 0x21, 0x72, 0xFD, 0x41, 0x74, 0xF7, 0x35, 0x22, 0x65,
|
||||
0x69, 0xF9, 0xFC, 0xC1, 0x01, 0xC2, 0x65, 0xF4, 0x00, 0x21, 0x70, 0xFA, 0x21, 0x6F, 0xFD, 0x21, 0x63, 0xFD, 0x21,
|
||||
0x73, 0xFD, 0x21, 0x69, 0xFD, 0x41, 0x6C, 0xF6, 0xCF, 0x21, 0x6C, 0xFC, 0x21, 0x69, 0xFD, 0x41, 0x6C, 0xFE, 0x92,
|
||||
0x21, 0x61, 0xFC, 0x41, 0x74, 0xFE, 0x0B, 0x21, 0x6F, 0xFC, 0x22, 0x76, 0x70, 0xF6, 0xFD, 0x42, 0x69, 0x65, 0xFF,
|
||||
0xFB, 0xFD, 0x8D, 0x21, 0x75, 0xF9, 0x48, 0x63, 0x64, 0x6C, 0x6E, 0x70, 0x6D, 0x71, 0x72, 0xFF, 0x60, 0xFF, 0x7F,
|
||||
0xFF, 0xA8, 0xFF, 0xBF, 0xFF, 0xD6, 0xFF, 0xE0, 0xFF, 0xFD, 0xFE, 0x65, 0x45, 0xA7, 0xA9, 0xA2, 0xA8, 0xB4, 0xFD,
|
||||
0x8D, 0xFF, 0xE7, 0xFE, 0xA1, 0xFE, 0xA1, 0xFE, 0xA1, 0xA0, 0x02, 0xC3, 0x21, 0x74, 0xFD, 0x21, 0x75, 0xFD, 0x41,
|
||||
0x69, 0xFA, 0xC0, 0x41, 0x2E, 0xF3, 0xB5, 0x21, 0x74, 0xFC, 0x21, 0x6E, 0xFD, 0x21, 0x65, 0xFD, 0x21, 0x6D, 0xFD,
|
||||
0x21, 0xAA, 0xFD, 0x21, 0xC3, 0xFD, 0xA3, 0x00, 0xE1, 0x6F, 0x70, 0x72, 0xE3, 0xE6, 0xFD, 0xA0, 0x06, 0x51, 0x21,
|
||||
0x6C, 0xFD, 0x21, 0x6C, 0xFD, 0x21, 0x69, 0xFD, 0x44, 0x2E, 0x73, 0x6E, 0x76, 0xFE, 0x9E, 0xFE, 0xA1, 0xFE, 0xAA,
|
||||
0xFF, 0xFD, 0x42, 0x2E, 0x73, 0xFE, 0x91, 0xFE, 0x94, 0xA0, 0x03, 0x63, 0x21, 0x63, 0xFD, 0xA0, 0x03, 0x93, 0x21,
|
||||
0x74, 0xFD, 0x21, 0xA9, 0xFD, 0x22, 0x61, 0xC3, 0xF4, 0xFD, 0x21, 0x72, 0xFB, 0xA2, 0x00, 0x81, 0x65, 0x6F, 0xE2,
|
||||
0xFD, 0xC2, 0x00, 0x81, 0x65, 0x6F, 0xFF, 0xDB, 0xFB, 0x6A, 0x41, 0x64, 0xF5, 0x75, 0x21, 0x6E, 0xFC, 0x21, 0x65,
|
||||
0xFD, 0xCD, 0x00, 0xE2, 0x2E, 0x62, 0x65, 0x67, 0x6C, 0x6D, 0x6E, 0x70, 0x72, 0x73, 0x74, 0x77, 0x69, 0xFE, 0x59,
|
||||
0xFE, 0x5F, 0xFF, 0xBB, 0xFE, 0x5F, 0xFF, 0xE6, 0xFE, 0x5F, 0xFE, 0x5F, 0xFE, 0x5F, 0xFF, 0xED, 0xFE, 0x5F, 0xFE,
|
||||
0x5F, 0xFE, 0x5F, 0xFF, 0xFD, 0x41, 0x6C, 0xF2, 0xB8, 0xA1, 0x00, 0xE1, 0x6C, 0xFC, 0xA0, 0x03, 0xC2, 0xC9, 0x00,
|
||||
0xE2, 0x2E, 0x62, 0x65, 0x66, 0x67, 0x68, 0x70, 0x73, 0x74, 0xFE, 0x23, 0xFE, 0x29, 0xFE, 0x3B, 0xFE, 0x29, 0xFE,
|
||||
0x29, 0xFF, 0xFD, 0xFE, 0x29, 0xFE, 0x29, 0xFE, 0x29, 0xC2, 0x00, 0xE2, 0x65, 0x61, 0xFE, 0x1D, 0xFC, 0xEE, 0xA0,
|
||||
0x03, 0xE1, 0x22, 0x63, 0x71, 0xFD, 0xFD, 0xA0, 0x03, 0xF2, 0x21, 0x63, 0xF5, 0x21, 0x72, 0xF2, 0x22, 0x6F, 0x75,
|
||||
0xFA, 0xFD, 0x21, 0x73, 0xFB, 0x27, 0x63, 0x64, 0x70, 0x72, 0x73, 0x75, 0x78, 0xEA, 0xEF, 0xE7, 0xE7, 0xFD, 0xE7,
|
||||
0xE7, 0xA0, 0x04, 0x12, 0x21, 0xA9, 0xFD, 0x23, 0x66, 0x6E, 0x78, 0xD2, 0xD2, 0xD2, 0x41, 0x62, 0xFC, 0x3B, 0x21,
|
||||
0x72, 0xFC, 0x41, 0x69, 0xFF, 0x5D, 0x41, 0x2E, 0xFD, 0xE0, 0x21, 0x74, 0xFC, 0x21, 0x6E, 0xFD, 0x21, 0x65, 0xFD,
|
||||
0x42, 0x67, 0x65, 0xFF, 0xFD, 0xF4, 0xBE, 0x21, 0x6E, 0xF9, 0x21, 0x69, 0xFD, 0x41, 0x76, 0xF4, 0xB4, 0x21, 0x69,
|
||||
0xFC, 0x24, 0x75, 0x66, 0x74, 0x6E, 0xD8, 0xDB, 0xF6, 0xFD, 0x41, 0x69, 0xF2, 0xCF, 0x21, 0x74, 0xFC, 0x21, 0x69,
|
||||
0xFD, 0x21, 0x6E, 0xFD, 0x41, 0x6C, 0xF4, 0x97, 0x21, 0x75, 0xFC, 0x21, 0x70, 0xFD, 0x21, 0x74, 0xC9, 0x21, 0xA9,
|
||||
0xFD, 0x21, 0xC3, 0xFD, 0x21, 0x70, 0xFD, 0xC7, 0x00, 0xE1, 0x61, 0xC3, 0x65, 0x6E, 0x67, 0x72, 0x6D, 0xFF, 0x8C,
|
||||
0xFF, 0x9E, 0xFF, 0xA1, 0xFF, 0xD4, 0xFF, 0xE7, 0xFF, 0xF1, 0xFF, 0xFD, 0x41, 0x93, 0xFB, 0xFE, 0x41, 0x72, 0xF2,
|
||||
0x88, 0xA1, 0x00, 0xE1, 0x72, 0xFC, 0xC1, 0x00, 0xE1, 0x72, 0xFE, 0x7D, 0x41, 0x64, 0xF2, 0x79, 0x21, 0x69, 0xFC,
|
||||
0x4D, 0x61, 0xC3, 0x65, 0x68, 0x69, 0x6B, 0x6C, 0x6F, 0xC5, 0x72, 0x75, 0x79, 0x63, 0xFE, 0x8A, 0xFD, 0x27, 0xFD,
|
||||
0x4C, 0xFE, 0xE4, 0xFF, 0x12, 0xFF, 0x1A, 0xFF, 0x38, 0xFF, 0xCE, 0xFF, 0xE6, 0xFD, 0x5C, 0xFF, 0xEE, 0xFF, 0xF3,
|
||||
0xFF, 0xFD, 0x41, 0x63, 0xFC, 0x7B, 0xC3, 0x00, 0xE1, 0x61, 0x6B, 0x65, 0xFF, 0xFC, 0xFD, 0x17, 0xFD, 0x29, 0x41,
|
||||
0x63, 0xFF, 0x53, 0x21, 0x69, 0xFC, 0x21, 0x66, 0xFD, 0x21, 0x69, 0xFD, 0xA1, 0x00, 0xE1, 0x6E, 0xFD, 0x41, 0x74,
|
||||
0xF2, 0x5A, 0xA1, 0x00, 0x91, 0x65, 0xFC, 0x21, 0x6C, 0xFB, 0xC3, 0x00, 0xE1, 0x6C, 0x6D, 0x74, 0xFF, 0xFD, 0xFC,
|
||||
0x45, 0xFB, 0x1A, 0x41, 0x6C, 0xFF, 0x29, 0x21, 0x61, 0xFC, 0x21, 0x76, 0xFD, 0x41, 0x61, 0xF2, 0xF5, 0x21, 0xA9,
|
||||
0xFC, 0x21, 0xC3, 0xFD, 0x21, 0x72, 0xFD, 0x22, 0x6F, 0x74, 0xF0, 0xFD, 0xA0, 0x04, 0xC3, 0x21, 0x67, 0xFD, 0x21,
|
||||
0xA2, 0xFD, 0x21, 0xC3, 0xFD, 0x21, 0x6E, 0xFD, 0x21, 0x65, 0xFD, 0xA2, 0x00, 0xE1, 0x6E, 0x79, 0xE9, 0xFD, 0x41,
|
||||
0x6E, 0xFF, 0x2B, 0x21, 0x6F, 0xFC, 0xA1, 0x00, 0xE1, 0x63, 0xFD, 0x47, 0xA2, 0xA9, 0xA8, 0xAA, 0xAE, 0xB4, 0xBB,
|
||||
0xFB, 0x41, 0xFF, 0xFB, 0xFB, 0x41, 0xFB, 0x41, 0xFB, 0x41, 0xFB, 0x41, 0xFB, 0x41, 0xC2, 0x00, 0xE1, 0x2E, 0x73,
|
||||
0xFC, 0x84, 0xFC, 0x87, 0x41, 0x6F, 0xFB, 0x3F, 0x42, 0x6D, 0x73, 0xFF, 0xFC, 0xFB, 0x3E, 0x41, 0x73, 0xFB, 0x34,
|
||||
0x22, 0xA9, 0xA8, 0xF5, 0xFC, 0x21, 0xC3, 0xFB, 0xA0, 0x02, 0xA2, 0x4A, 0x75, 0x69, 0x6F, 0x61, 0xC3, 0x65, 0x6E,
|
||||
0xC5, 0x73, 0x79, 0xFF, 0x69, 0xFF, 0x7A, 0xFF, 0xB4, 0xFB, 0x08, 0xFF, 0xC7, 0xFF, 0xDD, 0xFF, 0xFA, 0xFF, 0x0A,
|
||||
0xFF, 0xFD, 0xFB, 0x08, 0x41, 0x63, 0xF3, 0x54, 0x21, 0x69, 0xFC, 0x41, 0x67, 0xFE, 0x89, 0x21, 0x72, 0xFC, 0x21,
|
||||
0x75, 0xFD, 0x41, 0x61, 0xF3, 0x46, 0xC4, 0x00, 0xE1, 0x74, 0x67, 0x73, 0x6D, 0xFF, 0xEF, 0xF1, 0x62, 0xFF, 0xF9,
|
||||
0xFF, 0xFC, 0x47, 0xA9, 0xA2, 0xA8, 0xAA, 0xAE, 0xB4, 0xBB, 0xFF, 0xF1, 0xFA, 0xC5, 0xFA, 0xC5, 0xFA, 0xC5, 0xFA,
|
||||
0xC5, 0xFA, 0xC5, 0xFA, 0xC5, 0x41, 0x67, 0xF1, 0x3D, 0xC2, 0x00, 0xE1, 0x6E, 0x6D, 0xFF, 0xFC, 0xFB, 0x62, 0x42,
|
||||
0x65, 0x69, 0xFA, 0x7F, 0xF8, 0xF9, 0xC5, 0x00, 0xE1, 0x6C, 0x70, 0x2E, 0x73, 0x6E, 0xFF, 0xF9, 0xFB, 0x5A, 0xFB,
|
||||
0xF4, 0xFB, 0xF7, 0xFC, 0x00, 0xC1, 0x00, 0xE1, 0x6C, 0xFB, 0x48, 0x41, 0x6D, 0xF1, 0x11, 0x41, 0x61, 0xF0, 0xC1,
|
||||
0x21, 0x6F, 0xFC, 0x21, 0x69, 0xFD, 0xC3, 0x00, 0xE1, 0x6D, 0x69, 0x64, 0xFB, 0x2C, 0xFF, 0xF2, 0xFF, 0xFD, 0x41,
|
||||
0x68, 0xF8, 0xC0, 0xA1, 0x00, 0xE1, 0x74, 0xFC, 0xA0, 0x07, 0xC2, 0x21, 0x72, 0xFD, 0x43, 0x2E, 0x73, 0x75, 0xFB,
|
||||
0xB3, 0xFB, 0xB6, 0xFF, 0xFD, 0x21, 0x64, 0xF3, 0xA2, 0x00, 0xE2, 0x65, 0x79, 0xF3, 0xFD, 0x4A, 0xC3, 0x69, 0x63,
|
||||
0x6D, 0x65, 0x75, 0x61, 0x79, 0x68, 0x6F, 0xFF, 0x81, 0xFF, 0x9B, 0xFB, 0x39, 0xFB, 0x39, 0xFF, 0xAB, 0xFF, 0xBD,
|
||||
0xFF, 0xD1, 0xFF, 0xE1, 0xFF, 0xF9, 0xFA, 0x46, 0xA0, 0x03, 0x11, 0x21, 0x2E, 0xFD, 0x21, 0x74, 0xFD, 0x21, 0x6E,
|
||||
0xFD, 0x21, 0x65, 0xFD, 0x22, 0x63, 0x7A, 0xFD, 0xFD, 0x21, 0x6F, 0xFB, 0x21, 0x64, 0xFD, 0x21, 0x74, 0xFD, 0x21,
|
||||
0x61, 0xFD, 0x21, 0x76, 0xFD, 0x21, 0x6E, 0xE9, 0x21, 0x69, 0xFD, 0x21, 0x6D, 0xFD, 0x21, 0xA9, 0xFD, 0x42, 0xC3,
|
||||
0x73, 0xFF, 0xFD, 0xF3, 0x42, 0x21, 0xA9, 0xF9, 0x41, 0x6E, 0xFA, 0x3D, 0x21, 0x69, 0xFC, 0x21, 0x6D, 0xFD, 0x21,
|
||||
0xA9, 0xFD, 0x41, 0x74, 0xF4, 0xB0, 0x22, 0xC3, 0x73, 0xF9, 0xFC, 0xC5, 0x00, 0xE2, 0x69, 0x75, 0xC3, 0x6F, 0x65,
|
||||
0xFF, 0xD1, 0xFD, 0xED, 0xFF, 0xE7, 0xFF, 0xFB, 0xFB, 0x49, 0x41, 0x65, 0xF0, 0x5C, 0x21, 0x6C, 0xFC, 0x42, 0x62,
|
||||
0x63, 0xFF, 0xFD, 0xF0, 0x55, 0x21, 0x61, 0xF9, 0x21, 0x6E, 0xFD, 0xC3, 0x00, 0xE1, 0x67, 0x70, 0x73, 0xFF, 0xFD,
|
||||
0xFC, 0x3E, 0xFC, 0x3E, 0x41, 0x6D, 0xF2, 0x05, 0x44, 0x61, 0x65, 0x69, 0x6F, 0xF2, 0x01, 0xF2, 0x01, 0xF2, 0x01,
|
||||
0xFF, 0xFC, 0x21, 0x6C, 0xF3, 0x21, 0x6C, 0xFD, 0x21, 0x69, 0xFD, 0xA0, 0x06, 0xD2, 0x21, 0xA9, 0xFD, 0x21, 0xC3,
|
||||
0xFD, 0x21, 0x6F, 0xFD, 0x21, 0xA9, 0xFD, 0x21, 0xC3, 0xFD, 0xA2, 0x00, 0xE1, 0x70, 0x6C, 0xEB, 0xFD, 0x42, 0xA9,
|
||||
0xA8, 0xF5, 0x47, 0xF5, 0x47, 0x48, 0x76, 0x61, 0x65, 0xC3, 0x69, 0x6F, 0x73, 0x75, 0xFD, 0xEE, 0xF1, 0x6D, 0xF1,
|
||||
0x6D, 0xFF, 0xF9, 0xF1, 0x6D, 0xF1, 0x6D, 0xF1, 0x6D, 0xF1, 0x6D, 0x21, 0x79, 0xE7, 0x41, 0x65, 0xFC, 0xAD, 0x21,
|
||||
0x72, 0xFC, 0x21, 0x74, 0xFD, 0x21, 0x73, 0xFD, 0xA2, 0x00, 0xE1, 0x6C, 0x61, 0xF0, 0xFD, 0xC2, 0x00, 0xE2, 0x75,
|
||||
0x65, 0xF9, 0x7E, 0xFA, 0xAD, 0x43, 0x6D, 0x74, 0x68, 0xFE, 0x5B, 0xF1, 0xA4, 0xEF, 0x15, 0xC4, 0x00, 0xE1, 0x72,
|
||||
0x2E, 0x73, 0x6E, 0xFF, 0xF6, 0xFA, 0x82, 0xFA, 0x85, 0xFA, 0x8E, 0x41, 0x6C, 0xEF, 0x95, 0x21, 0x75, 0xFC, 0xA0,
|
||||
0x06, 0xF3, 0x21, 0x71, 0xFD, 0x21, 0xA9, 0xFD, 0x21, 0xC3, 0xFD, 0xA2, 0x00, 0xE1, 0x6E, 0x72, 0xF1, 0xFD, 0x47,
|
||||
0xA2, 0xA9, 0xA8, 0xAA, 0xAE, 0xB4, 0xBB, 0xF9, 0x00, 0xFF, 0xF9, 0xF9, 0x00, 0xF9, 0x00, 0xF9, 0x00, 0xF9, 0x00,
|
||||
0xF9, 0x00, 0xC1, 0x00, 0x81, 0x65, 0xFB, 0xB2, 0x41, 0x73, 0xEF, 0x26, 0x21, 0x6F, 0xFC, 0x21, 0x74, 0xFD, 0xA0,
|
||||
0x07, 0x62, 0x21, 0xA9, 0xFD, 0x21, 0xC3, 0xFD, 0x21, 0x6C, 0xFD, 0x21, 0x73, 0xF4, 0xA2, 0x00, 0x41, 0x61, 0x69,
|
||||
0xFA, 0xFD, 0xC8, 0x00, 0xE2, 0x2E, 0x65, 0x6C, 0x6E, 0x6F, 0x72, 0x73, 0x74, 0xFA, 0x1D, 0xFA, 0x35, 0xFF, 0xDA,
|
||||
0xFA, 0x23, 0xFF, 0xE7, 0xFF, 0xDA, 0xFA, 0x23, 0xFF, 0xF9, 0x41, 0xA9, 0xF8, 0xC6, 0x41, 0x75, 0xF8, 0xC2, 0x22,
|
||||
0xC3, 0x65, 0xF8, 0xFC, 0x41, 0x68, 0xF8, 0xB9, 0x21, 0x63, 0xFC, 0x21, 0x79, 0xFD, 0x41, 0x72, 0xF8, 0xAF, 0x22,
|
||||
0xA8, 0xA9, 0xFC, 0xFC, 0x21, 0xC3, 0xFB, 0x4D, 0x72, 0x75, 0x61, 0x69, 0x6F, 0x6C, 0x65, 0xC3, 0x68, 0x6E, 0x73,
|
||||
0x74, 0x79, 0xFE, 0xAE, 0xFE, 0xD4, 0xFF, 0x0C, 0xFC, 0x95, 0xFF, 0x43, 0xFF, 0x4A, 0xFF, 0x5D, 0xFF, 0x86, 0xFF,
|
||||
0xC2, 0xFF, 0xE5, 0xFF, 0xF1, 0xFF, 0xFD, 0xF8, 0x86, 0x41, 0x63, 0xF1, 0xA8, 0x21, 0x6F, 0xFC, 0x41, 0x64, 0xF1,
|
||||
0xA1, 0x21, 0x69, 0xFC, 0x41, 0x67, 0xF1, 0x9A, 0x41, 0x67, 0xF0, 0xB7, 0x21, 0x6C, 0xFC, 0x41, 0x6C, 0xF1, 0x8F,
|
||||
0x23, 0x69, 0x75, 0x6F, 0xF1, 0xF9, 0xFC, 0x41, 0x67, 0xF8, 0x89, 0x21, 0x69, 0xFC, 0x21, 0x6C, 0xFD, 0x21, 0x6C,
|
||||
0xFD, 0x42, 0x65, 0x69, 0xFF, 0xFD, 0xF6, 0x84, 0x42, 0x74, 0x6F, 0xF9, 0xAC, 0xFF, 0xE1, 0x41, 0x74, 0xF8, 0x1F,
|
||||
0x21, 0x61, 0xFC, 0x21, 0x6D, 0xFD, 0x21, 0x72, 0xFD, 0x21, 0x6F, 0xFD, 0x26, 0x6E, 0x63, 0x64, 0x74, 0x73, 0x66,
|
||||
0xB5, 0xBC, 0xCE, 0xE2, 0xE9, 0xFD, 0x41, 0xA9, 0xF8, 0xB0, 0x42, 0x61, 0x6F, 0xF8, 0xAC, 0xF8, 0xAC, 0x22, 0xC3,
|
||||
0x69, 0xF5, 0xF9, 0x42, 0x65, 0x68, 0xF7, 0xCF, 0xFF, 0xFB, 0x41, 0x74, 0xFC, 0xE0, 0x21, 0x61, 0xFC, 0x22, 0x63,
|
||||
0x74, 0xF2, 0xFD, 0x41, 0x2E, 0xF0, 0xE1, 0x21, 0x74, 0xFC, 0x21, 0x6E, 0xFD, 0x21, 0x65, 0xFD, 0x21, 0x63, 0xFD,
|
||||
0x42, 0x73, 0x6E, 0xFF, 0xFD, 0xF1, 0x19, 0x41, 0x6E, 0xF1, 0x12, 0x22, 0x69, 0x61, 0xF5, 0xFC, 0x42, 0x75, 0x6F,
|
||||
0xFF, 0x68, 0xF9, 0xD4, 0x22, 0x6D, 0x70, 0xF4, 0xF9, 0xA0, 0x00, 0xA1, 0x21, 0x69, 0xFD, 0x21, 0x67, 0xFD, 0x21,
|
||||
0x72, 0xF7, 0x21, 0x68, 0xFD, 0x21, 0x74, 0xFD, 0x22, 0x6C, 0x72, 0xF4, 0xFD, 0x41, 0x6C, 0xF7, 0x69, 0x41, 0x72,
|
||||
0xFA, 0x24, 0x41, 0x74, 0xFA, 0xF9, 0x21, 0x63, 0xFC, 0x21, 0x79, 0xDA, 0x22, 0x61, 0x78, 0xFA, 0xFD, 0x41, 0x61,
|
||||
0xF2, 0x17, 0x49, 0x6E, 0x73, 0x6D, 0x61, 0xC3, 0x6C, 0x62, 0x6F, 0x76, 0xFF, 0x72, 0xFF, 0x9D, 0xFF, 0xC9, 0xFF,
|
||||
0xE0, 0xF7, 0x7E, 0xFF, 0xE5, 0xFF, 0xE9, 0xFF, 0xF7, 0xFF, 0xFC, 0x41, 0x70, 0xF8, 0x13, 0x43, 0x65, 0x6F, 0x68,
|
||||
0xF7, 0x3E, 0xFF, 0xFC, 0xF8, 0x0F, 0x41, 0x69, 0xF5, 0xAE, 0x22, 0x63, 0x74, 0xF2, 0xFC, 0xA0, 0x05, 0xB3, 0x21,
|
||||
0x72, 0xFD, 0x21, 0x76, 0xFD, 0x41, 0x65, 0xFE, 0xF9, 0x21, 0x72, 0xFC, 0x22, 0x69, 0x74, 0xF6, 0xFD, 0x41, 0x61,
|
||||
0xFF, 0xA5, 0x21, 0x74, 0xFC, 0x21, 0x73, 0xFD, 0xC2, 0x01, 0x71, 0x63, 0x69, 0xED, 0x74, 0xED, 0x74, 0x21, 0x61,
|
||||
0xF7, 0x21, 0x72, 0xFD, 0x21, 0x74, 0xFD, 0x45, 0x73, 0x6E, 0x75, 0x78, 0x72, 0xFF, 0xCA, 0xFF, 0xDF, 0xFF, 0xEB,
|
||||
0xFF, 0xFD, 0xF8, 0x31, 0xC1, 0x00, 0xE1, 0x6D, 0xF7, 0xC4, 0x41, 0x61, 0xF9, 0xFD, 0x41, 0x6D, 0xFA, 0xAA, 0x21,
|
||||
0x69, 0xFC, 0x21, 0x72, 0xFD, 0xA2, 0x00, 0xE1, 0x63, 0x74, 0xF2, 0xFD, 0x47, 0xA2, 0xA9, 0xA8, 0xAA, 0xAE, 0xB4,
|
||||
0xBB, 0xF6, 0xF2, 0xFF, 0xF9, 0xF6, 0xF2, 0xF6, 0xF2, 0xF6, 0xF2, 0xF6, 0xF2, 0xF6, 0xF2, 0x41, 0x68, 0xFB, 0xD1,
|
||||
0x41, 0x70, 0xED, 0x6E, 0x21, 0x6F, 0xFC, 0x43, 0x73, 0x63, 0x74, 0xFA, 0x6A, 0xFF, 0xFD, 0xF8, 0x57, 0x41, 0x69,
|
||||
0xFE, 0x77, 0x41, 0x2E, 0xEE, 0x5F, 0x21, 0x74, 0xFC, 0x21, 0x6E, 0xFD, 0x21, 0x65, 0xFD, 0x21, 0x6D, 0xFD, 0x21,
|
||||
0x67, 0xFD, 0x21, 0x61, 0xFD, 0x21, 0x72, 0xFD, 0x21, 0x68, 0xFD, 0x21, 0x70, 0xFD, 0xA3, 0x00, 0xE1, 0x73, 0x6C,
|
||||
0x61, 0xD3, 0xDD, 0xFD, 0xA0, 0x05, 0x52, 0x21, 0x6C, 0xFD, 0x21, 0x64, 0xFA, 0x21, 0x75, 0xFD, 0x22, 0x61, 0x6F,
|
||||
0xF7, 0xFD, 0x41, 0x6E, 0xF7, 0xEF, 0x21, 0x65, 0xFC, 0x4D, 0x27, 0x61, 0xC3, 0x64, 0x65, 0x69, 0x68, 0x6C, 0x6F,
|
||||
0x72, 0x73, 0x75, 0x79, 0xF6, 0x83, 0xFF, 0x76, 0xFF, 0x91, 0xFF, 0xA7, 0xF7, 0xEB, 0xFF, 0xDF, 0xFF, 0xF4, 0xFF,
|
||||
0xFD, 0xF6, 0x83, 0xF7, 0xFB, 0xFB, 0x78, 0xF6, 0x83, 0xF6, 0x83, 0x41, 0x63, 0xFA, 0x33, 0x41, 0x72, 0xF6, 0xA6,
|
||||
0xA1, 0x01, 0xC2, 0x61, 0xFC, 0x41, 0x73, 0xEF, 0xDE, 0xC2, 0x05, 0x23, 0x63, 0x74, 0xF0, 0x03, 0xFF, 0xFC, 0x45,
|
||||
0x70, 0x61, 0x68, 0x6F, 0x75, 0xFF, 0xEE, 0xFF, 0xF7, 0xEC, 0xAD, 0xF0, 0x56, 0xF0, 0x56, 0x21, 0x73, 0xF0, 0x21,
|
||||
0x6E, 0xFD, 0xC4, 0x00, 0xE2, 0x69, 0x75, 0x61, 0x65, 0xFA, 0x40, 0xFF, 0xD0, 0xFF, 0xFD, 0xF7, 0x9C, 0x41, 0x79,
|
||||
0xFB, 0x9D, 0x21, 0x68, 0xFC, 0xC3, 0x00, 0xE1, 0x6E, 0x6D, 0x63, 0xFB, 0x66, 0xF6, 0xCC, 0xFF, 0xFD, 0x41, 0x6D,
|
||||
0xFB, 0xEE, 0x21, 0x61, 0xFC, 0x21, 0x72, 0xFD, 0x21, 0xA9, 0xFD, 0x21, 0xC3, 0xFD, 0x21, 0x70, 0xFD, 0x41, 0x6D,
|
||||
0xEE, 0x61, 0x21, 0x61, 0xFC, 0x42, 0x74, 0x2E, 0xFF, 0xFD, 0xF7, 0x48, 0xC5, 0x00, 0xE1, 0x72, 0x6D, 0x73, 0x2E,
|
||||
0x6E, 0xFB, 0x39, 0xFF, 0xEF, 0xFF, 0xF9, 0xF7, 0x41, 0xF7, 0x4D, 0xC2, 0x00, 0x81, 0x69, 0x65, 0xF3, 0x22, 0xF8,
|
||||
0x9E, 0x41, 0x73, 0xEB, 0xD9, 0x21, 0x6F, 0xFC, 0x21, 0x6D, 0xFD, 0x44, 0x2E, 0x73, 0x72, 0x75, 0xF7, 0x1C, 0xF7,
|
||||
0x1F, 0xFF, 0xFD, 0xFB, 0x66, 0xC7, 0x00, 0xE2, 0x72, 0x2E, 0x65, 0x6C, 0x6D, 0x6E, 0x73, 0xFF, 0xE0, 0xF7, 0x0F,
|
||||
0xFF, 0xF3, 0xF7, 0x15, 0xF7, 0x15, 0xF7, 0x15, 0xF7, 0x15, 0x41, 0x62, 0xF9, 0x76, 0x41, 0x73, 0xEC, 0x06, 0x21,
|
||||
0x67, 0xFC, 0xC3, 0x00, 0xE1, 0x72, 0x6D, 0x6E, 0xFF, 0xF5, 0xF6, 0x4A, 0xFF, 0xFD, 0xC2, 0x00, 0xE1, 0x6D, 0x72,
|
||||
0xF6, 0x3E, 0xF9, 0x8D, 0x42, 0x62, 0x70, 0xEB, 0x8A, 0xEB, 0x8A, 0x44, 0x65, 0x69, 0x6F, 0x73, 0xEB, 0x83, 0xEB,
|
||||
0x83, 0xFF, 0xF9, 0xEB, 0x83, 0x21, 0xA9, 0xF3, 0x21, 0xC3, 0xFD, 0xA1, 0x00, 0xE1, 0x6C, 0xFD, 0x48, 0xA2, 0xA0,
|
||||
0xA9, 0xA8, 0xAA, 0xAE, 0xB4, 0xBB, 0xF5, 0x5F, 0xF5, 0x5F, 0xFF, 0xFB, 0xF5, 0x5F, 0xF5, 0x5F, 0xF5, 0x5F, 0xF5,
|
||||
0x5F, 0xF5, 0x5F, 0x41, 0x74, 0xF1, 0x2A, 0x21, 0x6E, 0xFC, 0x21, 0x69, 0xFD, 0x21, 0x68, 0xFD, 0x41, 0x6C, 0xFA,
|
||||
0x2E, 0x4B, 0x72, 0x61, 0x65, 0x68, 0x75, 0x6F, 0xC3, 0x63, 0x69, 0x74, 0x79, 0xFF, 0x0A, 0xFF, 0x20, 0xFF, 0x4D,
|
||||
0xFF, 0x7F, 0xFF, 0xA2, 0xFF, 0xAE, 0xFF, 0xD6, 0xFF, 0xF9, 0xF5, 0x35, 0xFF, 0xFC, 0xF5, 0x35, 0xC1, 0x00, 0xE1,
|
||||
0x63, 0xF8, 0xEB, 0x47, 0xA2, 0xA9, 0xA8, 0xAA, 0xAE, 0xB4, 0xBB, 0xF5, 0x0D, 0xFF, 0xFA, 0xF5, 0x0D, 0xF5, 0x0D,
|
||||
0xF5, 0x0D, 0xF5, 0x0D, 0xF5, 0x0D, 0x41, 0x75, 0xFF, 0x01, 0x21, 0x68, 0xFC, 0xC2, 0x00, 0xE1, 0x72, 0x63, 0xF5,
|
||||
0x32, 0xFF, 0xFD, 0xC2, 0x00, 0xE2, 0x65, 0x61, 0xF6, 0x58, 0xF3, 0x41, 0x41, 0x74, 0xF6, 0x64, 0xC2, 0x00, 0xE2,
|
||||
0x65, 0x69, 0xF6, 0x4B, 0xFF, 0xFC, 0x4A, 0x61, 0xC3, 0x65, 0x69, 0x6C, 0x6F, 0x72, 0x73, 0x75, 0x79, 0xFD, 0xC4,
|
||||
0xFF, 0xC4, 0xF6, 0x39, 0xFF, 0xE1, 0xFF, 0xEA, 0xF4, 0xD1, 0xFF, 0xF7, 0xF9, 0xC6, 0xFD, 0xC4, 0xF4, 0xD1, 0x45,
|
||||
0x61, 0x65, 0x69, 0x6F, 0x79, 0xF4, 0xCF, 0xF4, 0xCF, 0xF4, 0xCF, 0xF4, 0xCF, 0xF4, 0xCF, 0x41, 0x75, 0xFA, 0x87,
|
||||
0x21, 0x71, 0xFC, 0x21, 0x6F, 0xFD, 0x21, 0x6C, 0xFD, 0x21, 0x69, 0xFD, 0x21, 0x64, 0xFD, 0x42, 0x6D, 0x6E, 0xF2,
|
||||
0xE6, 0xFF, 0xFD, 0xC2, 0x00, 0xE2, 0x65, 0x61, 0xF5, 0xF9, 0xFF, 0xF9, 0xC1, 0x00, 0xE1, 0x65, 0xF5, 0xF0, 0x4C,
|
||||
0x61, 0xC3, 0x65, 0x68, 0x69, 0x6C, 0x6E, 0x6F, 0x72, 0x75, 0x73, 0x79, 0xF4, 0x79, 0xF5, 0xBC, 0xF5, 0xE1, 0xFF,
|
||||
0xC7, 0xF7, 0xA7, 0xF5, 0xF1, 0xF5, 0xF1, 0xF4, 0x79, 0xFF, 0xF1, 0xFF, 0xFA, 0xF9, 0x6E, 0xF4, 0x79, 0x41, 0x69,
|
||||
0xEF, 0xBB, 0x21, 0x75, 0xFC, 0x42, 0x71, 0x2E, 0xFF, 0xFD, 0xF5, 0xA6, 0xC5, 0x00, 0xE1, 0x72, 0x6D, 0x73, 0x2E,
|
||||
0x6E, 0xEA, 0xD7, 0xF6, 0x80, 0xFF, 0xF9, 0xF5, 0x9F, 0xF5, 0xAB, 0x41, 0x69, 0xF6, 0xD1, 0x42, 0x6C, 0x73, 0xFF,
|
||||
0xFC, 0xEB, 0x02, 0xA0, 0x02, 0xD2, 0x21, 0x68, 0xFD, 0x42, 0xC3, 0x61, 0xFA, 0x3F, 0xFF, 0xFD, 0xC2, 0x06, 0x02,
|
||||
0x6F, 0x73, 0xF5, 0x12, 0xF5, 0x12, 0x21, 0x72, 0xF7, 0x21, 0x65, 0xFD, 0xC5, 0x00, 0xE1, 0x63, 0x62, 0x6D, 0x72,
|
||||
0x70, 0xFD, 0xB2, 0xFF, 0xDD, 0xF4, 0xC4, 0xFF, 0xEA, 0xFF, 0xFD, 0x41, 0x6C, 0xFC, 0x26, 0xA1, 0x00, 0xE2, 0x75,
|
||||
0xFC, 0x21, 0x72, 0xFB, 0x41, 0x61, 0xF4, 0x0C, 0x21, 0x69, 0xFC, 0x21, 0x74, 0xFD, 0x41, 0x6D, 0xF4, 0x02, 0x21,
|
||||
0x72, 0xFC, 0x41, 0x6C, 0xF3, 0xFB, 0x41, 0x6F, 0xF8, 0xC3, 0x22, 0x65, 0x72, 0xF8, 0xFC, 0x45, 0x6F, 0x61, 0x65,
|
||||
0x68, 0x69, 0xFF, 0xDF, 0xFF, 0xE9, 0xFF, 0xF0, 0xFB, 0x48, 0xFF, 0xFB, 0x41, 0x6F, 0xF6, 0x5E, 0x42, 0x6C, 0x76,
|
||||
0xFF, 0xFC, 0xF3, 0xDA, 0x41, 0x76, 0xF3, 0xD3, 0x22, 0x61, 0x6F, 0xF5, 0xFC, 0x41, 0x70, 0xFB, 0x11, 0x41, 0xA9,
|
||||
0xFB, 0x17, 0x21, 0xC3, 0xFC, 0x41, 0x70, 0xF3, 0xBF, 0xC3, 0x00, 0xE2, 0x2E, 0x65, 0x73, 0xF4, 0xF7, 0xF6, 0x66,
|
||||
0xF4, 0xFD, 0x24, 0x61, 0x6C, 0x6F, 0x68, 0xE5, 0xED, 0xF0, 0xF4, 0x41, 0x6D, 0xF9, 0x29, 0xC6, 0x00, 0xE2, 0x2E,
|
||||
0x65, 0x6D, 0x6F, 0x72, 0x73, 0xF4, 0xDE, 0xF4, 0xF6, 0xF4, 0xE4, 0xFF, 0xFC, 0xF4, 0xE4, 0xF4, 0xE4, 0x41, 0x64,
|
||||
0xF3, 0x8D, 0x21, 0x72, 0xFC, 0x21, 0x61, 0xFD, 0x21, 0x64, 0xFD, 0x21, 0x6E, 0xFD, 0x41, 0x6E, 0xF3, 0x7D, 0x21,
|
||||
0x69, 0xFC, 0xA0, 0x07, 0xE2, 0x21, 0x73, 0xFD, 0x21, 0x6F, 0xFD, 0x21, 0xA9, 0xFD, 0x21, 0xC3, 0xFD, 0x21, 0x72,
|
||||
0xFD, 0x21, 0xA9, 0xFD, 0x41, 0x67, 0xFF, 0x5F, 0x41, 0x6B, 0xF3, 0x5D, 0x42, 0x63, 0x6D, 0xFF, 0xFC, 0xFF, 0x62,
|
||||
0x41, 0x74, 0xFA, 0x90, 0x21, 0x63, 0xFC, 0x42, 0x6F, 0x75, 0xFF, 0x81, 0xFF, 0xFD, 0x41, 0x65, 0xF3, 0x44, 0x21,
|
||||
0x6C, 0xFC, 0x27, 0x61, 0x65, 0xC3, 0x69, 0x6F, 0x72, 0x79, 0xBD, 0xC4, 0xD9, 0xDC, 0xE4, 0xF2, 0xFD, 0x4D, 0x65,
|
||||
0x75, 0x70, 0x6C, 0x61, 0xC3, 0x63, 0x68, 0x69, 0x6F, 0xC5, 0x74, 0x79, 0xFE, 0xCB, 0xFF, 0x04, 0xFF, 0x40, 0xFF,
|
||||
0x5F, 0xF3, 0x11, 0xF4, 0x54, 0xFF, 0x7F, 0xFF, 0x8C, 0xF3, 0x11, 0xF3, 0x11, 0xF7, 0x13, 0xFF, 0xF1, 0xF3, 0x11,
|
||||
0x41, 0x69, 0xF3, 0x97, 0x21, 0x6E, 0xFC, 0x21, 0x6F, 0xFD, 0x22, 0x6D, 0x73, 0xFD, 0xF6, 0x21, 0x6F, 0xFB, 0x21,
|
||||
0x6E, 0xFD, 0x41, 0x75, 0xED, 0x66, 0x41, 0x73, 0xEC, 0x54, 0x21, 0x64, 0xFC, 0x21, 0x75, 0xFD, 0x41, 0x6F, 0xF6,
|
||||
0xA4, 0x42, 0x73, 0x70, 0xEA, 0xC3, 0xFF, 0xFC, 0x21, 0x69, 0xF9, 0x43, 0x6D, 0x62, 0x6E, 0xF3, 0x6F, 0xFF, 0xEF,
|
||||
0xFF, 0xFD, 0x41, 0x67, 0xF3, 0x5C, 0x21, 0x6E, 0xFC, 0x21, 0x6F, 0xFD, 0x21, 0x6C, 0xFD, 0x41, 0x65, 0xFA, 0x82,
|
||||
0x21, 0x74, 0xFC, 0x41, 0x6E, 0xFA, 0xEA, 0x21, 0x6F, 0xFC, 0x42, 0x73, 0x74, 0xF7, 0x88, 0xF7, 0x88, 0x41, 0x6F,
|
||||
0xF7, 0x81, 0x21, 0x72, 0xFC, 0x21, 0xA9, 0xFD, 0x41, 0x6D, 0xF7, 0x77, 0x41, 0x75, 0xF7, 0x73, 0x42, 0x64, 0x74,
|
||||
0xF7, 0x6F, 0xFF, 0xFC, 0x41, 0x6E, 0xF7, 0x68, 0x21, 0x6F, 0xFC, 0x21, 0x69, 0xFD, 0x21, 0x74, 0xFD, 0x21, 0x63,
|
||||
0xFD, 0x22, 0x61, 0x69, 0xE9, 0xFD, 0x25, 0x61, 0xC3, 0x69, 0x6F, 0x72, 0xCB, 0xD9, 0xDC, 0xDC, 0xFB, 0x21, 0x74,
|
||||
0xF5, 0x41, 0x61, 0xE9, 0x22, 0x21, 0x79, 0xFC, 0x4B, 0x67, 0x70, 0x6D, 0x72, 0x62, 0x63, 0x64, 0xC3, 0x69, 0x73,
|
||||
0x78, 0xFF, 0x72, 0xFF, 0x75, 0xFF, 0x91, 0xF3, 0x5D, 0xFF, 0xA5, 0xFF, 0xAC, 0xFD, 0x10, 0xF2, 0x46, 0xFF, 0xB3,
|
||||
0xFF, 0xF6, 0xFF, 0xFD, 0x41, 0x6E, 0xE8, 0xBD, 0xA1, 0x00, 0xE1, 0x67, 0xFC, 0x46, 0x61, 0x65, 0x69, 0x6F, 0x75,
|
||||
0x72, 0xFF, 0xFB, 0xF3, 0x86, 0xF2, 0x1E, 0xF2, 0x1E, 0xF2, 0x1E, 0xF2, 0x3B, 0xA0, 0x01, 0x71, 0x21, 0xA9, 0xFD,
|
||||
0x21, 0xC3, 0xFD, 0x41, 0x74, 0xE8, 0x44, 0x21, 0x70, 0xFC, 0x22, 0x69, 0x6F, 0xF6, 0xFD, 0xA1, 0x00, 0xE1, 0x6D,
|
||||
0xFB, 0x47, 0xA2, 0xA9, 0xA8, 0xAA, 0xAE, 0xB4, 0xBB, 0xF1, 0xF1, 0xFF, 0xFB, 0xF1, 0xF1, 0xF1, 0xF1, 0xF1, 0xF1,
|
||||
0xF1, 0xF1, 0xF1, 0xF1, 0x41, 0xA9, 0xE9, 0x74, 0xC7, 0x06, 0x02, 0x61, 0x65, 0xC3, 0x69, 0x6F, 0x73, 0x75, 0xF2,
|
||||
0xCD, 0xF2, 0xCD, 0xFF, 0xFC, 0xF2, 0xCD, 0xF2, 0xCD, 0xF2, 0xCD, 0xF2, 0xCD, 0x21, 0x72, 0xE8, 0x47, 0x61, 0x65,
|
||||
0xC3, 0x69, 0x6F, 0x73, 0x75, 0xE9, 0xBD, 0xE9, 0xBD, 0xED, 0x93, 0xE9, 0xBD, 0xE9, 0xBD, 0xE9, 0xBD, 0xE9, 0xBD,
|
||||
0x22, 0x65, 0x6F, 0xE7, 0xEA, 0xA1, 0x00, 0xE1, 0x70, 0xFB, 0x47, 0x61, 0xC3, 0x65, 0x69, 0x6F, 0x75, 0x79, 0xF1,
|
||||
0x9C, 0xFF, 0xAB, 0xF6, 0x71, 0xF4, 0xCA, 0xF1, 0x9C, 0xFA, 0x8F, 0xFF, 0xFB, 0x41, 0x76, 0xF3, 0xC0, 0x41, 0x76,
|
||||
0xE8, 0x54, 0x41, 0x78, 0xE8, 0x50, 0x22, 0x6F, 0x61, 0xF8, 0xFC, 0x21, 0x69, 0xFB, 0x41, 0x72, 0xF2, 0x20, 0x21,
|
||||
0x74, 0xFC, 0x45, 0x63, 0x65, 0x76, 0x6E, 0x73, 0xF2, 0x5E, 0xFF, 0xE5, 0xF2, 0x5E, 0xFF, 0xF6, 0xFF, 0xFD, 0x42,
|
||||
0x6E, 0x73, 0xE9, 0xBA, 0xE9, 0xBA, 0x21, 0x69, 0xF9, 0x21, 0x6C, 0xFD, 0x21, 0x6C, 0xFD, 0x21, 0x69, 0xFD, 0xC2,
|
||||
0x00, 0xE1, 0x63, 0x6E, 0xF3, 0x82, 0xFF, 0xFD, 0xC2, 0x00, 0xE1, 0x6C, 0x64, 0xF4, 0x69, 0xF9, 0xE8, 0x41, 0x74,
|
||||
0xF7, 0x1B, 0x21, 0x6F, 0xFC, 0x21, 0x70, 0xFD, 0x21, 0x69, 0xFD, 0x42, 0x72, 0x2E, 0xFF, 0xFD, 0xF2, 0x88, 0x42,
|
||||
0x69, 0x74, 0xEF, 0x79, 0xFF, 0xF9, 0xC3, 0x00, 0xE1, 0x6E, 0x2E, 0x73, 0xFF, 0xF9, 0xF2, 0x74, 0xF2, 0x77, 0x41,
|
||||
0x69, 0xE7, 0x51, 0x21, 0x6B, 0xFC, 0x21, 0x73, 0xFD, 0x21, 0x6F, 0xFD, 0xA1, 0x00, 0xE1, 0x6C, 0xFD, 0x47, 0xA2,
|
||||
0xA9, 0xA8, 0xAA, 0xAE, 0xB4, 0xBB, 0xF0, 0xFD, 0xFF, 0xFB, 0xF0, 0xFD, 0xF0, 0xFD, 0xF0, 0xFD, 0xF0, 0xFD, 0xF0,
|
||||
0xFD, 0x41, 0x6D, 0xE9, 0xDD, 0x21, 0x61, 0xFC, 0x21, 0x74, 0xFD, 0xA1, 0x00, 0xE1, 0x6C, 0xFD, 0x48, 0x61, 0x69,
|
||||
0x65, 0xC3, 0x6F, 0x72, 0x75, 0x79, 0xFF, 0x90, 0xFF, 0x99, 0xFF, 0xBD, 0xFF, 0xDB, 0xFF, 0xFB, 0xF2, 0x50, 0xF0,
|
||||
0xD8, 0xF0, 0xD8, 0xA0, 0x01, 0xD1, 0x21, 0x6E, 0xFD, 0x21, 0x6F, 0xFD, 0x42, 0x69, 0x75, 0xFF, 0xFD, 0xF0, 0xF8,
|
||||
0x41, 0x72, 0xF6, 0xE9, 0xA1, 0x00, 0xE1, 0x77, 0xFC, 0x48, 0xA2, 0xA0, 0xA9, 0xA8, 0xAA, 0xAE, 0xB4, 0xBB, 0xF0,
|
||||
0xA6, 0xF0, 0xA6, 0xF0, 0xA6, 0xF0, 0xA6, 0xF0, 0xA6, 0xF0, 0xA6, 0xF0, 0xA6, 0xF0, 0xA6, 0x41, 0x2E, 0xE6, 0x8A,
|
||||
0x21, 0x74, 0xFC, 0x21, 0x6E, 0xFD, 0x21, 0x65, 0xFD, 0x4A, 0x69, 0x6C, 0x61, 0xC3, 0x65, 0x6F, 0x73, 0x75, 0x79,
|
||||
0x6D, 0xF3, 0xAE, 0xFF, 0xCA, 0xFF, 0xD5, 0xFF, 0xDA, 0xF1, 0xE8, 0xF0, 0x80, 0xF8, 0x95, 0xF0, 0x80, 0xF0, 0x80,
|
||||
0xFF, 0xFD, 0x41, 0x6C, 0xF3, 0x8B, 0x42, 0x69, 0x65, 0xFF, 0xFC, 0xF9, 0xD3, 0xC1, 0x00, 0xE2, 0x2E, 0xF1, 0xAF,
|
||||
0x49, 0x61, 0xC3, 0x65, 0x68, 0x69, 0x6F, 0x72, 0x75, 0x79, 0xF0, 0x50, 0xF1, 0x93, 0xF1, 0xB8, 0xFF, 0xFA, 0xF0,
|
||||
0x50, 0xF0, 0x50, 0xF0, 0x6D, 0xF0, 0x50, 0xF0, 0x50, 0x42, 0x61, 0x65, 0xF0, 0x76, 0xF1, 0xA5, 0xA1, 0x00, 0xE1,
|
||||
0x75, 0xF9, 0x41, 0x69, 0xFA, 0x32, 0x21, 0x72, 0xFC, 0xA1, 0x00, 0xE1, 0x74, 0xFD, 0xA0, 0x01, 0xF2, 0x21, 0x2E,
|
||||
0xFD, 0x22, 0x2E, 0x73, 0xFA, 0xFD, 0x21, 0x74, 0xFB, 0x21, 0x61, 0xFD, 0x4A, 0x75, 0x61, 0xC3, 0x65, 0x69, 0x6F,
|
||||
0xC5, 0x73, 0x78, 0x79, 0xFF, 0xEA, 0xF0, 0x0B, 0xF1, 0x4E, 0xF1, 0x73, 0xF0, 0x0B, 0xF0, 0x0B, 0xF4, 0x0D, 0xFF,
|
||||
0xFD, 0xF8, 0x58, 0xF0, 0x0B, 0x41, 0x68, 0xF8, 0x39, 0x21, 0x74, 0xFC, 0x42, 0x73, 0x6C, 0xFF, 0xFD, 0xF8, 0x38,
|
||||
0x41, 0x6F, 0xFD, 0x5C, 0x21, 0x74, 0xFC, 0x22, 0x61, 0x73, 0xF2, 0xFD, 0x42, 0xA9, 0xA8, 0xEF, 0xD2, 0xEF, 0xD2,
|
||||
0x47, 0x61, 0x65, 0xC3, 0x69, 0x6F, 0x75, 0x79, 0xEF, 0xCB, 0xF1, 0x33, 0xFF, 0xF9, 0xEF, 0xCB, 0xEF, 0xCB, 0xEF,
|
||||
0xCB, 0xEF, 0xCB, 0x5D, 0x27, 0x2E, 0x61, 0x62, 0xC3, 0x63, 0x6A, 0x6D, 0x72, 0x70, 0x69, 0x65, 0x64, 0x74, 0x66,
|
||||
0x67, 0x73, 0x6F, 0x77, 0x68, 0x75, 0x76, 0x6C, 0x78, 0x6B, 0x71, 0x6E, 0x79, 0x7A, 0xE7, 0xD0, 0xEF, 0x48, 0xF0,
|
||||
0xCD, 0xF1, 0x53, 0xF2, 0x28, 0xF3, 0xD1, 0xF3, 0xFD, 0xF4, 0xAD, 0xF5, 0x6F, 0xF7, 0x2F, 0xF8, 0x34, 0xF8, 0x98,
|
||||
0xF9, 0x32, 0xFA, 0x80, 0xFA, 0xE4, 0xFB, 0x3C, 0xFC, 0xA4, 0xFD, 0x6C, 0xFD, 0x97, 0xFE, 0x19, 0xFE, 0x4A, 0xFE,
|
||||
0xDD, 0xFF, 0x35, 0xFF, 0x58, 0xFF, 0x65, 0xFF, 0x88, 0xFF, 0xAA, 0xFF, 0xDE, 0xFF, 0xEA,
|
||||
};
|
||||
|
||||
constexpr SerializedHyphenationPatterns fr_patterns = {
|
||||
fr_trie_data,
|
||||
sizeof(fr_trie_data),
|
||||
};
|
||||
1770
lib/Epub/Epub/hyphenation/generated/hyph-ru.trie.h
Normal file
1770
lib/Epub/Epub/hyphenation/generated/hyph-ru.trie.h
Normal file
File diff suppressed because it is too large
Load Diff
@ -51,7 +51,7 @@ void ChapterHtmlSlimParser::startNewTextBlock(const TextBlock::Style style) {
|
||||
|
||||
makePages();
|
||||
}
|
||||
currentTextBlock.reset(new ParsedText(style, extraParagraphSpacing));
|
||||
currentTextBlock.reset(new ParsedText(style, extraParagraphSpacing, hyphenationEnabled));
|
||||
}
|
||||
|
||||
void XMLCALL ChapterHtmlSlimParser::startElement(void* userData, const XML_Char* name, const XML_Char** atts) {
|
||||
@ -170,21 +170,6 @@ void XMLCALL ChapterHtmlSlimParser::characterData(void* userData, const XML_Char
|
||||
continue;
|
||||
}
|
||||
|
||||
// Skip soft-hyphen with UTF-8 representation (U+00AD) = 0xC2 0xAD
|
||||
const XML_Char SHY_BYTE_1 = static_cast<XML_Char>(0xC2);
|
||||
const XML_Char SHY_BYTE_2 = static_cast<XML_Char>(0xAD);
|
||||
// 1. Check for the start of the 2-byte Soft Hyphen sequence
|
||||
if (s[i] == SHY_BYTE_1) {
|
||||
// 2. Check if the next byte exists AND if it completes the sequence
|
||||
// We must check i + 1 < len to prevent reading past the end of the buffer.
|
||||
if ((i + 1 < len) && (s[i + 1] == SHY_BYTE_2)) {
|
||||
// Sequence 0xC2 0xAD found!
|
||||
// Skip the current byte (0xC2) and the next byte (0xAD)
|
||||
i++; // Increment 'i' one more time to skip the 0xAD byte
|
||||
continue; // Skip the rest of the loop and move to the next iteration
|
||||
}
|
||||
}
|
||||
|
||||
// Skip Zero Width No-Break Space / BOM (U+FEFF) = 0xEF 0xBB 0xBF
|
||||
const XML_Char FEFF_BYTE_1 = static_cast<XML_Char>(0xEF);
|
||||
const XML_Char FEFF_BYTE_2 = static_cast<XML_Char>(0xBB);
|
||||
|
||||
@ -36,6 +36,7 @@ class ChapterHtmlSlimParser {
|
||||
uint8_t paragraphAlignment;
|
||||
uint16_t viewportWidth;
|
||||
uint16_t viewportHeight;
|
||||
bool hyphenationEnabled;
|
||||
|
||||
void startNewTextBlock(TextBlock::Style style);
|
||||
void makePages();
|
||||
@ -48,7 +49,7 @@ class ChapterHtmlSlimParser {
|
||||
explicit ChapterHtmlSlimParser(const std::string& filepath, GfxRenderer& renderer, const int fontId,
|
||||
const float lineCompression, const bool extraParagraphSpacing,
|
||||
const uint8_t paragraphAlignment, const uint16_t viewportWidth,
|
||||
const uint16_t viewportHeight,
|
||||
const uint16_t viewportHeight, const bool hyphenationEnabled,
|
||||
const std::function<void(std::unique_ptr<Page>)>& completePageFn,
|
||||
const std::function<void(int)>& progressFn = nullptr)
|
||||
: filepath(filepath),
|
||||
@ -59,6 +60,7 @@ class ChapterHtmlSlimParser {
|
||||
paragraphAlignment(paragraphAlignment),
|
||||
viewportWidth(viewportWidth),
|
||||
viewportHeight(viewportHeight),
|
||||
hyphenationEnabled(hyphenationEnabled),
|
||||
completePageFn(completePageFn),
|
||||
progressFn(progressFn) {}
|
||||
~ChapterHtmlSlimParser() = default;
|
||||
|
||||
@ -107,6 +107,11 @@ void XMLCALL ContentOpfParser::startElement(void* userData, const XML_Char* name
|
||||
return;
|
||||
}
|
||||
|
||||
if (self->state == IN_METADATA && strcmp(name, "dc:language") == 0) {
|
||||
self->state = IN_BOOK_LANGUAGE;
|
||||
return;
|
||||
}
|
||||
|
||||
if (self->state == IN_PACKAGE && (strcmp(name, "manifest") == 0 || strcmp(name, "opf:manifest") == 0)) {
|
||||
self->state = IN_MANIFEST;
|
||||
if (!SdMan.openFileForWrite("COF", self->cachePath + itemCacheFile, self->tempItemStore)) {
|
||||
@ -266,6 +271,11 @@ void XMLCALL ContentOpfParser::characterData(void* userData, const XML_Char* s,
|
||||
self->author.append(s, len);
|
||||
return;
|
||||
}
|
||||
|
||||
if (self->state == IN_BOOK_LANGUAGE) {
|
||||
self->language.append(s, len);
|
||||
return;
|
||||
}
|
||||
}
|
||||
|
||||
void XMLCALL ContentOpfParser::endElement(void* userData, const XML_Char* name) {
|
||||
@ -300,6 +310,11 @@ void XMLCALL ContentOpfParser::endElement(void* userData, const XML_Char* name)
|
||||
return;
|
||||
}
|
||||
|
||||
if (self->state == IN_BOOK_LANGUAGE && strcmp(name, "dc:language") == 0) {
|
||||
self->state = IN_METADATA;
|
||||
return;
|
||||
}
|
||||
|
||||
if (self->state == IN_METADATA && (strcmp(name, "metadata") == 0 || strcmp(name, "opf:metadata") == 0)) {
|
||||
self->state = IN_PACKAGE;
|
||||
return;
|
||||
|
||||
@ -13,6 +13,7 @@ class ContentOpfParser final : public Print {
|
||||
IN_METADATA,
|
||||
IN_BOOK_TITLE,
|
||||
IN_BOOK_AUTHOR,
|
||||
IN_BOOK_LANGUAGE,
|
||||
IN_MANIFEST,
|
||||
IN_SPINE,
|
||||
IN_GUIDE,
|
||||
@ -34,6 +35,7 @@ class ContentOpfParser final : public Print {
|
||||
public:
|
||||
std::string title;
|
||||
std::string author;
|
||||
std::string language;
|
||||
std::string tocNcxPath;
|
||||
std::string tocNavPath; // EPUB 3 nav document path
|
||||
std::string coverItemHref;
|
||||
|
||||
82
scripts/generate_hyphenation_trie.py
Executable file
82
scripts/generate_hyphenation_trie.py
Executable file
@ -0,0 +1,82 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Embed hypher-generated `.bin` tries into constexpr headers."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import pathlib
|
||||
|
||||
|
||||
def _format_bytes(blob: bytes, per_line: int = 16) -> str:
|
||||
# Render the blob as a comma separated list of hex literals with consistent wrapping.
|
||||
lines = []
|
||||
for i in range(0, len(blob), per_line):
|
||||
chunk = ', '.join(f"0x{b:02X}" for b in blob[i : i + per_line])
|
||||
lines.append(f" {chunk},")
|
||||
if not lines:
|
||||
lines.append(" 0x00,")
|
||||
return '\n'.join(lines)
|
||||
|
||||
|
||||
def _symbol_from_output(path: pathlib.Path) -> str:
|
||||
# Derive a stable C identifier from the destination header name (e.g., hyph-en.trie.h -> en).
|
||||
name = path.name
|
||||
if name.endswith('.trie.h'):
|
||||
name = name[:-7]
|
||||
if name.startswith('hyph-'):
|
||||
name = name[5:]
|
||||
name = name.replace('-', '_')
|
||||
if name.endswith('.trie'):
|
||||
name = name[:-5]
|
||||
return name
|
||||
|
||||
|
||||
def write_header(path: pathlib.Path, blob: bytes, symbol: str) -> None:
|
||||
# Emit a constexpr header containing the raw bytes plus a SerializedHyphenationPatterns descriptor.
|
||||
path.parent.mkdir(parents=True, exist_ok=True)
|
||||
data_symbol = f"{symbol}_trie_data"
|
||||
patterns_symbol = f"{symbol}_patterns"
|
||||
bytes_literal = _format_bytes(blob)
|
||||
content = f"""#pragma once
|
||||
|
||||
#include <cstddef>
|
||||
#include <cstdint>
|
||||
|
||||
#include "../SerializedHyphenationTrie.h"
|
||||
|
||||
// Auto-generated by generate_hyphenation_trie.py. Do not edit manually.
|
||||
alignas(4) constexpr uint8_t {data_symbol}[] = {{
|
||||
{bytes_literal}
|
||||
}};
|
||||
|
||||
constexpr SerializedHyphenationPatterns {patterns_symbol} = {{
|
||||
{data_symbol},
|
||||
sizeof({data_symbol}),
|
||||
}};
|
||||
"""
|
||||
path.write_text(content)
|
||||
|
||||
|
||||
def main() -> None:
|
||||
parser = argparse.ArgumentParser()
|
||||
parser.add_argument('--input', dest='inputs', action='append', required=True,
|
||||
help='Path to a hypher-generated .bin trie')
|
||||
parser.add_argument('--output', dest='outputs', action='append', required=True,
|
||||
help='Destination header path (hyph-*.trie.h)')
|
||||
args = parser.parse_args()
|
||||
|
||||
if len(args.inputs) != len(args.outputs):
|
||||
raise SystemExit('input/output counts must match')
|
||||
|
||||
for src, dst in zip(args.inputs, args.outputs):
|
||||
# Process each input/output pair independently so mixed-language refreshes work in one invocation.
|
||||
src_path = pathlib.Path(src)
|
||||
blob = src_path.read_bytes()
|
||||
out_path = pathlib.Path(dst)
|
||||
symbol = _symbol_from_output(out_path)
|
||||
write_header(out_path, blob, symbol)
|
||||
print(f'wrote {dst} ({len(blob)} bytes payload)')
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
||||
@ -14,7 +14,7 @@ CrossPointSettings CrossPointSettings::instance;
|
||||
namespace {
|
||||
constexpr uint8_t SETTINGS_FILE_VERSION = 1;
|
||||
// Increment this when adding new persisted settings fields
|
||||
constexpr uint8_t SETTINGS_COUNT = 18;
|
||||
constexpr uint8_t SETTINGS_COUNT = 20;
|
||||
constexpr char SETTINGS_FILE[] = "/.crosspoint/settings.bin";
|
||||
} // namespace
|
||||
|
||||
@ -48,6 +48,7 @@ bool CrossPointSettings::saveToFile() const {
|
||||
serialization::writePod(outputFile, textAntiAliasing);
|
||||
serialization::writePod(outputFile, hideBatteryPercentage);
|
||||
serialization::writePod(outputFile, longPressChapterSkip);
|
||||
serialization::writePod(outputFile, hyphenationEnabled);
|
||||
outputFile.close();
|
||||
|
||||
Serial.printf("[%lu] [CPS] Settings saved to file\n", millis());
|
||||
@ -110,12 +111,15 @@ bool CrossPointSettings::loadFromFile() {
|
||||
strncpy(opdsServerUrl, urlStr.c_str(), sizeof(opdsServerUrl) - 1);
|
||||
opdsServerUrl[sizeof(opdsServerUrl) - 1] = '\0';
|
||||
}
|
||||
if (++settingsRead >= fileSettingsCount) break;
|
||||
serialization::readPod(inputFile, textAntiAliasing);
|
||||
if (++settingsRead >= fileSettingsCount) break;
|
||||
serialization::readPod(inputFile, hideBatteryPercentage);
|
||||
if (++settingsRead >= fileSettingsCount) break;
|
||||
serialization::readPod(inputFile, longPressChapterSkip);
|
||||
if (++settingsRead >= fileSettingsCount) break;
|
||||
serialization::readPod(inputFile, hyphenationEnabled);
|
||||
if (++settingsRead >= fileSettingsCount) break;
|
||||
} while (false);
|
||||
|
||||
inputFile.close();
|
||||
|
||||
@ -84,6 +84,8 @@ class CrossPointSettings {
|
||||
uint8_t sleepTimeout = SLEEP_10_MIN;
|
||||
// E-ink refresh frequency (default 15 pages)
|
||||
uint8_t refreshFrequency = REFRESH_15;
|
||||
uint8_t hyphenationEnabled = 0;
|
||||
|
||||
// Reader screen margin settings
|
||||
uint8_t screenMargin = 5;
|
||||
// OPDS browser settings
|
||||
|
||||
@ -280,7 +280,7 @@ void EpubReaderActivity::renderScreen() {
|
||||
|
||||
if (!section->loadSectionFile(SETTINGS.getReaderFontId(), SETTINGS.getReaderLineCompression(),
|
||||
SETTINGS.extraParagraphSpacing, SETTINGS.paragraphAlignment, viewportWidth,
|
||||
viewportHeight)) {
|
||||
viewportHeight, SETTINGS.hyphenationEnabled)) {
|
||||
Serial.printf("[%lu] [ERS] Cache not found, building...\n", millis());
|
||||
|
||||
// Progress bar dimensions
|
||||
@ -325,7 +325,7 @@ void EpubReaderActivity::renderScreen() {
|
||||
|
||||
if (!section->createSectionFile(SETTINGS.getReaderFontId(), SETTINGS.getReaderLineCompression(),
|
||||
SETTINGS.extraParagraphSpacing, SETTINGS.paragraphAlignment, viewportWidth,
|
||||
viewportHeight, progressSetup, progressCallback)) {
|
||||
viewportHeight, SETTINGS.hyphenationEnabled, progressSetup, progressCallback)) {
|
||||
Serial.printf("[%lu] [ERS] Failed to persist page data to SD\n", millis());
|
||||
section.reset();
|
||||
return;
|
||||
|
||||
@ -14,7 +14,7 @@
|
||||
|
||||
// Define the static settings list
|
||||
namespace {
|
||||
constexpr int settingsCount = 21;
|
||||
constexpr int settingsCount = 22;
|
||||
const SettingInfo settingsList[settingsCount] = {
|
||||
// Should match with SLEEP_SCREEN_MODE
|
||||
SettingInfo::Enum("Sleep Screen", &CrossPointSettings::sleepScreen, {"Dark", "Light", "Custom", "Cover", "None"}),
|
||||
@ -38,6 +38,7 @@ const SettingInfo settingsList[settingsCount] = {
|
||||
SettingInfo::Value("Reader Screen Margin", &CrossPointSettings::screenMargin, {5, 40, 5}),
|
||||
SettingInfo::Enum("Reader Paragraph Alignment", &CrossPointSettings::paragraphAlignment,
|
||||
{"Justify", "Left", "Center", "Right"}),
|
||||
SettingInfo::Toggle("Hyphenation", &CrossPointSettings::hyphenationEnabled),
|
||||
SettingInfo::Enum("Time to Sleep", &CrossPointSettings::sleepTimeout,
|
||||
{"1 min", "5 min", "10 min", "15 min", "30 min"}),
|
||||
SettingInfo::Enum("Refresh Frequency", &CrossPointSettings::refreshFrequency,
|
||||
|
||||
388
test/hyphenation_eval/HyphenationEvaluationTest.cpp
Normal file
388
test/hyphenation_eval/HyphenationEvaluationTest.cpp
Normal file
@ -0,0 +1,388 @@
|
||||
#include <Utf8.h>
|
||||
|
||||
#include <algorithm>
|
||||
#include <cctype>
|
||||
#include <cmath>
|
||||
#include <fstream>
|
||||
#include <functional>
|
||||
#include <iostream>
|
||||
#include <sstream>
|
||||
#include <string>
|
||||
#include <vector>
|
||||
|
||||
#include "lib/Epub/Epub/hyphenation/HyphenationCommon.h"
|
||||
#include "lib/Epub/Epub/hyphenation/LanguageHyphenator.h"
|
||||
#include "lib/Epub/Epub/hyphenation/LanguageRegistry.h"
|
||||
|
||||
struct TestCase {
|
||||
std::string word;
|
||||
std::string hyphenated;
|
||||
std::vector<size_t> expectedPositions;
|
||||
int frequency;
|
||||
};
|
||||
|
||||
struct EvaluationResult {
|
||||
int truePositives = 0;
|
||||
int falsePositives = 0;
|
||||
int falseNegatives = 0;
|
||||
double precision = 0.0;
|
||||
double recall = 0.0;
|
||||
double f1Score = 0.0;
|
||||
double weightedScore = 0.0;
|
||||
};
|
||||
|
||||
struct LanguageConfig {
|
||||
std::string cliName;
|
||||
std::string testDataFile;
|
||||
const char* primaryTag;
|
||||
};
|
||||
|
||||
const std::vector<LanguageConfig> kSupportedLanguages = {
|
||||
{"english", "test/hyphenation_eval/resources/english_hyphenation_tests.txt", "en"},
|
||||
{"french", "test/hyphenation_eval/resources/french_hyphenation_tests.txt", "fr"},
|
||||
{"german", "test/hyphenation_eval/resources/german_hyphenation_tests.txt", "de"},
|
||||
{"russian", "test/hyphenation_eval/resources/russian_hyphenation_tests.txt", "ru"},
|
||||
};
|
||||
|
||||
std::vector<size_t> expectedPositionsFromAnnotatedWord(const std::string& annotated) {
|
||||
std::vector<size_t> positions;
|
||||
const unsigned char* ptr = reinterpret_cast<const unsigned char*>(annotated.c_str());
|
||||
size_t codepointIndex = 0;
|
||||
|
||||
while (*ptr != 0) {
|
||||
if (*ptr == '=') {
|
||||
positions.push_back(codepointIndex);
|
||||
++ptr;
|
||||
continue;
|
||||
}
|
||||
|
||||
utf8NextCodepoint(&ptr);
|
||||
++codepointIndex;
|
||||
}
|
||||
|
||||
return positions;
|
||||
}
|
||||
|
||||
std::vector<TestCase> loadTestData(const std::string& filename) {
|
||||
std::vector<TestCase> testCases;
|
||||
std::ifstream file(filename);
|
||||
|
||||
if (!file.is_open()) {
|
||||
std::cerr << "Error: Could not open file " << filename << std::endl;
|
||||
return testCases;
|
||||
}
|
||||
|
||||
std::string line;
|
||||
while (std::getline(file, line)) {
|
||||
if (line.empty() || line[0] == '#') {
|
||||
continue;
|
||||
}
|
||||
|
||||
std::istringstream iss(line);
|
||||
std::string word, hyphenated, freqStr;
|
||||
|
||||
if (std::getline(iss, word, '|') && std::getline(iss, hyphenated, '|') && std::getline(iss, freqStr, '|')) {
|
||||
TestCase testCase;
|
||||
testCase.word = word;
|
||||
testCase.hyphenated = hyphenated;
|
||||
testCase.frequency = std::stoi(freqStr);
|
||||
|
||||
testCase.expectedPositions = expectedPositionsFromAnnotatedWord(hyphenated);
|
||||
|
||||
testCases.push_back(testCase);
|
||||
}
|
||||
}
|
||||
|
||||
file.close();
|
||||
return testCases;
|
||||
}
|
||||
|
||||
std::string positionsToHyphenated(const std::string& word, const std::vector<size_t>& positions) {
|
||||
std::string result;
|
||||
std::vector<size_t> sortedPositions = positions;
|
||||
std::sort(sortedPositions.begin(), sortedPositions.end());
|
||||
|
||||
const unsigned char* ptr = reinterpret_cast<const unsigned char*>(word.c_str());
|
||||
size_t codepointIndex = 0;
|
||||
size_t posIdx = 0;
|
||||
|
||||
while (*ptr != 0) {
|
||||
while (posIdx < sortedPositions.size() && sortedPositions[posIdx] == codepointIndex) {
|
||||
result.push_back('=');
|
||||
++posIdx;
|
||||
}
|
||||
|
||||
const unsigned char* current = ptr;
|
||||
utf8NextCodepoint(&ptr);
|
||||
result.append(reinterpret_cast<const char*>(current), reinterpret_cast<const char*>(ptr));
|
||||
++codepointIndex;
|
||||
}
|
||||
|
||||
while (posIdx < sortedPositions.size() && sortedPositions[posIdx] == codepointIndex) {
|
||||
result.push_back('=');
|
||||
++posIdx;
|
||||
}
|
||||
|
||||
return result;
|
||||
}
|
||||
|
||||
std::vector<size_t> hyphenateWordWithHyphenator(const std::string& word, const LanguageHyphenator& hyphenator) {
|
||||
auto cps = collectCodepoints(word);
|
||||
trimSurroundingPunctuationAndFootnote(cps);
|
||||
|
||||
return hyphenator.breakIndexes(cps);
|
||||
}
|
||||
|
||||
std::vector<LanguageConfig> resolveLanguages(const std::string& selection) {
|
||||
if (selection == "all") {
|
||||
return kSupportedLanguages;
|
||||
}
|
||||
|
||||
for (const auto& config : kSupportedLanguages) {
|
||||
if (config.cliName == selection) {
|
||||
return {config};
|
||||
}
|
||||
}
|
||||
|
||||
return {};
|
||||
}
|
||||
|
||||
EvaluationResult evaluateWord(const TestCase& testCase,
|
||||
std::function<std::vector<size_t>(const std::string&)> hyphenateFunc) {
|
||||
EvaluationResult result;
|
||||
|
||||
std::vector<size_t> actualPositions = hyphenateFunc(testCase.word);
|
||||
|
||||
std::vector<size_t> expected = testCase.expectedPositions;
|
||||
std::vector<size_t> actual = actualPositions;
|
||||
|
||||
std::sort(expected.begin(), expected.end());
|
||||
std::sort(actual.begin(), actual.end());
|
||||
|
||||
for (size_t pos : actual) {
|
||||
if (std::find(expected.begin(), expected.end(), pos) != expected.end()) {
|
||||
result.truePositives++;
|
||||
} else {
|
||||
result.falsePositives++;
|
||||
}
|
||||
}
|
||||
|
||||
for (size_t pos : expected) {
|
||||
if (std::find(actual.begin(), actual.end(), pos) == actual.end()) {
|
||||
result.falseNegatives++;
|
||||
}
|
||||
}
|
||||
|
||||
if (result.truePositives + result.falsePositives > 0) {
|
||||
result.precision = static_cast<double>(result.truePositives) / (result.truePositives + result.falsePositives);
|
||||
}
|
||||
|
||||
if (result.truePositives + result.falseNegatives > 0) {
|
||||
result.recall = static_cast<double>(result.truePositives) / (result.truePositives + result.falseNegatives);
|
||||
}
|
||||
|
||||
if (result.precision + result.recall > 0) {
|
||||
result.f1Score = 2 * result.precision * result.recall / (result.precision + result.recall);
|
||||
}
|
||||
|
||||
// Treat words that contain no hyphenation marks in both the expected data and the
|
||||
// algorithmic output as perfect matches so they don't drag down the per-word averages.
|
||||
if (expected.empty() && actual.empty()) {
|
||||
result.precision = 1.0;
|
||||
result.recall = 1.0;
|
||||
result.f1Score = 1.0;
|
||||
}
|
||||
|
||||
double fpPenalty = 2.0;
|
||||
double fnPenalty = 1.0;
|
||||
|
||||
int totalErrors = result.falsePositives * fpPenalty + result.falseNegatives * fnPenalty;
|
||||
int totalPossible = expected.size() * fpPenalty;
|
||||
|
||||
if (totalPossible > 0) {
|
||||
result.weightedScore = 1.0 - (static_cast<double>(totalErrors) / totalPossible);
|
||||
result.weightedScore = std::max(0.0, result.weightedScore);
|
||||
} else if (result.falsePositives == 0) {
|
||||
result.weightedScore = 1.0;
|
||||
}
|
||||
|
||||
return result;
|
||||
}
|
||||
|
||||
void printResults(const std::string& language, const std::vector<TestCase>& testCases,
|
||||
const std::vector<std::pair<TestCase, EvaluationResult>>& worstCases, int perfectMatches,
|
||||
int partialMatches, int completeMisses, double totalPrecision, double totalRecall, double totalF1,
|
||||
double totalWeighted, int totalTP, int totalFP, int totalFN,
|
||||
std::function<std::vector<size_t>(const std::string&)> hyphenateFunc) {
|
||||
std::string lang_upper = language;
|
||||
if (!lang_upper.empty()) {
|
||||
lang_upper[0] = std::toupper(lang_upper[0]);
|
||||
}
|
||||
|
||||
std::cout << "================================================================================" << std::endl;
|
||||
std::cout << lang_upper << " HYPHENATION EVALUATION RESULTS" << std::endl;
|
||||
std::cout << "================================================================================" << std::endl;
|
||||
std::cout << std::endl;
|
||||
|
||||
std::cout << "Total test cases: " << testCases.size() << std::endl;
|
||||
std::cout << "Perfect matches: " << perfectMatches << " (" << (perfectMatches * 100.0 / testCases.size()) << "%)"
|
||||
<< std::endl;
|
||||
std::cout << "Partial matches: " << partialMatches << std::endl;
|
||||
std::cout << "Complete misses: " << completeMisses << std::endl;
|
||||
std::cout << std::endl;
|
||||
|
||||
std::cout << "--- Overall Metrics (averaged per word) ---" << std::endl;
|
||||
std::cout << "Average Precision: " << (totalPrecision / testCases.size() * 100.0) << "%" << std::endl;
|
||||
std::cout << "Average Recall: " << (totalRecall / testCases.size() * 100.0) << "%" << std::endl;
|
||||
std::cout << "Average F1 Score: " << (totalF1 / testCases.size() * 100.0) << "%" << std::endl;
|
||||
std::cout << "Average Weighted Score: " << (totalWeighted / testCases.size() * 100.0) << "% (FP penalty: 2x)"
|
||||
<< std::endl;
|
||||
std::cout << std::endl;
|
||||
|
||||
std::cout << "--- Overall Metrics (total counts) ---" << std::endl;
|
||||
std::cout << "True Positives: " << totalTP << std::endl;
|
||||
std::cout << "False Positives: " << totalFP << " (incorrect hyphenation points)" << std::endl;
|
||||
std::cout << "False Negatives: " << totalFN << " (missed hyphenation points)" << std::endl;
|
||||
|
||||
double overallPrecision = totalTP + totalFP > 0 ? static_cast<double>(totalTP) / (totalTP + totalFP) : 0.0;
|
||||
double overallRecall = totalTP + totalFN > 0 ? static_cast<double>(totalTP) / (totalTP + totalFN) : 0.0;
|
||||
double overallF1 = overallPrecision + overallRecall > 0
|
||||
? 2 * overallPrecision * overallRecall / (overallPrecision + overallRecall)
|
||||
: 0.0;
|
||||
|
||||
std::cout << "Overall Precision: " << (overallPrecision * 100.0) << "%" << std::endl;
|
||||
std::cout << "Overall Recall: " << (overallRecall * 100.0) << "%" << std::endl;
|
||||
std::cout << "Overall F1 Score: " << (overallF1 * 100.0) << "%" << std::endl;
|
||||
std::cout << std::endl;
|
||||
|
||||
// Filter out perfect matches from the “worst cases” section so that only actionable failures appear.
|
||||
auto hasImperfection = [](const EvaluationResult& r) { return r.weightedScore < 0.999999; };
|
||||
std::vector<std::pair<TestCase, EvaluationResult>> imperfectCases;
|
||||
imperfectCases.reserve(worstCases.size());
|
||||
for (const auto& entry : worstCases) {
|
||||
if (hasImperfection(entry.second)) {
|
||||
imperfectCases.push_back(entry);
|
||||
}
|
||||
}
|
||||
|
||||
std::cout << "--- Worst Cases (lowest weighted scores) ---" << std::endl;
|
||||
int showCount = std::min(10, static_cast<int>(imperfectCases.size()));
|
||||
for (int i = 0; i < showCount; i++) {
|
||||
const auto& testCase = imperfectCases[i].first;
|
||||
const auto& result = imperfectCases[i].second;
|
||||
|
||||
std::vector<size_t> actualPositions = hyphenateFunc(testCase.word);
|
||||
std::string actualHyphenated = positionsToHyphenated(testCase.word, actualPositions);
|
||||
|
||||
std::cout << "Word: " << testCase.word << " (freq: " << testCase.frequency << ")" << std::endl;
|
||||
std::cout << " Expected: " << testCase.hyphenated << std::endl;
|
||||
std::cout << " Got: " << actualHyphenated << std::endl;
|
||||
std::cout << " Precision: " << (result.precision * 100.0) << "%"
|
||||
<< " Recall: " << (result.recall * 100.0) << "%"
|
||||
<< " F1: " << (result.f1Score * 100.0) << "%"
|
||||
<< " Weighted: " << (result.weightedScore * 100.0) << "%" << std::endl;
|
||||
std::cout << " TP: " << result.truePositives << " FP: " << result.falsePositives
|
||||
<< " FN: " << result.falseNegatives << std::endl;
|
||||
std::cout << std::endl;
|
||||
}
|
||||
|
||||
// Additional compact list of the worst ~100 words to aid iteration
|
||||
int compactCount = std::min(100, static_cast<int>(imperfectCases.size()));
|
||||
if (compactCount > 0) {
|
||||
std::cout << "--- Compact Worst Cases (" << compactCount << ") ---" << std::endl;
|
||||
for (int i = 0; i < compactCount; i++) {
|
||||
const auto& testCase = imperfectCases[i].first;
|
||||
std::vector<size_t> actualPositions = hyphenateFunc(testCase.word);
|
||||
std::string actualHyphenated = positionsToHyphenated(testCase.word, actualPositions);
|
||||
std::cout << testCase.word << " | exp:" << testCase.hyphenated << " | got:" << actualHyphenated << std::endl;
|
||||
}
|
||||
std::cout << std::endl;
|
||||
}
|
||||
}
|
||||
|
||||
int main(int argc, char* argv[]) {
|
||||
const bool summaryMode = argc <= 1;
|
||||
const std::string languageSelection = summaryMode ? "all" : argv[1];
|
||||
|
||||
std::vector<LanguageConfig> languages = resolveLanguages(languageSelection);
|
||||
if (languages.empty()) {
|
||||
std::cerr << "Unknown language: " << languageSelection << std::endl;
|
||||
return 1;
|
||||
}
|
||||
|
||||
for (const auto& lang : languages) {
|
||||
const auto* hyphenator = getLanguageHyphenatorForPrimaryTag(lang.primaryTag);
|
||||
if (!hyphenator) {
|
||||
std::cerr << "No hyphenator registered for tag: " << lang.primaryTag << std::endl;
|
||||
continue;
|
||||
}
|
||||
const auto hyphenateFunc = [hyphenator](const std::string& word) {
|
||||
return hyphenateWordWithHyphenator(word, *hyphenator);
|
||||
};
|
||||
|
||||
if (!summaryMode) {
|
||||
std::cout << "Loading test data from: " << lang.testDataFile << std::endl;
|
||||
}
|
||||
std::vector<TestCase> testCases = loadTestData(lang.testDataFile);
|
||||
|
||||
if (testCases.empty()) {
|
||||
std::cerr << "No test cases loaded for " << lang.cliName << ". Skipping." << std::endl;
|
||||
continue;
|
||||
}
|
||||
|
||||
if (!summaryMode) {
|
||||
std::cout << "Loaded " << testCases.size() << " test cases for " << lang.cliName << std::endl;
|
||||
std::cout << std::endl;
|
||||
}
|
||||
|
||||
int perfectMatches = 0;
|
||||
int partialMatches = 0;
|
||||
int completeMisses = 0;
|
||||
|
||||
double totalPrecision = 0.0;
|
||||
double totalRecall = 0.0;
|
||||
double totalF1 = 0.0;
|
||||
double totalWeighted = 0.0;
|
||||
|
||||
int totalTP = 0, totalFP = 0, totalFN = 0;
|
||||
|
||||
std::vector<std::pair<TestCase, EvaluationResult>> worstCases;
|
||||
|
||||
for (const auto& testCase : testCases) {
|
||||
EvaluationResult result = evaluateWord(testCase, hyphenateFunc);
|
||||
|
||||
totalTP += result.truePositives;
|
||||
totalFP += result.falsePositives;
|
||||
totalFN += result.falseNegatives;
|
||||
|
||||
totalPrecision += result.precision;
|
||||
totalRecall += result.recall;
|
||||
totalF1 += result.f1Score;
|
||||
totalWeighted += result.weightedScore;
|
||||
|
||||
if (result.f1Score == 1.0) {
|
||||
perfectMatches++;
|
||||
} else if (result.f1Score > 0.0) {
|
||||
partialMatches++;
|
||||
} else {
|
||||
completeMisses++;
|
||||
}
|
||||
|
||||
worstCases.push_back({testCase, result});
|
||||
}
|
||||
|
||||
if (summaryMode) {
|
||||
const double averageF1Percent = testCases.empty() ? 0.0 : (totalF1 / testCases.size() * 100.0);
|
||||
std::cout << lang.cliName << ": " << averageF1Percent << "%" << std::endl;
|
||||
continue;
|
||||
}
|
||||
|
||||
std::sort(worstCases.begin(), worstCases.end(),
|
||||
[](const auto& a, const auto& b) { return a.second.weightedScore < b.second.weightedScore; });
|
||||
|
||||
printResults(lang.cliName, testCases, worstCases, perfectMatches, partialMatches, completeMisses, totalPrecision,
|
||||
totalRecall, totalF1, totalWeighted, totalTP, totalFP, totalFN, hyphenateFunc);
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
5012
test/hyphenation_eval/resources/english_hyphenation_tests.txt
Normal file
5012
test/hyphenation_eval/resources/english_hyphenation_tests.txt
Normal file
File diff suppressed because it is too large
Load Diff
5012
test/hyphenation_eval/resources/french_hyphenation_tests.txt
Normal file
5012
test/hyphenation_eval/resources/french_hyphenation_tests.txt
Normal file
File diff suppressed because it is too large
Load Diff
@ -0,0 +1,232 @@
|
||||
"""
|
||||
Generate hyphenation test data from a text file.
|
||||
|
||||
This script extracts unique words from a book and generates ground truth
|
||||
hyphenations using the pyphen library, which can be used to test and validate
|
||||
the hyphenation implementations (e.g., German, English, Russian).
|
||||
|
||||
Usage:
|
||||
python generate_hyphenation_test_data.py <input_file> <output_file>
|
||||
[--language de_DE] [--max-words 5000] [--min-prefix 2] [--min-suffix 2]
|
||||
|
||||
Requirements:
|
||||
pip install pyphen
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import re
|
||||
from collections import Counter
|
||||
import pyphen
|
||||
from pathlib import Path
|
||||
import zipfile
|
||||
|
||||
|
||||
def extract_text_from_epub(epub_path):
|
||||
"""Extract textual content from an .epub archive by concatenating HTML/XHTML files."""
|
||||
texts = []
|
||||
with zipfile.ZipFile(epub_path, "r") as z:
|
||||
for name in z.namelist():
|
||||
lower = name.lower()
|
||||
if (
|
||||
lower.endswith(".xhtml")
|
||||
or lower.endswith(".html")
|
||||
or lower.endswith(".htm")
|
||||
):
|
||||
try:
|
||||
data = z.read(name).decode("utf-8", errors="ignore")
|
||||
except Exception:
|
||||
continue
|
||||
# Remove tags
|
||||
text = re.sub(r"<[^>]+>", " ", data)
|
||||
texts.append(text)
|
||||
return "\n".join(texts)
|
||||
|
||||
|
||||
def extract_words(text):
|
||||
"""Extract all words from text, preserving original case."""
|
||||
# Match runs of Unicode letters (any script) while excluding digits/underscores
|
||||
return re.findall(r"[^\W\d_]+", text, flags=re.UNICODE)
|
||||
|
||||
|
||||
def clean_word(word):
|
||||
"""Normalize word for hyphenation testing."""
|
||||
# Keep original case but strip any non-letter characters
|
||||
return word.strip()
|
||||
|
||||
|
||||
def generate_hyphenation_data(
|
||||
input_file,
|
||||
output_file,
|
||||
language="de_DE",
|
||||
min_length=6,
|
||||
max_words=5000,
|
||||
min_prefix=2,
|
||||
min_suffix=2,
|
||||
):
|
||||
"""
|
||||
Generate hyphenation test data from a text file.
|
||||
|
||||
Args:
|
||||
input_file: Path to input text file
|
||||
output_file: Path to output file with hyphenation data
|
||||
language: Language code for pyphen (e.g., 'de_DE', 'en_US')
|
||||
min_length: Minimum word length to include
|
||||
max_words: Maximum number of words to include (default: 5000)
|
||||
min_prefix: Minimum characters allowed before the first hyphen (default: 2)
|
||||
min_suffix: Minimum characters allowed after the last hyphen (default: 2)
|
||||
"""
|
||||
print(f"Reading from: {input_file}")
|
||||
|
||||
# Read the input file
|
||||
if str(input_file).lower().endswith(".epub"):
|
||||
print("Detected .epub input; extracting HTML content")
|
||||
text = extract_text_from_epub(input_file)
|
||||
else:
|
||||
with open(input_file, "r", encoding="utf-8") as f:
|
||||
text = f.read()
|
||||
|
||||
# Extract words
|
||||
print("Extracting words...")
|
||||
words = extract_words(text)
|
||||
print(f"Found {len(words)} total words")
|
||||
|
||||
# Count word frequencies
|
||||
word_counts = Counter(words)
|
||||
print(f"Found {len(word_counts)} unique words")
|
||||
|
||||
# Initialize pyphen hyphenator
|
||||
print(
|
||||
f"Initializing hyphenator for language: {language} (min_prefix={min_prefix}, min_suffix={min_suffix})"
|
||||
)
|
||||
try:
|
||||
hyphenator = pyphen.Pyphen(lang=language, left=min_prefix, right=min_suffix)
|
||||
except KeyError:
|
||||
print(f"Error: Language '{language}' not found in pyphen.")
|
||||
print("Available languages include: de_DE, en_US, en_GB, fr_FR, etc.")
|
||||
return
|
||||
|
||||
# Generate hyphenations
|
||||
print("Generating hyphenations...")
|
||||
hyphenation_data = []
|
||||
|
||||
# Sort by frequency (most common first) then alphabetically
|
||||
sorted_words = sorted(word_counts.items(), key=lambda x: (-x[1], x[0].lower()))
|
||||
|
||||
for word, count in sorted_words:
|
||||
# Filter by minimum length
|
||||
if len(word) < min_length:
|
||||
continue
|
||||
|
||||
# Get hyphenation (may produce no '=' characters)
|
||||
hyphenated = hyphenator.inserted(word, hyphen="=")
|
||||
|
||||
# Include all words (so we can take the top N most common words even if
|
||||
# they don't have hyphenation points). This replaces the previous filter
|
||||
# which dropped words without '='.
|
||||
hyphenation_data.append(
|
||||
{"word": word, "hyphenated": hyphenated, "count": count}
|
||||
)
|
||||
|
||||
# Stop if we've reached max_words
|
||||
if max_words and len(hyphenation_data) >= max_words:
|
||||
break
|
||||
|
||||
print(f"Generated {len(hyphenation_data)} hyphenated words")
|
||||
|
||||
# Write output file
|
||||
print(f"Writing to: {output_file}")
|
||||
with open(output_file, "w", encoding="utf-8") as f:
|
||||
# Write header with metadata
|
||||
f.write(f"# Hyphenation Test Data\n")
|
||||
f.write(f"# Source: {Path(input_file).name}\n")
|
||||
f.write(f"# Language: {language}\n")
|
||||
f.write(f"# Min prefix: {min_prefix}\n")
|
||||
f.write(f"# Min suffix: {min_suffix}\n")
|
||||
f.write(f"# Total words: {len(hyphenation_data)}\n")
|
||||
f.write(f"# Format: word | hyphenated_form | frequency_in_source\n")
|
||||
f.write(f"#\n")
|
||||
f.write(f"# Hyphenation points are marked with '='\n")
|
||||
f.write(f"# Example: Silbentrennung -> Sil=ben=tren=nung\n")
|
||||
f.write(f"#\n\n")
|
||||
|
||||
# Write data
|
||||
for item in hyphenation_data:
|
||||
f.write(f"{item['word']}|{item['hyphenated']}|{item['count']}\n")
|
||||
|
||||
print("Done!")
|
||||
|
||||
# Print some statistics
|
||||
print("\n=== Statistics ===")
|
||||
print(f"Total unique words extracted: {len(word_counts)}")
|
||||
print(f"Words with hyphenation points: {len(hyphenation_data)}")
|
||||
print(
|
||||
f"Average hyphenation points per word: {sum(h['hyphenated'].count('=') for h in hyphenation_data) / len(hyphenation_data):.2f}"
|
||||
)
|
||||
|
||||
# Print some examples
|
||||
print("\n=== Examples (first 10) ===")
|
||||
for item in hyphenation_data[:10]:
|
||||
print(
|
||||
f" {item['word']:20} -> {item['hyphenated']:30} (appears {item['count']}x)"
|
||||
)
|
||||
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(
|
||||
description="Generate hyphenation test data from a text file",
|
||||
formatter_class=argparse.RawDescriptionHelpFormatter,
|
||||
epilog="""
|
||||
Examples:
|
||||
# Generate test data from a German book
|
||||
python generate_hyphenation_test_data.py ../data/books/bobiverse_1.txt hyphenation_test_data.txt
|
||||
|
||||
# Limit to 500 most common words
|
||||
python generate_hyphenation_test_data.py ../data/books/bobiverse_1.txt hyphenation_test_data.txt --max-words 500
|
||||
|
||||
# Use English hyphenation (when available)
|
||||
python generate_hyphenation_test_data.py book.txt test_en.txt --language en_US
|
||||
""",
|
||||
)
|
||||
|
||||
parser.add_argument("input_file", help="Input text file to extract words from")
|
||||
parser.add_argument("output_file", help="Output file for hyphenation test data")
|
||||
parser.add_argument(
|
||||
"--language", default="de_DE", help="Language code (default: de_DE)"
|
||||
)
|
||||
parser.add_argument(
|
||||
"--min-length", type=int, default=6, help="Minimum word length (default: 6)"
|
||||
)
|
||||
parser.add_argument(
|
||||
"--max-words",
|
||||
type=int,
|
||||
default=5000,
|
||||
help="Maximum number of words to include (default: 5000)",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--min-prefix",
|
||||
type=int,
|
||||
default=2,
|
||||
help="Minimum characters permitted before the first hyphen (default: 2)",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--min-suffix",
|
||||
type=int,
|
||||
default=2,
|
||||
help="Minimum characters permitted after the last hyphen (default: 2)",
|
||||
)
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
generate_hyphenation_data(
|
||||
args.input_file,
|
||||
args.output_file,
|
||||
language=args.language,
|
||||
min_length=args.min_length,
|
||||
max_words=args.max_words,
|
||||
min_prefix=args.min_prefix,
|
||||
min_suffix=args.min_suffix,
|
||||
)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
5012
test/hyphenation_eval/resources/german_hyphenation_tests.txt
Normal file
5012
test/hyphenation_eval/resources/german_hyphenation_tests.txt
Normal file
File diff suppressed because it is too large
Load Diff
5012
test/hyphenation_eval/resources/russian_hyphenation_tests.txt
Normal file
5012
test/hyphenation_eval/resources/russian_hyphenation_tests.txt
Normal file
File diff suppressed because it is too large
Load Diff
32
test/run_hyphenation_eval.sh
Executable file
32
test/run_hyphenation_eval.sh
Executable file
@ -0,0 +1,32 @@
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
|
||||
ROOT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
|
||||
BUILD_DIR="$ROOT_DIR/build/hyphenation_eval"
|
||||
BINARY="$BUILD_DIR/HyphenationEvaluationTest"
|
||||
|
||||
mkdir -p "$BUILD_DIR"
|
||||
|
||||
SOURCES=(
|
||||
"$ROOT_DIR/test/hyphenation_eval/HyphenationEvaluationTest.cpp"
|
||||
"$ROOT_DIR/lib/Epub/Epub/hyphenation/Hyphenator.cpp"
|
||||
"$ROOT_DIR/lib/Epub/Epub/hyphenation/LanguageRegistry.cpp"
|
||||
"$ROOT_DIR/lib/Epub/Epub/hyphenation/LiangHyphenation.cpp"
|
||||
"$ROOT_DIR/lib/Epub/Epub/hyphenation/HyphenationCommon.cpp"
|
||||
"$ROOT_DIR/lib/Utf8/Utf8.cpp"
|
||||
)
|
||||
|
||||
CXXFLAGS=(
|
||||
-std=c++20
|
||||
-O2
|
||||
-Wall
|
||||
-Wextra
|
||||
-pedantic
|
||||
-I"$ROOT_DIR"
|
||||
-I"$ROOT_DIR/lib"
|
||||
-I"$ROOT_DIR/lib/Utf8"
|
||||
)
|
||||
|
||||
c++ "${CXXFLAGS[@]}" "${SOURCES[@]}" -o "$BINARY"
|
||||
|
||||
"$BINARY" "$@"
|
||||
Loading…
x
Reference in New Issue
Block a user