## Summary This PR applies a micro optimization on `SerializedHyphenationPatterns`, which allow reading `rootOffset` directly without having to parse then cache it. It should not affect storage space since no new bytes are added. This also gets rid of the linear cache search whenever `liangBreakIndexes` is called. In theory, the performance should be improved a bit, although it may be too small to be noticeable in practice. ## Testing master branch: ``` english: 99.1023% french: 100% german: 97.7289% russian: 97.2167% spanish: 99.0236% ``` This PR: ``` english: 99.1023% french: 100% german: 97.7289% russian: 97.2167% spanish: 99.0236% ``` --- ### AI Usage While CrossPoint doesn't have restrictions on AI tools in contributing, please be transparent about their usage as it helps set the right context for reviewers. Did you use AI tools to help write this code? PARTIALLY - mostly IDE tab-autocompletions
25 lines
485 B
Bash
Executable File
25 lines
485 B
Bash
Executable File
#!/usr/bin/env bash
|
|
set -euo pipefail
|
|
|
|
ROOT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
|
|
|
|
cd "$ROOT_DIR"
|
|
|
|
process() {
|
|
local lang="$1"
|
|
|
|
mkdir -p "build"
|
|
wget -O "build/$lang.bin" "https://github.com/typst/hypher/raw/refs/heads/main/tries/$lang.bin"
|
|
|
|
python scripts/generate_hyphenation_trie.py \
|
|
--input "build/$lang.bin" \
|
|
--output "lib/Epub/Epub/hyphenation/generated/hyph-${lang}.trie.h"
|
|
}
|
|
|
|
process en
|
|
process fr
|
|
process de
|
|
process es
|
|
process ru
|
|
process it
|