Arthur Tazhitdinov 8824c87490
feat: dict based Hyphenation (#305)
## Summary

* Adds (optional) Hyphenation for English, French, German, Russian
languages

## Additional Context

* Included hyphenation dictionaries add approximately 280kb to the flash
usage (German alone takes 200kb)
* Trie encoded dictionaries are adopted from hypher project
(https://github.com/typst/hypher)
* Soft hyphens (and other explicit hyphens) take precedence over
dict-based hyphenation. Overall, the hyphenation rules are quite
aggressive, as I believe it makes more sense on our smaller screen.

---------

Co-authored-by: Dave Allie <dave@daveallie.com>
2026-01-19 12:56:26 +00:00

24 lines
748 B
C++

#pragma once
#include <cstddef>
#include <string>
#include <vector>
class LanguageHyphenator;
class Hyphenator {
public:
struct BreakInfo {
size_t byteOffset;
bool requiresInsertedHyphen;
};
// Returns byte offsets where the word may be hyphenated. When includeFallback is true, all positions obeying the
// minimum prefix/suffix constraints are returned even if no language-specific rule matches.
static std::vector<BreakInfo> breakOffsets(const std::string& word, bool includeFallback);
// Provide a publication-level language hint (e.g. "en", "en-US", "ru") used to select hyphenation rules.
static void setPreferredLanguage(const std::string& lang);
private:
static const LanguageHyphenator* cachedHyphenator_;
};