Files
crosspoint-reader-mod/lib/KOReaderSync/KOReaderDocumentId.h
cottongin 3628d8eb37 feat: port upstream KOReader sync PRs (#1185, #1217, #1090)
Port three unmerged upstream PRs with adaptations for the fork's
callback-based ActivityWithSubactivity architecture:

- PR #1185: Cache KOReader document hash using mtime fingerprint +
  file size validation to avoid repeated MD5 computation on sync.
- PR #1217: Proper KOReader XPath synchronisation via new
  ChapterXPathIndexer (Expat-based on-demand XHTML parsing) with
  XPath-first mapping and percentage fallback in ProgressMapper.
- PR #1090: Push Progress & Sleep menu option with PUSH_ONLY sync
  mode. Adapted to fork's callback pattern with deferFinish() for
  thread-safe completion. Modified to sleep silently on any failure
  (hash, upload, no credentials) rather than returning to reader.

Made-with: Cursor
2026-03-02 05:19:14 -05:00

73 lines
3.0 KiB
C++
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
#pragma once
#include <string>
/**
* Calculate KOReader document ID (partial MD5 hash).
*
* KOReader identifies documents using a partial MD5 hash of the file content.
* The algorithm reads 1024 bytes at specific offsets and computes the MD5 hash
* of the concatenated data.
*
* Offsets are calculated as: 1024 << (2*i) for i = -1 to 10
* Producing: 256, 1024, 4096, 16384, 65536, 262144, 1048576, 4194304,
* 16777216, 67108864, 268435456, 1073741824 bytes
*
* If an offset is beyond the file size, it is skipped.
*/
class KOReaderDocumentId {
public:
/**
* Calculate the KOReader document hash for a file (binary/content-based).
*
* @param filePath Path to the file (typically an EPUB)
* @return 32-character lowercase hex string, or empty string on failure
*/
static std::string calculate(const std::string& filePath);
/**
* Calculate document hash from filename only (filename-based sync mode).
* This is simpler and works when files have the same name across devices.
*
* @param filePath Path to the file (only the filename portion is used)
* @return 32-character lowercase hex MD5 of the filename
*/
static std::string calculateFromFilename(const std::string& filePath);
private:
// Size of each chunk to read at each offset
static constexpr size_t CHUNK_SIZE = 1024;
// Number of offsets to try (i = -1 to 10, so 12 offsets)
static constexpr int OFFSET_COUNT = 12;
// Calculate offset for index i: 1024 << (2*i)
static size_t getOffset(int i);
// Hash cache helpers
// Returns the path to the per-book cache file that stores the precomputed hash.
// Uses the same directory convention as the Epub cache (/.crosspoint/epub_<hash>/).
static std::string getCacheFilePath(const std::string& filePath);
// Returns the cached hash if the file size and fingerprint match, or empty
// string on miss/invalidation.
//
// The fingerprint is derived from the file's modification timestamp. We
// call `FsFile::getModifyDateTime` to retrieve two 16bit packed values
// supplied by the filesystem: one for the date and one for the time. These
// are concatenated and represented as eight hexadecimal digits in the form
// <date><time> (high 16 bits = packed date, low 16 bits = packed time).
//
// The resulting string serves as a lightweight change signal; any modification
// to the file's mtime will alter the packed date/time combo and invalidate
// the cache entry. Since the full document hash is expensive to compute,
// using the packed timestamp gives us a quick way to detect modifications
// without reading file contents.
static std::string loadCachedHash(const std::string& cacheFilePath, size_t fileSize,
const std::string& currentFingerprint);
// Persists the computed hash alongside the file size and fingerprint (the
// modification-timestamp token) used to generate it.
static void saveCachedHash(const std::string& cacheFilePath, size_t fileSize, const std::string& fingerprint,
const std::string& hash);
};