Files
crosspoint-reader-mod/scripts/build_html.py
Xuan-Son Nguyen 5816ab2a47
Some checks failed
CI (build) / clang-format (push) Has been cancelled
CI (build) / cppcheck (push) Has been cancelled
CI (build) / build (push) Has been cancelled
CI (build) / Test Status (push) Has been cancelled
feat: use pre-compressed HTML pages (#861)
## Summary

Pre-compress the HTML file to save flash space. I'm using `gzip` because
it's supported everywhere (indeed, we are using the same optimization on
[llama.cpp server](https://github.com/ggml-org/llama.cpp), our HTML page
is huge 😅 ).

This free up ~40KB flash space.

Some users suggested using `brotli` which is known to further reduce 20%
in size, but it doesn't supported by firefox (only supports if served
via HTTPS), and some reverse proxy like nginx doesn't support it out of
the box (unrelated in this context, but just mention for completeness)

```
PR:
RAM:   [===       ]  31.0% (used 101700 bytes from 327680 bytes)
Flash: [==========]  95.5% (used 6259244 bytes from 6553600 bytes)

master:
RAM:   [===       ]  31.0% (used 101700 bytes from 327680 bytes)
Flash: [==========]  96.2% (used 6302416 bytes from 6553600 bytes)
```

---

### AI Usage

While CrossPoint doesn't have restrictions on AI tools in contributing,
please be transparent about their usage as it
helps set the right context for reviewers.

Did you use AI tools to help write this code? **PARTIALLY**, only the
python part
2026-02-14 18:49:39 +03:00

75 lines
2.9 KiB
Python

import os
import re
import gzip
SRC_DIR = "src"
def minify_html(html: str) -> str:
# Tags where whitespace should be preserved
preserve_tags = ['pre', 'code', 'textarea', 'script', 'style']
preserve_regex = '|'.join(preserve_tags)
# Protect preserve blocks with placeholders
preserve_blocks = []
def preserve(match):
preserve_blocks.append(match.group(0))
return f"__PRESERVE_BLOCK_{len(preserve_blocks)-1}__"
html = re.sub(rf'<({preserve_regex})[\s\S]*?</\1>', preserve, html, flags=re.IGNORECASE)
# Remove HTML comments
html = re.sub(r'<!--.*?-->', '', html, flags=re.DOTALL)
# Collapse all whitespace between tags
html = re.sub(r'>\s+<', '><', html)
# Collapse multiple spaces inside tags
html = re.sub(r'\s+', ' ', html)
# Restore preserved blocks
for i, block in enumerate(preserve_blocks):
html = html.replace(f"__PRESERVE_BLOCK_{i}__", block)
return html.strip()
for root, _, files in os.walk(SRC_DIR):
for file in files:
if file.endswith(".html"):
html_path = os.path.join(root, file)
with open(html_path, "r", encoding="utf-8") as f:
html_content = f.read()
# minified = regex.sub("\g<1>", html_content)
minified = minify_html(html_content)
# Compress with gzip (compresslevel 9 is maximum compression)
# IMPORTANT: we don't use brotli because Firefox doesn't support brotli with insecured context (only supported on HTTPS)
compressed = gzip.compress(minified.encode('utf-8'), compresslevel=9)
base_name = f"{os.path.splitext(file)[0]}Html"
header_path = os.path.join(root, f"{base_name}.generated.h")
with open(header_path, "w", encoding="utf-8") as h:
h.write(f"// THIS FILE IS AUTOGENERATED, DO NOT EDIT MANUALLY\n\n")
h.write(f"#pragma once\n")
h.write(f"#include <cstddef>\n\n")
# Write the compressed data as a byte array
h.write(f"constexpr char {base_name}[] PROGMEM = {{\n")
# Write bytes in rows of 16
for i in range(0, len(compressed), 16):
chunk = compressed[i:i+16]
hex_values = ', '.join(f'0x{b:02x}' for b in chunk)
h.write(f" {hex_values},\n")
h.write(f"}};\n\n")
h.write(f"constexpr size_t {base_name}CompressedSize = {len(compressed)};\n")
h.write(f"constexpr size_t {base_name}OriginalSize = {len(minified)};\n")
print(f"Generated: {header_path}")
print(f" Original: {len(html_content)} bytes")
print(f" Minified: {len(minified)} bytes ({100*len(minified)/len(html_content):.1f}%)")
print(f" Compressed: {len(compressed)} bytes ({100*len(compressed)/len(html_content):.1f}%)")