mirror of https://github.com/CloudNebulaProject/wayray.git synced 2026-04-10 13:10:41 +00:00

Till Wegmueller 167c6c17c6

Add project documentation, architecture decisions, and usage book

Comprehensive documentation for WayRay, a SunRay-like thin client
Wayland compositor targeting illumos and Linux:

- CLAUDE.md: project context and conventions
- docs/ai/plans: 6-phase implementation roadmap
- docs/ai/adr: 9 architecture decision records (Smithay, QUIC,
  frame encoding, session management, rendering, audio, project
  structure, illumos support, pluggable window management)
- docs/architecture: system architecture overview with diagrams
- docs/protocols: WayRay wire protocol specification
- book/: mdbook user guide (introduction, concepts, server/client
  guides, admin, development)
- RESEARCH.md: deep research on remote display protocols

2026-03-28 20:47:16 +01:00

3.3 KiB

Raw Blame History

ADR-003: Frame Encoding Strategy

Status

Accepted

Context

WayRay must transmit rendered Wayland compositor output from server to client efficiently. The encoding must balance bandwidth, latency, and visual quality. Different regions of the display have different characteristics (static UI vs video vs text) that benefit from different encoding strategies.

Options Considered

1. Full-frame video encoding (H.264/AV1 everything)

Simple pipeline: render -> encode entire frame -> transmit -> decode
Good for video-heavy workloads
Introduces compression artifacts on text and UI elements
Consistent latency characteristics
Hardware encoding available (VAAPI, NVENC)

2. Differential lossless (like waypipe)

XOR diff against previous frame, compress with zstd/lz4
Perfect visual quality
Very efficient for mostly-static displays
Bandwidth spikes when large areas change
No hardware acceleration

3. Content-adaptive hybrid (like SPICE)

Heuristic classification of regions: text, UI, video, image
Each region encoded with optimal codec
Best quality-bandwidth tradeoff
Most complex to implement
SPICE proved this approach works at scale

4. Tile-based encoding

Divide frame into fixed-size tiles (e.g., 64x64)
Only encode tiles that changed (via damage tracking)
Each tile encoded independently (parallel-friendly)
Mix lossless and lossy per-tile based on content heuristics
Good balance of complexity and effectiveness

Decision

Tile-based encoding with content-adaptive per-tile codec selection, implemented progressively:

Stage 1 (MVP): Differential lossless

Use OutputDamageTracker damage rectangles
XOR diff against previous frame for damaged regions
Compress with zstd
Good enough for development and LAN use

Stage 2: Tile-based with lossy option

Divide into 64x64 tiles, only process damaged tiles
Lossless path: zstd-compressed diff (for text, UI)
Lossy path: JPEG/WebP for photographic content
Tile-level quality selection based on content entropy

Stage 3: Hardware video encoding

Detect video regions (rapidly changing tiles)
Route video regions to H.264 encoder (VAAPI/NVENC)
Keep text/UI regions lossless
AV1 as alternative for better quality at same bitrate

Frame Update Message Structure

FrameUpdate {
    sequence: u64,
    timestamp: u64,
    full_width: u32,
    full_height: u32,
    regions: Vec<EncodedRegion>,
}

EncodedRegion {
    x: u32,
    y: u32,
    width: u32,
    height: u32,
    encoding: Encoding,  // Zstd, Jpeg, H264, Raw
    data: Vec<u8>,
}

Rationale

Progressive approach: ship MVP quickly, optimize later
Damage tracking from Smithay gives us changed regions for free
Tile-based approach is parallelizable (encode tiles on multiple cores)
Content-adaptive encoding avoids the "text looks blurry" problem of pure video encoding
Hardware encoding path available when needed for video content
Lossless path ensures perfect text rendering (critical for terminal/code editors)

Consequences

Stage 1 is bandwidth-hungry on large screen changes (acceptable for LAN)
Need heuristics for tile content classification (can start simple: entropy-based)
Hardware encoding adds dependency on system GPU/driver support
Client must handle mixed encodings within a single frame update

3.3 KiB Raw Blame History