wayray/docs/ai/plans/001-implementation-roadmap.md
Till Wegmueller a373ea1c41
Add greeter/session-launch architecture, clarify scope boundary
WayRay is a compositor, not a DE or login system. GNOME/KDE cannot
run on WayRay (they ARE compositors). The desktop is composed from
independent Wayland clients (pluggable WM + panel + launcher + apps).

- ADR-010: Greeter as Wayland client, external session launcher
  handles PAM/user env (like greetd for Sway)
- Clarify scope: WayRay owns compositor session + token binding,
  not user auth, home dirs, or environment setup
- Update roadmap with greeter phase and session.toml config
- Update architecture overview with scope boundary section
2026-03-28 21:35:18 +01:00

215 lines
8.9 KiB
Markdown

# Implementation Roadmap
## Phase 0: Foundation (Weeks 1-2)
### 0.1 Project Structure
- Set up Cargo workspace with four crates: `wayray-server`, `wayray-client`, `wayray-protocol`, `wayray-ctl`
- Configure shared dependencies, feature flags, CI (Linux + illumos)
- Set up tracing/logging infrastructure with miette error handling
- Smithay with `default-features = false` + portable features only in core
- Platform-specific backends behind `cfg(target_os)` + feature flags
### 0.2 Minimal Compositor (Server)
- Implement a minimal Smithay compositor based on Smallvil patterns
- Support: wl_compositor, xdg_shell, wl_shm, wl_seat, wl_output
- Use Winit backend for development/testing (works on both Linux and illumos)
- Use `shm_open` with `memfd_create` fallback for shared memory portability
- Verify basic Wayland clients (foot terminal, weston-info) can connect and render
### 0.3 Framebuffer Capture
- Implement ExportMem-based framebuffer capture after each render pass
- Integrate OutputDamageTracker for efficient dirty region tracking
- Benchmark: capture latency, memory bandwidth
## Phase 1: Network Protocol (Weeks 3-5)
### 1.1 Protocol Definition
- Define WayRay wire protocol in wayray-protocol crate
- Message types: FrameUpdate, InputEvent, SessionControl, AudioChunk, USBData
- Serialization: serde + bincode or postcard for low overhead
- Version negotiation and capability exchange
### 1.2 QUIC Transport Layer
- Implement QUIC server (quinn) in wayray-server
- Implement QUIC client (quinn) in wayray-client
- Stream mapping:
- Stream 0: Control channel (session mgmt, capabilities)
- Stream 1: Display channel (frame updates, damage regions)
- Stream 2: Input channel (keyboard, mouse, touch)
- Stream 3: Audio channel (Opus frames)
- Stream 4+: USB device channels (one per device)
- Connection handling: TLS certificates, authentication
### 1.3 Frame Encoding Pipeline
- Implement frame differencing (XOR diff against previous frame)
- Implement region-based compression (zstd for lossless regions)
- Implement content-adaptive encoding:
- Static regions: lossless zstd diff
- Video regions: H.264 via ffmpeg-next or VAAPI
- Text regions: lossless PNG-style encoding
- Damage rectangle merging and optimization
## Phase 2: Client Viewer (Weeks 5-7)
### 2.1 Display Client
- Implement wayray-client as a standalone application
- Use winit + wgpu for cross-platform display
- Frame decoding pipeline: receive -> decompress -> decode -> upload to GPU -> display
- Double-buffered rendering with VSync
### 2.2 Input Capture & Forwarding
- Capture keyboard events (with proper keymap forwarding via xkb)
- Capture mouse events (motion, buttons, scroll)
- Capture touch events
- Serialize and send over QUIC input stream
- Handle keyboard grab/release for compositor key passthrough
### 2.3 Cursor Handling
- Server-side cursor rendering (simplest)
- Client-side cursor rendering with cursor image forwarding (lower latency)
- Cursor shape protocol support
## Phase 2.5: Pluggable Window Management (Weeks 7-8)
### 2.5.1 WM Protocol Definition
- Define `wayray_wm_manager_v1` Wayland protocol XML
- Define `wayray_wm_window_v1`, `wayray_wm_seat_v1`, `wayray_wm_workspace_v1`
- Implement two-phase transaction model (manage + render sequences)
- Generate Rust bindings via wayland-scanner
### 2.5.2 WM Protocol Server (in compositor)
- Implement WM global in wayray-server
- Window lifecycle events (new, closed, properties)
- Manage phase: receive policy decisions, send configures
- Render phase: apply positions/z-order atomically
- Keybinding registration and dispatch via seat interface
### 2.5.3 Built-in Floating WM
- Default WM active when no external WM is connected
- Basic floating behavior: centered new windows, focus-follows-click
- Keyboard shortcuts: Alt+F4 close, Alt+Tab cycle, Super+Arrow snap
- Yields to external WM on connect
### 2.5.4 Example Tiling WM
- Ship a reference tiling WM as a separate binary (`wayray-wm-tiling`)
- Demonstrates the protocol for third-party WM developers
- Basic BSP tiling with keyboard-driven focus
## Phase 3: Session Management (Weeks 8-11)
### 3.1 Session Persistence
- Session state machine: Created -> Active -> Suspended -> Resumed -> Destroyed
- Session storage: in-memory with optional persistence (SeaORM + SQLite)
- Session timeout and cleanup policies
### 3.2 Greeter and Session Launch
- Define session launcher interface (events over Unix socket: session_requested, session_authenticated, session_logout)
- Implement reference session launcher (`wayray-session-launcher`) that:
- Receives "new session needed" events from WayRay
- Creates user environment (delegates to PAM, system tools)
- Starts WayRay compositor session for the user
- Launches greeter as first Wayland client
- Implement reference greeter (`wayray-greeter`) as a Wayland client:
- Login form (username + password)
- Authenticates via PAM through session launcher
- On success, session launcher starts user's configured session (WM, panel, apps)
- Greeter exits
- User session config: `~/.config/wayray/session.toml` (WM, panel, launcher, autostart apps)
- Support `wlr-layer-shell` protocol for panels, launchers, notification daemons
### 3.3 Token-Based Session Identity
- Token-based session identification (smart card ID, badge, or software token)
- Session-token binding in session store
- WayRay does NOT own authentication -- delegates to session launcher / PAM
### 3.4 Hot-Desking (Session Mobility)
- Token insertion triggers session lookup across server pool
- Session reconnection: rebind existing session to new client endpoint
- Session disconnect: unbind from client, keep session running
- Sub-second reconnection target (< 500ms)
### 3.5 Multi-Server Support
- Server discovery protocol (mDNS or custom)
- Session registry: which sessions live on which servers
- Cross-server session redirect
- Load balancing for new session placement
## Phase 4: Audio & Peripherals (Weeks 10-13)
### 4.1 Audio Forwarding
- Trait-based audio backend: PipeWire (Linux), PulseAudio (illumos/Linux fallback)
- PipeWire integration on server side for audio capture (Linux)
- Opus encoding for low-latency audio streaming
- Audio stream over dedicated QUIC stream
- Playback synchronization with display frames
- Microphone input forwarding (bidirectional audio)
### 4.2 USB Device Forwarding
- Userspace USB forwarding protocol over QUIC (not kernel USB/IP, for illumos portability)
- Consider usbredir as wire format or design custom
- Device hotplug detection on client (udev on Linux, sysevent on illumos)
- Device attach/detach over QUIC channels
- Security: device class filtering (allow/deny policies)
### 4.3 Clipboard Synchronization
- Intercept wl_data_device on server
- Forward clipboard content types and data over control channel
- Handle large clipboard entries (images) efficiently
- Security: optional clipboard direction restrictions
## Phase 5: Production Hardening (Weeks 14-17)
### 5.1 Platform-Specific Backends
- Linux: DRM/KMS backend for running wayray-server on hardware (optional, feature-gated)
- Linux: Multi-GPU support via MultiRenderer
- Linux: Session management via logind/libseat
- illumos: Custom input backend for `/dev/kbd` + `/dev/mouse` (local console use)
- illumos: Zones integration for session isolation
- CI: Test matrix for both Linux and illumos
### 5.2 XWayland Support
- Integrate Smithay's XWayland module for X11 application compatibility
- Handle X11 clipboard integration
### 5.3 Performance Optimization
- Adaptive bitrate based on network conditions
- Hardware encoding path (VAAPI, NVENC)
- Zero-copy frame capture via DMA-BUF export to encoder
- Client-side frame interpolation for network jitter compensation
### 5.4 Security
- TLS 1.3 for all QUIC connections (mandatory)
- Certificate-based mutual authentication
- Session encryption at rest
- Audit logging for session lifecycle events
- AppArmor/seccomp profiles for server process
## Phase 6: Management & Operations (Weeks 16-20)
### 6.1 Administration
- CLI tool: `wayray-ctl` for server/session management
- REST API for external integration
- Session monitoring: active sessions, resource usage, network stats
### 6.2 Multi-Tenancy
- User session isolation (namespaces, cgroups)
- Resource quotas (CPU, memory, GPU per session)
- Fair scheduling across sessions
### 6.3 High Availability
- Server failover groups
- Session state replication for seamless failover
- Health checking and automatic server removal
## Milestones
| Milestone | Phase | Description |
|-----------|-------|-------------|
| M0 | 0 | Wayland clients render in local compositor |
| M1 | 1 | Remote viewer sees compositor output over network |
| M2 | 2 | Interactive remote session (display + input) |
| M2.5 | 2.5 | External WM can control window layout via protocol |
| M3 | 3 | Session persists across client disconnects, hot-desking works |
| M4 | 4 | Audio and USB forwarding functional |
| M5 | 5 | Production-ready with platform backends, XWayland, illumos CI |
| M6 | 6 | Multi-server deployment with HA and management tools |