Chapter 20: User I/O Subsystems

TTY/PTY, input (evdev), audio (ALSA), display/graphics (DRM/KMS)

20.1 TTY and PTY Subsystem

20.1.1 The Problem

Linux's TTY layer is a historical artifact designed for 300-baud hardware teletypes. It features monolithic locks (tty_mutex, termios_rwsem), synchronous line discipline processing (handling backspace and signals in the critical path), and a complex buffer management system that scales poorly to thousands of concurrent terminal sessions.

In modern systems, the TTY layer is primarily used for Pseudo-Terminals (PTYs) — the backends for SSH sessions, terminal emulators (GNOME Terminal, Alacritty), and container multiplexers (Docker, Kubernetes). The Linux PTY implementation requires every byte of terminal output to traverse the kernel data path, acquiring locks and waking sleeping processes, making it a significant bottleneck for high-density container logging and high-throughput terminal applications.

20.1.2 UmkaOS's Lock-Free Ring Architecture

UmkaOS completely rearchitects the TTY/PTY subsystem around lock-free, single-producer/single-consumer (SPSC) ring buffers, identical to the KABI ring buffers used for storage and networking (Section 10.6).

The PTY Data Path: A PTY consists of a master side (/dev/ptmx, held by SSHd or Docker) and a slave side (/dev/pts/N, held by the shell or containerized application).

In UmkaOS, a PTY pair shares a pair of mapped memory pages (8 KB total) containing two SPSC ring buffers (master-to-slave and slave-to-master). Each ring buffer occupies one 4 KB page, providing adequate buffer space for interactive terminal sessions and container logging.

/// PTY ring buffer header. 16 bytes, designed for minimal overhead.
///
/// This is a simplified SPSC ring buffer format (not the full DomainRingBuffer
/// from Section 10.7.2, which has 128 bytes of header for MPSC/broadcast support).
/// PTYs are always single-producer/single-consumer, so the compact header suffices.
///
/// Layout:
///   - bytes [0..8]:   head index (write position, AtomicU64)
///   - bytes [8..16]:  tail index (read position, AtomicU64)
///   - bytes [16..4095]: data buffer (4079 bytes usable; 1 byte sentinel for ring-full detection)
#[repr(C)]
pub struct PtyRingHeader {
    /// Write position (producer advances). Counts bytes written.
    /// Applied modulo data capacity (4079) to get buffer offset.
    pub head: AtomicU64,
    /// Read position (consumer advances). Counts bytes read.
    /// Applied modulo data capacity (4079) to get buffer offset.
    pub tail: AtomicU64,
}

/// PTY ring buffer page. 4 KB total, 4079 bytes usable data.
/// Aligned to page boundary for direct mmap() into userspace.
///
/// The producer writes at (head % 4079 + 16), advancing head.
/// The consumer reads at (tail % 4079 + 16), advancing tail.
///
/// **Full/empty detection**: Since head and tail are monotonically increasing
/// u64 counters (never reset), full/empty is detected by simple subtraction:
/// - Empty when `head == tail`
/// - Full when `head - tail >= 4079` (one byte sacrificed for disambiguation)
/// - Available for write: `4079 - (head - tail)`
/// - Available for read: `head - tail`
/// Buffer offsets are derived by `head % 4079 + 16` and `tail % 4079 + 16`.
/// The u64 counters will not wrap in practice (2^64 bytes ≈ 18 exabytes).
#[repr(C, align(4096))]
pub struct PtyRingPage {
    /// Ring buffer header (16 bytes).
    pub header: PtyRingHeader,
    /// 4079 bytes data; 1 byte sentinel for ring-full detection (standard ring buffer practice).
    pub data: [u8; 4079],
}

/// The reverse-direction ring (slave→master) is a separate page allocation.
/// Same layout as PtyRingPage. This design allows each direction to be
/// mapped independently if needed, and avoids the 8 KB allocation exceeding
/// the page granularity.
#[repr(C, align(4096))]
pub struct PtyRingPageReverse {
    /// Ring buffer header (16 bytes).
    pub header: PtyRingHeader,
    /// 4079 bytes data; 1 byte sentinel for ring-full detection (standard ring buffer practice).
    pub data: [u8; 4079],
}

/// Terminal state shared between master and slave.
/// Stored in a separate small allocation (not a full page) within a per-master
/// state arena. Multiple PTYs from the same master share a single arena,
/// amortizing the page allocation overhead.
///
/// Total size: 32 bytes (cache-line friendly when padded).
/// Layout: termios_flags(4) + winsize_seq(4) + winsize_data(8) + flow_control(1)
///         + zero_copy_enabled(1) + _pad(14) = 32.
#[repr(C, align(8))]
pub struct AtomicTtyState {
    /// Terminal flags (ICANON, ECHO, ISIG, etc.) as bit positions.
    /// Modified atomically via compare-and-swap.
    pub termios_flags: AtomicU32,
    /// Window size (rows, columns). Modified via seqlock protocol.
    /// Layout: [seq_counter: AtomicU32 (4 bytes), winsize: Winsize (8 bytes)]
    pub winsize_seq: AtomicU32,
    pub winsize_data: UnsafeCell<Winsize>,
    /// Flow control state (stopped/running).
    pub flow_control: AtomicBool,
    /// Zero-copy mode enabled flag. Set by mutual consent handshake.
    pub zero_copy_enabled: AtomicBool,
    /// Padding to 32 bytes for cache alignment.
    _pad: [u8; 14],
}

/// Window size structure (matches POSIX struct winsize from <sys/ioctl.h>).
/// Used by TIOCGWINSZ/TIOCSWINSZ ioctls.
#[repr(C)]
#[derive(Clone, Copy)]
pub struct Winsize {
    pub ws_row: u16,
    pub ws_col: u16,
    pub ws_xpixel: u16,
    pub ws_ypixel: u16,
}

Memory layout: A PTY pair consists of three shared memory regions: 1. Master→slave ring (1 page, 4 KB): Written by master, read by slave 2. Slave→master ring (1 page, 4 KB): Written by slave, read by master 3. State arena (shared across PTYs from same master): Contains multiple AtomicTtyState structs (32 bytes each). A 4 KB arena supports up to 128 PTYs.

Seqlock protocol for window size: Reads use the standard seqlock pattern: 1. Read winsize_seq. If odd, retry (writer in progress). 2. Read winsize_data. 3. Read winsize_seq again. If changed, retry. Writes increment winsize_seq to odd, update winsize_data, then increment to even.

Writer serialization: Concurrent TIOCSWINSZ callers must acquire the TTY write mutex before entering the seqlock write section (incrementing winsize_seq to odd). Without this, two concurrent writers can interleave their begin/end increments, leaving winsize_seq in an odd (permanently-locked) state and corrupting winsize_data. The reader path (TIOCGWINSZ) requires no mutex — pure seqlock retry is sufficient.

When the slave application calls write() to stdout, the UmkaOS syscall interface (umka-compat) writes the data directly into the slave_tx ring buffer. If the master application is polling via epoll() or io_uring, the kernel signals the eventfd associated with the ring.

Zero-Copy PTYs for Containers: For high-density container environments, UmkaOS supports a zero-copy PTY mode. If both the master and slave processes explicitly request it via an UmkaOS-specific ioctl(PTY_REQ_DIRECT), the kernel maps the PtyRingPage directly into the address spaces of both processes. The master and slave can then exchange terminal data entirely in userspace, bypassing the kernel data path completely. The kernel is only invoked to handle buffer full/empty wakeups (via futex). This allows a single node to stream gigabytes of container logs per second with near-zero CPU overhead.

Zero-copy mode restrictions: - Raw mode only: Zero-copy mode requires the PTY to be in raw mode (ICANON flag clear in termios). The kernel's asynchronous TTY worker thread (Section 20.1.3) is bypassed, so no inline line discipline processing occurs. Applications receive raw bytes without backspace handling or line buffering. Signal generation is handled out-of-band via the control ring (see next bullet). - Signal generation via control ring: Because the kernel data path is bypassed, inline byte-stream interception cannot detect control characters. Instead, zero-copy PTY uses a dedicated control ring for out-of-band signal delivery (see Signal Generation in Zero-Copy Mode below). POSIX semantics (Ctrl+C → SIGINT, Ctrl+\ → SIGQUIT, Ctrl+Z → SIGTSTP) are preserved. - No echo processing: Local echo (ECHO flag) is disabled automatically when zero-copy mode is activated. The master must implement echo if required. - Termios changes require renegotiation: If either side calls tcsetattr() to change terminal settings, the kernel automatically disables zero-copy mode and falls back to kernel-mediated mode. To re-enable, both sides must repeat the consent handshake.

Security Model for Zero-Copy PTYs:

Trust boundary note: Zero-copy mode creates a shared-memory channel between master and slave. The master process can directly read all slave terminal output without kernel mediation. Zero-copy mode requires mutual trust between master and slave and is not suitable for security-isolation boundaries (e.g., between different security domains, privilege levels, or container trust zones).

Zero-copy PTY mode requires explicit security checks before enabling direct memory sharing:

Capability requirement: The master side (the process requesting zero-copy mode) must hold CAP_TTY_DIRECT (defined in Section 8.1.3). This capability grants permission to bypass the kernel's TTY data path security checks. Container runtimes (Docker, containerd) typically hold this capability; unprivileged processes do not.
Mutual consent: Both master and slave must explicitly agree to zero-copy mode. The PtyDirectParams structure passed to PTY_REQ_DIRECT is:

``rust /// Parameters for PTY_REQ_DIRECT ioctl. /// Layout: C-compatible, 64-byte fixed size (padding ensures ABI stability). #[repr(C)] pub struct PtyDirectParams { /// Random 64-bit nonce generated by the master. The slave must echo /// this value in its PTY_ACK_DIRECT ioctl to prove consent. /// The kernel verifies nonce equality. Generated viagetrandom(2)`. pub nonce: u64,

   /// Timeout for slave acknowledgement in milliseconds.
   /// If the slave does not call PTY_ACK_DIRECT within this window,
   /// PTY_REQ_DIRECT returns -ETIMEDOUT. Range: 100-30000 ms.
   /// Default (0): kernel uses 5000 ms.
   pub timeout_ms: u32,

   /// Requested ring buffer size for the shared data ring (bytes).
   /// Must be a power of two in [4096, 4194304] (4 KB to 4 MB).
   /// Default (0): kernel uses 65536 bytes (64 KB, matching pipe default).
   pub ring_size_bytes: u32,

   /// Flags. Currently reserved, must be 0.
   pub flags: u64,

   /// On success, filled by the kernel with the file descriptor for
   /// the shared ring mmap. The caller maps this fd to access the ring.
   /// Negative value on failure.
   pub ring_fd: i32,

   /// Padding to 64 bytes for ABI stability.
   pub _pad: [u8; 28],

} ```

Error codes for PTY_REQ_DIRECT: - -EPERM: caller lacks CAP_TTY_DIRECT - -EINVAL: timeout_ms or ring_size_bytes out of range, or flags != 0 - -ETIMEDOUT: slave did not acknowledge within timeout_ms - -EBUSY: zero-copy mode already active on this PTY - -ENOMEM: ring buffer allocation failed

Error codes for PTY_ACK_DIRECT: - -ENOENT: no pending PTY_REQ_DIRECT request on this slave fd - -EINVAL: nonce mismatch (wrong value supplied)

Master requests via ioctl(fd, PTY_REQ_DIRECT, &params) where params includes a nonce
Slave acknowledges via ioctl(slave_fd, PTY_ACK_DIRECT, nonce) within a timeout window
If the slave never acknowledges, the request fails with -ETIMEDOUT
This prevents a malicious master from forcing zero-copy mode on an unsuspecting slave
Same mount namespace constraint: Both processes must share the same mount namespace as the PTY owner. The check is:

rust // Zero-copy PTY access requires same mount namespace as the PTY owner. // Mount namespace is stable for a process's lifetime (cannot be changed // after unshare(CLONE_NEWNS)), unlike cgroup membership which can be // changed after the zero-copy channel is established (TOCTOU bypass). if current_task().mnt_ns_id == pty.owner_mnt_ns_id { enable_zero_copy_for_pair(master_fd, slave_fd) } else { Err(Error::PermissionDenied) }

Mount namespace is the correct isolation boundary for PTY zero-copy: it is immutable after unshare(CLONE_NEWNS) and correctly scopes to a container boundary. Using cgroup membership would be vulnerable to cgroup migration attacks — a process can be moved between cgroups by any holder of CAP_SYS_ADMIN, creating a TOCTOU bypass where the check passes but the process is subsequently migrated out of the container's cgroup scope before the zero-copy channel is used. Mount namespace membership, by contrast, is fixed for the lifetime of the process after the initial unshare() call and cannot be changed by any external actor.

The PtyPair struct stores owner_mnt_ns_id: MntNsId (not owner_cgroup) for this check. The MntNsId is recorded when the PTY master fd is opened (at posix_openpt() time) and never updated. Processes without CAP_SYS_ADMIN cannot change their mount namespace after creation.

Memory isolation guarantee: The PtyRingPage is mapped with PROT_READ | PROT_WRITE into both processes, but the kernel retains a back-reference to the physical pages. If either process exits or execs a binary with elevated capability grants (Section 8.1.6), the kernel immediately revokes the direct mapping and falls back to standard ring-buffer mode. This prevents privilege escalation via persistent shared memory.
Audit logging: Successful zero-copy mode activation generates an audit event (Section 19.2.9) with both PIDs and the PTY device identifier, enabling post-incident forensics.

The fallback path (when zero-copy is not requested or denied) uses the standard kernel-mediated ring buffer with full security checks on every data transfer.

Signal Generation in Zero-Copy Mode

Problem: In standard PTY mode, the kernel's line discipline (N_TTY) reads every byte written to the PTY master, detects control characters (INTR=0x03 → SIGINT, QUIT=0x1C → SIGQUIT, SUSP=0x1A → SIGTSTP, EOF=0x04), and delivers signals to the foreground process group. In zero-copy mode (PTY_REQ_DIRECT), the terminal emulator writes directly to the shared ring buffer without a kernel read path — so the kernel cannot intercept control characters inline.

Solution — Sentinel ring for control characters:

Zero-copy PTY uses a dual-ring design:

Data ring (shared mmap, zero-copy): carries printable characters. The terminal emulator writes here at full speed.
Control ring (small kernel-visible ring, 64 entries): carries out-of-band events. The terminal emulator writes here when it detects a control character.

/// Out-of-band control event sent from the terminal emulator to the kernel
/// via the control ring. Each variant corresponds to a POSIX signal or
/// terminal state change that the kernel must process.
#[repr(C, u8)]
pub enum PtyControlEvent {
    /// Terminal emulator detected INTR character (default: Ctrl+C = 0x03).
    /// Kernel delivers SIGINT to foreground process group.
    SignalIntr = 1,
    /// Terminal emulator detected QUIT character (default: Ctrl+\ = 0x1C).
    /// Kernel delivers SIGQUIT to foreground process group.
    SignalQuit = 2,
    /// Terminal emulator detected SUSP character (default: Ctrl+Z = 0x1A).
    /// Kernel delivers SIGTSTP to foreground process group.
    SignalSusp = 3,
    /// Terminal window resized. Kernel delivers SIGWINCH and updates winsize.
    WindowResize { cols: u16, rows: u16, xpixel: u16, ypixel: u16 } = 4,
    /// Terminal emulator detected EOF (default: Ctrl+D = 0x04).
    /// Kernel sets hangup condition on PTY slave.
    Eof = 5,
    /// Flush the data ring up to this byte offset (for atomic command delivery).
    FlushTo { offset: u64 } = 6,
}

Control ring layout:

/// Written to the control ring page (mapped read-write by terminal emulator).
/// The control ring occupies a single 4 KB page, separate from the data ring
/// pages. The terminal emulator writes events; the kernel drains them.
#[repr(C)]
pub struct PtyControlRing {
    /// Write index (terminal emulator advances).
    pub write_idx: AtomicU32,
    /// Padding to separate from kernel's read_idx (avoid false sharing).
    _pad: [u8; 60],
    /// Read index (kernel advances).
    pub read_idx: AtomicU32,
    /// Ring entries.
    pub entries: [PtyControlEvent; 64],
}

Terminal emulator protocol: When the terminal emulator detects a control character in the input stream (from the physical keyboard), it:

Writes the control character's PtyControlEvent to the control ring at index write_idx % 64.
Increments write_idx with Release ordering.
Triggers the kernel via write(ctl_fd, &SIG_NOTIFY, 1) — a 1-byte write to a dedicated control file descriptor that does not carry data, just wakes the kernel.

The kernel, on receiving the ctl_fd write:

Drains the control ring (reads from read_idx to write_idx).
For each SignalIntr/SignalQuit/SignalSusp: calls kill_pgrp(slave_pgrp, sig, 1) to deliver the signal to the foreground process group.
For WindowResize: updates PtyState.winsize and delivers SIGWINCH to the foreground process group.
For Eof: sets the hangup condition on the PTY slave, waking any blocked readers with zero-length reads.
Advances read_idx with Release ordering.

Security: The control ring is in a user-mapped page. A malicious terminal emulator could spam SIGINT events, but: (1) signals can only be delivered to processes in the session the PTY controls — cross-session delivery is impossible; (2) rate limiting: at most 64 control events per ctl_fd write (ring size); (3) the mapping is per-PTY, allocated only in zero-copy mode. A compromised terminal emulator already has full control over the PTY master side (it can close the fd, inject arbitrary bytes, resize the window), so the control ring does not expand the attack surface.

SIGINT rate limiting: PTY SIGINT rate limiting is applied per PTY slave device, not per master FD. Multiple master FDs opened to the same slave share one token bucket. This prevents a misbehaving terminal emulator from bypassing the rate limit by opening N master FDs (each with its own bucket) and interleaving SIGINT injections across them.

Token bucket: capacity = 100, refill_rate = 1000 tokens/second
Each injected SIGINT, SIGQUIT, SIGTSTP, or SIGHUP consumes 1 token
When the bucket is empty, excess signals are dropped silently
The terminal emulator is expected to coalesce input events; the rate limit prevents a malicious or buggy terminal emulator from flooding the foreground process group. 1000 signals/second is far above any legitimate interactive use.

Rate limiting state is stored in PtySlaveState (not in per-fd structures):

pub struct PtySlaveState {
    // ... existing fields ...
    /// Shared SIGINT rate limiter for this slave device.
    /// All master FDs to this slave share this bucket.
    /// Capacity: 100 signals; refill rate: 1000 signals/second.
    pub signal_token_bucket: TokenBucket,
}

When any master FD injects a SIGINT to this slave: deduct one token from PtySlaveState::signal_token_bucket. If the bucket is empty, the injection is rate-limited (SIGINT is either queued or dropped, depending on policy).

Token bucket lifetime: same as PTY slave device lifetime — NOT tied to any specific master FD's lifetime.

Compatibility: Applications using standard read()/write() on the PTY master continue to work unchanged — signal generation is handled by the kernel's line discipline (Section 20.1.3). The control ring is only allocated when zero-copy mode is activated via PTY_REQ_DIRECT ioctl. Falling back from zero-copy mode (due to tcsetattr() or process exit) automatically returns to kernel-mediated signal generation.

XON/XOFF Flow Control in Zero-Copy Mode

Problem: Classical TTY processes XON/XOFF software flow control by scanning each byte as it passes through the line discipline — when XOFF (Ctrl-S, 0x13) is seen, output is paused; when XON (Ctrl-Q, 0x11) is seen, output resumes. This is fundamentally incompatible with zero-copy: you cannot scan a buffer you are not copying. The solution is a two-layer architecture that preserves the zero-copy property for bulk data while enforcing POSIX flow control semantics.

Layer 1 — Data path (zero-copy): In zero-copy mode, the slave writes directly to the master's ring buffer without scanning for XON/XOFF characters. This preserves the zero-copy property for bulk data (container logs, remote shell output, etc.).

Layer 2 — Control path (XON/XOFF scanning): XON/XOFF scanning is performed only when IXON or IXOFF is set in termios.c_iflag. The scan happens at the ring buffer consumer (master read) side — bytes are examined as the master application reads them, not as the slave writes them. The flow control state is communicated back to the slave writer via an atomic flag in AtomicTtyState.

/// Flow control state for one side of a PTY (master or slave).
///
/// Manages both XON/XOFF (software flow control, IXON/IXOFF termios flags) and
/// the hardware-signal equivalents (simulated via PTY ioctls TIOCMGET/TIOCMSET).
///
/// XON character: `termios.c_cc[VSTART]` (default Ctrl-Q = 0x11).
/// XOFF character: `termios.c_cc[VSTOP]`  (default Ctrl-S = 0x13).
///
/// All fields are atomic so the master consumer and slave writer can read/write
/// without holding a lock on the hot data path.
pub struct PtyFlowControlState {
    /// True if this side is currently in XOFF state (transmission suspended).
    /// Set when the read buffer crosses `rx_high_watermark`; cleared when it
    /// drops below `rx_low_watermark`. The slave write path checks this before
    /// writing to the ring.
    pub tx_stopped: AtomicBool,

    /// True if the remote side (peer) is in XOFF state (we must stop sending).
    /// Set when we receive an XOFF character or when the peer's `tx_stopped` is true.
    pub peer_stopped: AtomicBool,

    /// Number of bytes currently in the receive buffer for this side.
    pub rx_bytes: AtomicU32,

    /// High watermark: when `rx_bytes` exceeds this, send XOFF to the peer.
    /// Default: 3/4 of the receive buffer capacity.
    pub rx_high_watermark: u32,

    /// Low watermark: when `rx_bytes` drops below this (after XOFF was sent),
    /// send XON to the peer to resume transmission.
    /// Default: 1/4 of the receive buffer capacity.
    pub rx_low_watermark: u32,

    /// Total receive buffer capacity in bytes. Set at PTY creation.
    pub rx_capacity: u32,

    /// Simulated modem control signals (subset of TIOCM_* flags).
    /// Bit 0 (TIOCM_RTS): Request To Send — set when we are ready to receive.
    /// Bit 1 (TIOCM_CTS): Clear To Send — set when the peer is ready to send.
    /// Bit 2 (TIOCM_DTR): Data Terminal Ready.
    /// Bit 3 (TIOCM_DSR): Data Set Ready.
    /// Bit 4 (TIOCM_CAR/CD): Carrier Detect — simulated as always-set for PTY.
    pub modem_signals: AtomicU8,

    /// Number of XON characters sent to the peer (telemetry).
    pub xon_sent: AtomicU32,

    /// Number of XOFF characters sent to the peer (telemetry).
    pub xoff_sent: AtomicU32,

    /// If true, software flow control (XON/XOFF) is enabled for this side.
    /// Matches the IXON/IXOFF termios flags.
    pub sw_flow_enabled: AtomicBool,

    /// Whether `IXON` is currently active (derived from `termios.c_iflag`).
    /// When false, the master consumer skips XON/XOFF scanning entirely.
    pub ixon_enabled: AtomicBool,

    /// Whether `IXOFF` is currently active.
    /// When true, the kernel sends XOFF/XON to the slave based on ring fill level.
    pub ixoff_enabled: AtomicBool,

    /// Whether `IXANY` is set: any character from master resumes output.
    pub ixany_enabled: AtomicBool,

    /// Tracks whether an XOFF has been injected into the slave for IXOFF
    /// threshold enforcement. True from the moment XOFF is injected until XON
    /// is injected (when the ring drains below `rx_low_watermark`).
    pub ixoff_sent: AtomicBool,

    /// XOFF character value (default 0x13 = Ctrl-S). From termios.c_cc[VSTOP].
    pub xoff_char: u8,

    /// XON character value (default 0x11 = Ctrl-Q). From termios.c_cc[VSTART].
    pub xon_char: u8,

    pub _pad: [u8; 2],
}

impl PtyFlowControlState {
    /// Default watermarks: high = 3/4 capacity, low = 1/4 capacity.
    pub fn new(capacity: u32) -> Self {
        Self {
            tx_stopped: AtomicBool::new(false),
            peer_stopped: AtomicBool::new(false),
            rx_bytes: AtomicU32::new(0),
            rx_high_watermark: capacity * 3 / 4,
            rx_low_watermark: capacity / 4,
            rx_capacity: capacity,
            modem_signals: AtomicU8::new(0b00010001), // RTS + CD always set
            xon_sent: AtomicU32::new(0),
            xoff_sent: AtomicU32::new(0),
            sw_flow_enabled: AtomicBool::new(false),
            ixon_enabled: AtomicBool::new(false),
            ixoff_enabled: AtomicBool::new(false),
            ixany_enabled: AtomicBool::new(false),
            ixoff_sent: AtomicBool::new(false),
            xoff_char: 0x13, // Ctrl-S
            xon_char: 0x11,  // Ctrl-Q
            _pad: [0; 2],
        }
    }

    /// Called when `rx_bytes` increases. Returns true if XOFF should be sent to peer.
    pub fn on_rx(&self, added: u32) -> bool {
        let new = self.rx_bytes.fetch_add(added, Ordering::Relaxed) + added;
        if self.sw_flow_enabled.load(Ordering::Relaxed)
            && new > self.rx_high_watermark
            && !self.tx_stopped.swap(true, Ordering::Release)
        {
            self.xoff_sent.fetch_add(1, Ordering::Relaxed);
            return true; // caller should inject XOFF into the peer's write path
        }
        false
    }

    /// Called when `rx_bytes` decreases. Returns true if XON should be sent to peer.
    pub fn on_tx(&self, consumed: u32) -> bool {
        let prev = self.rx_bytes.load(Ordering::Relaxed);
        let new = prev.saturating_sub(consumed);
        self.rx_bytes.store(new, Ordering::Relaxed);
        if self.sw_flow_enabled.load(Ordering::Relaxed)
            && new < self.rx_low_watermark
            && self.tx_stopped.swap(false, Ordering::Release)
        {
            self.xon_sent.fetch_add(1, Ordering::Relaxed);
            return true; // caller should inject XON into the peer's write path
        }
        false
    }
}

Slave write path (when ixon_enabled is set):

fn pty_slave_write(ring: &PtyRingPage, flow: &PtyFlowControlState, data: &[u8]):
  1. if flow.tx_stopped.load(Acquire):
       // Block until master sends XON (or zero-copy mode is exited).
       wait_event(&flow.write_waitq, !flow.tx_stopped.load(Relaxed))
  2. Write `data` to ring buffer (zero-copy; no character scanning).
  3. Signal master via eventfd (data available).

The slave never scans bytes — it only checks the tx_stopped flag before each write(). The wait is on a standard wait queue; wakeup is delivered by the master consumer path when XON is detected.

Master read path (consumer side, when ixon_enabled):

fn pty_master_read(ring: &PtyRingPage, flow: &PtyFlowControlState, buf: &mut [u8]):
  let xon  = flow.xon_char;   // plain u8, set by tcsetattr
  let xoff = flow.xoff_char;  // plain u8, set by tcsetattr
  let ixany = flow.ixany_enabled.load(Relaxed);

  for each byte `b` consumed from the ring:
    if b == xoff && flow.ixon_enabled.load(Relaxed):
      flow.tx_stopped.store(true, Release)
      // Wake slave to re-check stopped state on next write attempt.
      wake_up(&flow.write_waitq)
      // XON/XOFF bytes are NOT delivered to the master application (POSIX).
      continue
    elif b == xon && flow.ixon_enabled.load(Relaxed) && !ixany:
      flow.tx_stopped.store(false, Release)
      wake_up(&flow.write_waitq)  // Unblock paused slave writers.
      continue
    elif flow.tx_stopped.load(Relaxed) && ixany:
      // IXANY: any character from master resumes paused output.
      flow.tx_stopped.store(false, Release)
      wake_up(&flow.write_waitq)
      // The character itself IS delivered to master (unlike plain XON).
      buf.push(b)
    else:
      buf.push(b)

POSIX character-stripping rules: - When IXON is set and IXANY is not set: XON (VSTART) and XOFF (VSTOP) bytes are consumed by the flow control layer and not delivered to the master application. This matches POSIX termios(3) semantics. - When IXANY is set: any character received from the master resumes paused output; only VSTOP pauses. The character that resumed output IS passed to the master application (it is not a dedicated control byte in this mode). - When IXOFF is set: the kernel automatically injects XOFF (VSTOP) into the slave's input stream when the slave-to-master ring reaches 75% capacity, and injects XON (VSTART) when the ring drains below 25% capacity. This back-pressures the slave from the kernel side without application involvement.

IXOFF kernel-side injection:

fn pty_check_ixoff_thresholds(ring: &PtyRingPage, flow: &PtyFlowControlState):
  let used = ring.header.head.load(Relaxed) - ring.header.tail.load(Relaxed);
  let capacity = PTY_RING_DATA_SIZE as u64;  // 4079 bytes
  if flow.ixoff_enabled.load(Relaxed):
    if used >= (capacity * 3 / 4) && !flow.ixoff_sent.load(Relaxed):
      inject_byte_to_slave(ring, flow.xoff_char)  // plain u8
      flow.ixoff_sent.store(true, Release)
    elif used <= (capacity / 4) && flow.ixoff_sent.load(Relaxed):
      inject_byte_to_slave(ring, flow.xon_char)   // plain u8
      flow.ixoff_sent.store(false, Release)

This check runs on the master consumer path after each read batch; it does not require a background timer or dedicated thread.

Termios change interaction: XON/XOFF mode is part of termios.c_iflag. When tcsetattr() is called while zero-copy mode is active: - If only IXON/IXOFF/IXANY bits change, zero-copy mode remains active. AtomicTtyState is updated in place; the consumer and producer paths pick up the new values on their next iteration. - If ICANON is re-enabled or any flag incompatible with zero-copy is set, zero-copy mode falls back to kernel-mediated mode (as documented in the zero-copy restrictions above). The tx_stopped flag is cleared during the transition to prevent the slave from blocking indefinitely after the fallback.

OPOST interaction: Zero-copy mode is active only when OPOST is clear in termios.c_oflag. When OPOST is enabled, output processing (newline translation ONLCR, tab expansion, etc.) is required on each byte — this is fundamentally incompatible with zero-copy. Setting OPOST forces the copy path for output processing; zero-copy mode is automatically suspended until OPOST is cleared again.

Overhead: The XON/XOFF consumer-side check adds approximately 2 ns per byte on x86-64 (one atomic byte load per byte consumed, branch predicted not-taken for bulk data where flow control is inactive). For bulk container logging — where IXON is typically not set — there is zero overhead (the ixon_enabled atomic check short- circuits the entire scanning path). For interactive terminals where XON/XOFF flow control is active, the per-byte overhead is acceptable and consistent with the terminal's interactive (non-bulk) nature.

20.1.3 Asynchronous Line Disciplines (N_TTY)

The line discipline (N_TTY) translates raw characters into canonical input (handling backspace, line buffering) and generates signals (translating Ctrl+C into SIGINT).

In Linux, this processing happens synchronously during the write() or read() syscall, while holding the tty_mutex.

In UmkaOS, line discipline processing is asynchronous and decoupled from the data path. 1. When the user types Ctrl+C, the raw byte (0x03) is placed into the master_tx ring buffer. 2. The kernel's asynchronous TTY worker thread (running in UmkaOS Core) consumes the raw ring, processes the line discipline rules based on the termios state, and pushes the processed output to the canonical ring (or generates the SIGINT signal to the foreground process group). 3. The foreground application reads from the canonical ring.

Because the TTY worker thread is the sole consumer of the raw ring and the sole producer of the canonical ring, it operates entirely lock-free.

Async TTY Worker Thread Configuration:

Count: One worker thread per physical CPU socket (NUMA node), not per CPU. Named tty_worker/{socket_id}. Rationale: TTY throughput is not CPU-intensive (character processing + application wakeup); socket-scoped workers provide NUMA locality without per-CPU overhead.
Priority: SCHED_OTHER (normal timesharing) at nice -5. This gives TTY processing a small priority boost over typical user tasks (nice 0) without impacting RT workloads. Interactive terminal responsiveness is maintained because terminal input wakeup latency is dominated by the nice-level scheduling latency (~0.5–2ms), not TTY processing time.
Starvation handling: The worker thread checks tty_queue.len() at each wakeup. If the queue has grown to >80% of its capacity (TTY_ASYNC_QUEUE_SIZE = 4096 entries), the worker temporarily raises its scheduling priority to SCHED_OTHER nice=-15 until the queue drains below 50%. This prevents input drop under heavy load without permanently occupying a high-priority slot.
Queue overflow: If the async queue reaches 100% capacity (4096 unprocessed TTY events), new input is dropped and tty_drop_count is incremented. A warning is logged to the kernel ring buffer and exposed via umkafs at /System/Kernel/tty/drop_count. Drop recovery: the worker thread processes the queue as fast as possible, then resets drop_count to 0 when the queue clears.
Shutdown: The worker thread is a kthread; it joins cleanly via kthread_stop() during system shutdown after all TTY devices have been closed.

20.1.4 Serial TTY — Full POSIX termios and Modem Control

This section answers: "can minicom run on UmkaOS?"

The POSIX termios interface controls the serial line discipline: character size, baud rate, parity, flow control, canonical vs raw mode, and modem control signals. It applies to both serial UART ports (/dev/ttyS0, /dev/ttyUSB0) and to PTYs (via the PTY slave). §20.1.1-3 cover PTY; this section covers the serial-specific parts needed for programs like minicom, picocom, and screen.

20.1.4.1 struct termios

The full POSIX struct termios as exposed to userspace (Linux asm-generic/termbits.h layout, required for binary compat):

/// POSIX struct termios — character device terminal settings.
/// Layout matches Linux's `struct termios2` for TCGETS2/TCSETS2 ioctls.
/// The kernel-internal representation is `KernelTermios`; this is the
/// userspace-visible layout placed at ioctl argument pointers.
#[repr(C)]
pub struct Termios {
    /// Input mode flags.
    pub c_iflag: u32,
    /// Output mode flags.
    pub c_oflag: u32,
    /// Control mode flags.
    pub c_cflag: u32,
    /// Local mode flags.
    pub c_lflag: u32,
    /// Line discipline index (N_TTY = 0).
    pub c_line:  u8,
    /// Special character array (NCCS = 19 for Linux/POSIX).
    pub c_cc:    [u8; 19],
    /// Input baud rate (encoded as Bxxx constant OR an actual numeric rate
    /// when using TCSETS2/BOTHER — see §20.1.4.2).
    pub c_ispeed: u32,
    /// Output baud rate.
    pub c_ospeed: u32,
}

// c_iflag bits
pub const IGNBRK:  u32 = 0o000001; // Ignore BREAK condition
pub const BRKINT:  u32 = 0o000002; // BREAK → SIGINT to foreground process group
pub const IGNPAR:  u32 = 0o000004; // Ignore framing and parity errors
pub const PARMRK:  u32 = 0o000010; // Mark parity and framing errors with 0xFF 0x00
pub const INPCK:   u32 = 0o000020; // Enable input parity checking
pub const ISTRIP:  u32 = 0o000040; // Strip 8th bit from input characters
pub const INLCR:   u32 = 0o000100; // Translate NL to CR on input
pub const IGNCR:   u32 = 0o000200; // Ignore CR on input
pub const ICRNL:   u32 = 0o000400; // Translate CR to NL on input (unless IGNCR)
pub const IUCLC:   u32 = 0o001000; // Map uppercase to lowercase (obsolete, not POSIX)
pub const IXON:    u32 = 0o002000; // Enable XON/XOFF flow control on output
pub const IXANY:   u32 = 0o004000; // Any character restarts output stopped by XOFF
pub const IXOFF:   u32 = 0o010000; // Enable XON/XOFF flow control on input
pub const IMAXBEL: u32 = 0o020000; // Ring bell when input queue is full
pub const IUTF8:   u32 = 0o040000; // Input is UTF-8; affects erase in canonical mode

// c_oflag bits
pub const OPOST:   u32 = 0o000001; // Enable output processing
pub const OLCUC:   u32 = 0o000002; // Map lowercase to uppercase (obsolete)
pub const ONLCR:   u32 = 0o000004; // Map NL to CR-NL on output
pub const OCRNL:   u32 = 0o000010; // Map CR to NL on output
pub const ONOCR:   u32 = 0o000020; // No CR output at column 0
pub const ONLRET:  u32 = 0o000040; // NL performs CR function
pub const OFILL:   u32 = 0o000100; // Use fill characters for delay
pub const OFDEL:   u32 = 0o000200; // Fill char is DEL (otherwise NUL)

// c_cflag bits
pub const CBAUD:   u32 = 0o010017; // Baud rate mask (use BOTHER for non-standard rates)
pub const BOTHER:  u32 = 0o010000; // Non-standard baud rate (rate in c_ispeed/c_ospeed)
pub const CS5:     u32 = 0o000000; // 5-bit characters
pub const CS6:     u32 = 0o000020; // 6-bit characters
pub const CS7:     u32 = 0o000040; // 7-bit characters
pub const CS8:     u32 = 0o000060; // 8-bit characters
pub const CSIZE:   u32 = 0o000060; // Character size mask
pub const CSTOPB:  u32 = 0o000100; // 2 stop bits (1 if not set)
pub const CREAD:   u32 = 0o000200; // Enable receiver
pub const PARENB:  u32 = 0o000400; // Enable parity generation on output and checking on input
pub const PARODD:  u32 = 0o001000; // Odd parity (even if not set)
pub const HUPCL:   u32 = 0o002000; // Hang up on last close (de-assert DTR/RTS)
pub const CLOCAL:  u32 = 0o004000; // Ignore modem status lines
pub const CRTSCTS: u32 = 0o020000000000; // Enable RTS/CTS hardware flow control

// c_lflag bits
pub const ISIG:    u32 = 0o000001; // Generate signal when INTR/QUIT/SUSP received
pub const ICANON:  u32 = 0o000002; // Canonical mode (line-by-line)
pub const XCASE:   u32 = 0o000004; // Fold uppercase (obsolete)
pub const ECHO:    u32 = 0o000010; // Echo input characters
pub const ECHOE:   u32 = 0o000020; // ERASE erases preceding character
pub const ECHOK:   u32 = 0o000040; // KILL erases current line
pub const ECHONL:  u32 = 0o000100; // Echo NL even if ECHO is not set
pub const NOFLSH:  u32 = 0o000200; // No flush on INTR, QUIT, or SUSP
pub const TOSTOP:  u32 = 0o000400; // Send SIGTTOU for background write attempts
pub const ECHOCTL: u32 = 0o001000; // Echo control chars as ^X
pub const ECHOPRT: u32 = 0o002000; // Echo erased chars (hardcopy terminal style)
pub const ECHOKE:  u32 = 0o004000; // KILL erases by echoing spaces
pub const FLUSHO:  u32 = 0o010000; // Output is being flushed
pub const PENDIN:  u32 = 0o040000; // Re-print pending input at next read/newline
pub const IEXTEN:  u32 = 0o100000; // Enable implementation-defined input processing

// c_cc indices (NCCS = 19)
pub const VINTR:    usize = 0;  // Interrupt (default ^C = 0x03)
pub const VQUIT:    usize = 1;  // Quit (default ^\ = 0x1C)
pub const VERASE:   usize = 2;  // Erase (default ^H/DEL)
pub const VKILL:    usize = 3;  // Kill line (default ^U)
pub const VEOF:     usize = 4;  // End-of-file (canonical, default ^D)
pub const VTIME:    usize = 5;  // Timeout for non-canonical read (tenths of second)
pub const VMIN:     usize = 6;  // Min chars for non-canonical read
pub const VSWTC:    usize = 7;  // Switch (not POSIX; 0 in Linux)
pub const VSTART:   usize = 8;  // Resume output (XON, default ^Q)
pub const VSTOP:    usize = 9;  // Pause output (XOFF, default ^S)
pub const VSUSP:    usize = 10; // Suspend (default ^Z)
pub const VEOL:     usize = 11; // Additional end-of-line (canonical)
pub const VREPRINT: usize = 12; // Reprint pending input (default ^R)
pub const VDISCARD: usize = 13; // Toggle discard output (default ^O)
pub const VWERASE:  usize = 14; // Word erase (default ^W)
pub const VLNEXT:   usize = 15; // Literal next (default ^V)
pub const VEOL2:    usize = 16; // Second end-of-line (default NUL = disabled)
// indices 17, 18 are padding (unused)

20.1.4.2 Baud Rate Setting

Standard baud rates are encoded as Bxxx constants in c_cflag & CBAUD. Non-standard rates use BOTHER + numeric value in c_ispeed/c_ospeed, via the TCSETS2/TCGETS2 ioctls (Linux 2.6.32+, struct termios2):

/// Standard baud rate constants (in c_cflag bits 0-4, masked by CBAUD).
pub const B0:      u32 = 0o000000; // Hang up (de-assert DTR)
pub const B50:     u32 = 0o000001;
pub const B75:     u32 = 0o000002;
pub const B110:    u32 = 0o000003;
pub const B134:    u32 = 0o000004;
pub const B150:    u32 = 0o000005;
pub const B200:    u32 = 0o000006;
pub const B300:    u32 = 0o000007;
pub const B600:    u32 = 0o000010;
pub const B1200:   u32 = 0o000011;
pub const B1800:   u32 = 0o000012;
pub const B2400:   u32 = 0o000013;
pub const B4800:   u32 = 0o000014;
pub const B9600:   u32 = 0o000015;
pub const B19200:  u32 = 0o000016;
pub const B38400:  u32 = 0o000017;
pub const B57600:  u32 = 0o010001;
pub const B115200: u32 = 0o010002;
pub const B230400: u32 = 0o010003;
pub const B460800: u32 = 0o010004;
pub const B500000: u32 = 0o010005;
pub const B576000: u32 = 0o010006;
pub const B921600: u32 = 0o010007;
pub const B1000000:u32 = 0o010010;
pub const B1152000:u32 = 0o010011;
pub const B1500000:u32 = 0o010012;
pub const B2000000:u32 = 0o010013;
pub const B2500000:u32 = 0o010014;
pub const B3000000:u32 = 0o010015;
pub const B3500000:u32 = 0o010016;
pub const B4000000:u32 = 0o010017;

UmkaOS ioctls for terminal settings: - TCGETS (0x5401): get struct termios (old, 15 c_cc entries) - TCSETS (0x5402): set immediately - TCSETSW (0x5403): set after drain (wait for output to flush) - TCSETSF (0x5404): set after flush (drain output + flush input) - TCGETS2 (0x802C542A): get struct termios2 (19 c_cc, supports BOTHER) - TCSETS2 (0x402C542B): set via termios2 (supports non-standard baud) - TCSETSW2 / TCSETSF2: drain/flush variants of TCSETS2

20.1.4.3 Modem Control Lines

/// Modem control line bits (TIOCMGET/TIOCMSET/TIOCMBIS/TIOCMBIC).
pub const TIOCM_LE:  u32 = 0x001; // Line Enable (DSR in LE role)
pub const TIOCM_DTR: u32 = 0x002; // Data Terminal Ready (output)
pub const TIOCM_RTS: u32 = 0x004; // Request To Send (output)
pub const TIOCM_ST:  u32 = 0x008; // Secondary Transmit (rare)
pub const TIOCM_SR:  u32 = 0x010; // Secondary Receive (rare)
pub const TIOCM_CTS: u32 = 0x020; // Clear To Send (input)
pub const TIOCM_CAR: u32 = 0x040; // Carrier Detect (input, alias DCD)
pub const TIOCM_RNG: u32 = 0x080; // Ring Indicator (input)
pub const TIOCM_DSR: u32 = 0x100; // Data Set Ready (input)
pub const TIOCM_CD:  u32 = TIOCM_CAR;
pub const TIOCM_RI:  u32 = TIOCM_RNG;
pub const TIOCM_OUT1:u32 = 0x2000;
pub const TIOCM_OUT2:u32 = 0x4000;
pub const TIOCM_LOOP:u32 = 0x8000;

/// Modem control ioctls.
/// TIOCMGET: read current modem line state → *argp = u32 bitmask
/// TIOCMSET: set modem lines → *argp = u32 bitmask (replaces all writable bits)
/// TIOCMBIS: set individual bits → *argp = u32 bitmask (OR into current)
/// TIOCMBIC: clear individual bits → *argp = u32 bitmask (AND NOT into current)
pub const TIOCMGET:  u32 = 0x5415;
pub const TIOCMSET:  u32 = 0x5418;
pub const TIOCMBIS:  u32 = 0x5416;
pub const TIOCMBIC:  u32 = 0x5417;

/// TIOCMIWAIT: wait for modem line state change.
/// *argp = bitmask of lines to wait on (TIOCM_CAR|TIOCM_DSR|TIOCM_RI|TIOCM_CTS).
/// Blocks until any of the specified lines changes. Returns 0 on change, EINTR on signal.
pub const TIOCMIWAIT: u32 = 0x545C;

/// TIOCGICOUNT: get modem line interrupt counter (counts transitions since last call).
/// *argp = struct serial_icounter_struct { cts, dsr, rng, dcd, rx, tx, frame, overrun, parity, brk, ... }
pub const TIOCGICOUNT: u32 = 0x545D;

20.1.4.4 Serial-Specific ioctls

/// TIOCEXCL: put tty into exclusive mode.
/// Subsequent open() calls on the device fail with EBUSY.
/// Required by minicom for exclusive serial port access.
pub const TIOCEXCL:  u32 = 0x540C;
/// TIOCNXCL: clear exclusive mode.
pub const TIOCNXCL:  u32 = 0x540D;
/// TIOCGEXCL: check if in exclusive mode (Linux 3.8+). *argp = int (1 = exclusive).
pub const TIOCGEXCL: u32 = 0x80045440;

/// TIOCGSERIAL: get serial port info (struct serial_struct, Linux ABI compat).
pub const TIOCGSERIAL: u32 = 0x541E;
/// TIOCSSERIAL: set serial port info.
pub const TIOCSSERIAL: u32 = 0x541F;

/// struct serial_struct (Linux ABI — must match exactly for compat).
/// minicom uses TIOCGSERIAL to detect and set ASYNC_LOW_LATENCY.
#[repr(C)]
pub struct SerialStruct {
    pub type_:         i32,   // PORT_16550A etc.
    pub line:          i32,   // tty line number
    pub port:          u32,   // I/O port address
    pub irq:           i32,
    pub flags:         i32,   // ASYNC_LOW_LATENCY = 0x2000, ASYNC_SKIP_TEST = 0x0200
    pub xmit_fifo_size:i32,
    pub custom_divisor:i32,
    pub baud_base:     i32,   // base baud rate (usually 115200 or clock/16)
    pub close_delay:   u16,   // delay before fully closed (jiffies/100)
    pub io_type:       u8,
    pub reserved_char: [u8; 1],
    pub hub6:          i32,
    pub closing_wait:  u16,   // delay before close (jiffies/100; ASYNC_CLOSING_WAIT_NONE=0xFFFF)
    pub closing_wait2: u16,
    pub iomem_base:    u64,   // MMIO base (pointer, 64-bit)
    pub iomem_reg_shift: u16,
    pub port_high:     u32,
    pub iomap_base:    u64,
}

20.1.4.5 Line Discipline Switching

/// TIOCSETD: set line discipline. *argp = int (discipline number).
pub const TIOCSETD: u32 = 0x5423;
/// TIOCGETD: get current line discipline. *argp = int.
pub const TIOCGETD: u32 = 0x5424;

/// Registered line disciplines.
pub const N_TTY:   i32 = 0;  // Default: terminal line discipline
pub const N_SLIP:  i32 = 1;  // SLIP (Serial Line Internet Protocol)
pub const N_MOUSE: i32 = 2;  // Mouse driver (obsolete)
pub const N_PPP:   i32 = 3;  // PPP (Point-to-Point Protocol) — used by pppd
pub const N_STRIP: i32 = 4;  // STRIP (Metricom Striper) — obsolete
pub const N_AX25:  i32 = 5;  // AX.25 packet radio — unused on UmkaOS
pub const N_X25:   i32 = 6;  // X.25 async — unused on UmkaOS
pub const N_6PACK: i32 = 7;  // 6PACK packet radio — unused
pub const N_MASC:  i32 = 8;  // Reserved
pub const N_R3964: i32 = 9;  // Simatic R3964
pub const N_PROFIBUS_FDL: i32 = 10; // Profibus — industrial
pub const N_IRDA:  i32 = 11; // IrDA — legacy
pub const N_SMSBLOCK: i32 = 12; // SMS block protocol
pub const N_HDLC:  i32 = 13; // HDLC sync — used by isdn/WAN drivers
pub const N_SYNC_PPP: i32 = 14; // Sync PPP
pub const N_HCI:   i32 = 15; // Bluetooth HCI via UART (H4 protocol)

/// LineDisciplineOps trait — implemented by each line discipline.
pub trait LineDisciplineOps: Send + Sync {
    /// Called when characters arrive from the driver.
    fn receive_buf(&self, tty: &TtyPort, buf: &[u8], flags: &[u8]);
    /// Called when the application reads from the tty.
    fn read(&self, tty: &TtyPort, buf: &mut [u8]) -> Result<usize, KernelError>;
    /// Called when the application writes to the tty.
    fn write(&self, tty: &TtyPort, buf: &[u8]) -> Result<usize, KernelError>;
    /// Handle ioctl (discipline-specific, e.g., PPPIOCGUNIT for N_PPP).
    fn ioctl(&self, tty: &TtyPort, cmd: u32, arg: usize) -> Result<i32, KernelError>;
    /// Called when line discipline is opened.
    fn open(&self, tty: &TtyPort) -> Result<(), KernelError>;
    /// Called when line discipline is closed.
    fn close(&self, tty: &TtyPort);
}

Note — TIOCSETD behavior (D24): Line disciplines are not stacked. TIOCSETD replaces the current discipline with a new one; a TTY has exactly one active line discipline at any time (no STREAMS-style stacking, matching Linux behavior).

ioctl(fd, TIOCSETD, &ldisc_id): calls the old discipline's close(), then the new discipline's open(). Returns EINVAL if ldisc_id >= N_LDISC_MAX (30) or the discipline is not registered.

ioctl(fd, TIOCGETD, &ldisc_id): returns the ID of the current line discipline.

N_TTY (ID 0) is always registered and is the fallback if a custom discipline's open() fails.

N_LDISC_MAX = 30: system-wide limit on the number of distinct registered discipline types (not on simultaneous TTY instances), matching Linux.

20.1.4.6 SerialTtyOps KABI

Hardware serial UART drivers implement SerialTtyOps:

/// KABI vtable for a serial UART driver.
/// Transport: T1 (ring buffer + MPK domain switch).
#[repr(C)]
pub struct SerialTtyOps {
    pub vtable_size: usize,
    /// Apply new termios settings to hardware (baud rate, framing, flow control).
    pub set_termios: unsafe extern "C" fn(
        ctx:     *mut c_void,
        new:     *const Termios,
        old:     *const Termios,
    ),
    /// Get current modem control line state (returns TIOCM_* bitmask).
    pub get_mctrl: unsafe extern "C" fn(ctx: *mut c_void) -> u32,
    /// Set modem control output lines (DTR, RTS).
    pub set_mctrl: unsafe extern "C" fn(ctx: *mut c_void, mctrl: u32),
    /// Send a BREAK condition for `duration_ms` milliseconds.
    pub send_break: unsafe extern "C" fn(ctx: *mut c_void, duration_ms: u32),
    /// Start transmitting (driver was stopped by throttle/stop_tx, now resume).
    pub start_tx: unsafe extern "C" fn(ctx: *mut c_void),
    /// Stop transmitting (XOFF received or output buffer full).
    pub stop_tx: unsafe extern "C" fn(ctx: *mut c_void),
    /// Enable/disable receiver (CREAD flag).
    pub set_rx_enabled: unsafe extern "C" fn(ctx: *mut c_void, enabled: bool),
    /// Wait for modem line changes (blocking; interruptible).
    pub wait_mctrl_change: unsafe extern "C" fn(
        ctx:          *mut c_void,
        wait_mask:    u32,
        timeout_ms:   u32,
    ) -> u32,
    /// Get serial port static info (for TIOCGSERIAL).
    pub get_serial: unsafe extern "C" fn(ctx: *mut c_void, out: *mut SerialStruct),
    /// Set serial port parameters (for TIOCSSERIAL).
    pub set_serial: unsafe extern "C" fn(ctx: *mut c_void, new: *const SerialStruct) -> i32,
}

20.1.4.7 Break Handling ioctls

/// TCSBRK: send a BREAK. If arg==0, send 0.25s break; if arg!=0, drain output.
pub const TCSBRK:  u32 = 0x5409;
/// TCSBRKP: send break of arg*0.1s (POSIX break).
pub const TCSBRKP: u32 = 0x5425;
/// TIOCSBRK: start sending BREAK (until TIOCCBRK or TCSBRK arg=0).
pub const TIOCSBRK: u32 = 0x5427;
/// TIOCCBRK: stop sending BREAK.
pub const TIOCCBRK: u32 = 0x5428;

20.1.4.8 minicom Compatibility

minicom requires the following kernel features to operate correctly:

Feature	UmkaOS mechanism
Open serial port exclusively	`TIOCEXCL` → sets `TtyPort::exclusive` flag
Set baud rate (e.g., 115200)	`TCSETS2` with `BOTHER` or `TCSETS` with `B115200`
Hardware flow control	`CRTSCTS` flag → `set_mctrl(TIOCM_RTS)` + hardware CTS monitoring
Software flow control	`IXON`/`IXOFF` handled in N_TTY line discipline
Raw mode (no echo, no canon)	`c_lflag &= ~(ICANON\|ECHO\|ECHOE\|ISIG)`
Non-blocking I/O with timeout	`VMIN=0, VTIME=10` (1-second timeout per read)
Modem control (dial)	`TIOCMBIS(TIOCM_DTR\|TIOCM_RTS)` to assert DTR/RTS
Wait for DCD (carrier detect)	`TIOCMIWAIT(TIOCM_CAR)`
TIOCGSERIAL (low-latency mode)	`TIOCSSERIAL` with `ASYNC_LOW_LATENCY` flag
Z-modem (HDLC linedisc)	`TIOCSETD(N_HDLC)` for HDLC-based protocols

All of these are implemented in UmkaOS. minicom, picocom, screen, and cu all work correctly.

20.2 Input Subsystem (evdev)

Linux's evdev interface (/dev/input/eventX) is the standard for delivering keyboard, mouse, touch, and joystick events to userspace (Wayland compositors, X11).

20.2.1 Tier 2 Input Drivers

In UmkaOS, modern input drivers (USB HID, Bluetooth HID, I2C touchscreens) run in Tier 2 (Ring 3, process-isolated) (Section 10.4). An input driver's only responsibility is to parse hardware-specific reports and translate them into standardized input_event structs.

The driver communicates with umka-core via a shared memory ring established during driver registration (umka_driver_register, Section 11.1.3).

/// Internal kernel input event representation.
/// Uses 64-bit time fields for y2038 safety across all architectures.
///
/// **32-bit compatibility**: The userspace-visible `struct input_event` exposed
/// via `/dev/input/eventX` uses Linux-compatible layout that varies by architecture:
/// - 64-bit platforms: time_sec (u64), time_usec (u64), type (u16), code (u16), value (i32) = 24 bytes
/// - 32-bit platforms: time_sec (u32), time_usec (u32), type (u16), code (u16), value (i32) = 16 bytes
///
/// The `umka-compat` layer translates from this internal format to the
/// architecture-specific Linux input_event layout when copying to userspace.
/// This translation is zero-cost on 64-bit platforms (direct copy) and
/// requires field truncation/conversion on 32-bit platforms.
///
/// **Y2038 on 32-bit**: The 32-bit compat path preserves the Linux ABI
/// (u32 timestamps), which wraps in 2038. Linux solved y2038 for input
/// events by redefining `struct input_event` timestamp fields as
/// `__kernel_ulong_t` (unsigned 32-bit) in v5.0 (commit 152194fe9c3f),
/// extending the wrap date to 2106. UmkaOS follows the same approach: the
/// 32-bit compat layer uses unsigned timestamp fields (u32 sec, u32 usec),
/// matching Linux v5.0+ ABI. No separate ioctl is needed for input events.
#[repr(C)]
pub struct InputEvent {
    /// Event timestamp in seconds since boot (CLOCK_MONOTONIC).
    /// 64-bit for y2038 safety. Truncated to u32 at `copy_to_user` time during `read(2)`
    /// on `/dev/input/eventX` for 32-bit processes (via the `umka-compat` read path);
    /// the truncation point is the kernel→userspace copy, not ioctl registration or
    /// ring-buffer insertion.
    pub time_sec: u64,
    /// Event timestamp microseconds component.
    /// 64-bit for consistency with time_sec. Truncated to u32 at `copy_to_user` on the
    /// 32-bit compat read path (same truncation point as time_sec).
    pub time_usec: u64,
    /// Event type (EV_KEY, EV_REL, EV_ABS, etc.).
    pub type_: u16,
    /// Event code (key code, relative axis, absolute axis, etc.).
    pub code: u16,
    /// Event value (key state, relative delta, absolute position, etc.).
    pub value: i32,
}

When a user presses a key, the Tier 2 USB HID driver pushes an InputEvent into the shared ring and calls umka_driver_complete (Section 11.1.5). The UmkaOS Core's input multiplexer (umka-input) wakes up, reads the event, and copies it to all open file descriptors for the corresponding /dev/input/eventX node.

Because the input driver is a standard Tier 2 process, a crash in the complex USB HID parsing logic simply restarts the driver process (~10ms recovery) without dropping subsequent keystrokes.

20.2.2 Secure VT Switching and Panic Console

The Virtual Terminal (VT) subsystem provides the emergency text console and the mechanism for switching between graphical sessions (Ctrl+Alt+F1-F6).

In Linux, the VT subsystem is deeply entangled with the console driver, input layer, and DRM.

In UmkaOS, the VT subsystem is a minimal state machine inside umka-input: 1. Normal Operation: umka-input routes all input_event structs to the active Wayland compositor (the process holding the DRM master node). 2. VT Switch Detected: When umka-input detects a VT switch chord (e.g., Ctrl+Alt+F1), it immediately revokes the DRM master capability from the current compositor and pauses input event delivery to that process. 3. Panic Console Handoff: If the system panics, UmkaOS Core forcefully reclaims the display hardware from the Tier 1 DRM driver. It resets the display controller to a known-safe text mode (or simple framebuffer mode) using a minimal, statically linked Tier 0 VGA/EFI driver, and dumps the panic log. The complex Tier 1 DRM driver is completely bypassed during a panic to ensure the log is always visible, even if the GPU state machine is deadlocked.

Panic console handoff procedure:

On kernel panic, the console subsystem performs an orderly handoff to the Tier 0 emergency console before printing the panic message:

IRQs already disabled by the panic path before reaching this code.
Try Tier 1 console drivers: for each registered Tier 1 driver, attempt console->emergency_write(msg). Tier 1 drivers may have crashed (that may be the reason for the panic), so failures are silently ignored.
Fall through to Tier 0 emergency console: the Tier 0 console is initialized at boot time and its MMIO mapping is in Core memory (never revoked). Architecture-specific:
x86-64: COM1 serial (UART 16550, I/O port 0x3F8)
AArch64/ARMv7: PL011 UART (MMIO, base address from DTB)
RISC-V: SBI console extension (sbi_console_putchar)
PPC32/PPC64LE: OpenFirmware/OPAL console (opal_write)
Write panic output via Tier 0 console. No locks, no allocation.
umka-pstore fallback: if Tier 0 console is unavailable (headless system), write the panic message to the persistent crash buffer (mapped at boot, survives warm reboot, readable via /sys/umka/pstore/ after recovery).

The Tier 0 console path is entirely lock-free and allocation-free. It MUST work unconditionally at panic time, including when the panic was caused by a Tier 1 driver crash, memory corruption, or scheduler deadlock.

20.3 Audio Architecture (ALSA Compatibility)

Linux's Advanced Linux Sound Architecture (ALSA) provides the /dev/snd/pcmC0D0p interfaces for audio playback and capture.

20.3.1 ALSA PCM as DMA Rings

Audio devices are uniquely suited for UmkaOS's architecture because audio playback is fundamentally a ring buffer problem. Modern audio interfaces (Intel HDA, USB Audio Class 2.0) operate by reading PCM audio samples from a host memory ring buffer via DMA.

In UmkaOS, audio drivers (Tier 1 by default) do not implement complex ALSA state machines. Instead, an UmkaOS audio driver simply allocates an IOMMU-fenced DMA buffer (Section 11.1.6, umka_driver_dma_alloc) and programs the hardware to consume it.

When a userspace audio server (PipeWire or PulseAudio) opens the ALSA PCM node, umka-compat directly maps the hardware's DMA ring buffer into the PipeWire process's address space.

The Audio Data Path: 1. PipeWire writes PCM audio samples directly into the mapped DMA buffer in userspace. 2. PipeWire updates the ring buffer's "appl_ptr" (application pointer) in the shared memory control page. 3. The audio hardware consumes the samples via DMA and generates a period interrupt. 4. The kernel handles the interrupt, updates the "hw_ptr" (hardware pointer) in the shared control page, and wakes PipeWire via a futex.

Zero-Copy Routing: This architecture is purely zero-copy. The audio samples never pass through kernel memory, and the kernel never executes a copy_from_user(). The kernel's only role in the audio data path is routing the hardware interrupt to the PipeWire futex.

Xrun Handling (D25)

An xrun is a buffer underrun (playback) or overrun (capture) — the application failed to keep up with the real-time audio stream.

Underrun (playback: application fails to refill the DMA ring before the hardware consumes it): - The hardware continues running; the DMA ring outputs silence (zero samples) for the duration of the underrun. No explicit silence padding by the kernel is required — the hardware or DMA zeroes the consumed region. - The PCM state transitions to SNDRV_PCM_STATE_XRUN. - The next write() / snd_pcm_writei() call from the application returns -EPIPE. - The application must call snd_pcm_recover() or snd_pcm_prepare() to restart playback.

Overrun (capture: application fails to drain the DMA ring before it fills): - Incoming samples overwrite the oldest samples in the circular buffer; the oldest samples are silently dropped. - The PCM state transitions to SNDRV_PCM_STATE_XRUN. - The next read() / snd_pcm_readi() call returns -EPIPE. - The application must call snd_pcm_recover() or snd_pcm_prepare() to restart capture.

Recovery: snd_pcm_recover(pcm, -EPIPE, silent) calls snd_pcm_prepare() followed by snd_pcm_start() internally. The silent parameter suppresses error logging for expected xruns (e.g., during transient CPU load spikes).

No automatic recovery: UmkaOS does not silently recover from xruns on behalf of the application. The application is responsible for detecting -EPIPE and calling recover. This matches Linux ALSA behavior.

20.3.2 Audio Driver Tier Policy and Resilience

Audio drivers run in Tier 1 by default, as required for professional audio workloads with <5ms latency budgets where period interrupts fire every 1.3–42.7ms. This is consistent with the authoritative tier assignment in Section 12.1.3.

For consumer/desktop configurations where crash resilience is prioritized over latency, audio drivers may be optionally demoted to Tier 2. The demotion adds ~20–50μs syscall overhead per interrupt, which is acceptable at ≥10ms buffer periods but unacceptable for professional RT audio.

Audio drivers (especially USB Audio and complex DSPs) are prone to state machine bugs. Regardless of tier, an audio driver crash is seamlessly contained via the standard driver crash recovery mechanism (Section 10.8).

When an audio driver process crashes, the kernel's device registry (Section 10.5) revokes its MMIO mappings, leaving the DMA ring buffer intact. The registry restarts the driver process. The new driver instance re-initializes the hardware and binds back to the existing DMA ring buffer. PipeWire experiences a brief audio glitch but does not need to close and reopen the ALSA device, as the memory mapping remains valid throughout the recovery process.

Recovery time breakdown: The ~10-20ms total glitch comprises: (a) crash detection via page fault on revoked MMIO mapping (~0 — synchronous), (b) driver process restart including ELF load and re-initialization (~2-5ms), (c) hardware re-initialization including codec probe and DMA ring rebind (~5-15ms depending on hardware; USB Audio Class devices are at the high end due to USB control transfer latency). The glitch duration corresponds to 1-2 audio periods at typical buffer sizes (≥5ms periods). Professional RT configurations with ≤2ms periods may experience 2-5 dropped periods.

The 5–15 ms hardware re-init figure applies when the device supports soft reset — firmware reload without a USB port cycle. Full USB port reset requires T_RSTRCY ≥ 10 ms per USB 2.0 §10.2.6.2 (and ≥100 ms for USB 1.1 Full Speed devices), making full reset recovery 10–300 ms depending on USB version and device speed. Whether a device supports soft reset is detected at driver load time via AudioDevice::probe_soft_reset() and recorded in AudioDeviceCaps. Devices not supporting soft reset incur the full port-reset recovery time on crash reload.

20.3.3 Audio Device Trait

Interface contract: Section 12.1.3 (AudioDriver trait, audio_device_v1 KABI). This section specifies the Intel HDA, USB Audio Class, and HDMI/DP audio endpoint implementations of that contract. Tier decision and ALSA compat approach are authoritative in Section 12.1.3.

Architecture: Native UmkaOS audio driver framework with ALSA compatibility in umka-compat. The kernel provides a clean, low-latency PCM interface via the AudioDriver trait (Section 12.1.3). umka-compat translates snd_pcm_*/snd_ctl_* ioctls to native calls, enabling existing applications (PipeWire, PulseAudio, JACK) to work unmodified.

// umka-core/src/audio/mod.rs

/// Audio device handle.
#[repr(C)]
pub struct AudioDeviceId(u64);

/// PCM stream direction.
#[repr(u32)]
pub enum PcmDirection {
    /// Playback (host → device).
    Playback = 0,
    /// Capture (device → host).
    Capture = 1,
}

/// PCM sample format.
#[repr(u32)]
pub enum PcmFormat {
    /// Signed 16-bit little-endian (most common).
    S16Le = 0,
    /// Signed 24-bit little-endian (3 bytes per sample).
    S24Le = 1,
    /// Signed 32-bit little-endian.
    S32Le = 2,
    /// IEEE 754 32-bit float (pro audio, VST plugins).
    F32Le = 3,
}

/// PCM stream parameters.
#[repr(C)]
pub struct PcmParams {
    /// Direction (playback or capture).
    pub direction: PcmDirection,
    /// Sample format.
    pub format: PcmFormat,
    /// Sample rate in Hz (44100, 48000, 96000, 192000).
    pub rate: u32,
    /// Number of channels (2 = stereo, 6 = 5.1 surround, 8 = 7.1).
    pub channels: u8,
    /// Period size in frames (power of 2, typically 64-2048).
    /// A "period" is the granularity of interrupts: hardware fires an interrupt
    /// every `period_frames` samples. Smaller = lower latency, more CPU overhead.
    pub period_frames: u32,
    /// Buffer size in frames (multiple of period_frames, typically 4-8 periods).
    /// Total buffer duration = buffer_frames / rate. Example: 2048 frames @ 48kHz = 42.7ms.
    pub buffer_frames: u32,
}

/// PCM stream handle (opaque to userspace).
#[repr(C)]
pub struct PcmStreamHandle(u64);

// Note: The authoritative audio driver trait is `AudioDriver` defined in Section 12.1.3
// (10-drivers.md). Audio drivers implement that KABI trait, not a separate trait
// here. The methods are: open_pcm(), mixer_controls(), mixer_get(), mixer_set(),
// jack_ring(), suspend(), resume(). All use caller-supplied buffers (no Vec<T>
// returns) per KABI rules. See Section 12.1.3 for the full trait definition and rationale.

/// PCM stream (active playback or capture).
pub struct PcmStream {
    /// Stream handle (opaque to the kernel; used as a key by the driver).
    pub handle: PcmStreamHandle,
    /// Handle to the registered AudioDriver vtable (from device registry).
    /// Used to dispatch start_stream/stop_stream back to the owning driver.
    pub driver_handle: KabiDriverHandle<AudioDriverVTable>,
    /// Parameters.
    pub params: PcmParams,
    /// DMA buffer (ring buffer, mapped into userspace via umka-compat).
    pub dma_buffer: DmaBufferHandle,
    /// Hardware pointer (read position for playback, write position for capture).
    /// Updated by hardware via DMA or interrupt. Atomic for lock-free read from userspace.
    pub hw_ptr: Arc<AtomicU64>,
    /// Application pointer (write position for playback, read position for capture).
    /// Updated by userspace (PipeWire, ALSA lib).
    pub appl_ptr: Arc<AtomicU64>,
}

impl PcmStream {
    /// Start the stream (begin DMA).
    ///
    /// Programs the hardware DMA engine to transfer audio data between the
    /// ring buffer and the codec. For playback, the hardware reads from
    /// `dma_buffer[hw_ptr..appl_ptr]`. For capture, the hardware writes to
    /// `dma_buffer[hw_ptr..]`.
    ///
    /// The caller (ALSA compat layer or PipeWire bridge) must ensure sufficient
    /// data is buffered before calling start (playback) or that the buffer has
    /// space (capture). Returns `AudioError::Underrun` or `AudioError::Overrun`
    /// if preconditions are not met.
    ///
    /// This method delegates to the driver via the `AudioDriver` KABI trait
    /// (Section 12.1.3 in 10-drivers.md). The driver configures DMA scatter-gather from
    /// `self.dma_buffer` and sets the RUN bit in the stream descriptor register.
    pub fn start(&self) -> Result<(), AudioError> {
        // The actual hardware programming is performed by the AudioDriver
        // implementation behind the KABI vtable. The PcmStream is a handle
        // that the driver created in open_pcm(); the driver retains the
        // hardware references needed to configure DMA and stream registers.
        // The handle is passed back to the driver via the KABI start_stream()
        // call, which matches self.handle to the driver's internal state.
        //
        // Dispatch via the device registry: look up the AudioDriver vtable for
        // the device this stream belongs to (stored in self.driver_handle on open),
        // then call start_stream via the vtable pointer.
        let vtable = self.driver_handle.vtable();
        // SAFETY: vtable pointer is valid for the lifetime of the registered driver.
        unsafe { (vtable.start_stream)(self.handle) }
    }

    /// Stop the stream (pause DMA).
    ///
    /// Clears the RUN bit, waits for DMA to drain (up to 1 period), and
    /// resets the hardware pointer. The DMA buffer remains mapped — the
    /// stream can be restarted without re-opening.
    pub fn stop(&self) -> Result<(), AudioError> {
        let vtable = self.driver_handle.vtable();
        // SAFETY: vtable pointer is valid for the lifetime of the registered driver.
        unsafe { (vtable.stop_stream)(self.handle, false) } // immediate stop, no drain
    }
}

/// Mixer control (volume slider, mute toggle, input source selector).
#[repr(C)]
pub struct MixerControl {
    /// Control ID (for set_mixer_control).
    pub id: u32,
    /// Control type.
    pub control_type: MixerControlType,
    /// Name (e.g., "Master Playback Volume").
    pub name: [u8; 64],
    /// Min value (for volume controls).
    pub min: i32,
    /// Max value (for volume controls).
    pub max: i32,
    /// Current value.
    pub value: AtomicI32,
}

/// Mixer control type.
#[repr(u32)]
pub enum MixerControlType {
    /// Volume (integer range, min..max).
    Volume = 0,
    /// Mute (boolean, 0=unmuted, 1=muted).
    Mute = 1,
    /// Enumeration (e.g., input source: "Mic", "Line In", "CD").
    Enum = 2,
}

20.3.4 Intel HDA Driver Model

Intel High Definition Audio (HDA) is the dominant audio controller on Intel and AMD x86 platforms. The HDA spec defines: - HDA controller: PCI device (class 0x0403), exposes MMIO registers for command/response, DMA buffer descriptors, interrupt status. - Codecs: Audio chips connected via the HDA link (typically 1-2 codecs: one for analog audio, one for HDMI/DP audio). Each codec has a tree of widgets (nodes: DAC, ADC, mixer, pin, amplifier).

// umka-hda-driver/src/lib.rs (Tier 1 driver, optionally Tier 2)

/// Maximum number of codecs on a single HDA link (HDA spec allows 0-14).
pub const MAX_HDA_CODECS: usize = 15;

/// Maximum concurrent PCM streams per controller (limited by HDA stream
/// descriptor count; typical controllers support 4-16 bidirectional streams).
pub const MAX_HDA_STREAMS: usize = 16;

/// HDA controller state.
/// Uses fixed-capacity arrays to avoid heap allocation during audio playback.
/// Stream open/close modifies the array in-place without reallocation.
pub struct HdaController {
    /// PCI device.
    pub pci_dev: PciDevice,
    /// MMIO base address (from BAR0).
    pub mmio: *mut HdaRegisters,
    /// Codecs discovered on the HDA link.
    pub codecs: ArrayVec<HdaCodec, MAX_HDA_CODECS>,
    /// Active PCM streams.
    pub streams: ArrayVec<Arc<HdaPcmStream>, MAX_HDA_STREAMS>,
}

/// HDA codec (represents one audio chip on the HDA link).
pub struct HdaCodec {
    /// Codec address (0-14).
    pub addr: u8,
    /// Vendor ID (from root node).
    pub vendor_id: u32,
    /// Function groups discovered via GET_SUBORDINATE_NODE_COUNT on root node.
    /// Bounded by HDA spec: max 1 Audio Function Group + 1 Modem Function Group per codec.
    pub function_groups: ArrayVec<HdaFunctionGroup, 4>,
}

/// HDA function group (container for related widgets within a codec).
pub struct HdaFunctionGroup {
    /// Node ID (NID) of this function group.
    pub nid: u8,
    /// Widgets within this function group.
    /// Bounded by HDA spec: max 255 widgets per function group (NID range 8-bit).
    pub widgets: ArrayVec<HdaWidget, 256>,
}

/// HDA widget (node in codec's audio routing graph).
pub struct HdaWidget {
    /// Node ID (NID).
    pub nid: u8,
    /// Capabilities (from GET_PARAMETER verb).
    pub capabilities: u32,
}

/// HDA widget type.
#[repr(u8)]
pub enum HdaWidgetType {
    /// Audio output (DAC - Digital-to-Analog Converter).
    AudioOut = 0,
    /// Audio input (ADC - Analog-to-Digital Converter).
    AudioIn = 1,
    /// Mixer (combines multiple inputs).
    Mixer = 2,
    /// Selector (mux: selects one of multiple inputs).
    Selector = 3,
    /// Pin (physical connector: headphone jack, speaker, mic).
    Pin = 4,
    /// Power widget.
    Power = 5,
    /// Volume knob.
    VolumeKnob = 6,
    /// Vendor-specific.
    VendorDefined = 15,
}

impl HdaController {
    /// Send a verb (command) to a codec. Returns the response.
    /// HDA verbs use CORB (Command Outbound Ring Buffer) and RIRB (Response Inbound Ring Buffer).
    pub fn send_verb(&self, codec_addr: u8, nid: u8, verb: u32) -> Result<u32, HdaError> {
        // Write to CORB: codec_addr | nid | verb.
        // Wait for RIRB: response appears in ring buffer, signaled by interrupt or polling.
        // Encode verb: bits [31:28] = codec_addr, [27:20] = nid, [19:0] = verb payload.
        let command = ((codec_addr as u32) << 28) | ((nid as u32) << 20) | (verb & 0xF_FFFF);
        // Write command to next CORB slot, advance CORB write pointer.
        let wp = self.corb_advance_wp();
        unsafe { self.corb_base.add(wp).write_volatile(command) };
        // Poll RIRB read pointer until response arrives (timeout: 1ms).
        let response = self.rirb_poll_response(core::time::Duration::from_millis(1))?;
        Ok(response)
    }

    /// Probe codecs on the HDA link.
    pub fn probe_codecs(&mut self) -> Result<(), HdaError> {
        // Read STATESTS register to discover codec addresses (bit set = codec present).
        let statests = unsafe { (*self.mmio).statests };
        for addr in 0..15 {
            if (statests & (1 << addr)) != 0 {
                // Codec present: read vendor ID, build widget tree.
                let vendor_id = self.send_verb(addr, 0, VERB_GET_VENDOR_ID)?;
                let codec = self.build_codec(addr, vendor_id)?;
                self.codecs.push(codec);
            }
        }
        Ok(())
    }

    /// Build widget tree for a codec (enumerate all nodes, parse capabilities).
    fn build_codec(&self, addr: u8, vendor_id: u32) -> Result<HdaCodec, HdaError> {
        // Send GET_SUBORDINATE_NODE_COUNT to root (NID 0) to discover function groups.
        // Send GET_SUBORDINATE_NODE_COUNT to each function group to discover widgets.
        // For each widget, send GET_PARAMETER to read capabilities.
        // Root node (NID 0): get subordinate node count to discover function groups.
        let sub = self.send_verb(addr, 0, VERB_GET_SUBORDINATE_NODE_COUNT)?;
        let fg_start = (sub >> 16) as u8;
        let fg_count = (sub & 0xFF) as u8;
        let mut codec = HdaCodec { addr, vendor_id, function_groups: ArrayVec::new() };
        for fg_nid in fg_start..fg_start + fg_count {
            // Each function group: enumerate child widgets.
            let fg_sub = self.send_verb(addr, fg_nid, VERB_GET_SUBORDINATE_NODE_COUNT)?;
            let w_start = (fg_sub >> 16) as u8;
            let w_count = (fg_sub & 0xFF) as u8;
            let mut widgets = ArrayVec::new();
            for w_nid in w_start..w_start + w_count {
                let caps = self.send_verb(addr, w_nid, VERB_GET_PARAMETER(PARAM_AUDIO_WIDGET_CAP))?;
                widgets.push(HdaWidget { nid: w_nid, capabilities: caps });
            }
            codec.function_groups.push(HdaFunctionGroup { nid: fg_nid, widgets });
        }
        Ok(codec)
    }
}

DMA buffer descriptor list (BDLIST): HDA uses a scatter-gather DMA model. Each PCM stream has a BDLIST (Buffer Descriptor List) in host memory, containing entries like:

/// HDA Buffer Descriptor List Entry (BDL entry).
#[repr(C)]
pub struct HdaBdlEntry {
    /// Physical address of buffer segment.
    pub addr: u64,
    /// Length of buffer segment in bytes.
    pub length: u32,
    /// IOC (Interrupt On Completion) flag. Bit 0 only; upper 31 bits reserved per HDA
    /// spec §4.4.3 and must be written as zero. Set bit 0 to 1 to generate an interrupt
    /// when this segment completes; set to 0 for no interrupt on this entry.
    pub ioc: u32,
}

The HDA controller DMA engine walks the BDLIST, fetching audio data from the buffers, and generates an interrupt when ioc=1 entries complete (every period).

20.3.5 PipeWire Integration

Section 20.3 defines PipeWire ring buffers for audio routing in userspace. The integration: 1. Kernel provides raw PCM streams (Section 20.3.3 PcmStream): a DMA ring buffer that hardware directly reads/writes. 2. PipeWire runs in userspace (Tier 2): implements the audio graph (mixing, routing, resampling, effects). 3. Zero-copy path: PipeWire's "audio device" node directly mmaps the kernel PCM DMA buffer. PipeWire writes mixed samples to appl_ptr, advances the pointer, the kernel driver sees the update and programs the hardware to consume up to appl_ptr.

Low-latency timer: PipeWire needs a periodic callback to refill the buffer every period. The kernel provides a timer (HPET or TSC-deadline APIC timer, configured to fire every period_frames / rate seconds, e.g., 1ms for 48-frame periods at 48kHz). Timer interrupt wakes PipeWire, which renders the next period's samples.

20.3.6 Jack Detection

HDA codecs support unsolicited responses (jack detection events): when a headphone is plugged/unplugged, the codec sends an event to the controller.

impl HdaController {
    /// Enable unsolicited response for a pin widget (jack detection).
    pub fn enable_jack_detect(&self, codec_addr: u8, pin_nid: u8) -> Result<(), HdaError> {
        // Send SET_UNSOLICITED_ENABLE verb to pin widget.
        // SET_UNSOLICITED_ENABLE (verb 0x708): bit 7 = enable, bits [6:0] = tag.
        // NID is encoded by send_verb() into CORB bits [27:20]; do NOT embed it in the verb payload.
        let verb = VERB_SET_UNSOLICITED_ENABLE | (1 << 7); // enable=1, tag=0
        self.send_verb(codec_addr, pin_nid, verb)?;
        Ok(())
    }

    /// Handle unsolicited response interrupt (jack detection event).
    pub fn handle_unsolicited_response(&self, codec_addr: u8, response: u32) {
        // Parse response: extract pin NID, jack state (connected/disconnected).
        let pin_nid = (response >> 4) & 0xFF;
        let connected = (response & 0x1) != 0;

        // Post event to userspace via event ring buffer.
        umka_event::post_event(Event::AudioJackChanged {
            device_id: self.device_id(),
            codec_addr,
            pin_nid: pin_nid as u8,
            connected,
        });
    }
}

Audio routing policy: Audio routing policy (default device selection, per-app routing, volume control) is handled by PipeWire in userspace. Kernel provides DMA ring buffers and jack detection events.

20.3.7 Architectural Decision

Audio: Native UmkaOS framework + ALSA compat

Kernel provides native PCM interface with clean ABI. umka-compat translates ALSA ioctls to native calls, enabling existing applications (PipeWire, PulseAudio, JACK) to work unmodified. Best of both worlds: clean kernel API, full userspace compatibility.

20.3.8 ALSA MIDI Sequencer

The ALSA sequencer provides a kernel-internal MIDI event bus. Applications connect ports and route MIDI events between synthesizers, hardware MIDI interfaces, and software instruments. It is distinct from raw MIDI device I/O (which goes through /dev/midiC0D0 raw devices).

Architecture

┌────────────────────────────────────────────────────────┐
│                   snd_seq Core                         │
│                                                        │
│  Clients:  [app A]  [app B]  [snd_seq_dummy]  [hw]    │
│               │        │           │            │      │
│  Ports:    [128:0]  [129:0]     [14:0]      [20:0]    │
│               │        │           │            │      │
│  Subscriptions (routing graph — many-to-many)          │
│               └────────┴───────────┘────────────┘      │
│  Queues:   [Q0: real-time]  [Q1: MIDI tick-based]      │
│               │                    │                   │
│  Timer:    snd_hrtimer (CLOCK_MONOTONIC)                │
└────────────────────────────────────────────────────────┘
        ↕ /dev/snd/seq

Data Structures

/// Maximum MIDI ports per sequencer client.
/// Ports 0-191: user-space clients. Ports 192-255: kernel/system clients.
pub const SEQ_MAX_PORTS_PER_CLIENT: usize = 256;

/// MIDI event FIFO depth per sequencer client output queue.
/// 256 events × ~28 bytes each ≈ 7 KB per client — fixed, no heap allocation.
pub const SEQ_CLIENT_FIFO_DEPTH: usize = 256;

/// ALSA sequencer client (one per application or hardware source).
///
/// The `RingBuf<SeqEvent, SEQ_CLIENT_FIFO_DEPTH>` type is defined in Section 10.6
/// (umka-driver-sdk ring buffer). If not yet imported in this context, add
/// `use umka_driver_sdk::ring::RingBuf;`.
pub struct SeqClient {
    /// Client number (0-191 = user clients; 192-255 = kernel clients).
    pub client_id:   u8,
    /// Client type.
    pub type_:       SeqClientType,
    /// Client name (for display in aconnect etc.).
    pub name:        [u8; 64],
    /// Port table indexed directly by port ID (0-255). O(1) access by port_id.
    /// Option<Arc> allows sparse allocation — clients need not use all 256 ports.
    pub ports:       [Option<Arc<SeqPort>>; SEQ_MAX_PORTS_PER_CLIENT],
    /// Output event ring buffer (kernel→client direction). Fixed-size, no heap allocation.
    /// When full, new events are dropped and `lost` is incremented.
    pub fifo:        Mutex<RingBuf<SeqEvent, SEQ_CLIENT_FIFO_DEPTH>>,
    /// Count of dropped events due to full FIFO. Monotonically increasing.
    pub lost:        AtomicU64,
}

pub enum SeqClientType {
    /// Kernel client (e.g., hardware MIDI driver, snd_seq_dummy).
    Kernel,
    /// Userspace application connected via /dev/snd/seq.
    User,
}

/// ALSA sequencer port.
pub struct SeqPort {
    pub port_id:     u8,
    pub client_id:   u8,
    pub name:        [u8; 64],
    /// Port capability flags.
    pub capability:  SeqPortCapability,
    /// Port type flags.
    pub type_:       SeqPortType,
    /// Subscriber list: ports that send TO this port (WRITE direction).
    pub write_subs:  RwLock<Vec<SeqSubscription>>,
    /// Subscriber list: ports this port sends TO (READ direction).
    pub read_subs:   RwLock<Vec<SeqSubscription>>,
    /// Per-port kernel client callback (for kernel clients).
    pub kernel_fn:   Option<fn(port: &SeqPort, event: &SeqEvent)>,
}

bitflags! {
    pub struct SeqPortCapability: u32 {
        const READ        = 1 << 0; // Other ports may receive from this port
        const WRITE       = 1 << 1; // Other ports may send to this port
        const SYNC_READ   = 1 << 2; // Obsolete
        const SYNC_WRITE  = 1 << 3; // Obsolete
        const DUPLEX      = 1 << 4; // Full-duplex port
        const SUBS_READ   = 1 << 5; // Subscription list readable by other clients
        const SUBS_WRITE  = 1 << 6; // Subscription list writable by other clients
        const NO_EXPORT   = 1 << 7; // Do not export this port via ANNOUNCE
    }
}

bitflags! {
    pub struct SeqPortType: u32 {
        const SPECIFIC    = 1 << 0; // Hardware-specific (not a standard MIDI port)
        const MIDI_GENERIC = 1 << 1; // Standard MIDI port
        const MIDI_GM     = 1 << 2; // General MIDI compatible
        const MIDI_GS     = 1 << 3; // Roland GS compatible
        const MIDI_XG     = 1 << 4; // Yamaha XG compatible
        const MIDI_MT32   = 1 << 5; // Roland MT-32 compatible
        const MIDI_GM2    = 1 << 6; // General MIDI 2 compatible
        const SYNTH       = 1 << 10; // Software synthesizer
        const DIRECT_SAMPLE = 1 << 11; // Sampling synthesizer
        const SAMPLE      = 1 << 12; // Sample player
        const HARDWARE    = 1 << 16; // Hardware port (MIDI interface)
        const SOFTWARE    = 1 << 17; // Software port (application)
        const SYNTHESIZER = 1 << 18; // Synthesizer
        const PORT        = 1 << 19; // Port connector (MIDI port on a hardware device)
        const APPLICATION = 1 << 20; // Application (sequencer, arpeggiator, etc.)
    }
}

MIDI Event

/// ALSA sequencer event (matches struct snd_seq_event, 28 bytes).
#[repr(C)]
pub struct SeqEvent {
    /// Event type (see SeqEventType enum).
    pub type_:   u8,
    /// Flags: timestamp format, data format.
    pub flags:   u8,
    /// Tag (for application use).
    pub tag:     u8,
    /// Queue ID (for scheduled events; SNDRV_SEQ_QUEUE_DIRECT = 253 for immediate).
    pub queue:   u8,
    /// Timestamp (union: tick or real-time depending on flags).
    pub time:    SeqTimestamp,
    /// Source port (client_id, port_id).
    pub source:  SeqAddr,
    /// Destination port (client_id, port_id; SNDRV_SEQ_ADDRESS_BROADCAST = 253 for all subscribers).
    pub dest:    SeqAddr,
    /// Event data (union of MIDI event types).
    pub data:    SeqEventData,
}

/// Timestamp union (8 bytes).
pub union SeqTimestamp {
    /// MIDI tick timestamp (SNDRV_SEQ_TIME_STAMP_TICK flag).
    pub tick: u32,
    /// Real-time timestamp (SNDRV_SEQ_TIME_STAMP_REAL flag).
    pub time: SeqRealTime,
}

#[repr(C)]
pub struct SeqRealTime {
    pub tv_sec:  u32,
    pub tv_nsec: u32,
}

pub union SeqEventData {
    pub note:    SeqEvNote,    // NOTE_ON, NOTE_OFF, KEY_PRESSURE
    pub control: SeqEvCtrl,   // CONTROLLER, PGMCHANGE, PITCHBEND, etc.
    pub raw8:    [u8; 12],    // SYSEX (pointer in extended event format)
    pub ext:     SeqEvExt,    // Large events (SYSEX > 12 bytes)
    pub queue:   SeqEvQueue,  // Queue start/stop/tempo
    pub addr:    SeqAddr,     // Announce events
    pub connect: SeqConnect,  // Port subscribe/unsubscribe
    pub result:  SeqEvResult, // Echo/result
}

#[repr(C)]
pub struct SeqEvNote {
    pub channel:  u8,
    pub note:     u8,   // 0-127
    pub velocity: u8,   // 0-127 (NOTE_OFF: release velocity)
    pub off_velocity: u8, // NOTE_OFF velocity (for NOTE event)
    pub duration: u32,  // Duration in ticks (for NOTE event; ignored for NOTE_ON/OFF)
}

#[repr(C)]
pub struct SeqEvCtrl {
    pub channel: u8,
    pub _pad:    [u8; 3],
    pub param:   u32,   // Controller number (for CONTROLLER), program (PGMCHANGE), etc.
    pub value:   i32,   // Controller value; pitch bend is ±8192
}

Event Types

Key event types (SNDRV_SEQ_EVENT_*):

Type	Value	Description
NOTE_ON	6	Note On (channel, note, velocity)
NOTE_OFF	7	Note Off (channel, note, velocity)
KEYPRESS	8	Key Pressure / Aftertouch
CONTROLLER	10	Control Change (CC# 0-127)
PGMCHANGE	11	Program Change
CHANPRESS	12	Channel Pressure
PITCHBEND	13	Pitch Bend (±8192)
QFRAME	22	MIDI Quarter Frame (MTC)
SONGPOS	20	Song Position Pointer
SONGSEL	21	Song Select
START	30	MIDI Start
CONTINUE	31	MIDI Continue
STOP	32	MIDI Stop
CLOCK	36	MIDI Clock
RESET	41	Reset to power-on state
SENSING	42	Active Sensing
ECHO	50	Echo back to sender (for timing measurement)
SYSEX	130	System Exclusive (extended data format)
PORT_SUBSCRIBED	66	Port subscription created
PORT_UNSUBSCRIBED	67	Port subscription deleted

Queues and Timers

/// Sequencer queue (schedules events for future delivery).
pub struct SeqQueue {
    pub queue_id:  u8,
    /// Queue owner client (only owner can start/stop/set tempo).
    pub owner:     u8,
    /// Running state.
    pub running:   AtomicBool,
    /// Tempo in microseconds per quarter note (default 500000 = 120 BPM).
    pub tempo_us:  AtomicU32,
    /// Time signature numerator.
    pub ppq:       u32,  // Pulses Per Quarter note (default 96)
    /// Current position in ticks.
    pub tick:      AtomicU64,
    /// Current real-time position.
    pub real_time: AtomicU64, // nanoseconds
    /// Scheduled event heap (min-heap by timestamp).
    pub events:    Mutex<BinaryHeap<ScheduledEvent>>,
    /// hrtimer for next scheduled event.
    pub timer:     HrTimer,
}

Queue operations via ioctl SNDRV_SEQ_IOCTL_START_QUEUE, SNDRV_SEQ_IOCTL_STOP_QUEUE, SNDRV_SEQ_IOCTL_CONTINUE_QUEUE. Tempo change via SNDRV_SEQ_IOCTL_SET_QUEUE_TEMPO.

/dev/snd/seq Interface

ioctls on /dev/snd/seq (one fd per client):

ioctl	Description
`SNDRV_SEQ_IOCTL_PVERSION`	Get sequencer version
`SNDRV_SEQ_IOCTL_CLIENT_ID`	Get caller's client ID
`SNDRV_SEQ_IOCTL_SYSTEM_INFO`	Get max_queues, max_clients, max_ports, max_channels
`SNDRV_SEQ_IOCTL_CREATE_PORT`	Create a new port
`SNDRV_SEQ_IOCTL_DELETE_PORT`	Delete a port
`SNDRV_SEQ_IOCTL_GET_PORT_INFO`	Get port info (name, capability, type)
`SNDRV_SEQ_IOCTL_SET_PORT_INFO`	Set port info
`SNDRV_SEQ_IOCTL_SUBSCRIBE_PORT`	Create subscription (routing)
`SNDRV_SEQ_IOCTL_UNSUBSCRIBE_PORT`	Remove subscription
`SNDRV_SEQ_IOCTL_CREATE_QUEUE`	Create event queue
`SNDRV_SEQ_IOCTL_DELETE_QUEUE`	Delete queue
`SNDRV_SEQ_IOCTL_GET_QUEUE_STATUS`	Get queue running state
`SNDRV_SEQ_IOCTL_GET_QUEUE_TEMPO`	Get BPM/PPQ
`SNDRV_SEQ_IOCTL_SET_QUEUE_TEMPO`	Set BPM/PPQ
`SNDRV_SEQ_IOCTL_START_QUEUE`	Start queue timer
`SNDRV_SEQ_IOCTL_STOP_QUEUE`	Stop queue timer
`SNDRV_SEQ_IOCTL_CONTINUE_QUEUE`	Continue queue from pause
`SNDRV_SEQ_IOCTL_RUNNING_MODE`	Toggle real-time vs tick scheduling
`SNDRV_SEQ_IOCTL_GET_CLIENT_INFO`	Get client metadata
`SNDRV_SEQ_IOCTL_SET_CLIENT_INFO`	Set client name etc.

Read/write on the fd: each read() returns one or more SeqEvent structs; write() sends events to destination ports immediately (queue=SNDRV_SEQ_QUEUE_DIRECT) or schedules them (queue=Q0/Q1 with timestamp). O_NONBLOCK supported.

snd_seq_dummy — Loopback Client

snd_seq_dummy creates one kernel client (client 14, "Midi Through") with two ports: port 0 (writable by apps, readable by output devices) and port 1 (reverse). All events written to port 0 are echoed back to all subscribers of port 0. This provides a software MIDI loopback for virtual instruments.

Linux Compatibility

/dev/snd/seq character device (major 116, minor 1): same as Linux ALSA
ioctl codes identical to Linux ALSA sound/asound.h
struct snd_seq_event binary layout identical
aconnect(1), aplaymidi(1), aseqdump(1) work without modification
JACK and PipeWire MIDI ports connect via snd_seq (JACK uses seq_midi_event translation)
Timidity++, FluidSynth, and other software synthesizers use /dev/snd/seq directly

20.4 Display and Graphics (DRM/KMS)

The Direct Rendering Manager (DRM) and Kernel Mode Setting (KMS) subsystems manage GPUs, display outputs, and hardware-accelerated rendering.

20.4.1 DRM as a Tier 1 Subsystem

GPUs are complex, high-bandwidth devices that require aggressive memory management (GART/TTM) and rapid command submission. Therefore, UmkaOS GPU drivers (e.g., umka-amdgpu, umka-i915) operate in Tier 1 (Ring 0, MPK-isolated) (Section 10.4). Full implementation details covering display device models, atomic modesetting, framebuffer objects, and scanout planes are specified in Section 20.4.3–17.4.14.

The GPU driver runs in a dedicated hardware memory domain. It receives command buffers from userspace (Mesa/Vulkan) via shared memory rings. The driver validates the command buffers (ensuring they don't contain malicious GPU memory writes) and submits them to the hardware command rings.

Because the driver is MPK-isolated, a bug in the complex command validation logic (a frequent source of Linux CVEs) cannot corrupt UmkaOS Core memory or the page cache. If the GPU driver faults, it is reloaded (~50-150ms). Userspace rendering contexts are lost (triggering a VK_ERROR_DEVICE_LOST in Vulkan applications), but the system remains stable.

20.4.2 DMA-BUF and Secure File Descriptor Passing

Modern Linux graphics rely entirely on DMA-BUF: a mechanism for sharing hardware-backed memory buffers between different devices and processes (e.g., sharing a rendered frame from the GPU to the Wayland compositor, or from a V4L2 webcam to the GPU).

In Linux, a DMA-BUF is represented as a standard file descriptor. Passing the file descriptor over a UNIX domain socket grants access to the underlying memory.

UmkaOS's DMA-BUF Implementation: UmkaOS implements DMA-BUF using the core Capability System (Section 8.1). 1. When the GPU driver allocates a framebuffer, it creates an UmkaOS Memory Object and mints a Capability Token granting MEM_READ | MEM_WRITE access. 2. umka-compat wraps this Capability Token in a synthetic file descriptor. 3. When the Wayland client passes the file descriptor to the compositor over AF_UNIX (using SCM_RIGHTS), the kernel securely delegates the Capability Token to the compositor's capability space. 4. The compositor uses the Capability Token to map the framebuffer into its own address space, or passes it back to the GPU driver to queue a page flip (KMS).

By backing DMA-BUF file descriptors with cryptographic Capability Tokens, UmkaOS guarantees that memory access rights cannot be forged or leaked, and seamlessly supports distributed graphics rendering (Section 5.1) where the compositor and the rendering client exist on different physical nodes in the cluster.

20.4.3 Display Device Model

Interface contract: Section 12.1.2 (DisplayDriver trait, display_device_v1 KABI). This section specifies the Intel i915, AMD DCN, and embedded display pipeline implementations of that contract. Tier decision and atomic modesetting requirement are authoritative in Section 12.1.2.

Tier: Tier 1 for integrated GPUs (Intel i915, AMD amdgpu iGPU). Tier 2 only for fully offloaded display (USB DisplayLink, network display servers).

// umka-core/src/display/mod.rs

/// Display device handle.
#[repr(C)]
pub struct DisplayDeviceId(u64);

/// Display connector type.
#[repr(u32)]
pub enum ConnectorType {
    /// Unknown or internal.
    Unknown = 0,
    /// HDMI.
    Hdmi = 1,
    /// DisplayPort.
    DisplayPort = 2,
    /// embedded DisplayPort (laptop internal screen).
    EmbeddedDp = 3,
    /// USB-C with DP Alt Mode.
    UsbTypeC = 4,
    /// DVI.
    Dvi = 5,
    /// VGA (legacy).
    Vga = 6,
    /// Virtual (for VNC, RDP).
    Virtual = 7,
}

/// Display connector state.
#[repr(u32)]
pub enum ConnectorState {
    /// No display attached.
    Disconnected = 0,
    /// Display attached, EDID read successfully.
    Connected = 1,
    /// Display may be attached, but no EDID (fallback to safe mode).
    ConnectedNoEdid = 2,
}

/// Display connector.
///
/// Mutable connector properties (EDID, modes, active mode) are grouped into
/// a single `ConnectorProps` snapshot, swapped atomically via RCU during
/// hotplug or modeset. This eliminates per-field RwLock overhead and ensures
/// readers always see a consistent snapshot (no half-updated EDID + stale
/// mode list). Connector state and DPMS are independent atomic fields
/// because they change on different paths (hotplug IRQ vs userspace ioctl).
pub struct DisplayConnector {
    /// Connector ID (unique per display device).
    pub id: u32,
    /// Connector type.
    pub connector_type: ConnectorType,
    /// Current state (connected, disconnected). Updated atomically by
    /// hotplug IRQ handler — no lock needed.
    pub state: AtomicU32, // ConnectorState
    /// DPMS (Display Power Management Signaling) state.
    pub dpms: AtomicU32, // DpmsState
    /// Mutable connector properties. Updated during hotplug (EDID read,
    /// mode list rebuild) and modeset (active_mode change). RCU-protected:
    /// readers (userspace mode queries, compositor enumeration) are lock-free;
    /// writers (hotplug handler, atomic commit) clone-and-swap.
    pub props: RcuPtr<Arc<ConnectorProps>>,
}

/// Immutable snapshot of connector properties. Created during hotplug
/// (EDID parse → mode list → props swap) or atomic commit (active_mode
/// change). Freed after RCU grace period when superseded.
pub struct ConnectorProps {
    /// EDID data. Fixed-size buffer avoids heap allocation during hotplug.
    /// EDID standard: 128 bytes/block; E-EDID extensions up to 256 bytes;
    /// DisplayID and CTA extensions can reach 512 bytes total.
    pub edid: Option<ArrayVec<u8, 512>>,
    /// Supported display modes (parsed from EDID or driver-provided fallbacks).
    /// Typical displays advertise 10-40 modes; 64 is sufficient for 8K panels
    /// with multiple refresh rates.
    pub modes: ArrayVec<DisplayMode, 64>,
    /// Currently active mode (if connected and enabled).
    pub active_mode: Option<DisplayMode>,
}

/// Display mode (resolution, refresh rate).
#[repr(C)]
#[derive(Clone, Copy, PartialEq, Eq)]
pub struct DisplayMode {
    /// Horizontal resolution in pixels.
    pub hdisplay: u16,
    /// Vertical resolution in pixels.
    pub vdisplay: u16,
    /// Refresh rate in millihertz (60000 = 60.000 Hz).
    pub vrefresh_mhz: u32,
    /// Flags (interlaced, VRR capable, preferred mode).
    pub flags: u32,
    /// Pixel clock in kHz (for driver use, validates mode is achievable).
    pub clock_khz: u32,
    /// Horizontal timings (front porch, sync, back porch).
    /// **Range note**: u16 is sufficient for resolutions up to 8K@60Hz (htotal
    /// typically <12000). At 8K@120Hz with extended VRR blanking, htotal can
    /// approach or exceed 65535; such modes require u32. Linux DRM uses `i32` for
    /// all timing fields. If UmkaOS adds support for 8K@120Hz+ with wide VRR blanking,
    /// these fields must be widened to u32.
    pub hsync_start: u16,
    pub hsync_end: u16,
    pub htotal: u16,
    /// Vertical timings (front porch, sync, back porch).
    pub vsync_start: u16,
    pub vsync_end: u16,
    pub vtotal: u16,
}

/// Display mode flags.
pub mod mode_flags {
    /// Interlaced mode.
    pub const INTERLACED: u32 = 1 << 0;
    /// Variable Refresh Rate (VRR) capable (FreeSync, G-Sync, HDMI VRR).
    pub const VRR: u32 = 1 << 1;
    /// Preferred mode (from EDID).
    pub const PREFERRED: u32 = 1 << 2;
}

/// DPMS (Display Power Management Signaling) state.
#[repr(u32)]
pub enum DpmsState {
    /// Display on, normal operation.
    On = 0,
    /// Display standby (monitor sleeps, can wake instantly).
    Standby = 1,
    /// Display suspend (lower power than standby).
    Suspend = 2,
    /// Display off (lowest power, may take 1-2 seconds to wake).
    Off = 3,
}

20.4.4 Atomic Modesetting Protocol

UmkaOS uses an atomic modesetting model (same as Linux DRM atomic). Changes to the display configuration (resolution, framebuffer, connector enable/disable) are batched into a single atomic transaction. Either all changes apply or none do. This eliminates tearing and half-configured states.

// umka-core/src/display/atomic.rs

/// Atomic modesetting request.
///
/// Uses fixed-capacity `ArrayVec` instead of `Vec` to avoid heap allocation
/// on every display frame. The bounds are hardware-limited: no display
/// controller has more than `MAX_CONNECTORS` (8) connectors or `MAX_PLANES`
/// (32) planes. At 60fps+ with multiple displays, eliminating per-frame
/// heap allocation avoids allocator contention on the latency-sensitive
/// commit path.
pub struct AtomicModeset {
    /// Connector changes (enable, disable, mode change).
    pub connectors: ArrayVec<ConnectorUpdate, MAX_CONNECTORS>,
    /// Plane changes (scanout buffer, position, scaling).
    pub planes: ArrayVec<PlaneUpdate, MAX_PLANES>,
    /// Flags (test-only, allow modeset, async).
    pub flags: AtomicFlags,
}

/// Connector update (part of atomic transaction).
pub struct ConnectorUpdate {
    /// Connector ID.
    pub connector_id: u32,
    /// New mode (None = disable connector).
    pub mode: Option<DisplayMode>,
    /// CRTC to attach this connector to (if enabling).
    pub crtc_id: Option<u32>,
}

/// Plane update (part of atomic transaction).
pub struct PlaneUpdate {
    /// Plane ID.
    pub plane_id: u32,
    /// Framebuffer handle (None = disable plane).
    pub fb: Option<FramebufferHandle>,
    /// Source rectangle in framebuffer (for scaling/cropping).
    pub src: Rectangle,
    /// Destination rectangle on screen.
    pub dst: Rectangle,
}

/// Atomic modesetting flags.
pub mod atomic_flags {
    /// Test-only (validate but don't apply; used by compositors to check if mode is possible).
    pub const TEST_ONLY: u32 = 1 << 0;
    /// Allow modeset (may cause visible glitch; only allow during VT switch or initial setup).
    pub const ALLOW_MODESET: u32 = 1 << 1;
    /// Async flip (flip on next vblank, don't wait; lower latency).
    pub const ASYNC: u32 = 1 << 2;
}

/// Rectangle (for plane src/dst).
#[repr(C)]
#[derive(Clone, Copy)]
pub struct Rectangle {
    pub x: u32,
    pub y: u32,
    pub width: u32,
    pub height: u32,
}

Atomic commit flow: 1. Wayland compositor builds an AtomicModeset transaction: "attach framebuffer FB123 to primary plane, set mode to 1920x1080@60Hz on connector 0, disable connector 1". 2. Compositor calls ioctl(dri_fd, UMKA_DRM_ATOMIC_COMMIT, &atomic_modeset) (via umka-compat DRM emulation). 3. Kernel validates the transaction: - Mode is supported by the connector (in the modes list from EDID). - Framebuffer format is supported by the plane (RGB888, XRGB8888, NV12, etc.). - Bandwidth is achievable (pixel clock within limits, memory bandwidth sufficient). 4. If valid, kernel programs the display controller hardware (Intel i915 writes to plane registers, GGT, pipe config; AMD writes to DCN registers). 5. Hardware scans out the new framebuffer on the next vblank (tear-free).

20.4.5 Framebuffer Objects

A framebuffer is a region of GPU memory containing pixel data. The display controller's scanout engine reads from the framebuffer via DMA and sends pixels to the monitor.

// umka-core/src/display/framebuffer.rs

/// Framebuffer handle (opaque to userspace).
#[repr(C)]
pub struct FramebufferHandle(u64);

/// Framebuffer format (pixel layout).
#[repr(u32)]
pub enum FramebufferFormat {
    /// 32bpp XRGB (X=unused, R=red, G=green, B=blue; 8 bits each).
    Xrgb8888 = 0x34325258,
    /// 32bpp ARGB (with alpha channel).
    Argb8888 = 0x34325241,
    /// 24bpp RGB (no alpha, no padding).
    Rgb888 = 0x34324752,
    /// 16bpp RGB565.
    Rgb565 = 0x36314752,
    /// YUV 4:2:0 planar (NV12, for video).
    Nv12 = 0x3231564e,
}

/// Framebuffer descriptor.
pub struct Framebuffer {
    /// Handle.
    pub handle: FramebufferHandle,
    /// Width in pixels.
    pub width: u32,
    /// Height in pixels.
    pub height: u32,
    /// Pixel format.
    pub format: FramebufferFormat,
    /// Pitch (bytes per row; may be larger than width * bpp if aligned).
    pub pitch: u32,
    /// GPU memory object backing this framebuffer (for DMA-BUF export, Section 6.3/Section 20.4).
    pub mem_obj: MemoryObjectHandle,
}

Framebuffer allocation: Compositor allocates GPU memory (via the GPU driver, Section 8.1/Section 21.1), renders the desktop into it (via Vulkan/OpenGL), then creates a framebuffer object pointing to that memory and passes it to the display subsystem for scanout.

20.4.6 Scanout Planes

Modern display controllers have multiple planes (hardware overlays) that can scan out independent framebuffers simultaneously: - Primary plane: The desktop/window contents (always present). - Cursor plane: The mouse cursor (small, can be moved with no desktop re-render). - Overlay planes: Video playback windows (compositor passes video framebuffer directly to hardware, zero-copy).

// umka-core/src/display/plane.rs

/// Display plane (hardware overlay).
///
/// Plane state (framebuffer, position, scaling) is grouped into a single
/// `PlaneState` snapshot, swapped atomically via RCU during atomic commit.
/// This eliminates the need to acquire three separate RwLocks (fb, src, dst)
/// and guarantees readers see a consistent plane configuration.
pub struct DisplayPlane {
    /// Plane ID (unique per display device).
    pub id: u32,
    /// Plane type.
    pub plane_type: PlaneType,
    /// Supported framebuffer formats (immutable after probe).
    // ArrayVec<_, 16>: KABI-stable (no heap pointer, no runtime allocation).
    // 16 format slots cover all known display hardware (typical: 4–12 formats per plane).
    // Cannot use Vec across KABI boundary (Rust Vec layout not guaranteed stable).
    pub formats: ArrayVec<FramebufferFormat, 16>,
    /// Current plane state. Replaced atomically during modeset commit.
    /// Readers (vblank handlers, userspace queries) get a consistent
    /// snapshot via `rcu_read_lock()` — no per-field locking.
    pub state: RcuPtr<Arc<PlaneState>>,
}

/// Immutable snapshot of plane state. Created during atomic commit and
/// swapped via RCU. Freed after grace period when superseded.
pub struct PlaneState {
    /// Current framebuffer attached (None = plane disabled).
    pub fb: Option<FramebufferHandle>,
    /// Source rectangle in framebuffer (for scaling/cropping).
    pub src: Rectangle,
    /// Destination rectangle on screen.
    pub dst: Rectangle,
}

/// Plane type.
#[repr(u32)]
pub enum PlaneType {
    /// Primary plane (desktop contents).
    Primary = 0,
    /// Cursor plane (mouse cursor, small, high-priority).
    Cursor = 1,
    /// Overlay plane (video, additional window).
    Overlay = 2,
}

Cursor plane optimization: Moving the cursor only requires updating the cursor plane's dst rectangle. The compositor does NOT need to re-render the desktop or flip the primary plane. This is why modern desktops have smooth 144Hz cursors even with a 60Hz desktop.

20.4.7 Hotplug Detection

When a display is connected (USB-C DP Alt Mode, HDMI, etc.), the display controller raises an interrupt. The driver handles hotplug in the interrupt handler:

// umka-core/src/display/hotplug.rs

/// Per-display-controller device state.
/// One instance per display controller (e.g., one per i915 GPU, one per
/// DisplayPort MST hub). Created by the display driver during probe.
pub struct DisplayDevice {
    /// Device registry handle for this display controller.
    pub device_id: DisplayDeviceId,
    /// All connectors attached to this controller (HDMI, DP, eDP, etc.).
    pub connectors: ArrayVec<DisplayConnector, MAX_CONNECTORS>,
    /// Hardware display planes available for composition.
    pub planes: ArrayVec<DisplayPlane, MAX_PLANES>,
    /// MMIO base address for display controller registers.
    pub mmio_base: u64,
    /// IRQ number for hotplug/vblank interrupts.
    pub irq: u32,
}

impl DisplayDevice {
    /// Hotplug interrupt handler (runs in Tier 1 driver domain).
    pub fn handle_hotplug_interrupt(&self) {
        // Scan all connectors for state changes.
        for connector in &self.connectors {
            let new_state = self.read_connector_state(connector.id);
            let old_state = connector.state.load(Ordering::Acquire);

            if new_state != old_state {
                connector.state.store(new_state, Ordering::Release);

                if new_state == ConnectorState::Connected {
                    // Display connected: read EDID, parse modes.
                    // Build a new ConnectorProps snapshot and swap it into
                    // the RCU pointer atomically; readers are always lock-free.
                    if let Ok(edid) = self.read_edid(connector.id) {
                        let modes = parse_edid(&edid);
                        let new_props = Arc::new(ConnectorProps {
                            edid: Some(edid),
                            modes,
                            active_mode: None,
                        });
                        connector.props.swap(new_props);
                    }
                    // Post hotplug event to userspace.
                    self.post_hotplug_event(connector.id, HotplugEventType::Connected);
                } else {
                    // Display disconnected: replace props with an empty snapshot.
                    let new_props = Arc::new(ConnectorProps {
                        edid: None,
                        modes: ArrayVec::new(),
                        active_mode: None,
                    });
                    connector.props.swap(new_props);
                    self.post_hotplug_event(connector.id, HotplugEventType::Disconnected);
                }
            }
        }
    }
}

Compositor response: When the compositor receives a hotplug event (via the event ring buffer), it: 1. Re-enumerates connectors and modes (ioctl(DRM_IOCTL_MODE_GETRESOURCES)). 2. Decides how to configure the new display (extended desktop, mirror, ignore). 3. Allocates new framebuffers (if needed) for the new resolution. 4. Submits an atomic modesetting request to enable the new connector.

parse_edid() — EDID Parsing Specification

parse_edid() is called during hotplug handling (see handle_hotplug_interrupt above) to convert raw EDID bytes read from the monitor's I2C DDC bus into a list of display modes.

/// Parse an EDID (Extended Display Identification Data) blob into display modes.
///
/// Supports EDID 1.0–1.4 (128 bytes) and E-EDID (DisplayID, CTA-861 extensions,
/// up to 512 bytes). Input is the raw bytes read from the monitor's I2C DDC bus.
///
/// # Algorithm
///
/// 1. **Header validation**: First 8 bytes must be `[0x00, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0x00]`.
///    Return `Err(EdidError::InvalidHeader)` if not.
///
/// 2. **Checksum**: Sum all 128 bytes; result must be 0 (mod 256).
///    Return `Err(EdidError::BadChecksum)` if not.
///
/// 3. **Established timings** (bytes 35–37, 24 well-known modes):
///    Bit map to modes: bit 7 of byte 35 = 720×400@70Hz, bit 6 = 720×400@88Hz, ...
///    (See VESA EDID standard Table 3.20 for full mapping.)
///    Add each set bit as a `DisplayMode` to the output list.
///
/// 4. **Standard timing descriptors** (bytes 38–53, 8 entries × 2 bytes):
///    Each entry encodes horizontal active pixels and aspect ratio + refresh rate.
///    Skip entries equal to `0x0101` (unused).
///    Formula: `h_active = (byte0 + 31) * 8; v_active = h_active / aspect_ratio;
///               refresh = (byte1 & 0x3F) + 60`
///
/// 5. **Detailed timing descriptors** (bytes 54–125, 4 × 18-byte blocks):
///    Each 18-byte block is either a monitor descriptor (first byte 0x00) or
///    a detailed timing (first byte non-zero). Detailed timings encode pixel clock,
///    h/v active, h/v blanking, sync polarity, and flags for interlaced/stereo.
///    Parse each detailed timing as a `DisplayMode`.
///
/// 6. **CEA/CTA extensions** (each extension block is 128 bytes, same checksum rule):
///    Tag byte 0x02 = CEA-861 extension. Parse Video Data Block (tag=2), short
///    video descriptors (SVDs), and native mode indicator. VIC (Video Identification
///    Code) → mode lookup table (CEA-861-F Table 1).
///
/// # Output
///
/// Returns `ArrayVec<DisplayMode, 64>`: up to 64 modes. Modes are sorted by
/// descending priority: detailed timings first (native mode = bit15 of CEA block or
/// first detailed timing), then established timings, then standard timings.
/// Duplicate modes (same h×v×refresh) are deduplicated; the one from the highest-
/// priority source is kept.
///
/// # Error handling
/// Returns `Err(EdidError)` only for header/checksum failures on the base 128-byte
/// block. Invalid or unrecognized descriptor blocks are skipped silently (a partial
/// mode list is better than no modes at all).
pub fn parse_edid(raw: &[u8]) -> Result<ArrayVec<DisplayMode, 64>, EdidError>;

/// Errors returned by `parse_edid()`.
#[derive(Debug)]
pub enum EdidError {
    /// Buffer too short (< 128 bytes).
    TooShort,
    /// Magic header bytes wrong (first 8 bytes are not the EDID header pattern).
    InvalidHeader,
    /// Checksum over 128 bytes != 0 mod 256.
    BadChecksum,
}

20.4.8 Panel Self-Refresh (PSR)

When the compositor has not updated the framebuffer (static desktop), the display controller can enter Panel Self-Refresh mode: - The monitor's internal controller (eDP panel, DP monitor with PSR support) caches the last frame. - The GPU's scanout engine stops reading from VRAM (memory bandwidth saved). - The GPU's memory controller enters a low-power state (watts saved).

When the compositor updates the framebuffer (user moves the mouse, window animates), the display driver detects the change (via atomic commit) and exits PSR mode, resuming scanout.

Power savings: PSR saves 1-2W on a laptop when the screen is static (reading a document, watching a video with no UI movement). This extends battery life by ~10-15% for typical office workloads.

20.4.9 Variable Refresh Rate (VRR)

Modern monitors support VRR (FreeSync, G-Sync, HDMI VRR): the display refreshes at variable intervals (e.g., 40-144Hz) synchronized with the compositor's render rate. This eliminates tearing without vsync's fixed-cadence latency.

// umka-core/src/display/vrr.rs

/// VRR mode.
#[repr(u32)]
pub enum VrrMode {
    /// VRR disabled (fixed refresh rate).
    Disabled = 0,
    /// VRR enabled (variable refresh within supported range).
    Enabled = 1,
}

impl DisplayConnector {
    /// Enable VRR (if supported by mode and monitor).
    pub fn set_vrr(&self, enabled: bool) -> Result<(), DisplayError> {
        // Read the RCU-protected ConnectorProps snapshot (lock-free; safe in interrupt context).
        let props = self.props.read();
        let mode = props.active_mode.ok_or(DisplayError::NoActiveMode)?;
        if (mode.flags & mode_flags::VRR) == 0 {
            return Err(DisplayError::VrrNotSupported);
        }
        // Program display controller to enable VRR (DP Adaptive-Sync, HDMI VRR, or FreeSync).
        self.driver.set_vrr_mode(self.id, if enabled { VrrMode::Enabled } else { VrrMode::Disabled })
    }
}

Compositor use: Wayland compositors (KWin, Mutter, wlroots) detect VRR capability from the display mode flags, enable it, and schedule presentation to match the compositor's render loop (unlocked framerate, no vsync wait).

20.4.10 VBlank Handling and Synchronization

VBlank (vertical blanking interval) is the fundamental display timing primitive. The display controller generates a VBlank interrupt at the start of each blanking interval (between the last scanline of one frame and the first scanline of the next). All page flips, cursor moves, and mode changes are synchronized to VBlank to avoid tearing.

// umka-core/src/display/vblank.rs

/// VBlank event delivered to userspace via the event ring buffer.
#[repr(C)]
pub struct VblankEvent {
    /// Event type discriminant (for the generic event ring).
    pub event_type: u32,            // = EVENT_TYPE_VBLANK (0x02)
    /// Size of this event struct in bytes.
    pub length: u32,                // = 32
    /// Monotonic timestamp (ns) at which VBlank occurred (from CLOCK_MONOTONIC).
    pub timestamp_ns: u64,
    /// VBlank sequence counter (monotonically increasing, wraps at u64::MAX).
    pub sequence: u64,
    /// CRTC ID that generated this VBlank.
    pub crtc_id: u32,
    /// Padding.
    pub _pad: u32,
}

/// Per-CRTC VBlank tracking state (kernel-internal).
pub struct VblankState {
    /// Monotonically increasing VBlank counter.
    pub count: AtomicU64,
    /// Timestamp of the most recent VBlank (ns, CLOCK_MONOTONIC).
    pub last_timestamp_ns: AtomicU64,
    /// Wait queue for threads blocked on VBlank (epoll_wait, ioctl WAITVBLANK).
    pub waiters: WaitQueue,
    /// Number of userspace clients requesting VBlank events on this CRTC.
    /// When zero, the kernel masks the VBlank interrupt to save power.
    pub event_refcount: AtomicU32,
    /// Whether this CRTC is actively scanning out (false during DPMS off/suspend).
    pub enabled: AtomicBool,
}

VBlank interrupt handler (runs in Tier 1 driver domain):

VBlank IRQ fires:
  1. Read hardware VBlank status register, acknowledge interrupt.
  2. Increment vblank_state.count (atomic).
  3. Store current timestamp in vblank_state.last_timestamp_ns (atomic).
  4. If a page flip was pending (committed via atomic modesetting with !ASYNC flag):
     a. The hardware has latched the new framebuffer address — the flip is complete.
     b. Post a PAGE_FLIP_COMPLETE event to the compositor's event ring buffer.
     c. Release the old framebuffer's reference (it is no longer being scanned out).
  5. If vblank_state.event_refcount > 0:
     a. Post a VblankEvent to each subscribed client's event ring buffer.
  6. Wake all threads on vblank_state.waiters.

VBlank event delivery to userspace: Compositors subscribe to VBlank events via ioctl(dri_fd, UMKA_DRM_CRTC_ENABLE_VBLANK, crtc_id). Events are delivered through the DRM file descriptor's event ring buffer (readable via read(2) or epoll). When the last subscriber unsubscribes, the kernel masks the VBlank interrupt to avoid unnecessary IRQ overhead on idle displays.

VBlank-synchronized page flips: When the compositor submits an atomic commit without the ASYNC flag, the kernel programs the new framebuffer address into a shadow register. The hardware latches the shadow register on the next VBlank, atomically switching scanout to the new framebuffer. The compositor blocks (or polls) until the PAGE_FLIP_COMPLETE event confirms the flip.

VBlank Event Ring Specification:

/// Per-CRTC VBlank event ring. Written by the display interrupt handler;
/// read by compositors and frame-synchronization tools.
pub struct VblankEventRing {
    /// Ring buffer entries. Fixed size: 64 events × 32 bytes = 2 KiB per CRTC.
    /// 64 entries provides ~1 second of headroom at 60Hz even if the compositor
    /// stalls for one full refresh period.
    pub ring: RingBuffer<VblankEvent, 64>,
    /// Subscribers waiting for the next VBlank (list of tasks blocked on poll/select
    /// or io_uring POLL_ADD for the CRTC's event fd).
    pub waiters: WaitQueue,
    /// Monotonic VBlank counter. Wraps at u64::MAX (~584 years at 120Hz).
    pub sequence: AtomicU64,
    /// Timestamp of the last VBlank (CLOCK_MONOTONIC nanoseconds).
    pub last_vblank_ns: AtomicU64,
}

Subscription mechanism:

/// Request a VBlank notification for a specific CRTC.
/// Returns a subscription that becomes readable when the next VBlank fires.
/// Equivalent to DRM_IOCTL_WAIT_VBLANK with DRM_VBLANK_EVENT flag.
pub fn drm_vblank_subscribe(crtc_id: u32) -> Result<VblankSubscription, DrmError>;

pub struct VblankSubscription {
    /// Event fd: readable when VBlank fires (or immediately if missed_vblanks > 0).
    pub event_fd: EventFd,
    /// If > 0, this subscription was registered late and missed this many VBlanks.
    /// The first read from event_fd will return immediately to signal the miss.
    pub missed_vblanks: u32,
}

Overflow behavior: When the 64-entry ring overflows (compositor not reading fast enough):

The oldest event is overwritten (ring buffer semantics; latest events take priority).
The missed_count field on the subscription's next event is set to the number of overwritten events.
The compositor can detect overflow by checking event.sequence != last_sequence + 1.
No compositor blocking: the ring is lock-free (SPSC; interrupt handler is the single producer, compositor is the single consumer per subscription). Overflow silently drops old events — never blocks the interrupt handler.

Interrupt path: The display interrupt handler directly writes to VblankEventRing.ring using the interrupt-safe SPSC write path (no allocation, O(1)). Then calls WaitQueue::wake_all(&ring.waiters) to wake subscribed compositors.

20.4.11 Multi-Monitor Coordination

A display controller typically has multiple CRTCs (CRT Controllers — the name is historical; they drive flat panels too). Each CRTC is an independent timing generator that scans out one framebuffer to one or more connectors. The mapping is:

Display Pipeline:
  Planes → CRTC → Encoder → Connector → Monitor
             │
             ├── Each CRTC has independent timing (mode, refresh rate, VBlank)
             ├── Each CRTC owns a set of planes (primary + optional cursor/overlay)
             └── Multiple connectors can share a CRTC (clone/mirror mode)

Color Management: GammaLut and CrtcColorProperties

Before the CRTC structs, this section defines the color management types used by CrtcState. These types represent the CRTC-level display pipeline color correction stages: de-gamma (linearization), CTM (color space conversion), and gamma (re-gamma for display encoding). They match the Linux DRM color management ABI so that Wayland compositors and color management tools (colord, icc-profiles) work unmodified.

Display color pipeline (stages applied in hardware order):

Plane pixels (encoded, e.g., sRGB)
  → per-plane tone mapping (optional, plane-level property)
  → alpha compositing / blending
  → CRTC degamma LUT  (encoded → linear light, using degamma_lut)
  → CTM               (color space conversion, e.g., sRGB → display native)
  → CRTC gamma LUT    (linear light → display encoding, using gamma_lut)
  → scanout to panel

// umka-core/src/display/color.rs

/// Single entry in a hardware gamma lookup table.
///
/// Maps one input intensity level to per-channel output intensities.
/// The hardware applies the LUT per channel: `R_out = lut.red[R_in >> shift]`,
/// where `shift` accounts for the difference between input bit depth (e.g., 10-bit
/// pipe) and the LUT size (e.g., 256 entries → shift = 2 for a 10-bit pipe).
///
/// Layout matches Linux's `struct drm_color_lut` (see `include/uapi/drm/drm_mode.h`)
/// for binary ABI compatibility with Wayland compositors, Xorg, and color
/// management daemons that set gamma via `DRM_IOCTL_MODE_SETCRTC` or the
/// atomic `DRM_IOCTL_MODE_ATOMIC` with the `GAMMA_LUT` CRTC property blob.
#[repr(C)]
pub struct GammaLutEntry {
    /// Red channel output value (16-bit, linear, range 0..=65535).
    pub red: u16,
    /// Green channel output value (16-bit, linear, range 0..=65535).
    pub green: u16,
    /// Blue channel output value (16-bit, linear, range 0..=65535).
    pub blue: u16,
    /// Reserved for alignment; must be zero.
    /// (Matches the padding in `struct drm_color_lut` for ABI compatibility.)
    pub _reserved: u16,
}

/// Color correction gamma LUT (Look-Up Table).
/// Pre-allocated at display device initialization to the hardware's LUT capacity.
/// `Box<[GammaLutEntry]>` over `Vec<GammaLutEntry>`: allocated once at init,
/// never resized. Atomic modeset context writes into the pre-allocated slice
/// without risk of allocation failure.
///
/// Used for both the gamma LUT (post-blend, re-encodes into display gamma) and
/// the de-gamma LUT (pre-blend, linearizes sRGB input). The number of entries
/// is hardware-dependent — query via `CrtcProperties::gamma_lut_size` (the
/// read-only `GAMMA_LUT_SIZE` DRM CRTC property).
///
/// The kernel validates that `count <= entries.len()` on every atomic
/// commit that includes a `GAMMA_LUT` or `DEGAMMA_LUT` property update.
/// Mismatched sizes are rejected with `-EINVAL`.
///
/// **Default (linear) LUT**: When `CrtcColorProperties::gamma_lut` is `None`,
/// the hardware applies a linear identity mapping: entry `i` maps to
/// `i * 65535 / (size - 1)` per channel. This is the power-on default and the
/// behavior when color management is not requested.
///
/// **Allocation**: The `Box<[GammaLutEntry]>` is allocated once in
/// `drm_device_init()` when the hardware's `gamma_size` is read from the
/// display controller (via `DRM_IOCTL_MODE_GETPROPBLOB` → `gamma_size`).
/// After initialization the slice is never reallocated; atomic modesetting
/// paths only update entries within the already-allocated slice, so there is
/// no allocation failure path in the atomic commit code.
pub struct GammaLut {
    /// Pre-allocated LUT entries. Capacity = hardware LUT size (typically 256
    /// or 1024 per channel, queried from display hardware at init via
    /// `DRM_IOCTL_MODE_GETPROPBLOB` → `gamma_size`).
    /// Entries in order from darkest (index 0, input = black) to
    /// brightest (index count-1, input = full intensity).
    pub entries: Box<[GammaLutEntry]>,
    /// Number of valid entries (≤ entries.len()). Must match the hardware's
    /// `GAMMA_LUT_SIZE` property for the target CRTC on every atomic commit.
    pub count: u32,
}

/// 3×3 color transform matrix (CTM) in S31.32 fixed-point format.
///
/// Applied between the de-gamma and gamma stages to convert between color spaces
/// (e.g., sRGB → DCI-P3, BT.709 → BT.2020, or ICC profile adjustments).
///
/// Entry `matrix[i][j]` is the contribution of input channel `j` to output
/// channel `i`, where channels are ordered R=0, G=1, B=2. The fixed-point
/// format is S31.32: bit 63 is sign, bits 62..32 are the integer part, bits
/// 31..0 are the fractional part. This matches the Linux DRM CTM property blob
/// layout (`struct drm_color_ctm`, `include/uapi/drm/drm_mode.h`).
///
/// Identity matrix (no color conversion):
/// ```
/// matrix = [[1<<32, 0, 0],
///            [0, 1<<32, 0],
///            [0, 0, 1<<32]]
/// ```
#[repr(C)]
pub struct ColorTransformMatrix {
    /// Row-major 3×3 matrix. `matrix[output_channel][input_channel]`.
    pub matrix: [[i64; 3]; 3],
}

/// CRTC color management properties.
///
/// Grouped as a sub-struct within `CrtcState` so that all color properties
/// are updated atomically as part of an RCU-swapped state snapshot. This
/// prevents a race where gamma is updated but the CTM is not yet applied,
/// which would briefly produce incorrect colors on a live display.
pub struct CrtcColorProperties {
    /// Gamma LUT for post-blending correction (CRTC-level re-encoding).
    /// Applied after the CTM, converts linear light to display-encoded values.
    /// `None` means linear (no gamma correction — hardware applies identity LUT).
    /// Set via the atomic `GAMMA_LUT` CRTC property blob.
    pub gamma_lut: Option<GammaLut>,
    /// De-gamma LUT for pre-blending linearization (CRTC-level).
    /// Applied before plane blending, converts sRGB-encoded plane pixels to
    /// linear light for physically correct alpha compositing and CTM application.
    /// `None` means input is treated as linear (no de-gamma applied).
    /// Set via the atomic `DEGAMMA_LUT` CRTC property blob.
    pub degamma_lut: Option<GammaLut>,
    /// Color transform matrix (CTM) for color space conversion.
    /// Applied between de-gamma and gamma stages.
    /// `None` means identity (no color space conversion).
    /// Set via the atomic `CTM` CRTC property blob.
    pub ctm: Option<ColorTransformMatrix>,
}

Linux DRM ABI compatibility: - GammaLutEntry layout matches struct drm_color_lut exactly (field order and sizes are identical, including the 16-bit reserved padding field). - ColorTransformMatrix layout matches struct drm_color_ctm (nine S31.32 values in row-major order). - Gamma and de-gamma LUTs are set as blob properties via DRM_IOCTL_MODE_ATOMIC with the CRTC property names GAMMA_LUT and DEGAMMA_LUT. - The read-only CRTC property GAMMA_LUT_SIZE (and DEGAMMA_LUT_SIZE if the hardware has a separate de-gamma LUT) reports the hardware LUT size in entries. - The legacy DRM_IOCTL_MODE_SETCRTC gamma interface (which passes a simple 256- entry RGB table) is translated internally to a GammaLut with count = 256 (written into a pre-allocated slice of at least 256 entries) and applied as the gamma_lut property with degamma_lut = None, matching Linux behavior.

// umka-core/src/display/crtc.rs

/// CRTC (display timing generator).
///
/// All mutable CRTC properties (mode, plane assignments, connectors, gamma)
/// are grouped into a single `CrtcState` snapshot, swapped atomically via
/// RCU during modeset commit. This eliminates four separate RwLocks and
/// guarantees readers see a fully consistent CRTC configuration — no
/// half-applied modeset where `active_mode` is updated but `planes` is stale.
pub struct Crtc {
    /// CRTC index (0..num_crtcs-1, unique per display device).
    pub id: u32,
    /// VBlank tracking for this CRTC (independent lifecycle, not part
    /// of modeset state — vblank counters increment continuously).
    pub vblank: VblankState,
    /// Current CRTC state. Replaced atomically during modeset commit.
    /// VBlank handlers and userspace queries read lock-free via RCU.
    pub state: RcuPtr<Arc<CrtcState>>,
}

/// Immutable snapshot of CRTC configuration. Created during atomic commit
/// and swapped via RCU. Freed after grace period when superseded.
pub struct CrtcState {
    /// Current display mode (None = CRTC disabled).
    pub active_mode: Option<DisplayMode>,
    /// Planes assigned to this CRTC.
    pub planes: ArrayVec<u32, MAX_PLANES_PER_CRTC>,
    /// Connectors currently routed to this CRTC.
    pub connectors: ArrayVec<u32, MAX_CONNECTORS_PER_CRTC>,
    /// Color management properties (degamma LUT, CTM, gamma LUT).
    /// All three stages are updated atomically as part of this state snapshot.
    /// See `CrtcColorProperties` and the color pipeline diagram above.
    pub color: CrtcColorProperties,
}

/// Maximum CRTCs per display device (i915 = 4, AMD = 6, typical).
pub const MAX_CRTCS: usize = 8;
/// Maximum planes per CRTC (primary + cursor + overlays).
pub const MAX_PLANES_PER_CRTC: usize = 8;
/// Maximum connectors per CRTC (for clone/mirror).
pub const MAX_CONNECTORS_PER_CRTC: usize = 4;
/// Maximum connectors per display device.
pub const MAX_CONNECTORS: usize = 8;
/// Maximum planes per display device.
pub const MAX_PLANES: usize = 32;

Plane-to-CRTC assignment: Not all planes can drive all CRTCs. Each plane has a possible_crtcs bitmask (set by the driver during probe) indicating which CRTCs it can be attached to. The atomic commit validator checks this constraint. Example: on an Intel Gen12 GPU, the cursor plane for pipe A cannot be assigned to pipe B.

Bandwidth validation: When an atomic commit enables multiple CRTCs at high resolutions, the kernel validates that the total scanout bandwidth does not exceed the display controller's memory bandwidth limit:

bandwidth_check(commit):
  total_bw = 0
  for each active CRTC in commit:
    mode = crtc.active_mode
    bpp = framebuffer.format.bytes_per_pixel()
    total_bw += mode.clock_khz * 1000 * bpp  // bytes/sec
  if total_bw > display_device.max_scanout_bandwidth:
    return Err(DisplayError::InsufficientBandwidth)

This prevents configurations like 4x 4K@120Hz on a controller that can only sustain 2x 4K@120Hz, which would cause visual corruption or FIFO underruns.

Independent timing: Each CRTC runs at its own refresh rate. A laptop with a 120Hz internal panel (eDP) and a 60Hz external monitor (HDMI) has two CRTCs with independent VBlank timing. The compositor receives separate VBlank events for each and renders at independent cadences.

20.4.12 Display Register Abstraction

Display drivers access hardware via MMIO-mapped registers. To maintain the tier isolation model and support multiple display controller families, register access is abstracted behind a per-driver operations table:

// umka-core/src/display/hw.rs

/// Display hardware operations — implemented by each display driver
/// (i915, amdgpu, nouveau, etc.). Passed to the display core during probe.
#[repr(C)]
pub struct DisplayHwOps {
    /// Write a 32-bit value to a display register (MMIO offset from base).
    pub reg_write32: unsafe extern "C" fn(ctx: *mut c_void, offset: u32, value: u32),
    /// Read a 32-bit value from a display register.
    pub reg_read32: unsafe extern "C" fn(ctx: *mut c_void, offset: u32) -> u32,
    /// Program a CRTC's timing generator with the given mode.
    /// The driver translates DisplayMode into hardware-specific register values
    /// (PLL dividers, pipe timings, sync polarities).
    pub crtc_set_mode: unsafe extern "C" fn(
        ctx: *mut c_void,
        crtc_id: u32,
        mode: *const DisplayMode,
    ) -> IoResultCode,
    /// Enable/disable a CRTC's timing generator.
    pub crtc_enable: unsafe extern "C" fn(
        ctx: *mut c_void,
        crtc_id: u32,
        enable: bool,
    ) -> IoResultCode,
    /// Program a plane's scanout address and position.
    pub plane_update: unsafe extern "C" fn(
        ctx: *mut c_void,
        plane_id: u32,
        fb: *const Framebuffer,
        src: *const Rectangle,
        dst: *const Rectangle,
    ) -> IoResultCode,
    /// Commit all pending register writes atomically (latch on next VBlank).
    /// Called after crtc_set_mode/plane_update to apply changes together.
    pub commit_flush: unsafe extern "C" fn(ctx: *mut c_void) -> IoResultCode,
    /// Read EDID from a connector's DDC/CI I2C bus.
    pub read_edid: unsafe extern "C" fn(
        ctx: *mut c_void,
        connector_id: u32,
        out_edid: *mut u8,
        edid_buf_size: u32,
        out_edid_len: *mut u32,
    ) -> IoResultCode,
    /// Read connector hotplug state (connected/disconnected).
    pub read_connector_state: unsafe extern "C" fn(
        ctx: *mut c_void,
        connector_id: u32,
    ) -> ConnectorState,
    /// Acknowledge VBlank interrupt. Returns the CRTC ID that generated it.
    pub ack_vblank: unsafe extern "C" fn(
        ctx: *mut c_void,
        out_crtc_id: *mut u32,
    ) -> IoResultCode,
    /// Set DPMS power state on a connector.
    pub set_dpms: unsafe extern "C" fn(
        ctx: *mut c_void,
        connector_id: u32,
        state: DpmsState,
    ) -> IoResultCode,
}

The display core (generic, hardware-independent code) calls DisplayHwOps methods to program the hardware. Each driver (i915, amdgpu, etc.) provides its own DisplayHwOps implementation that translates generic operations into hardware-specific register writes. This is the same VTable pattern used by all UmkaOS KABI interfaces (Section 11.1).

Register access isolation: Display drivers run in Tier 1. Their MMIO regions are mapped into the driver's isolation domain. The reg_write32/reg_read32 functions access MMIO directly (no syscall overhead). The display core, running in umka-core's domain, calls the driver's DisplayHwOps via a domain switch (~23-80 cycles).

20.4.13 DRM/KMS Compatibility Interface

Userspace compositors (Wayland compositors, Xwayland, mpv) interact with the display subsystem via Linux DRM/KMS ioctl() calls on /dev/dri/card* device nodes. UmkaOS's umka-compat layer (Section 18.1) translates these ioctls into UmkaOS-native display operations.

Supported DRM ioctls (minimum viable set for Wayland compositors):

ioctl	Linux cmd nr	UmkaOS handler	Description
`DRM_IOCTL_MODE_GETRESOURCES`	0xA0	`display_get_resources()`	Enumerate CRTCs, connectors, encoders
`DRM_IOCTL_MODE_GETCONNECTOR`	0xA7	`display_get_connector()`	Get connector properties and supported modes
`DRM_IOCTL_MODE_GETENCODER`	0xA6	`display_get_encoder()`	Get encoder↔CRTC mapping
`DRM_IOCTL_MODE_GETCRTC`	0xA1	`display_get_crtc()`	Get current CRTC mode and framebuffer
`DRM_IOCTL_MODE_SETCRTC`	0xA2	`display_legacy_set_crtc()`	Legacy mode setting (translated to atomic internally)
`DRM_IOCTL_MODE_ADDFB2`	0xB8	`display_add_framebuffer()`	Create framebuffer object from DMA-BUF / GEM handle
`DRM_IOCTL_MODE_RMFB`	0xAF	`display_remove_framebuffer()`	Destroy framebuffer object
`DRM_IOCTL_MODE_PAGE_FLIP`	0xB0	`display_page_flip()`	Flip primary plane (translated to atomic commit)
`DRM_IOCTL_MODE_ATOMIC`	0xBC	`display_atomic_commit()`	Full atomic modesetting
`DRM_IOCTL_MODE_CREATEPROPBLOB`	0xBD	`display_create_blob()`	Create property blob (for gamma LUTs, HDR metadata)
`DRM_IOCTL_MODE_DESTROYPROPBLOB`	0xBE	`display_destroy_blob()`	Destroy property blob
`DRM_IOCTL_PRIME_HANDLE_TO_FD`	0x2D	`dma_buf_export()`	Export GEM handle as DMA-BUF fd
`DRM_IOCTL_PRIME_FD_TO_HANDLE`	0x2E	`dma_buf_import()`	Import DMA-BUF fd as GEM handle

Error mapping: UmkaOS DisplayError variants map to Linux errno values:

/// Display subsystem error codes. Each variant maps to a unique Linux errno.
/// Multiple display errors mapping to the same errno (e.g., EINVAL) use the
/// variant's identity for internal dispatch; the errno value is only used at
/// the userspace ABI boundary (DRM ioctl return).
///
/// **Convention**: Discriminant values use the **negative** of the Linux errno,
/// following the standard kernel-internal convention where functions return
/// `-EFOO` on failure. The syscall return path (`umka-compat`) passes the
/// negative value directly to userspace via the register ABI; glibc then
/// negates it, stores it in `errno`, and returns -1. This matches Linux
/// kernel behavior (`return -EINVAL;` in C kernel code).
#[repr(i32)]
pub enum DisplayError {
    /// Permission denied (DRM_MASTER required for modesetting).
    PermissionDenied     = -1,   // -EPERM
    /// Connector not found.
    ConnectorNotFound    = -2,   // -ENOENT
    /// CRTC not found.
    CrtcNotFound         = -6,   // -ENXIO
    /// Mode not supported by connector.
    ModeNotSupported     = -22,  // -EINVAL
    /// Bandwidth exceeded for display controller.
    InsufficientBandwidth = -28, // -ENOSPC
    /// Framebuffer format not supported by plane.
    FormatNotSupported   = -61,  // -ENODATA
    /// No active mode on connector (VRR without mode set).
    NoActiveMode         = -71,  // -EPROTO
    /// VRR not supported by connector/mode.
    VrrNotSupported      = -95,  // -EOPNOTSUPP
    /// Atomic test failed (TEST_ONLY flag).
    AtomicTestFailed     = -125, // -ECANCELED
}

Legacy compatibility: Older applications use DRM_IOCTL_MODE_SETCRTC and DRM_IOCTL_MODE_PAGE_FLIP (non-atomic). UmkaOS translates these into atomic commits internally — SETCRTC becomes an atomic commit with ALLOW_MODESET, PAGE_FLIP becomes an atomic commit with only the primary plane updated. This matches the approach used by modern Linux DRM drivers (i915, amdgpu) which internally implement legacy ioctls as wrappers around atomic.

20.4.14 Architectural Decision

Display: Wayland-only + Xwayland

UmkaOS's KMS interface (Section 20.4) is Wayland-native (DRM atomic modesetting, DMA-BUF via capabilities). X11 support via Xwayland (same as Fedora, Ubuntu 22.04+). No native X11 server support — X11 protocol is a 40-year-old security liability (MIT-MAGIC-COOKIE-1, unrestricted window snooping). Xwayland provides compatibility for legacy apps without compromising security.