GPU Texture Pipeline

Until now, Lux has been a vector graphics engine. Shapes, transforms, layers — all described as draw commands, all rasterised by Vello at the end of the frame. Beautiful for 2D geometry. Completely useless for pixel-level image processing.

If you want to blur an image, colour-correct a video feed, or run a custom fragment shader, you need textures. GPU-resident, handle-allocated, pipeline-cached textures. You need a texture engine.

This is the foundation for everything that follows.

The architecture

The texture pipeline has four components, split across two crates:

In lux-core (where plugins can see it):

TextureHandle — an opaque u64 handle. Plugins never touch a wgpu::Texture directly.
TextureOp — a declarative enum describing what you want the GPU to do.

In lux-render (where the GPU lives):

TexturePool — handle-based allocation with free-list reuse and frame-based GC.
ShaderCache — WGSL source hash → compiled pipeline cache.
TextureEngine — the orchestrator that executes TextureOps after graph evaluation.

The split is deliberate. Plugins depend only on lux-core. They describe what they want — “allocate a 512x512 texture”, “run this shader with these inputs” — and the render layer figures out how. A plugin author never imports wgpu, never creates a bind group, never thinks about buffer alignment. They push a TextureOp and move on.

Handles, not resources

Every texture in the system is referenced by a TextureHandle(u64). It’s Copy, it’s Serialize, it can cross wires, live in pin values, and get saved to disk. Handle 0 is reserved as INVALID — the texture equivalent of null.

#[derive(Clone, Copy, PartialEq, Eq, Hash, Serialize, Deserialize)]
pub struct TextureHandle(pub u64);

impl TextureHandle {
    pub const INVALID: Self = Self(0);
    pub fn is_invalid(&self) -> bool { self.0 == 0 }
}

The pool assigns handles from an atomic counter starting at 1. No mutex, no lock, just fetch_add(1, Relaxed). The handle is the key into a HashMap of actual GPU resources — wgpu::Texture, wgpu::TextureView, and the TextureDesc that created them.

Why handles instead of Arc<wgpu::Texture>? Three reasons:

Serialisation. A handle is a number. It can be saved to a .lux file, sent across a wire, stored in node state. A raw GPU resource can’t.
GC. The pool can reclaim textures that nobody references anymore. With shared ownership you’d need weak refs and cycle detection.
Reuse. When a texture is freed, its GPU memory goes back to the free list. The next allocation of the same size reuses it. Zero GPU allocation overhead for steady-state patches.

The free list

TexturePool maintains two data structures:

active: HashMap<u64, PoolEntry> — textures currently in use, keyed by handle ID.
free_list: HashMap<(u32, u32, TextureFormat), Vec<FreeEntry>> — released textures, keyed by dimensions and format.

When a node calls ctx.alloc_texture(512, 512, Rgba8), the pool first checks the free list for a matching (512, 512, Rgba8) entry. If one exists, it pops it and returns the handle. If not, it creates a new wgpu::Texture with the right usage flags (TEXTURE_BINDING | RENDER_ATTACHMENT | COPY_SRC | COPY_DST) and wraps it.

Garbage collection runs every 300 frames — about 5 seconds at 60fps. Any active texture that hasn’t been touched in 180 frames (3 seconds) gets moved to the free list. Free-list entries that exceed the memory budget get dropped entirely. The budget threshold is 512MB, logged once as a warning when exceeded.

For a typical patch running at steady state, the pool allocates textures on the first frame and then just recycles handles forever. Zero GPU allocations per frame.

Shader caching

ShaderCache compiles WGSL source into wgpu::RenderPipeline (for fragment shaders) or wgpu::ComputePipeline (for compute shaders), keyed by FNV-1a hash of the source string.

The first time a node runs a shader, it pays the compilation cost. Every subsequent frame reuses the cached pipeline. Since most filter nodes use a static WGSL source embedded at compile time, the hash is the same every frame, and the pipeline is effectively compiled once for the lifetime of the application.

Every fragment shader shares a single fullscreen vertex shader — a clever trick that generates a screen-covering triangle from the vertex ID alone:

@vertex fn vs_main(@builtin(vertex_index) vertex_index: u32) -> VSOut {
    let x = f32(i32(vertex_index & 1u) * 4 - 1);
    let y = f32(i32(vertex_index >> 1u) * 4 - 1);
    // ...
}

No vertex buffer. No index buffer. Just draw(0..3) and the GPU fills the screen. This is the foundation that every texture filter node will build on.

Declarative ops

The core of the plugin API is TextureOp — an enum with eight variants:

pub enum TextureOp {
    Alloc { handle, desc },
    Free { handle },
    Upload { handle, desc, data },
    RenderLayer { handle, desc, commands, clear_color },
    RunShader { output, desc, wgsl_source, inputs },
    RunCompute { output, desc, wgsl_source, inputs, dispatch },
    ReadBack { handle },
    MarkInUse { handle },
}

A node pushes ops into ProcessContext during process(). After the graph finishes evaluating, TextureEngine::execute() processes them all in order — allocating, uploading, compiling, dispatching. The node never blocks on GPU work. It describes intent; the engine handles execution.

This separation means the entire texture pipeline is testable without nodes. You can construct a vec of TextureOps, hand them to the engine, and verify the GPU did the right thing. The node graph is just one source of ops.

The frame loop

The three-phase pipeline ties into the existing frame loop:

begin_frame — TextureEngine updates its frame counter, returns a snapshot of all texture descriptors to the evaluator.
evaluate — nodes run, push TextureOps into their ProcessContext.
execute — TextureEngine processes all ops, runs GPU work, updates the pool.

The descriptor snapshot in phase 1 is important — it lets nodes call ctx.texture_size(handle) during evaluation without touching the GPU. The pool provides the metadata; the engine provides the execution.

What this enables

This is 1,531 lines of infrastructure. No new nodes. No visible features. The app does exactly what it did yesterday.

But now any plugin can allocate a GPU texture, upload pixel data, run a WGSL shader, and read back the result — all through four method calls on ProcessContext. The texture pipeline is the foundation for image loading, colour correction, blur, bloom, chroma key, video playback, feedback loops, and custom shaders.

The plumbing is done. Time to turn on the water — starting with texture sources.