Pixels on Screen: Building the GPU Pipeline

July 27, 2025

A node graph that evaluates is great. A node graph that evaluates and you can’t see anything? Less great.

Phase 2 was about one thing: get pixels on screen using the GPU.

The stack

I’m using three layers:

wgpu, the GPU abstraction. Talks to Vulkan on Linux, Metal on macOS, DX12 on Windows. One API, every platform.
Vello, a 2D vector renderer that does its tessellation on the GPU. Paths, shapes, text, all GPU-accelerated.
A blit pipeline, my own tiny wgpu pipeline that copies Vello’s output to the window surface.

Why not just render directly to the window? Because I need the intermediate texture. The editor canvas, output preview, separate output window, they all read from that texture. Render once, display many.

The version alignment problem

Here’s a fun one. I need wgpu 29 because egui-wgpu 0.34 requires it. But Vello 0.8 on crates.io uses wgpu 28. You can’t have two wgpu versions in one process, they’d create separate GPU devices.

Solution: I’m using a git branch of Vello that supports wgpu 29. It works perfectly. It’s also a dependency that points at someone’s GitHub fork, which is… not ideal for production. But it’s what I’ve got until Vello publishes a wgpu 29 release. I check regularly.

# Temporary, switch back when vello ships wgpu 29
vello = { git = "https://github.com/nicoburns/vello", branch = "wgpu29" }

VelloBackend

The VelloBackend translates Lux’s DrawCommand types into Vello scene calls:

FillCircle → vello::Scene::fill() with a circle path
FillRect → rounded rect with configurable corner radius
StrokePath → bezier path with stroke width
DrawText → font rendering (I embed a font at startup)
PushTransform / PopTransform → affine transform stack

The whole thing is a state machine: begin_frame() clears the scene, you push draw commands, then render_to_texture() kicks off Vello’s GPU pipeline.

BlitPipeline

This is the simplest wgpu pipeline you can write. One fullscreen triangle, one texture sampler, one fragment shader that reads the source texture. It handles format conversion too, some GPUs want RGBA, some want BGRA. The blit pipeline doesn’t care.

Vello renders → Rgba8Unorm texture → BlitPipeline → window surface

About 80 lines of Rust, 20 lines of WGSL. It’ll never change. It’s perfect.

GpuContext

I split the GPU resources into GpuContext (device, queue, renderers, textures) and RenderState (window + surface). Why? Because GpuContext can be created without a window, new_headless() gives you a software-rasterized GPU context for testing.

This means my GPU tests run everywhere. No display server needed. No Xvfb. Just cargo test and the GPU pipeline runs on Mesa’s llvmpipe software adapter. Same shaders, same code paths, just software-rasterized.

First render

The moment a white circle appeared on a dark background, I knew the stack was right. wgpu handles the platform differences, Vello handles the tessellation, my blit handles the output. Three layers, each doing one thing.

Now I need an editor so you don’t have to write Rust to place that circle.

← Back to blog