Particles, Finally

Every visual programming environment worth the name has a particle system. The fountain, the snow field, the sparks from a collision, the smoke curling up from a fire. None of it works without particles.

Lux has not had particles. Ten months in. Every blog post, someone (usually me, in my head) has asked “where are the particles.” The answer was always “next phase.” It is this phase. It is this post.

This is the 2D CPU particle system. Not GPU yet. The GPU version is the very next post after this one, and it rides on top of the compute buffer cache from two weeks ago. Today is the CPU version, because I wanted the semantics right before I moved them onto the compute path, and I wanted this post to be about the shape of a particle system rather than the shape of a storage buffer.

The data

A particle in Lux is a flat 44-byte struct. Pinned in lux-core::particle:

#[repr(C)]
pub struct Particle {
    pub pos: [f32; 2],      // 8
    pub vel: [f32; 2],      // 8
    pub age: f32,           // 4
    pub lifetime: f32,      // 4
    pub size: f32,          // 4
    pub color: [f32; 4],    // 16
    // total: 44 bytes
}

Four bytes wider than I wanted at the start of the session (I’d been aiming for 40), and I decided not to fight it. Every field earns its place. All the fields are f32, so with #[repr(C)] there’s no interior padding, and the struct packs to exactly 44 bytes. Two particles fit comfortably in a cache line either way. When this moves onto the GPU in the next post the layout will shift anyway, because std430 has its own opinions about what a packed particle looks like.

A ParticleSystem is a Vec<Particle> plus some metadata (max count, current count, spawn accumulator). Two helper methods on Particle: life_fraction() returns age / lifetime clamped to [0, 1], and is_dead() returns age >= lifetime. Used by the integrator and the renderer respectively.

A new pin type

PinType::Particles with PinValue::Particles(Arc<ParticleSystem>). The Arc is the same trick as the Arc-wrapped layers from a few months back: wire transfer is a refcount bump, not a deep copy of the whole pool. When the emitter needs exclusive access to mutate the pool (which is every frame), it calls Arc::make_mut. If the refcount is 1, which it is whenever the consumer has already run its process(), make_mut returns the existing allocation unchanged and the emitter mutates in place. Zero-copy frame-to-frame.

The 40-byte PinValue invariant holds because we’re carrying a single Arc pointer, which is 16 bytes. Well within budget.

The emitter

lux-particle-2d::Emitter2dNode is a monolith. Fourteen input pins, which is more than any other node in Lux so far, and I thought about splitting it into ParticleSource + ParticlePhysics + ParticleColor for ten seconds before deciding that the split would make every simple patch three times more annoying to build. Sometimes the right answer is a fat node.

The pins:

position (Vec2). Centre of the spawn volume.
position_jitter (Vec2). Random perturbation added to the spawn position, uniform in [-jitter, +jitter] per component. Without it, every particle spawns at exactly position and you get a beam, not a fountain.
spawn_rate (Number). Particles per second. Fractional values are fine, see below.
lifetime (Number). Seconds before a particle is removed.
initial_velocity (Vec2). Base velocity for every spawned particle.
velocity_jitter (Vec2). Random perturbation applied to the initial velocity, uniform in [-jitter, +jitter] per component.
gravity (Vec2). Constant acceleration applied every frame.
drag (Number). Per-second velocity damping in [0, 1]. 0 means no drag; 0.99 means very heavy air resistance.
start_size / end_size (Number, Number). Size lerps linearly from start to end over the particle’s lifetime.
start_color / end_color (Color, Color). Same idea for colour, which is what makes a fire look like a fire.
max_particles (Int). Hard cap on the pool. Spawns that would exceed the cap are dropped.
seed (Int). RNG seed for the jitter, for deterministic playback.

Output: a single Particles pin carrying the Arc-wrapped pool.

The integrator

Every frame, the emitter does four things in order:

Kill dead particles. pool.retain(|p| !p.is_dead()). Vec::retain does an in-place sweep and preserves the existing capacity, so the backing allocation never shrinks and grows again frame to frame, which keeps the allocator out of the hot path.
Spawn new particles. This is where the spawn_rate of “particles per second” gets tricky. At 60fps, a spawn rate of 400 means 400/60 = 6.667 particles per frame, which isn’t an integer. If you floor it every frame, you get 6 per frame = 360 per second, which is wrong. The fix is a floating-point spawn accumulator stored on the node across frames. Every frame, accumulator += rate * dt, then spawn_count = accumulator.floor(), then accumulator -= spawn_count. The fractional leftover carries into the next frame. Average over time matches the requested rate exactly, and the rate is framerate-independent.
Integrate physics. For each live particle: velocity += gravity * dt, then velocity *= (1.0 - drag).powf(dt) (per-second exponential damping, raised to the current dt so the decay is framerate-independent), then position += velocity * dt, then age += dt. Euler integration. Not the world’s most accurate but perfectly adequate for visual effects, and cheap. dt is clamped to [0, 1/30] seconds so a dropped frame doesn’t teleport every particle across the screen.
Interpolate appearance. For each live particle, compute t = age / lifetime, then size = lerp(start_size, end_size, t) and color = lerp(start_color, end_color, t). Stored back into the particle struct so the renderer doesn’t need to recompute it.

The RNG for jitter is xorshift. Seeded per-node, advanced once per spawned particle, deterministic across runs with the same seed. Good enough for a particle system. If I ever need cryptographically interesting randomness in a particle system, something has gone very wrong in my life.

The renderer

RenderParticles2dNode takes a Particles input and outputs a Layer. Its process() walks the pool and emits one FillCircle draw command per live particle, using the particle’s current position, size, and color. Zero-size and zero-alpha particles are skipped entirely; why issue a draw call for nothing.

This is deliberately boring. I want every particle to be a filled circle. “Render particles as textured quads” / “render particles as triangle strips” / “render particles as SDFs” are all valid and all live in future nodes. RenderParticles2D is the smallest useful renderer, which means it composes with everything else that produces and consumes Layer values (group, translate, colour adjust, blend) without any special cases.

A fountain of 600 particles produces 600 FillCircle commands per frame, which the Vello backend tessellates in one draw call anyway. CPU side this is maybe 50 microseconds of work. GPU side it is free. I measured, then I got bored of measuring.

The warmup problem

Here is the unexpected rabbit hole.

I wanted a reference PNG for the fountain patch, the same way every other test patch in Lux has a reference PNG generated via --dump-frame. I wrote the patch. I ran lux --dump-frame phase9_p1_fountain.lux out.png. The output was black with a single particle at the emitter position.

That makes sense and also makes the feature untestable. --dump-frame runs exactly one evaluation tick and captures the frame. A particle system that spawns 400 particles per second needs time to build up a population. At tick 1, there is one partially-spawned particle and nothing else. At tick 90, about 1.5 seconds of simulated time, the fountain is in steady state with roughly 600 live particles.

This is the first time I’ve needed to dump a frame from a stateful node whose single-frame behaviour isn’t meaningful. Every previous test patch was either stateless (shapes, filters, SDFs) or had a stable first frame (textures, feedback loops that converge quickly).

The fix is a new CLI flag: --warmup N. It runs N extra evaluation ticks before the capture tick. N is clamped to [0, 600] so a typo can’t lock up CI for an hour. Every existing test harness passes 0, so no existing reference PNG changes. The fountain patch passes 90, which gives me ~1.5 seconds of simulated time at 60fps and a reference image with 606 live particles.

This is going to come up again for every stateful node I add from here on: oscillators, LFOs, smoothers, state-tracking logic. --warmup unblocks all of them. Small feature, big payoff. Should have been there from the beginning; glad it’s there now.

The patch

tests/patches/phase9_p1_fountain is now a first-class regression test. The patch:

Emitter2D at position (256, 400), spawn_rate 400, lifetime 2.0, initial_velocity (0, -200), velocity_jitter (30, 20), gravity (0, 200), drag 0.3, start_size 6, end_size 1, start_color yellow, end_color dark red, max_particles 2000.
RenderParticles2D consuming the Emitter2D output.
Captured at --warmup 90.

Particles spawn at the bottom of the frame, launch upward with horizontal jitter, arc under gravity, fade from yellow to red as they age, and die at 2 seconds. In steady state the frame holds 606 live particles. The reference PNG is deterministic because the xorshift RNG is seeded, the dt is fixed at 1/60, and the warmup tick count is fixed.

All 53 lux-app tests pass, including the see_also_references_resolve and all_nodes_have_summary tests that catch missing docs on new nodes.

What it feels like

I wired up the patch interactively, dropped a Mouse node onto the emitter’s position pin, and watched particles follow my cursor in a yellow-to-red trail across the canvas. Flicked the mouse and the trail curled. Stopped moving and the particles rained down under gravity. Added an Oscillator on the spawn rate and the fountain pulsed.

‘A fountain of 606 particles fading from yellow to red’

Ten months. Every blog post, in the back of my head, “where are the particles.” Today, they are here. The fountain is on. I made a short recording for myself and watched it loop twenty times.

The GPU particle post is next. Same emitter semantics, compute-buffer-backed pool, 100,000 particles instead of 600, simulation on the GPU via the cache from two posts ago. Everything in Phase 9 has been building toward it, and it is about to be very easy to write.