A downloadable tool

Real-time Smoothed Particle Hydrodynamics (SPH) fluid simulation using OpenGL compute shaders

GitHub:- Project Link

At a Glance

  • Real-time 3D SPH fluid simulation fully accelerated on the GPU

  • Handles up to 1M particles with stable performance

  • GPU-based neighbour search, sorting, and surface reconstruction

  • Marching Cubes for real-time fluid surface generation

  • Designed with memory efficiency, warp coherence, and scalability in mind

Tech Stack: C++, GLSL, OpenGL Compute Shaders
Core Algorithms: SPH, Morton Order, Bitonic Sort, Marching Cubes

Project Summary

This project implements a GPU-driven 3D fluid simulation using Smoothed Particle Hydrodynamics (SPH).
The focus was on scaling particle-based fluids to large particle counts while maintaining numerical stability, real-time performance, and high visual quality.

All major simulation stages — neighbour search, force evaluation, integration, sorting, and surface reconstruction — are executed entirely on the GPU using compute shaders.

Key Engineering Challenges (TL;DR)

  • O(n²) neighbour search bottleneck in SPH

  • GPU data flow and SSBO synchronization across shader passes

  • Bitonic sort limitation to power-of-two element counts

  • Warp divergence due to branch-heavy shaders

  • Real-time surface reconstruction from particle data

Core Technical Solutions (Summary)

  • Uniform grid + Morton order for O(n) neighbour search

  • GPU bitonic sort with support for arbitrary particle counts

  • Explicit SSBO memory barriers for safe cross-shader data usage

  • Branch-minimized, warp-coherent compute shaders

  • GPU-based Marching Cubes surface reconstruction

Performance Highlights

  • Stable real-time simulation with up to 1,000,000 particles

  • Linear memory scaling with particle count

  • Branch minimization saved ~1–2% frame time

  • GPU-only pipeline with zero CPU↔GPU readbacks

📊 Graphs: FPS vs Particle Count, Surface Resolution vs FPS, Memory Usage

Technical Deep Dive

GPU-Based SPH Simulation

  • Density, pressure, viscosity, and force computation implemented in GLSL compute shaders

  • Each SPH stage executed as a separate GPU pass

  • Particle data stored in Shader Storage Buffer Objects (SSBOs)

Why compute shaders?
Massive parallelism, zero CPU stalls, and scalable performance.

Neighbour Search (Uniform Grid + Morton Order)

  • Simulation space partitioned into uniform grids

  • Particles assigned Morton codes (Z-order curves)

  • Particles sorted every frame using GPU bitonic sort

  • Grid hash buffer stores start indices for neighbour lookup

Benefits

  • O(n) grid construction

  • Improved cache locality

  • Fast, predictable neighbour queries

SSBO Synchronization Across Shader Passes

Problem
Multiple compute shaders consume data generated by earlier passes, risking race conditions and stale reads.

Solution

  • Carefully ordered shader dispatches

  • Explicit OpenGL memory barriers between passes

  • No CPU-side synchronization or frame delays

Result

  • Deterministic data flow

  • Stable simulation within a single frame

Bitonic Sort with Non–Power-of-Two Element Counts

Problem
Bitonic sort assumes power-of-two element counts, as noted in GPU literature.

Solution

  • Added bounds checks during compare–exchange stages

  • Skipped operations when indices exceeded valid particle count

Trade-off

  • Minor SIMD divergence

Result

  • Sorting supports arbitrary particle counts

  • No buffer padding required

  • Negligible real-world performance impact

Branch Minimization & Warp-Coherent Shaders

  • Reduced conditional branching in compute shaders

  • Used mathematical masking and structured execution paths

  • Minimized warp divergence

Impact

  • ~1–2% frame time improvement

  • More predictable GPU performance

Surface Reconstruction (Marching Cubes)

  • Density field generated from particle data

  • Marching Cubes executed fully on the GPU

  • Generated meshes stored in SSBOs and rendered directly

Optimizations

  • Parallel cube evaluation

  • Early exits in low-density regions

  • GPU-only surface pipeline

Rendering

Particle Rendering

  • GPU instanced rendering

  • Single draw calls for millions of particles

  • Used primarily for debugging and analysis

Surface Rendering

  • Real-time fluid surface mesh

  • Updated every frame

  • High visual fidelity

Technologies Used

  • Languages: C++, GLSL

  • Graphics & Compute: OpenGL, Compute Shaders

  • Algorithms: SPH, Morton Order, Bitonic Sort, Marching Cubes

What This Project Demonstrates

  • GPU compute and parallel algorithm design

  • Memory-efficient, data-oriented systems

  • Real-time physics simulation

  • Performance profiling and optimization

  • Rendering + simulation integration


Leave a comment

Log in with itch.io to leave a comment.