Grass Simulation in OpenGL
| Mysterious | Good Day |
|---|---|
![]() |
![]() |
Grass blades are constructed from 2 Bezier Curves. Blades currently have a base form, which includes characteristics like lean and height, then they are instanced to the GPU, where some amount of noise is calculated in the Vertex Shader to alter the appearance of the instanced grass blade. In addition to this, I use Perlin Noise on the CPU side to generate rotation noise to makes each of the blades rotate around its root to give it a general smooth windy appearance (Tile.cpp). Instanced transforms and rotations would be sent to the GPU as Vec3, from which Mat4 would be derived on the GPU to increase memory efficiency (one base grass model is sent to GPU, but PN'd transforms and rotations of grass blades happen per instanced, and are thus instanced on the GPU). Uniforms like height and lightColor were used to paint a more realistic color. Fragment Shader does fragment shader things to make the grass seem believable.
I did sorta basic LOD-ing. According to several talks regarding the subject of Grass Sims (more specifically Ghost of Tsushima), I think they LOD-ed by decreaseing the number of verts per blade. Instead, I simplified it to lessening blades of grass per tile with distance. I decided first which grass blades I wanted to LOD out, and then for the ones I wanted to keep into the frame of view, I would then animate those blades (Perlin Noise for rotation).
Culling I found to be very important as it get's rid of and prevents the render of lots of blades. My frustum was built out of a near, far, top, bottom, right and left planes (point + normal). After LODing, I would check whether or not the blade that passed my LODing test would be culled out or not. If it wasn't culled out, I would proceed to Perlin Noise generation. In terms of my AABB, I chose not use a 3D bounding box or sphere, but rather a 2D quad extending from the root of the blade to the 3rd and 4th to last points along the Bezier Curve. My thought process would be that AABB would have too much empty space as blades are quite thin, and less points to check.
In order to speed up the checking of the 2D quad in culling, I used SIMD to parallelize the process of checking whether or not a point was outside the frustum. To check if a point is outside the frustum, we basically dot prod the normal of each frustum face with the vector from the base of the normal to the point. In my test, if this is positive (same direction), then it is outside the frustum. Because these operations can be represented by a lot of fmadds, subs, and broadcasts, I chose to use SIMD to increase throughput.
-
MORE Parallelizable MORE Vectorizable!!!!.
-
🤯 Learned to use AVX Intel Intrinsics.
-
At ~80k blades, SIMD and Scalar sit at 60 FPS
-
At ~180k blades, SIMD sits at ~60 FPS; Scalar at ~60 FPS
-
At ~500k blades, SIMD sits consistently at mid 40s - high 40s FPS. Scalar at high 30s - low 40s FPS.
▶️ Very good. -
Multithread when frustum culling, then draw only the ones that are not culled out
-
❓ Apparently not great and doens't do as well as non-threaded.
-
Frustum culling is currently done per-blade. Probably should be altered to cull per tile.
-
Aesthetically, isn't there yet: Blades don't rotate. They flap in the wind on the broad side
Blade generation should be done on the GPU as there are more cores and threads.
We HAVE TO cull tiles.
We generate blades of grass initially so each Grass object knows its bezier vertices. Each grass blade gets its own AABB as usual for frustum culling. I think typically Frustum Culling in Compute Shader, but just do it on the CPU because I already have it done.
Cull based on blade position.
- Random lengths --> Longer a blade is, the more it bends.
- Random directions.
Lighting
Recording.2025-08-18.213312.mp4
Absolutely squeezed perf out of SIMD; Info from VS Profiler
0817.1.mp4
Enabled Instanced Grasslets
Recording.2025-08-17.171151.mp4
Very slow, but mobile little grasslets
Screen.Recording.2025-08-16.193339.mp4
Below is 125k blades of grass + SIMD Frustum Culling
0814.2.mp4
With more and more grass, even with LODing it gets pretty slow; Implemented Frustum Culling. With good LODing and Frustum Culling, we get 60FPS.
Screen.Recording.2025-08-09.013648.mp4
LODing helps to increase FPS, but of course at the cost of aesthetics.
Look at that FPS!
Beautiful Gaggle of Grass
Colorful Gaggle of Grass
Gaggle of Grass
Basic grass flaco
Basic grass phat

