精選文章

SmallBurger Asset Home

  SmallBurger

2021年3月7日 星期日

How to optimize a massive number of GPU instance transforms?

Recently, I've become obsessed with the vegetation system in Ghost of Tsushima, primarily because of its lightning-fast loading times. When I re-enter the game in the open world, it loads quickly, and there is rarely any noticeable lag while moving. This has sparked my journey into optimizing GPU instances...


Below is the process of simulating 1,000 instances to represent 1,000,000 vegetation.

  1. Using the same template page with random transforms to simulate vegetation across an enormous world:




  2. Utilizing a Quadtree to gather the currently visible pages and passing them to the Culling Compute Shader for calculating the visible instance IDs and their quantities.
  3. Passing in a weight map to perform relevant filter collection processing:


  4. Using ComputeBuffer.CopyCount to transfer the count of culled objects between GPUs without involving the CPU, and copy it into the argument compute buffer.
  5. Finally, use DrawMeshInstancedIndirect along with the argument compute buffer to render the results.
Finally, here's a video demo showcasing the relevant content. Thank you for taking the time to watch and share.





沒有留言:

張貼留言