Ker-SPLAT

Ker-SPLAT

World Interactions in Cosmonious High

And now for something completely different! Starting this month we’ll take a deep dive into the tech that made Cosmonious High what it is.

Early in development we knew we wanted the player to be able to coat the world around them with water, paint, and the various chemicals found in the colorful world of Cosmonious High. Then we realized: if we have the tech, why not expand it so non-fluids feel just as responsive? Fire can leave scorch marks, ice can make things frosty. Thus, SplatTech was born.

Our initial prototype met with great success, but shipping on all our target platforms required a highly optimized version that was a far greater technical task. We’ll cover highlights from both versions.

“This article is quite technical, so grab yourself a cup of coffeen and get comfortable!”

-Ben Hopkins, Expert Graphics Engineer, who definitely wrote 95% of this because we don’t even know what SDF is

Initial Implementation

Early in development we opted to use baked lighting to get Cosmonious High’s signature, delicious softness. Luckily this meant that all static objects would have an additional set of unique texture coordinates already set up that we could also use to paint any static surface at runtime.

Faking High Resolution with Low Resolution Textures

The world of Cosmonious High has lots of surfaces and almost all of them are splatable. Using textures of a high enough resolution to represent the splats at the visual quality we required was simply not realistic. Luckily there’s a well known alternative: signed distance fields (SDF). Instead of using a color channel of the texture to store color information we instead use it to store the closest distance from that point in space to some other surface.

Early SplatTech implementation example

When adding a new splat to the world we treat it as a sphere, storing the closest distance from every point in the texture (which itself is a point on a surface in the world) to the surface of this sphere. If that point in the texture already has a value we check if the new value is smaller before writing it to the texture. When rendering a splatable surface in game we can now simply check if the appropriate texel’s distance value is less than zero to determine if we’re “inside” a splat or not!

Using a low resolution texture to store SDF data like this trades pixelation for smooth but low detailed shapes. Using various tiled noise textures to offset the SDF itself before checking the distance in the shader adds details back to the shape!

Splat’s SDF without and with noise offsetting

Storing Every Substance in the Cosmos in 2 Textures

With an ever growing list of potential substances in Cosmonious High, we needed a flexible way of representing them. Not wanting to go over two textures per SplatMap, we ended up with the following data per texel of a splatable surface:

  • Persistent SDF (substances that should persist until cleaned by the player)

  • Dynamic SDF (substances that should evaporate and clean away persistent substances)

  • Substance thickness/viscosity 

  • Substance wetness/shininess

  • Substance color and opacity

Every splat in the game, from fire scorches to metalligem splatters to water, is made using only those five channels.

Visualization of various data channels in a SplatMap

Making an Object Splatable

At this stage, the only requirements for an object to be splatable are lightmap UVs and using our special static shader that reads the SplatMap textures and shades the splats directly inline. Adding a SplatPointer component to the object creates and manages the 2 SplatMap textures for that object. During gameplay, if we detect a collision that should result in a splat we can easily look up the appropriate SplatPointer based on Collider and submit a new splat at that worldspace position, with things such as radius, color and substance values all configurable per-splat. Every frame we update all SplatMaps that have had new splats added.

A Frame in the Life of a SplatMap

When a SplatMap needs to be updated we render the object’s mesh using its lightmap UVs instead of its 3D vertex positions using a special update shader. New splats are submitted to this shader as vector arrays for things such as position, radius, color and substance values. Iterating over these arrays, we calculate the new SDFs, color, and substance values for each point on the object and write them back to the 2 SplatMap textures at once using MRT (multiple render targets).

Once the SplatMap textures have been updated, we run several iterations of a dilation shader to eliminate visible seams at the edges of UV islands. As a final step we also run several iterations of a blur shader on the texture representing splat colors to give the impression of diffusion between substances and hide the low resolution of this texture.

Meta Quest 2

Up until this point, development had been solely focused on Desktop PC. With our plans to simultaneously launch on Meta Quest 2, it was time to see how SplatTech ran on the standalone headset’s mobile GPU.

So.. How Bad of a State Are We In?

When we first ran Cosmonious High on Quest, we were seeing framerates in the single digits.

Not Good.

With everything disabled in a scene other than SplatTech, our GPU frametime metrics were capping out at 64ms. Actual GPU frametime was significantly worse. 

Definitely. Not. Good!

Reading through the initial desktop implementation of SplatTech is a laundry list of things you should probably avoid on Quest. Memory bandwidth is a huge issue; every surface has to read and write 2 textures when updating its SplatMap and then has to read both of those textures and a 3D noise texture when rendering the surface, not to mention any textures used by the surface’s material. Since each splatable object has its own unique SplatMap, drawcall batching is also broken. Then due to the TBDR (tile-based deferred renderer) nature of the Quest’s GPU, any time you render to a texture you pay a hefty “resolve” cost before you can read that texture somewhere else. Each of those dilation and blur iterations suffer this cost, as does the SplatMap update shader itself. With each object having its own unique SplatMap we’re also doing each of these steps many times per frame.

Determining the Correct Path

With such incredibly tight constraints, it was important to break down all the moving parts and find where savings, no matter how large or small, could be made. Using a custom benchmarking framework, we were able to quickly iterate on optimizations and new techniques with accurate GPU frametimes output to easily parsable graphs.

Benchmark output example for early dilation optimization tests

For SplatTech to ship on Quest 2 it was clear that we needed to: 

  • limit update to a single SplatMap per frame

  • figure out a way of implementing dilation as close to free as possible

  • decouple splat shading from surface shading (ideally with a limit on distance from player that splats incur full cost shading)

Final Implementation

The version of SplatTech that players experience in Cosmonious High today is an almost complete rewrite of the original implementation, with the same code used for both Meta Quest and Desktop PC. The final implementation can be broken down into 3 parts: offline mesh processing, runtime mesh generation and texture storage on player teleport, and runtime splat updating and shading. 

Offline

As an offline process, we collect every splatable object in a scene and generate an asset that contains the object’s mesh data broken up into a custom data structure that allows for UV-centric queries to be performed very quickly. While this step is technically not needed, it drastically speeds up the mesh generation step that occurs on player teleport. At this stage we also calculate dilation offsets for the scene’s SplatCache.

On Player Teleport

When the player teleports to a new location, we first write all splat data from the local SplatMap back into the SplatCache and then update the baked splat texture for this region. At this point we generate a brand new mesh containing the triangles of all splatable objects within a 10m radius of the player’s new location. We then dynamically generate a UV layout for this new mesh, ensuring consistent texel density within a 1k texture (the local SplatMap). This mesh generation process runs off of the main thread, and, thanks to the data structure we generate offline and some other optimizations, only takes a few frames on Quest (mesh generation is guaranteed to complete before we start to fade in from black). Finally, we copy relevant splat data for the new mesh from the SplatCache into the local SplatMap, keeping splats for the entire scene persistent between teleports.

SplatMesh and UV generation as player teleports through scene

Runtime

The 8 channels of data that were previously stored in 2 separate RGBA textures are bit-packed into a single 32bit unsigned texture (the bit depth of each channel varies based on usage, the highest being 6bits for the dynamic SDF and the lowest being just 1bit for alpha).

During gameplay we update the local SplatMap at 30hz using a single compute shader. The compute shader first unpacks the data for the current texel, the worldspace position, and encoded dilation offset, then iterates over all newly-submitted splats updating both SDF channels and blending substance data and color accordingly. For splats on non-static surfaces (anything that can move/rotate/scale during gameplay), we inverse-transform the splat’s position into the object’s local space and then use an encoded object ID to only allow that splat to affect certain texels in the SplatMap. The dynamic SDF is then subtracted from the persistent SDF and also decremented so that substances like water can clean paint and evaporate over time. Finally, the updated data is packed back into a single 32bit value and pre-calculated dilation is performed. Phew!

Visualizing SplatCache (left) and local SplatMap (right) IO

While the packed 32bit unsigned texture is optimal for storage and updating, when it comes time to actually render the splats it has several limitations. The bit-packed data means we can’t rely on hardware texture filtering or mip maps. Initially I implemented custom filtering in the shader by using GatherRed() to retrieve 4 separate texels and manually unpack and filter the data inline, however this adds overhead to every frame (and eye) we render, and we can do better. On the frame following SplatMap update, we “resolve” the packed 32bit texture into two 16bit 565 format textures. The resolve process combines the 2 SDF channels into one (as splat shading doesn’t need to distinguish between persistent and dynamic splats) and stores this value with substance values in texture 1. The RGB color channels are stored in texture 2, with alpha discarded as we can infer it at shading time from the substance values! Post-resolve we generate mip maps as well.

Visualizing real time splats around the player & cheap baked splats in the distance

Splats themselves are shaded at 2 different levels. For everything within the 10m radius of the player, the mesh generated on teleport is blended over top of the world with splats built up in a single shader using the 2 resolved textures. Substance values control SDF and noise thresholding to create various shapes and shading effects. The low resolution of the color channel is hidden by offsetting the coordinate read from using noise attenuated by distance from the splat’s edge. For all surfaces outside the 10m radius baked splats are used with a single texture read in our main static shader. The boundary between realtime and baked splats is hidden using a simple crossfade.

Ship It!

It may have taken many months and a ground up overhaul, but Splats are running at framerate on Quest 2 and we’re excited to have our most responsive world yet.