As Gaussian Splatting is blasting forward in public awareness, one thing I’ve been asked frequently in my master classes has been: “How do I separate and animate individual objects in a Gaussian Splatting (or NeRF) scene?”
While quite a lot still has to happen before we can simply pull parts apart in 3D space and infill the scene behind them (especially at runtime), some interesting developments are going on…
The Power of Mask3D
The latest development is Mask3D: Mask Transformer for 3D Instance Segmentation. The idea is beautifully simple - you provide the model with a .ply pointcloud file and it will segment the cloud for you, grouping the points into labeled segments for which you could then do anything with… for example: You could move them apart in 3D space.
The cool thing is that Gaussian Splat scenes happen to be .ply files. So there’s that immediate connection.
From where I look, it shouldn’t take too much engineering to figure out how to animate objects as it’s just point data. They have a demo page that looks pretty straight forward, so could be an interesting experiment to document and share. The next days we might look into it at Manyone.
Different Approaches to Object Separation
Previously on this topic I looked into something called Distilled Feature Fields. That’s still super relevant, as the techniques are in two different spatial domains. DFF is field-based, while Mask3D is point-based and there’s a multidimensional difference between their potential. Just like NeRFs are just temporarily taking a break and lets Gaussian Splatting shine while our hardware catches up to the potential for runtime NeRFs.
Links: