Fugue Devlog 16: Motion Capture Tests

· 08.12.2022 · projects/fugue

An example rendering from the CMU MoCap database.

After a few busy weeks I've found a bit of time to continue work on Fugue. One of the last major questions content production, after writing, textures, and music, is character animation (this still leaves object and character modeling as the other two problem areas). While I believe I can get away with lower-poly models and crunchier photo-textures, I don't think that's the case with low-quality animation—it's too jarring. So I want to figure out a way to produce fairly good, realistic motion on the cheap.

There are a number of deep learning-based projects available for motion capture without a sophisticated tracking setup. Some are even monocular, requiring only one camera. There are some commercial offerings (such as deepmotion.com) but I want to see how far I can get with open source options. I'll be able to modify those as I need and they'll be easier to integrated into a more automated process than commercial options.

The open source projects are usually research projects, so they aren't polished/are somewhat janky and probably don't generalize very well. And their licenses often restrict usage to non-commercial purposes. For example, MocapNET, EasyMocap, and FrankMocap are all non-commercial uses only. I did find MotioNet which does allow commercial usage (under its BSD-2 license) and requires only one camera, so that was promising.

One alternative to the deep learning approach is to just use existing motion capture data and hope that covers all the animations I'd need. A great resource is the CMU Graphics Lab Motion Capture Database, which has generously been converted to .bvh by Bruce Hahne for easy usage in Blender. The collection encompasses 2,500 motions and is "free for all uses". The range of motions is expansive enough (including things like pantomiming a dragon) that it's possible it will have everything I need.

Still, I wanted to try out the deep learning approaches, in part because I was curious.

One note here is that these models typically output motions as .bvh files. These contain motion instructions addressed to a particular skeleton (where, for example, the left leg bone might be named LeftLeg). I used Mixamo's auto-rigger to rig my character and the resulting skeleton has a different naming system. Fortunately there is a Blender addon, "BVH Retargeter", that remaps a .bvh to a differently-named skeleton. It doesn't include a mapping for Mixamo by default, but I set one up myself (available here, it goes into the known_rigs directory).

On this note, there is also this Deep-motion-editing project which has a much more sophisticated retargeter:

Deep-motion-editing retargeter

I don't know yet if I'll have a need for this, but good to know it's there!

On to the tests:

I'm using a Kiros Seagill model (from FF8) for these tests.

Even though the MocapNET license is not what I need, I decided to try it anyways:

MocapNET test

It looks ok, a little janky and all over the place though. And the hands aren't animated.

MotioNet

MotioNet looked promising but unfortunately did not have very good output. The output pose is upside-down for some reason (this is a known issue), which seems like an easy enough fix, but the joint movement is stiff and incorrect.

CMU MoCap

The CMU motion looks great of course, as it's actually captured properly. Again, the only concern here is that it doesn't have a wide enough range of motions.

The last software I tried is FreeMoCap, which is still in very early stages of development, but there's enough to try it out. It was quite a bit more complicated to set up as it works best with multiple cameras (they can still be fairly cheap, e.g. $20-30, webcams), and requires a charuco board for calibration, which I printed off at Kinko's. That cost me about $30 to get it on poster board, but you can probably make something cheaper with an old cardboard box and printing on large-sized computer paper. In total I spent ~$100 on equipment.

The most important thing is to get webcams that work for the size of your recording room, so get your full body in frame for all of them (which may require wide-angle cameras). Then you need to make sure that your charuco board is large enough that its patterns are clear on the webcams—the further you position the webcams, the larger the charuco board you'll need and the lower the resolution you record at, the larger the charuco board you'll need. Note that there's also a resolution/frame-rate trade-off: when running 3 cameras at 1080p I get about 14-15fps, but I needed to run at that size for my charuco board to render clearly. And as another note, the FPS can be bottlenecked if you use a USB hub to run your cameras through (some of the cameras may not even work in that case); I ended up plugging each camera into its own port for best performance.

Getting the program to work was tricky, which wasn't a surprise given the project is in an alpha state. I had to make a few changes to get it to run properly (mainly switching from multithreading to multiprocessing, since the threads were blocked on my system, and manually setting the FPS for my cameras, which for some reason would limit to 5FPS otherwise).

Below is an example of the program's output. My recording environment is less than ideal (my camera setup was super janky, attached to books or shelves in a haphazard way), but the output looks decent. It's jittery, and you'll notice the pose skeleton and camera footage are swapped in the first and last videos. I'm not sure if that's just a bug with this visualization or if it's happening deeper in the program, in which case it may be why there's that jitteriness and the skeleton angle is off.

FreeMoCap output

The program can also output Blender files:

FreeMoCap Blender output

Here the issues are more apparent: the hands are especially all over the place. But even the limbs are too finicky. The demo video (above) has good limb motion, so maybe my setup is lacking (though the hands are also jittery).

FreeMoCap is a really promising project, but unfortunately it's at too early of a stage to be consistent and reliable. For now I'll probably develop the game using the CMU motion data, and then later, when I'm ready and FreeMoCap is likely much more mature I can go through and replace or refine with custom motions. Though at the rate development is going, there's a good chance that FreeMoCap will be much further along by the time I'm ready to start really working on character animations!