MocoCompositor: GPU Accelerated Compositing

March 24th, 2010

I’m a little giddy and excited about this new feature. Last semester I took a class at the main campus titled “Technical Animation” where we learned about all sorts of Computer Graphics and Animation techniques/algorithms used in the game and movie industries. It was a pretty cool class that focused on projects. My final project (teamed up with Federico Perazzi and Grace Lin (both from ETC)) was to create a target-driven smoke-simulation accelerated on the GPU. I knew absolutely nothing about smoke simulations or GPUs for that matter. Long story short, we taught ourselves how GPU accelerated computation worked and how to write shaders in GLSL… and eventually wrote a regular smoke simulation in GLSL (we ran out of time for the target-driven part). Turns out it doesn’t matter if you do the software in Python as far as speed is concerned since you’re passing all the heavy computation over to the GPU on the video card to do anyway. So we ended up using pyglet (the OpenGL interface in Python) and a tiny shader class to string together several custom shaders to do our smoke simulation… it worked in real-time pretty well.

Skip to the present: We at Mocotila talked a lot about compositing images plates together because that was one of the biggest uses for motion controlled cameras. Compositing a live action/model with a matte-painting or 3D model. But we also realized that compositing is usually done post-production and takes a lot of time to do, render and see the output. Then if any shots are screwed up in framing/etc you’d either have to reshoot or try to fudge the effects til it was acceptable.

But here we are with an awesome camera with its viewfinder being streamed to anything capable of reading http-streamed images… and not just the expensive camera either, our bioloid has a little webcam that is also being streamed in the same way and when we get Maya/Blender integration we might have live 3D renderings being streamed as well. Wouldn’t it be grand if the cameraman/director could see a live end-result composite preview so he could direct actors or reframe things appropriately? And what if we could just kinda composite several of these streams into a new viewfinder? This is where the MocoCompositor comes in… it runs on any computer on the network with a good video card capable of shaders. It pulls in (subscribes to) images from multiple image streams coming from the server and it publishes a new composited image stream to the server that anyone else can subscribe to.

The actual compositing is accomplished using the GPU via GLSL (the OpenGL Shading Language). This is where my Technical Animation story comes in… I went back and looked through my GPU smoke simulator and implemented it again taking out all the simulation stuff and adding layers of images to be processed like Photoshop Layers. Your view is from the top of the layer stack looking down (the very bottom being the background image plate). The algorithm runs from the bottom plate up applying an associated shader to each plate and saving the result into an output plate. The output plate is what is packaged as the jpeg image and published out to the server again.

So far we’ve implemented a green-screen shader that replaces all the green in the foreground plate with the pixels from the background plate… this means live green-screen replacement compositing. We experimented with background-subtraction (taking a reference shot of the background and subtracting it from the live shot so we wouldn’t need a green-screen, but it just wasn’t reliable or clean). We’re hoping to add some more shaders to this system all implemented via GLSL shaders… especially some gradient blur filters (if you know where I’m getting at :).

Under the Hood

The main thread is a pyglet app running its event loop. Word of advice NEVER mix OpenGL calls (or anything that touches hardware directly without locks and state preservation) across threads… bad things happen (one of these days I’ll get around to putting locks on the camera class too). Anyways in this main thread we have the on_draw event from pyglet where we run through each image plate and execute the appropriate shader on the input image (and working/output image). After we’ve gone through all the plates we package the output image texture back into a jpeg and send it out to the server via a http stream publisher.

Now since pyglet controls our main thread and we want to be able to pull images from at least 2 streams concurrently from the server, we need to do it in threads. So for each image stream coming in (subscribed to) we have a thread which grabs the image and updates a mutex-protected data structure (our image plates) with the raw data. The next time through the main pyglet loop it’ll reload the image from the raw data… we can’t do it out in the thread because lord knows what pyglet is doing behind the scenes (or if the libraries it uses are thread-safe) to load these images.

Speaking of the libraries used by pyglet… we were having this nasty SegFault in the MocoCompositor after a few seconds of working perfectly. Just out of nowhere it would SegFault (and not at the same period). After digging through pyglet and stepping through the execution using a python debugger I tracked down the problem to a codec being used for the JPEG decompression. Pyglet uses 3rd party libraries to decompress jpeg images (not sure if it has to do with patents or whatnot), but on the Ubuntu system I’d been using, it was defaulting to using gdkpixbuf to do the decompression… I assume it’s the fastest implementation available on the system (probably uses the C libjpeg or something). But I noticed in the pyglet documentation that you can specify a decoder to use for the decoding and I noticed that the Python Imaging Library was installed (PIL)… so I forced pyglet to use that decoder instead (it seems a tinsy bit slower) but it worked without any SegFaults [so far]… huzzah!

Also note that textures used in OpenGL really have to be made in powers of 2… otherwise crazy things start happening (things start striping and staggering diagonally). This was an ongoing struggle for quite a while, until on a whim I remembered that old rule and tried it as a power of 2 and it worked. So now I have the code looking at the input image size and rounding it up to the next power of 2 to allocate the texture. This leaves a big black unused area but it works… when we’re done calculating the output image, we simply blit the original size of the image from the big texture to send to the output stream as the jpeg image.

Software: Moco MotionBuilder GUI

March 15th, 2010

Before we began developing our GUI we did a lot of research into existing professional end software packages used by our target demographic. Our target demographics being semi-professional to professional videographers/cinematographers. We learned/played/used non-linear video editing software (Final Cut, Adobe Premiere), compositing software (Shake, Motion, AfterEffects) and 3D animation packages (Blender, Maya). Additionally we looked into the FLAIR robot-control software package that’s used by the Milo big-rig motion-controlled robot. FLAIR turned out to little more than a spreadsheet with a ton of buttons (designed by an engineer obviously).

We took a lot of the ideas in these packages and implemented them in our own GUI. Our GUI is a web-based interface connecting to the multithreaded Python-based custom web server on the backend. The GUI makes use of HTML5 technologies (namely <canvas> tags for 2d rendering), jQuery UI for the theming/sliders-widgets, AJAX (via jQuery) and JSON (Javascript Object Notation) for transferring data between the GUI/Server/Robot/etc.

MotionBuilder GUI

Our MotionBuilder GUI is split up into 2 halves. The top half has all the outputs from the robot (the viewfinder, the skeletal model of the side profile of the robot and the skeletal model of the top profile of the robot).

The bottom half of the GUI represents the curves editor. The editor consists of a canvas and vertical-slider for each actuator/servo on our robot as well as actuators for the camera (I’ve disabled them for the time being). The canvas for each actuator is where the motion curves are drawn (remember in a previous post I mentioned that we had implemented several curves including discrete, step, linear and catmull-rom). The vertical-slider allows the user to fine-tune the placement of the keypoint at wherever the play-head currently sits.

The play head is controlled by the frame-scrubber horizontal-slider that runs the width of the GUI and lies above the actuator curves. Scrubbing across the slider will cause the play head to move to the specific frame and publishes the actuator motions back to the server for that frame. This in turn causes the subscribers of the actuator publication (ie. the skeletal models in the top half of the GUI as well as the robot itself) to move accordingly to the new actuator position.

Above the frame-scrubber is another horizontal-slider with the ability to set a frame-range by dragging the minimum and maximum handles. This allows the user to select a particular sub-range of the animation to playback.

The Playback controls are above the scrubber sliders and include a “go-to-beginning-of-range”, “fast-rewind-play”, “rewind-play”, “stop”, “forward-play”, “fast-forward-play” and “go-to-end-of-range” buttons. We use a Javascript setInterval timer to step through in the playback and it is NOT representative of the final time/delay of the animation. It’s mainly used for previewing the shot.

To the left of the playback controls we have the curve-set buttons that allow the user to clear the editor and start with a new curve set, load an existing curve set (from a file on the server) or save the current curve set (to a file on the server).

Finally we have some keyframe buttons above each actuator curve canvas. It includes a button to move the playhead to the previous keyframe on that curve, move it to the next keyframe on that curve or to toggle the existence of the keyframe at the playhead (this is how you delete keyframes).

Under the Hood

Our Web-server is a custom-written (raw socket based) multi-threaded Python server. We created a paradigm of “subscriber/publisher” streaming where any given web client can connect to the server and offer to “publish” data (via a URI) and any number of clients can connect to the server and “subscribe” to the data (via the URI). Subscribers are added to a queue for the URI and a thread is fired up for each publisher of a URI. Whenever a publisher pushes data it is broadcast to each client on their queue as fast as possible (very little buffering). This technique allows us to broadcast streams of data/images to any number of clients regardless of their intentions of the data and it affords us a great amount of flexibility and scalability for adding on more subsystems to our software. You’ll see an example of this with our compositor later.

Image streaming from the server is accomplished via the old-school Netscape “multipart/x-mixed-replace” content/mime type… it’s what those web-enabled streaming spycam/monitoring-cameras use. Most modern browsers support this mime-type (and we don’t bother to support/test IE at all).

Data streaming from the server uses the old long-polling script tag block technique (I think it’s referred to as Comet nowadays)… basically the server keeps the connection open and sends the data as JSON strings surrounded by javascript callback function enveloped in <script> tags. Most browsers execute the javascript when the ending script tag is found (it’s a throwback to old-day compatibility). So as long as the server keeps the socket open it can send all the data it likes and the client will process it one after the other.

Communication to the server (for stream-like functionality like scrubbing the sliders) is just AJAX sending JSON via HTTP GET requests repeatedly. Surprisingly the TCP handshake and HTTP request overhead aren’t too bad when hammering away at the server… especially since the biggest bottleneck would be the real world servos (the user subconsciously moves those sliders slowly when they notice the robot sluggishly getting into position).

I’m sure there’s more I’m forgetting to mention… I’ll save those for a later post

Technical Update

February 28th, 2010

We’ve been making some good progress since quarters:

Parts ordered/received:

  • We ordered and received some of the servos we’ll be using (pan/tilt assembly, turret)
  • We’ve got the Lynxmotion SSC-32 servo controller board (and a 9v wall-wart for it)
  • We’ve got the Canon EOS 7D DSLR camera we’ll be using for our project (this thing is a beast)
  • We’re waiting on a final design before purchasing any raw materials


  • We’ve got a Mechanical Engineering intern! Kyle Gee seems excited about the project, knowledgeable about his area of expertise and started off strong already with some Solidworks models. We’d started him on designing the tripod head adapter for pan/tilt assembly so we have a temporary holder for it, he emailed us a few hours later with a total breakdown, design and render in 3D. We’ll be getting it spec’ed, designed and machined… kind of like a gold-spike in our Mechanical pipeline once we get the materials and machinist.
  • Mike created a mockup that more closely resembles our final design using the erector set we got. This lets us work out some obvious kinks in the design as well as better describe our end product to faculty and visitors.
  • The armature is pretty much figured out, but we’re under heavy discussions about the dolly/track-transport mechanisms. We need cheap reliable (read accurate) movement on the track with very small steps… we’re talking with some Professional Engineers about this as well.


  • Reworked the code from Girl-Tech Lynxy arm to use the newer heavy duty servos we got. Already ran into a small problem, the servos take a long time to get to their destination point even when the Lynxmotion controller sets pulse-width to the final value… this means that even though the controller and computer think the servo has finished moving, it isn’t… this is very bad. We’ve got 2 solutions, use a soft-wait (time.sleep() in python of a guesstimated timeout while the servos move) or, as Mike suggested, try to tap into the potentiometer readings that are attached to the servo rigs we bought and run it into the computer (via a phidget) to read the true value. We’ll need to research the hardware a bit more, but until then a soft-wait it is.
  • Got the camera controlling via Linux! This is all thanks to the libgphoto2 open-source library… actually had to get the latest version 2.4.8 (it wasn’t in the Ubuntu repository yet) and compile it. Started off using the gphoto2 program to execute commands over the shell but that just wasn’t practical and the libgphoto2 library was written in C while our code is all Python… enter the awesome ctypes library! It took me all of a night to port their preview/capture C gphoto example code over to Python and linking into the shared library using ctypes as the glue. ctypes is freaking awesome! Cleaned up the code and wrapped it up in a camera class capable of producing actual high quality images as well as pulling the viewfinder/preview images.
  • Developed a curves class (in Python and Javascript) capable of interpolating curves of type: discrete, step, linear and Catmull-Rom… this is our fundamental building block for all motions so it’s very important that it works right.
  • Started designing the layout of the GUI and the system-framework on paper.

There’s much more but I’ll leave it at this for now.

HDRI Test 1

February 23rd, 2010

Quarter Presentations

February 13th, 2010

Thanks to the Blizzard of 2010 our Quarter Presentations were reduced to a 15min walk-around by the faculty on the Friday of 1/4s week. We quickly reworked our original presentation and slides to work with the new format (gutting it out for the most part). We had to repeatedly present the same thing to every small group of faculty that came in the door. Vastly different than the normal 1/4 presentation routine… We started off a bit rough but got into a rhythm quite quickly. BTW, Tom created an awesome looking theme for the slides and we’re hoping to integrate that over to the website soon.

The Quarter Presentation/Walkaround went rather smoothly, with very minimal questions from the faculty. Either we explained ourselves extremely well or we totally lost them 🙂 . Brenda did make a good point that we were being pretty technical but weren’t expressing the “why” [it would make her interested in it]. I remember Jim Burke from Lockheed Martin telling us in Bat-teK to always ask and answer the “why” question at every step of the way… ok so I’m a slow learner. We’ll need to expand on the “why” of our product for next time. During the presentation we showed Mike’s “Fiber One” camera intervalometer (it keeps your camera interval regular :P), the Bioloid robot and did a quick demo (python script to pan the camera taking pictures and compile it into a quick movie), showed them a potential GUI layout and showed them some test videos we’d made both with the Bioloid as well as the timelapse video Mike and Mark made of their trip to Canada.

So now that the Quarter Presentations are over we’ve got to take all the things we’d learned in our Research phase and start designing and implementing our robot. We’ll be making heavy use of the people we’ve met so far outside of this project. In fact, Prof. Messner (from the CMU Mechanical Engineering department) was kind enough to allow us to pitch our project to his under gradclass to recruit an intern for our project. More on this later. We’re starting to talk to actual suppliers and builders also to see what resources we have available and what we’d need to buy etc.

Now the fun begins…

Protected: Meeting with Jerry Andrews

February 8th, 2010

This content is password protected. To view it please enter your password below:

Intervalometer Test 02

February 6th, 2010

Same rig as before, only this time its strapped in to the back seat of my car. The images were taken at 6 second intervals with a 2″ exposure time (Meaning 4″ between the end of one photo and the beginning of the next).

Intervalometer Test 01

February 5th, 2010

Here we are testing our recently completed intervalometer. It is constructed from an Arduino, a reed relay, a potentiometer, and a 2.5mm jack.

The photos were taken at 1/2 second intervals on a Canon Rebel XT.

The pan at the end was done manually.

Bioloid Tests

February 2nd, 2010

Here, we have more time lapse tests we ran with the Bioloid rig:

MoCoTiLa Bioloid Tests from Michael Hill on Vimeo.

Good week

January 28th, 2010

Parts have been ordered, tests have been done. More meetings scheduled. Even a trip to Toronto to check out the top of the line motion control rigs. More shooting scheduled for Friday. And more planned for next week. Watch this space !