Being Honest in Robot Videos: Motion Capture, Speedup Rates, and Teleoperation

Movies and scifi books inspire roboticists to push the envelope, but they've also skewed the public's perception of robot capabilities. This problem is being exacerbated by researchers. In the last three months, I've had to shatter a few dreams: "Your $300 AR.Drone or $150 Ladybird will not be able to perform insane autonomous aerial maneuvers (yet). The UPenn quadrotors rely on $20k-$50k camera-based (Vicon) motion capture systems, which provide global pose estimation of each UAV at millimeter-accuracies at up to 1kHz (and often uses an external, centralized motion planning computer too)." That this crucial aspect of the videos does not register with intelligent people means that researchers are being disingenuous and violating their duty to the public -- which sucks, because their projects and research are awesome! And this is just the example that happens to be most salient to me at the moment. In this post I'd like to explore some "best practices" for robot videos so that we can quit misleading one another.

Exemptions: Hollywood, Sci-Fi, Art, and Entertainment

Every roboticist I know has encountered someone who insists, "Robots can already do that. I've seen it in the movies." Real robots are not as capable as Rosie (Jetsons), Sonny (iRobot), or the mechs in Avatar / District 9 / Matrix.

Rosie Robot from Jetsons Sonny Robot from iRobot Movie Robot Mech from Avatar Movie

I forgive Hollywood (and scifi authors). Their job is to make entertaining fiction. Besides, their work is inspirational. I also forgive robots built as "art" and robot performances. For example, the PR2 robot dancing at its launch party and the recent quadrotor light show. No one attending these events cared that the robots were scripted and used external localization, respectively.

Quadrotor UAV Lightshow PR2 Robot Graduation Party

But research videos (naturally) need to be held to a higher standard. They must represent the work honestly.

Research Video Ethics

I'm going to explore four factors "by example" that could be addressed through simple watermarks on the videos. I deeply respect all the research in these videos and the researchers who made them. I'm using you guys out of respect -- I love you all. ;-)

Let me start by saying that the following excuses are insufficient: The video needs to stand alone.

"It's mentioned in the audio." Not everyone listens to the audio.
"You can read the paper for details." Non-academics don't read papers -- full stop.
"It's in the YouTube title or description." The videos on this page are embedded. It's too easy to miss.
"This video is only for other researchers." Too bad, the general public will consume it too.

External Camera Localization

The KMel quadrotors used by UPenn researchers are amazing. Researchers have them building buildings, performing aggressive aerial maneuvers, and flying in formation (my favorite, below!).

But what you may have missed... the inconspicuous infrared cameras that form the backbone of a "20-camera Vicon motion capture system" that costs between $20k and $50k:

Quadrotors use Vicon motion capture system to perform incredible feats

Clearly the UPenn researchers aren't omitting this maliciously... In the "aggressive maneuvers" video, they explicitly mention the Vicon in the audio. But in the formation-flying swarm video, there is no mention of camera localization. This is problematic. Some of the early maneuvers (eg. single flips) do not require the Vicon, whereas flying in formation does. I am aware of these subtleties... but many (most?) people watching the video will not be!

"External Camera Localization" watermarked in the bottom right corner of the video when appropriate would really help.

Teleoperated vs. Scripted vs. Autonomous

Here's another pet peeve. Back before the days of the PR2, researchers at Stanford build the PR1:

The PR1 was an amazing piece of hardware that ultimately led to the PR2 robot by Willow Garage. But unbeknownst the the watcher: this video was 100% teleoperated and appears to have been sped up many times over. While the PR1 video demonstrates the hardware's capability, we are still many years away from robots operating in such unstructured environments with this level of fidelity. And yet any sane person watching this video is apt to think we've already achieved those capabilities. The same can be said for robots performing scripted actions.

"Under Teleoperation" or "Scripted Motions" or "Autonomous" (for bragging rights) watermarked in the bottom right corner of the video when appropriate would really help.

Realtime vs. Sped-Up

We kinda touched on this already, but this example is just too apt. Pieter Abbeel (and crew) from UC Berkeley enabled a PR2 robot to fold towels. This is awesome:

Just watching that video, it's tough to tell that the video is 50 times realtime. It's mentioned in the video's title (especially if you click through over to YouTube), but it's not readily apparent from the video itself. And here's the thing: who cares that the robot can only fold one towel every 20-25 minutes! The robot could fold towels all day while I'm away at work.

The problem is public perception. Here's what CNET has to say:

"It can fold up to 25 towels per minute." Oops. No, it really can't. But now the public won't understand the monumental progress required to make the PR2 fold a towel in just 5 minutes (aka, ongoing research). That's bad.

"50 x Realtime" or "Sped-Up 5000%" watermarked in the bottom right corner of the video when appropriate would really help.

Tethered

This one is a little more dicey (for reasons I'll get to later). But look at PetMan:

That's one of the most compelling videos from 2011. And yet, one of the big challenges with walking robots right now is finding lightweight, high-density power supplies (and quiet, in the case of internal combustion engines). There are ongoing, multi-million dollar grants to address this issue. In some cases, it probably makes sense to include an additional disclosure that the robot is tethered. For other domains, communication tethers are a crucial distinguishing characteristic (eg. for latency).

"Tethered" watermarked in the bottom right corner of the video when appropriate would really help.

Discussion

Disclosing caveats via watermarks would go a long way toward informing the public (and other researchers). This helps keep the work honest and paves the way for truly-groundbreaking improvements: the KMel robots operating without Vicon, the PR2 tidying up a room autonomously, towel folding at 25 towels / minute, and an untethered PetMan. But for those giant leaps to be recognized as such... We need to communicate key limitations of existing robots -- especially via the videos.

But how much is too much? I'm not sure. Clearly, not all caveats should be reported via watermarks on the video. We understand that cameras don't usually function too well in bright sunlight. We know that these small UAV's don't work outdoors. Ultimately, it falls on the research community to establish sound practices, and I hope that this blog post will (at least!) get the conversation started.

I'd like to thank my advisor (Dr. Charlie Kemp at Georgia Tech's Healthcare Robotics Lab) for instilling me with a sense of "robot video ethics." I didn't always share his perspective, and probably even rebelled against it a bit in grad school. But I've come to appreciate his wisdom!

Comments (14)

Reno J. Tibke July 02, 2012 at 08:43 PM

This is a great piece, I've also tried to point this out in the past (e.g., ASIMO = super-tech puppet, etc.), but a more focused, non-smartass approach was certainly needed.

On the software end of the robotic mule, Willow Garage is doing great things - even if they were visually disappointing to people like Senator Tom Coburn (didn't spare him: www.anthrobotic.com #coburn). However, it was pleasing when they finally brought the video-speeding-up issue to the fore. It doesn't diminish the quality of their work, but the lay-viewer was left to assume that their accomplishments in the physical matched up with the software - which is not the case (yet) - so it was good they came clean.

Now, speaking of the robotic mule, with the Boston Dynamics projects I think it's important to mention the kind of tethering that's going on, i.e., is the device being held up or balanced artificially (which would be weak), or is an external power source being fed into the device during testing (understandable). We've all seen the Big Dog videos, and we all know that those machines were tethered to a powersource before they got the annoying hyperspaz lawnmower sound. In the PETMAN videos and the newer "Robot Master's Stairs" video, it's clear that a power source is being tethered in, but the weight tether is slack - it's there in case of falls, and that's an important distinction. PETMAN's physical/mechanical development is BD's focus (given the nature of the DARPA contract, which is basically to test kneepads and other protective gear), and plenty of other labs are strengthening and shrinking power sources - they can some of those to cart later. It is amusing though, that they sorta kinda accidentally created the world's most physically advanced bipedal robot...

Which brings me to the end of my article after your much better article.
Given the complexity of both the software and physical ends of the robotic mule, it's unlikley that a single lab is going to produce a comprehensively superior utility robot. But, put some Willow Garage in a PETMAN, and you've got a Terminator who folds laundry and makes pancakes with its German buddy AND makes tech-ignorant senators close their mouth about cutting funding the U.S.'s National Science Foundation (provided development funds to WG).

As always, please keep up the great work.
-Reno at www.anthrobotic.com

Reno J. Tibke July 02, 2012 at 09:02 PM

Please forgive the in-a-hurry typos!

Travis Deyle July 02, 2012 at 09:24 PM

@Reno,

BDI has always (in my opinion) been pretty upfront about the tethering, just not in the robot videos -- much like many of these other projects. There are videos of BigDog using an untethered, internal combustion engine (ICE) to drive hydraulic motors... they are the videos that get made fun of for the persistent whinning noise. Also, I believe BDI is in the process of developing "high energy density, lightweight, quiet ICE power supplies" per a new DARPA grant. Sorta like the Sarcos exoskeleton. Furthermore, given that the new DARPA Grand Challenge for Humanoid Robots has specifically selected BDI's PETMAN (or a derivative thereof) as the "general hardware platform" and the Willow Garage spin-out, Open Source Robotics Foundation, as the "common software platform"... your wishes are about to come true.

Finally... the development of the PR2 was not paid for by NSF funds. To the best of my recollection, the robots are the result of Scott Hasan's charitable support -- ie. he founded and bankrolled WG using proceeds from his successful business endeavors (at eGroups and Google). Several academic labs (and WG?) have since acquired NSF funds for robot projects (either to buy or use PR2s), but NSF funds were not used to bankroll the original PR2 Beta program that generated 12 robots for outside institutions (and about the same number internal to Willow).

Ken July 02, 2012 at 11:20 PM

We had pretty strict video guidelines at WG that required any footage we published to always include the exact video speedup overlaid on the videos, in every frame. Same thing for teleoperation. Pretty simple guidelines, well worth following.

Travis Deyle July 02, 2012 at 11:48 PM

@Ken,

That's a sound practice. As I alluded at the end of the post... Charlie had a similar policy for our lab. I think there is a real opportunity for me to do more than complain... so I've placed some inquiries with folks at IEEE about trying to setup a common (official) set of designations for IEEE video submissions.

I think Hai Nguyen's comments over on Google+ about using G / PG / PG-13 / R / NC-17 type ratings is a good idea. I also think that visually-appealing icons, similar to those used by Creative Commons (below), could be a nice way to annotate videos without being distracting:

Creative Commons Icons

Will let you know what my contacts at IEEE have to say... but if you know anyone who might be in a position to effect changes, have 'em contact me.

Owen July 03, 2012 at 02:33 AM

If the viewer doesn't read the research paper, then they probably don't stand to lose much by being deceived as they probably aren't too involved in the industry.

In the case of the US Senator, if he wasn't an engineer then he could never have properly judged the progress of the technology by watching a video demo anyway, so it's his own fault that he was deceived.

I agree that the media probably won't be excited when clothes can actually be folded in realtime, but I think this story was only ever going to be reported once so it's not really the fault of those who made the video. I reckon the media people knew it wasn't realtime, but decided they could make the viewers interested if they just told them that it was.

Reno J. Tibke July 03, 2012 at 03:58 AM

Mr. Deyle:
Your clarification is an important clarification I did not make clear, but JSYK it was my implication that the complaints from the technologically thick senator occurred after WG received some NSF funding, and by "development funding" I just meant that it's like, you know, still being. I appreciate the note.

Mr. Owen:
Your point is well taken that the senator did/could not really have understood and parsed PR2's mechanics & software & other sciency details, however, he's got a voice and a vote much louder than any of us, and therefore the person's lack of industry involvement is exactly the problem.

I think this underscores Mr. Deyle's original point: with a strong and growing interest in robotics - particularly since the science is finally catching up with the fiction, and robotics are actually becoming robotic - developers have to be conscious of not only the reaction of technology & robot dorks (those of us who do read the fine print and comb through a video and visit the website), but also, at least a bit, that of the general consuming public as well.

The bad press for WG was minimal, and it really only made the rounds among robot enthusiasts and those who either love Tom Coburn or hate the NSF, but the potential was there - one YouTube video out of context, and suddenly a very powerful (albeit simple) man was making public statements about the wasteful moneypit that is WG and their slow-folding PR2.

Myself and others were quick to make fun of the senator, and thankfully the story didn't go anywhere - and most likely WG would have been fine if it had - but still, the potential was there. Again, I think this is Mr. Deyle's point - and it's an excellent example of how robotics labs and developers should not only maintain scientific veracity but also some small measure of social responsibility in the form of a very simple disclosure in public materials.

-Reno at www.anthrobotic.com

Alper Aydemir July 03, 2012 at 05:49 AM

I think you shouldn't need a PhD in robotics in order to quickly judge the hidden simplifications in robotics reserach demonstration videos. The videos should be explicit and clear about this. Once you put up a video on Youtube, it's in the public domain, and there to be watched by everyone, senator, freshman students or Mr. Robotics-Expert.

We have poured some effort and thought into this previously after seeing many videos that includes robots performing very complex tasks presented as though problem solved, not knowing what parts are scripted and what parts are part of the actual research. We wanted to bring this to the attention of the robotics community with a workshop proposal to previous ICRA on exactly this topic, titled: "Robotic Demonstrations in Mobile Manipulation"

We wanted this to be an honest and open discussion about best practices when it comes to publishing robotics research demonstration videos. In fact, we've also designed a 10-question survey to measure to impact of misleading videos on the public and robotics community and did some dry-runs to fine tune the questions. Many people new or unrelated to robotics seem to think robots can now bake cookies and do a bunch of daily chores reliably and robustly.

Here was the workshop website with some more explanation in case the topic gets re-kindled again, I'd be happy to do the grunt work to have a meaningful discussion on this:

http://www.csc.kth.se/~bohg/WorkshopWebsite/

I'm very glad that you took up on this, great article.

cheers,

bert [not a roboticist] July 03, 2012 at 10:42 AM

I appreciate what you're trying to achieve with this, and agree that it would be a good idea for people who make these videos to spare a few seconds of thought about what a non-savvy viewer will get out of them. I think that for speed-up numbers or teleop, a watermark clearly makes sense. But on the other hand, who will get anything out of a phrase like "External Camera Localization"? Precisely the people who didn't need it in the first place. Interest in robotics is exploding, so I agree that there should be a better way for the public to get better information. I think a good way to achieve that would be to include the URL of a page, written in plain English, describing what the research project has actually done and what the videos actually show. Trying to fit all of that info into a watermark seems doomed. Basically, there are a few groups of people to consider here. Fellow researchers can take care of themselves. The truly general, utterly non-scientific public probably can't make much use of anything beyond the total basics. Enthusiasts, high-school students and the like may need the extra information, but they can probably follow an URL and read a short summary. That leaves the bad science reporters who can't be bothered to do their homework, and for those you don't need a watermark, you need some new color of Kryptonite.

William July 03, 2012 at 12:31 PM

I think bert is on the right track with this. Some of my greatest challenges in robotics are explaining what I do to my parents and friends. For anybody in a non-technical field, "External Camera Localization" means nothing. Even to friends who worked on quadrotors, it took a minute for that information to mean "everything cool and exciting happens off-board." I think it's important to be honest in dealing with the public, but we need a better way to explain our accomplishments and shortcomings. Video demonstrations are a great way to show people the interesting and cutting edge research, but they're ultimately sales tools. It's going to be very difficult for someone to say "here's all the things this robot can't do." I hate to bring up issues with out offering a solution, but I really don't have a solution for this. Watermarks would help to explain it, but they need to mean something.

Karl July 03, 2012 at 09:22 PM

Complaints like these come across as someone whose ego is bruised because laypeople don't seem to understand the difficulty of their accomplishments, or are jealous of what other people accomplish with better resources. Even if you could convince everyone to use them, better demo video annotations aren't going to help that. Either people already comprehend the difficulty, or they are ignorant enough that annotations aren't enough to change their mind. If anything, these kinds of demonstration techniques help to instill a more appropriate level of respect for the accomplishment.

You seem to be implying that truly groundbreaking improvements won't happen without a properly informed public. That's demonstrably false, by the subjects of the exact videos you are citing. The fact that the public may not perceive an unembellished achievement as groundbreaking does not make it any less so.

One of the major reasons I became an engineer was to be able to do things that laypeople don't understand, and I'm as proud of what I create over a weekend from the limited resources of my bench at home as I am of the expensive equipment I create at work over the course of several months with a professional team. Judging yourself against people with better resources or by other's perception of your work is a surefire way to unhappiness.

Travis Deyle July 04, 2012 at 12:16 AM

@ Karl,

"Judging yourself against people with better resources or by other's perception of your work is a surefire way to unhappiness."

Come on now... don't project your inadequacies onto others. ;-) I'm not judging myself against anyone -- they might surpass me if I did. That would be sad. Instead, I live a fulfilling life. I recently earned my PhD in robotics from Georgia Tech's Healthcare Robotics Lab. I'm proud of my contributions to mobile manipulation and UHF RFID perception -- hell, it made it on CNN! And I certainly don't suffer from "resource envy." Arguably, our lab had more "high-end" robotic resources than most any academic lab in the country: EL-E, Cody, two PR2s, a Vicon-like localization system (It's actually a convenient, lower-cost variant. I should write about it!), and an amazing fabrication facilty. I was very blessed in lab resources. I'm speaking-out about this because I (and many other researchers) believe that it is a real problem.

Furthermore, I deeply respect the videos that were highlighed. I chose these examples very carefully because I feel their research contributions are beyond reproach -- everyone knows the work is bad-ass. In my opinion, these researchers deserve MORE credit than what they already receive. That's why I founded Hizook: To give robotics the attention it deserves! It's a sad state of affairs when a blog post (or YouTube video) does more for disseminating advances in robotics than does publishing papers. People just don't read the papers... So it's on us to make compelling (but honest) videos.

@Bert, William, and Karl,

I'd like to make one thing exceedingly clear: I am not talking about "what [you] create over a weekend." We like to do side-projects, hold hackathons, and goof around sometimes too. I am talking about professional, academic researchers representing their work in videos whose purpose is to convey the research to other researchers (remember, robotics is a broad field), Joe-Q-Public, and their funding agencies. Even minor misrepresentations are not OK.

Let me suggest two different considerations:

(1) Adding things like "External Camera Localization" as a small watermark in the bottom right-hand corner of the video is relatively inconsequential. Most people will just ignore it. Why not include it? Of course, all those other things (eg. linking to a paper or detailed website) should be done too.... but including it in the video: no harm, no foul.

(2) These videos were groundbreaking. Full stop. Let's consider the swarming quadrotors. What happens in, say, five years when these UPenn researchers figure out how to do onboard, multi-UAV localization and planning? The public says, "Oh yeah, we've already seen that. You did that 5 years ago." NO! The new work is groundbreaking... and the only people who know or care will be on websites like this, yelling and screaming about how revolutionary their new systems are. Besides: as a researcher, it's gratifying when you finally get around to putting "Autonomous" watermarks in your videos.

Wow, that was a longer rant than anticipated. Look what you made me do! ;-)

Bouke Versteegh July 05, 2012 at 10:06 AM

I agree with the author in the sense that I too regret how misinformed lay people seem to be (about robotics). Publishers of videos should definitely avoid misleading people, and make information about limitations of their solutions available, but I don't think it will solve the problem.

The reality is that the majority of people don't have a scientific mindset. They don't care about truth; they want to see something interesting and have something interesting to tell. You present them all these facts and all these limitations, but as soon as it's passed on to the next person, it will be stripped of everything that makes it sound less than cool.

A person with a scientific mindset will wonder how the robot can do what it does, and click to see the video's description if necessary. Even if it's not explained, or information couldn't be found, such a person would not draw any conclusions as to how the robot works and would just be left wondering. Perhaps they would come up with a hypothesis, but not a conclusion. Eventually, the data should be discarded, because no conclusions could be drawn from it.

An ignorant person would just draw their own conclusion -- not acknowledging that they don't know anything about robotics, and can therefor not judge what they see -- and they'd decide that robots can all sorts of things that are not evident from the video. Even with a description, they would not care about it and ignore it in favor of their baseless conclusions.

Anything crucial information that's not clear from just the video should be mentioned in audio commentary or in the description. Not for the ignorant, who will not care about it, but for the ones with a scientific mind, who would otherwise keep wondering, or discard your video.

IJ Dee-Vo July 11, 2012 at 03:40 PM

Also many sci/fi movies have robots from the future or from an advanced alien civilation. Can't blame them for portraying their robots as more advanced then the ones we have now.