This is a guest essay by Fred Nikgohar. Fred is the CEO of RoboDynamics, makers of the TiLR robot which was recently featured in a New York Times overview of telepresence robots. Fred argues that we've reached a watershed moment in robotics facilitated by cheap 3D sensors like Microsoft's Kinect (ie. PrimeSense RGB-D camera) -- that the Kinect provides a roadmap where "the best solution to complex, low-cost sensing (or actuation for that matter) is to take advantage of affordable, mass-produced components, complementing them with the innovative use of software solutions that benefit from constantly declining prices of computation."
The holidays are upon us and once again the world goes shopping.
Santa Monica Place is a newly remodeled shopping mall directly across our offices here in beautiful downtown Santa Monica. Every holiday season we are witness to the large masses who visit the mall and the adjacent 3rd Street Promenade to do their holiday shopping. Given my distaste for crowded masses with Holiday Sales on their mind, I do my best to avoid shopping malls and stores at all costs during the holiday season - until this year.
There is one good reason for my excitement this year.
No doubt by now you've heard a great deal about the recently released Microsoft Kinect. Originally designed as a Game Controller for the XBox gaming platform, Kinect is a fantastic 3D sensor that retails for $150 in the U.S. What is for more interesting about the device however, is that it enables endless applications beyond just gaming. For example, if you've ever seen the movie Minority Report, you are already familiar with Gesture Recognition, the ability to interact/interface with digital content by simply gesturing to your computer. These types of interfaces are now a reality with the Kinect device.
Immediately following the $3,000 bounty by Adafruit for the first open source driver to read the data-stream from Kinect, we have been witness to a plethora of home-brewed projects from every corner of the world who are using the device in ways never imagined before. From gesture interfaces for controlling your computer to SLAM implementations on mobile robots... and everything in between (KinectHacks.net), Kinect has proven to be one heck of a sensor, and it is inspiring imagination and creativity all over the world.
But you know what's funny?
Kinect is hardly the first - or even the most ideal sensor - to enable these innovation. For at least a decade, we’ve had Time of Flight Cameras, RBGDs, LIDARs, Stereo Vision, and a number of other sensors that could have been used to create such applications. Yet somehow it was Kinect that kick-started such a flurry of innovation. Which naturally invites the question: what is it about Kinect that is inspiring such flurry of activity on a global scale?
In a single word: Affordability!
You see, all the sensors predating the Microsoft Kinect have been - and still remain - financially unaffordable to the average consumer. Ranging from several hundred to several thousands of dollars, they are inaccessible to consumers who clearly have the will, the skill, and the desire to create such amazing applications. The availability of Kinect at $150 retail changes the equation dramatically. It provides to the masses inexpensive access to a powerful sensor with a simple interface that can be used in virtually hundreds of applications - some as yet unimagined.
The speed at which these Kinect-based home-brewed innovations are proliferating around the Internet surprised even Microsoft, who originally threatened “legal action” against anyone who would attempt to hack it. In just under three weeks Microsoft did a complete reversal in attitude when it announced publicly that it had “left Kinect open by design” thereby encouraging the innovations we are now witness to. Regardless of Microsoft’s intentions, there is no denying the variety and quality of applications that are popping up everyday.
I’ve written before about the dire need for Low Cost Sensors in Robotics. Sensing is a huge challenge of commercializing consumer robotic platforms and applications. The inexpensive sensors are incapable of providing useful information about the robot’s environment, whereas the capable sensors are expensively cost-prohibitive. Though certain challenges like Obstacle Avoidance are manageable with inexpensive sensors, challenges like Autonomous Navigation remain woefully unresolved given the strict budgetary requirements of a consumer electronic product (such as a robot). And without capabilities like Autonomous Navigation, we find ourselves severely limited as to the type and usefulness of the robots we can produce at consumer-affordable prices.
The very existence of a complex yet (relatively) inexpensive sensor such as Kinect makes it possible for the first time to build (robotic) applications that are simultaneously useful and affordable. But how do we make complex sensors that are simultaneously capable and affordable? We can look to Kinect itself for the answer. In developing Kinect, Microsoft could have used its scale to drive down the price of exotic components. Instead, they chose to use mass-produced commodity components readily available in the market and achieve complexity by off-loading intelligence to a commodity CPU. The trade-off has a more costly upfront development cost, off-set through time as the product scales through market-acceptance, in return for a far cheaper piece of hardware made up of commodity components. Inside a Kinect you will find two RGB cameras, a depth sensor, a microphone array, and a CPU to do the heavy lifting. These are all components produced in huge scale and readily affordable. And therein lies an important lesson for the robotics industry:
The best solution to complex, low-cost sensing (or actuation for that matter) is to take advantage of affordable, mass-produced components, complementing them with the innovative use of software solutions that benefit from constantly declining prices of computation.
There is always a paradox in disruptive technologies: new and exotic innovations of today become tomorrow’s commodity components, readily available in large quantity and high quality. The challenge is to drive market demand that leads to increased production, greater competition, and ultimately higher quality products and lower prices for the masses. CPU, GPS, WiFi, LCD Touch-screens, and countless other technologies all experience this trajectory.
3D sensing generally, and Kinect specifically, will also experience this same price/quality trajectory. By using inexpensive components and advance software algorithms, Microsoft has succeeded in introducing to the market an entirely new user interface paradigm. More importantly, they will succeed in creating the demand necessary for low cost 3D sensing, which benefits not only them as a company, but the entire digital technology space as a whole. It won’t be long before a new breed of 3D sensors hit the market, drastically decrease in price through time, and render gesture-based interactions commonplace in every home.
If we are to have a robot in every home, we must have low cost complex sensing. Kinect is showing us the roadmap. It is up to us to learn and apply the lessons... and as we do, we will come close to that day where man and machine will live together in harmony. My hope is that products such as Kinect, and the ingenuity of the home brew community, compels us to act. I look forward to your comments and questions.
Happy Holidays and Happy New Year.
CEO - RoboDynamics Corp.
I would like to thank Fred for heeding Hizook's request for assistance and for submitting such a quality essay! In the past, such essays were commonplace in SciFi magazines (eg. Marvin Minsky's "Telepresence: A Manifesto" in Omni Magazine). I'm not sure that a common forum exists any longer, but with your discussion and submissions perhaps we can turn Hizook into a place for robotics thought leadership.