Interesting thing of the day: Moravec's paradox

Easy for humans, hard for robots and Easy for robots, hard for humans

Sep 15, 2024

Things that are easy for humans are often hard for computers, and vice versa. For example computers play chess really well, put we cannot get them to crack an egg, or carry a table upstairs.

As Hans Moravec put it

“it is comparatively easy to make computers exhibit adult level performance on intelligence tests or playing checkers, and difficult or impossible to give them the skills of a one-year-old when it comes to perception and mobility" (Moravec, p.15)

Moravec wrote this in 1988, and it is remarkably prescient. We have LLMs in everyday life, but I have yet to see a robot do the grocery shopping. In the 80,000 hours podcast Ken Goldberg (robotics, UC Berkeley) put it like this the day before yesterday:

Perception is quite difficult with cameras: even if you have a stereo camera, you still can’t really build a map of where everything is in space. It’s just very difficult. And I know that sounds surprising, because humans are very good at this. In fact, even with one eye, we can navigate and we can clear the dinner table.
But it seems that we’re building in a lot of understanding and intuition about what’s happening in the world and where objects are and how they behave. For robots, it’s very difficult to get a perfectly accurate model of the world and where things are. So if you’re going to go manipulate or grasp an object, a small error in that position will maybe have your robot crash into the object, a delicate wine glass, and probably break it. So the perception and the control are both problems. (Goldberg, 2014)

An interesting explanation for Moravec's paradox, is surprise, surprise, evolutionary: Tasks like perception and motor control are very old and thus had much time to evolve. High-level cognition is relatively new and thus not highly evolved. So the former seems easier to us than the latter. But in fact due to the shorter time it had to evolve, high-level cognition is less optimized than perception.

So what does this mean for automated driving?
Fully automated driving would many lives a year, and better the lives of many people (stress, commuting, housing choice, long-distance relationships, caring for relatives and friends further away). It seems to me that automated driving is an easier problem than some of the robotics challenges. A car can basically only accelerate, brake and go left and right. Cracking an egg is far more delicate. So maybe Moravec's paradox helps explain why we don't have fully automated driving yet, but the comparatively simpler nature of driving compared to other perception/motor tasks helps explain that it seems we are getting somewhere in automated driving. In San Francisco there's already tons of self-driving taxis, and it seems to be working like being driven by a student driver. Will the last 20% be more difficult than the first 80%?

Schonger Substack

Discussion about this post