« Hail to the neuromorphic engineers | Main | On Intelligence by Jeff Hawkins »

Wednesday, 25 April 2007



I believe machine vision in its present form is trying to tackle one of the Hard problems: these learners have to deal with input only - they are not allowed to interact with this world they are learning, they are not allowed to query "what would this scene look like if i moved my camera just a little bit?", they are definitely not allowed to touch anything.
Only adult humans can really get something meaningful from such a passive experience, and that is after a lifetime of training in interacting with the world and acquiring strong priors on what 2D images might mean. Trying to get a machine to tie that sort of experience onto the symbolic domain of words just seems impossible to me.

There is still a lot one could do with clever hacks in particular domains, but such a general-purpose interpreter seems almost as unrealistic as a Perpetuum Mobile.

The comments to this entry are closed.