Researchers at MIT have developed the first integrated-circuit vocal tract, which could eventually make it's way into high-end PDAs. It's biologically inspired and combined with a bionic-ear processor in a feedback loop. This means it can not only be used for producing speech but also recognizing it: the vocal tract can help to model what the ear thinks it is hearing to verify whether it's likely to have it right or not.
Essentially, the system is a biological model of our own method of producing speech that has been implemented in silicon. It is not just good at synthesizing speech but (in conjunction with the ear/feedback) but in interpreting it because it can literally figure out what muscles etc. would have to be used to produce a particular sound and so can 'reverse engineer' what sound was actually intended. F and S may sound similar, but the way the sounds are produced in terms of muscles are quite different. So if you're able to get into the physiology from small differences in what you hear then you can make much better guesses at what's being said.
The chips used (one for the vocal tract and one for the ear) are both based on human have been implemented in custom analog circuitry. Digital computers can be used, but the computational complexity of the problem means that the analog solutions are drastically smaller, faster, and less power-hungry.
There are a number of really interesting commercial applications. The most obvious is robust speech/speaker/language recognition in noisy environments, but they are also building a glove that can drive the chip and a brain machine interface that can be implanted in the brain for speech impaired subjects, and are building muscle interfaces that would allow silent phone calls (you talk silently on the train, the system figures out what sounds you are trying to make and makes them for you down the phone).
I was at a conference about humanoid robotics, and particularly the iCub, yesterday and Mark Lee from the University of Aberystwyth was talking about the difficulties of 'raising' truly developmental robots: robots that learn about their bodies and environments through experience the way we do. This got me thinking.
Being an analog girl at heart, and given that robots have a lot of essentially analog components (even if they are driven with digital controllers), I'd always assumed that truly intelligent humanoids would have to be raised developmentally. Each individual would have to learn about it's unique set of motors and sensors and processors and what they could do and how they could interact with the world before they would be able go out and do things. Now I wonder if it has to be as drastic as that.
My thinking now is that the question turns on just how different each robot will be to its 'siblings': robots turned out in the same batch and that are (at least intended to be) identical. There are bound to be subtle differences because of manufacturing tolerances, but perhaps these don't matter so much for digitally-driven machines. After all, a robot has to be able to cope if some of its components fail or change their performance (for instance) due to changes in temperature. So perhaps we could clone the 'brains' of one robot and successfully transplant it into another, which would just wake up feeling a bit out of sorts and have to re-optimize.
If that's true, it brings up a lot of interesting questions. Is the best idea is to focus energy on the development of a single machine to clone, or to raise a whole bunch of robots to a certain point and then, in a natural-selection-type way, clone the best brains and ditch the rest? Perhaps the latter would be the best way to learn the best teaching methods for robots? And at what point in their development should they be cloned? Presumably you don't want to have to send a 'baby' robot to each new workplace so that it can learn about it's new environment and tasks at the same time as it's learning how to see and control its actuators from scratch. On the other hand you want it to have plenty of scope for adapting to its new environment...
I wonder.
Photo: Yan Wu working with the Imperial College London iCub.
I've been taking a break from writing to work on another project this spring and summer but managed to find the time to finish off a story about the iCub. This open-source robot is designed to allow academics to concentrate on implementing their theories about learning and interaction without having to focus on designing and building hardware, and is part of the general trend towards open source in the field. You can find out more by reading the full piece in EE Times.
Photo: The iCub is an artificial toddler with senses, 53 degrees of freedom, and a modular software structure designed to allow the work of different research teams to be combined.
One of the many things that have kept me from my blog in the last month or two has been working on a new progression of a newsletter I edit called The Neuromorphic Engineer. The new format is is more accessible, searchable, and generally usable, plus it allows for different kinds of content including blog posts. I've put a few of my own posts up as well as all the old newsletter archives, and there will be new content every 2-4 weeks.
If you're interested in how people are trying to build technology that emulates the neural systems of various animals (particularly, but not exclusively, using analog technology) then check this out. In general the articles are more technical than my blog, but less technical than journal papers. I'd really like to hear what you think and any suggestions you may have.
I'm currently reading Ray Kurzweil's book, The Singularity is Near, which I'll properly review later. Among other things, the book talks about the supposed imminence of our being able to simulate the brain. I'm afraid I'm not convinced by his arguments. Don't get me wrong: it's not that I think it's not going to happen. It's just I really think he minimizes the engineering challenges that will have to be overcome to make it happen. I've been interested for many years in the challenge of building brain-like hardware and it's not (to say the least) a trivial problem.
One thing in particular that he mentions but glosses over, is how we will tackle the problem of connectivity. We know that the brain has of the order a trillion neural processors, neurons, linked by as many as thousand trillion synapses: so that's an average of a thousand connections each. According to California Institute of Technology professor Yaser Abu-Mostafa, this connectivity is crucial for learning in biologically-plausible neural networks, and cannot be traded off easily: in other words, you can't just have fewer local connections and a larger (or faster) network and expect to be able to compensate for the deficit.
When you consider that a handful of connections is about as much as you can expect between elements on a chip, and that getting information from one chip to another is even harder (the number of connections around the edge goes as the √2N, where N is the number of elements in the chip), you can see that this communications bottleneck could be a major obstacle.
I've been interested in clever ways to get around this problem for some time. A lot of them are optoelectronic, using light to communicate between chips and boards of neurons. There are also ways of having lots of different neurons share connections. However, one of the most interesting possibilities from my perspective has been true 3D interconnection of stacked chips: chips that have had their backs thinned and are then electronically connected to each other not just via edge connectors (as normal) but also across their surface.
I first heard about a project that Irvine Sensors was doing in this area back in the late 1990s (John Carson, Chief Technical Officer at that time, was particularly interested in building brainlike systems). Lots of companies seem to be into this now, but most seem to be doing it as a way of trying to combine different materials systems (like gallium and silicon) and so different functionalities, rather than for pure connectivity reasons. Also, success has been limited: most teams have stacked just two and three chips rather than the dozens that Irvine Sensors had hoped for when they started working on the problem ten years ago.
So our current best hope is not yet a done deal. Calculations on the back of an envelope are easy. Engineering is hard...
Figure: Irvine Sensors scheme for 3D interconnection back in 1998.
Although I've no doubt that digital computing will be crucial to the development of intelligent robotics, one of my interests is in the other—often neglected—technologies that will also be vital to making it happen. One of these is mechanics. The video shown here (top) is Domo, one of the latest robots being developed at the MIT Computer Science and Artificial Intelligence Laboratory. As well as incorporating many new ideas, this robot builds on one concept that was developed at the AI lab a decade ago: the idea of compliant limbs.
Matthew Williamson, an earlier PhD student of Rodney Brooks, worked on the arms for Cog in the mid to late 1990s. As you can see in the animated image (middle) this robot saws in what looks like a natural way. This is clever because, unlike the robot arms you see in car manufacturing plant (or on the Honda robot, ASIMO, for that matter), the limbs to not work by calculating exactly where they are supposed to be at every moment. Instead, they have some give (compliance), provided by springs in the arm structure. This not only makes the robots safer because they push back, but also means that the robot need not 'micromanage' it's arm to the degree that it would have to otherwise. It can count on the push/pull of the springs to do some of the work for it.
An even more obvious example of how physics can help is with locomotion. In the late 1990s, Andy Ruina, developed a walking robot with his team at Cornell University. This may not seem very impressive: it's certainly not the first. However, it is completely passive: it has no 'brain' of any kind, it simply walks due to its structure and the laws of physics as you can see in the video (bottom). Essentially, the legs below the knee act as pendula, with the knee itself stoping them from swinging forward. A little potential energy (the slope) is all it needs for a fairly convincing walking gait.
Though the programming side of robotics is often thought to be the more glamourous, getting your hands dirty in the machine shop can make even more of a difference to the efficiency of the machine.
A few weeks ago, the British press was abuzz with stories about robot rights. A collection of roboticists and philosophers got together to debate the issue at the Dana Center in London as a result of a quasi-governmental report published towards the end of last year. This sparked a discussion on the influential BBC radio program Today: a show that is thought to have the ears of the political class running the UK.
Members of the US military have also been thinking about robot welfare. At a meeting of the British Computer Society last week, I heard Rodney Brooks, founder of iRobot and outgoing head of the MIT Computer Science and AI Laboratory, describe the relationship between one robot and its operator. After several successful missions, the packbot (an example of which is pictured), was destroyed. The operator brought it back to the makers and asked for it to be rebuilt. He didn't want a new one, he wanted it fixed. It was a good robot and they'd been through a lot together...
Mark Tilden has a similar story to tell: after watching a successful demonstration of one of his landmine-clearing robots, a colonel insisted that the test be stopped be stopped because it was inhumane.
I like machines. I spend more waking hours with computers than I do with people on top of worrying about how to make them more intelligent. However, I have to say that I don't lose sleep about their well-being. I believe we could one-day have conscious, self-aware robots: different from humans but feeling their own kind of pain or hunger. I do. And when we get remotely close to that state of technology, if it's in my lifetime, I expect I'll have an early opportunity to see (meet?) these machines and will choose to champion their rights.
However, in the meantime, my view is that this concern is absurd verging on obscene. For one thing, I'd rather we spend the time worrying about human rights, or even animal rights, at a time when they seem so under protected: at least we know for sure animals feel pain. For another, if the civil liberties of humans (never mind animals) continue to be violated as flagrantly and often as they seem to be now, what chance is there that robot rights, whether legislated for or not, will be in any way respected?
Picture
The packbot: one of iRobot's products for the military.
Saw an interesting talk yesterday. The Visual Geometry Group from Oxford University claimed to have developed a technique for automatically tagging video content (both arbitrary objects and faces) so that it was easily searchable in real time. The faces could even be labelled as particular actors or characters using online resources. The goal was to have a video version of Google where you can search by names, faces, logos, object images, anything. Although searching is not an area I particularly follow, the work seemed to have important implications for machine vision and I was amazed at how effective it was as described in the presentation.
The system works by breaking images down into small features (words) that together (in hundreds or thousands) combine to make objects and scenes that work regardless of the size of the object in the frame, it's position, or even its orientation. This is done off-line and is currently quite slow, though they're hoping to speed up to real time eventually.
Once the video (they've been working mainly with movies) has been indexed, we were told, then you can select any object on any frame and use that to search the rest of the movie. So, for instance you can select an object (they show an example where they choose a tie from the movie Groundhog Day) and then it comes up with other instances of that or other ties. The same thing should happen with other kinds of objects.
They also have a system for identifying and tracking actors in frames, and then adding the character/actor's name to the index using fansite transcripts of movies and TV shows.
I was quite excited to write about this after watching the presentation, and had intended to read up their papers to really understand the work better before writing this blog. But fortunately or unfortunately, I chose to use their demo first, and decided not to bother: the technology does not work nearly as well they claimed... at least, not as far as I can see. I encourage you to go and try out the system for yourself and let me know if you have better success. I tried searching their demo movie Charade for many objects: a pillar, a car, a phone, letters and numbers, a passport, a bottle... and more. I really wanted it to work because I had been impressed with the talk. I even tried searching for a tie like on Groundhog Day.
What I found was that instead of being able to identify objects, the system seemed to be picking up minor geometrical features of objects like textures and general shapes (like long and thin) and found those. In the particular case of the Groundhog Day tie, I suspect that the system was finding the large cross-hatch pattern, not the tie itself. Maybe I was looking for the wrong things, I don't know. And maybe the face recognition system (which they have no online demo of) really does work better.
But it felt to me like the searches that work well are the exception rather than the rule, which goes to the credibility of the research... and suggests that machine vision really is in as poor a state as we all suspected.
Mark Tilden changed my life. In about 1998 I started to become interested in analog computing for intelligence and came across a paper called Living Machines Mark wrote with Brosl Hasslacher a few years earlier. In it they talked about analog electronic creatures that were were very different to any other robots I had seen before. The 'nervous networks' that drove them were made of very few transistors, capacitors, and resistors—dozens rather than hundreds—and yet they, together, performed a rich, natural, and robust set of behaviors. The sun-seeking robots were even being used for interesting applications like satellite guidance and mine-clearing. It was a great story and it helped me understand what was important about intelligence in a way I hadn't before.
What was more important, however, was that researching this piece made me more sensitive to other analog stories and introduced me to the community of neuromorphic engineers: people who try to build in silicon circuits that emulate biological neurons. A case in point was Reid Harrison's work on fly vision at Caltech. Reid, I discovered, was part of a movement started by Carver Mead, who showed in his groudbreaking book that analog circuits could do the same kind of functions that biological neurons did, and at the same time would consume many orders of magnitude less power than digital circuits. I started to get sucked into the field.
About two years later, in 2001, I got the opportunity to attend the Telluride Neuromorphic Engineering Workshop in Colorado: a critical experience in my life. There I learned about floating gates and address-event representation (which I'll get to in later posts), that I could climb from 8000 to 12000 feet in a couple of hours without dying, and that scientists and engineers are not the same. I also got to see the construction of the very first prototype of Mark Tilden's Robosapiens (pictured). I was so inspired that I offered to start a newsletter covering the
field, which is produced on behalf of the Institute of Neuromorphic
Engineering.
Although I, of course, have written about the movement (and will be writing a lot more about it in this blog), I'm amazed at how few people who claim to be interested in artificial intelligence are aware of it. This community has built robots that respond to the calls of the opposite sex, can see using tiny vision chips and optical flow, that have bat-like hearing, that walk using the same kind of internal oscillations that keep a chicken running after it's head's been chopped off.
My theory is that neuromorphic engineering is still such a minority pursuit because it requires people to actually build things (rather than just program them). You have to solder and send chips off to be fabricated and all sorts of expensive and time-consuming things that really make getting your PhD finished difficult: you don't just type and run. Plus you have to do the maths related to analog electronics, which is not for the faint hearted. The 'basic' analog VLSI course I had to do as part of my time at the workshop was well beyond me and my physics degree. In fact, companies are having difficulty recruiting analog engineers because so few bother to train.
Anyway, though small in numbers they're doing some very interesting work. To show some of it off without rambling on for much longer here, below is a little video of the last Telluride event I attended (2005), and the test of some of their robots at the end of the three-week workshop You'll see robots navigating around a maze. Some are very intelligent, some are not (just dumb toys designed to show that the maze can't be navigated by accident). They use vision, sonar, feelers, and even smell as triggers to move around. And then there's Audio Sapiana (I think that's her name), irresistably drawn to the voice of her mate...
Illustrations
Top: This robot is driven by a circuit based on the nervous system of a lobster.
Middle: The first prototype for Robosapiens, built at the Telluride Neuromorphic Engineering Workshop in 2001.
Bottom: Telluride 2005 video by Stuart Arnott of Red Planet.
When I discuss analog computation with most people, their instant reaction is to explain to me that there is no such thing as analog. For instance, one of the comments on an earlier post I wrote on analog vision pointed out that using a film-based camera rather than a digital camera doesn't give you infinite resolution, just a higher, finite resolution. In this case the the limit is grains of silver on the film rather than the number of pixels on a camera array. This is true, but it also misses the point.
One of the main differences between physical computation (the slightly extreme form of analog I discussed in my PhD thesis) and digital computation is that it is good at extracting meaning out of tiny fluctuations. Digital computing, for very good reasons, is designed so that small 'glitches' that could lead a calculation astray are completely ignored. This is great when doing maths, but not when processing signals.
Let me give a concrete example from a piece of research done a few years ago by Dana Anderson and his colleagues at the University of Colorado: a classic example, in my view, of how analog feedback can be exploited to pull signal from noise. Essentially the system (shown) performs a winner-take-all operation to solve the so-called 'cocktail party' problem where lots of noise (literally sounds in this case) overlaps making it impossible to hear what is going on. In his system, one of the components (one of the sounds) will dominate,
extinguishing the rest.
Note: this system is really complicated optically, so don't feel bad if you don't understand my explanation on the first pass.
First,
the sound signals are processed (you can see the details one of their publications on the subject or on their website) and then used to vary the output of an array of laser beams (so the light carries the sound). These beams create a holographic
pattern in the photorefractive medium, a crystal that changes the way it bends light when it is exposed to photons of a particular color. This pattern is then read out by another beam (called the 'loop' beam because of the shape of the circuit) that, as well as picking up information, helps strengthen the hologram where the patterns are similar, and weaken it where the patterns are not.
The loop beam is then made to interfere with a second beam with a similar but not identical frequency. This produces beats (where two high-frequency signals, mixed, produce a frequency that is the difference between the two). This new oscillation is low enough to be picked up by a detector. After being filtered and
amplified, this signal is used to modulate the phase of the loop beam, altering the elements of the hologram that are strengthened and weakened by it.
In English, the loop beam initially reads out a
hologram containing all of the signals. Since one is the strongest
(even if only by a tiny amount) it has
slightly more impact on the beam, and so also on the output of
the detector. The feedback loop is then tweaked to enhance this
effect. This feedback continues until the strongest signal becomes the only signal (winner takes all). You can actually hear this happening in their system if you use the player below the diagram.
We know we have winner-take-all circuits in the eye, so this kind of feedback is not just clever engineering, but biologically sound too. Even the tiniest hint of what's important can be leveraged into fuller knowledge. Of course, you can do feedback with digital too, but that would be silly: essentially you would be asking it to do something it was designed not to be good at. And, as I will discuss in a future post, you will pay a huge power penalty for your trouble.
Figure: This holographic systems pulls out individual sounds from noise using feedback. Click the image to hear this happen.