Last year, I approached a very different pattern recognition problem: recognizing simple shapes. For this task, I used a neural network. A neural network is a collection of layers of neurons, simulating the human brain structure. Each layer of neurons is connected to the next, but each connection has a certain weight. Every time the neural network processes some input, it adjusts these weights to make the output closer to a given desired value for the output.
This year, I decided to apply the same technique to the different problem of speech recognition. No one had ever used Neural networks as the core of a speech recognizer before, but I thought that they were up to the task.
I designed and constructed a voice recognition system in C++, using a flexible neural network and several mathematical techniques to abstract the sounds. Then, I tested the system with several short, monosyllable words. First, the network would train using two samples of each word, and then I tested it using those same two samples. Within these training sets, the recognizer worked quite well. The recognizer achieve moderate results with other samples of the words. With larger vocabularies, the recognizer was only able to identify about half the of words. However, considering the great variation in the waveforms of words, I think that this system can be deemed a success.
The practical applications of this program will involve the future of America and the transportation system. Programs such as the one we plan to write will guide cars to their destinations using (GPS) without the driver having to be aware of outside influences (outside the panels of the vehicle). GPS will tell the car where it is, and our program will tell the car how to get to its destination. Our idea is that this will create less vehicular traffic, and it will better the environment by decreasing the wasteful driving that most people do. By reducing the excess driving, people will also find that driving will become more cost-effective for their purposes.
The programs we have used during our project are Netscape 2.0 (a World Wide Web browser), Spyglass Transform and Plot (a Macintosh graphics package), and NCSA Telnet 2.6. Along with the programs we used, we also had help from a Power Macintosh 5300, a Sun 4c workstation at Colorado State University (CSU), and a CRAY Y-MP supercomputer at the National Center for Atmospheric Research (NCAR).