SpeechWare Work

Projektdetaljer

Beskrivelse

In light of the move toward distributed networks, small-footprint synthesis was investigated as a potential direction for a new version of the DST system. The following possibilities were covered: - reduction of processing load, relevant if a future system is to run on a small device such as an iPAQ. - reduction of executable image size. - reduction of program memory used. - reduction of database size. All but the last point were judged to involve considerable effort for little gain. With regard to shrinking the database, the number of alternatives is considerable. Some can be combined with each other. Depending on the application's quality requirements (such as telephony applications), some reduction measures may cause little or no perceived quality loss, though in all cases this should be checked with naive listeners. The most dramatic reductions in storage come if the current platform is abandoned. However, such a move nearly guarantees some loss in quality, together with a substantial investment of effort, and the supporting of two systems. Any decision to undertake such modifications needs to weigh the capacity of the target platform, how much that capacity will increase in the near future, and user response to a loss of naturalness and intelligibility. Judgements about these factors should drive the decision whether or not to invest the needed resources. There were improvements in quality of both male and female databases. The female database was reanalyzed and the concatenation improved. The male database was extended to have context-sensitive diphones. There was an investigation into changing the quality of the male voice, with the intent of creating additional voices for the TTS system without recording new audio. The SGM code was completely redesigned and rewritten. The result is now thread-safe, slightly smaller than the first release and runs as fast. Most important, the amount of program memory used is now limited, instead of increasing with the size of the text input. (Ove Andersen, Charles Hoequist)
StatusAfsluttet
Effektiv start/slut dato31/12/200331/12/2003