Tell it like it really is

3rd October 1997, 1:00am

Share

Tell it like it really is

https://www.tes.com/magazine/archive/tell-it-it-really-0
If you were starting afresh to design new ways in which to use a computer, you would not invent a keyboard. The most natural way to communicate is to talk, and voice recognition systems - software that lets you dictate direct to the screen - have become both powerful and affordable.

“Voice recognition will open up computing to the 85 per cent of people who are non-participants today,” claims Gordon Moore of Intel, the firm that makes the processors for most of today’s PCs.

Current prices range from #163;42 to #163;459, depending on product, type of computer and version. However, performanc e is constantly improving while prices spiral downward. Within a few years, both hardware and software may be bundled within the price of every new computer.

Peter Kelway, of Aptech Ltd, near Newcastle, says: “By the year 2000, the keyboard will be redundant as the main input device, and many new computers will not have keyboards as standard.” In 1989, his company became the first in the world to market large-vocabulary voice recognition systems, and Aptech continues to offer demonstrations of all major systems with independent advice and support.

Given the speed of recent advances, these claims do not seem exaggerated.

In recent years, the competition has been among “discrete” products - software that allows you to declaim only one word at a time, enforcing a slow, stilted style of dictation. Such software is unable to cope with the way we elide our words, so that “recognise speech” may come out as “wreck a nice beach”.

Today, the main contest in the Windows PC field is between two sources: Dragon and IBM. DragonDictate costs #163;139 (for the new Classic version), and allows voice control of all aspects of Windows as well as dictation into a variety of word processors. IBM’s Simply Speaking Gold costs #163;89 (the basic version at #163;42 has restricted abilities).

The costs of buying a computer sound card and headset microphone currently outweigh price differences between products. Cutting corners on the hardware is unwise, as the software is already facing formidable technical problems: giving it the extra handicap of coping with a degraded sound signal is a false economy.

The two companies have a history of leapfrogging each other’s releases, but nevertheless the two products maintain different personalities. In general, DragonDictate is more versatile. While both products can be used for dictation into Microsoft Word, if you prefer another word processor you are strongly advised to use DragonDictate. This can also dictate into spreadsheets and databases, and offers easy control of mouse movement, so you can direct all aspects of Windows with your voice. This is particularly important for disabled users. DragonDictate is a natural choice for educational use.

By contrast, the IBM system is designed for wordsmiths, and offers a range of specialist dictionaries tailored to the needs of adult writers, businesses and professions. Some people claim higher productivity from Simply Speaking than from DragonDict ate.

Among Aptech staff, opinions differ about which system is fast-er or more accurate. Performance depends on individual voices and training expertise, and users, once committed to a system, will invest long hours in helping it to improve. The 20-40 minutes “enrolment” period is just the beginning of a long relationship, and the system should continue to improve over weeks and months. During this period, the user’s friends can expect to become victims of proud demonstrations.

Both systems work well in experienced hands. Despite some errors, they were uncannily accurate and fast compared with keyboarding. They seem to interpret context sensibly, coping perfectly, for example, with “write a letter to Mr Wright right now”.

Unlike keyboards, which generally either work well or not at all, voice input is a tender plant. Speed and accuracy depend not only on processor speed and available memory, but also on the optimisation of a range of software settings, the presence of the user’s previously corrected documents and voice files and systematic updating of dictionaries. In practice, therefore, dictation is much less portable than keyboarding between machines.

On Macintosh computers, voice recognition is available through PowerSecretary, a product based on Dragon technology which is functionally similar to DragonDictate. Minimum memory is 24 megabytes, and you need a PowerPC with 16-bit sound input.

PowerSecretary, as demonstrated, seems as effective as DragonDictate, and the ways in which is can be used are more elegant. This comes at a price: the Personal Edition (for dictation into one of four software choices) costs #163;259, while the Power Edition (dictation into most software, with larger active dictionary) costs #163;459, in both cases a high-quality headset microphone is included.

For years, talk among developers has been of whether and when the Holy Grail of “continuous” dictation might arrive; predictions of early next century have been dismissed as optimistic. In June, Dragon Systems created a sensation by releasing NaturallySpeaking, the world’s first truly continuous voice recognition. IBM followed shortly after with ViaVoice, the continuous relative of Simply Speaking. Both products work well, and are a joy to use.

There are strong advantages in joining in at this stage. Those who are practised with the discrete systems first have to unlearn their halting, Dalek-like delivery before they can get the best out of the continuous products, whereas novices such as myself can rejoice in rapid-fire delivery of meaningful phrases.

Naturally, continuous dictation makes greater demands on PC hardware, typically requiring at least a 133 or 150 Mhz processor with 32 to 48 megabytes of memory. Alas for Macintosh users, there is no immediate prospect of a continuous product, although discrete dictation should work well under System 8.

A further dimension to all this is the role of voice output, in two distinct ways. The first is voice playback - in which each dictated document is stored and can be played back. The second is text-reading, where the system converts dictated text into synthesised speech. For mainstream users, this benefits proof-reading and checking punctuation, and it can be helpful for the dyslexic.

At present, neither IBM product offers text-readi ng, although both provide voice playback. DragonDictate (discrete) version 3 has text-reading, but NaturallySpeaking (continuous) does not yet. However, Dragon has announced NaturallySpeaking Deluxe which offers text-reading as well as full command and control. By the time it ships, ViaVoice Gold from IBM will offer fresh competition.

You may be wondering by now if these words were dictated straight to screen, bypassing weary fingers altogether ? Alas, no: even if the Notino notebook package kindly loaned by Hi-Tech had featured the more powerful software, dictating it on the Newcastle train would have been downright anti-social.

Aptech Ltd, Aptech House, Meadowfield, Ponteland, Newcastle upon Tyne, NE20 9SD, tel 01661 860 999, provides independent advice as well as supplying software and hardware.

Hi-grade Ltd is on 0800 074 0401 and kindly loaned a Notino Listener V7 for this review; this is a high-range Pentium multimedia laptop package with DragonDictate and other software that sells for #163;1,429 plus VAT

Want to keep reading for free?

Register with Tes and you can read two free articles every month plus you'll have access to our range of award-winning newsletters.

Keep reading for just £1 per month

You've reached your limit of free articles this month. Subscribe for £1 per month for three months and get:

  • Unlimited access to all Tes magazine content
  • Exclusive subscriber-only stories
  • Award-winning email newsletters
Recent
Most read
Most shared