Historical document or fabulous fraud? Rob Churchill examines a mystery that has fascinated researchers for a century
In 1912, Wilfred Voynich, a Lithuanian refugee and antiquarian book-dealer, was travelling through Italy in search of rare books. In the Villa Mondragone in Frascati, in a chest of old manuscripts, one volume caught his eye. Covering more than 200 vellum pages were illustrations of unknown plants, astrological diagrams and bizarre tableaux of women bathing in green pools. More mystifying still was the unrecognisable script, which used strange characters to form an unknown language or cipher and which has subsequently earned Voynich's find the title "the world's most mysterious manuscript".
He believed he had discovered an important medieval document, and by the time of his arrival in New York in 1914 he had attached a price tag of $160,000 to it. But he was never to find a buyer, and the "Voynich manuscript" is now the property of the Beinecke Library at Yale University.
It has, however, remained a subject of fascination for professional and amateur cryptologists alike, largely due to the fact that not one word of the text has ever been successfully deciphered.
On first viewing, the manuscript is a somewhat unprepossessing and rather dog-eared volume, approximately 15cm by 22cm. But once inside, the vellum leaves are decorated with brightly coloured illustrations, the subjects of which have been used to divide the book into thematic sections. The largest section, which opens the manuscript, contains well over 100 drawings of fabulous plants, none of which has been identified. The remainder is divided into 25 circular astral diagrams, a "biological" section containing the drawings of nude females, and a pharmacopoeia of herbs and roots with, apparently, accompanying recipes. Most bizarrely of all, there is a large multi-folded page with nine inexplicable "rosettes", cell-like structures decorated with what appear to be stars, flames, petals and even organ pipes.
The plant illustrations have led some investigators to suggest the work to be a "medieval herbal", an encyclopedic work on the medicinal properties of plants. This explanation, however, fails to account for the other, even more puzzling pictures that fill the volume, which have been interpreted variously as alchemical secrets, a guide to herbal contraception, or even demonstrating the presence of mental illness in the illustrator.
Most investigative effort has been concentrated on achieving a decipherment of the text, written in its own flowing script. While certain characters appear to have a passing resemblance to Latin letters and Arabic numerals, the majority are wholly original. The creation of cipher alphabets using new and abstract symbols was not unknown in the Middle Ages and the Renaissance, but to understand what makes this manuscript so cryptographically unique, it is necessary to delve into the murky world of ciphers and cipher-breaking.
Ciphers fall into two broad categories: transposition and substitution. A transposition cipher, as the name suggests, transposes or rearranges the letters of the original text, thus creating an anagram. In a substitution cipher, the letters of the message remain in their correct order, but are substituted or replaced by other letters, numbers or symbols. One of the simplest and oldest forms of substitution cipher, recorded by the Roman historian Suetonius, is the Caesar shift cipher (see box on page 10). To encrypt a message, one replaces each letter in the original message with a letter that is an agreed number of places further along the alphabet. In the example given here we have a "shift" of five, as the two alphabets have been shifted five places relative to each other. To encipher, take each letter of your message, find it on the top row and replace it with the corresponding letter in the cipher alphabet below. For example, the message "the Voynich manuscript" would read: "YMJ ATDSNHM RFSZXHWNUY".
Unfortunately, the Caesar shift cipher does not afford the sender of the message much security. There are only 25 positionings available when shifting an alphabet like this; and so if the message were to be intercepted, it would not take long for a cipher-breaker to try out all the possible shifts and discover the message. Much greater security is provided if the cipher alphabet is a random rearrangement of the plain alphabet rather than just a shift.
Throughout Europe in the Middle Ages, simple substitution ciphers were the only available secure way of encrypting a message - and in fact the only means necessary, as no one had devised a way of breaking them. However, by the beginning of the 15th century a new type of cipher began to appear in response to the use of frequency analysis as a cipher-breaking tool. The power of frequency analysis comes from one simple fact; not all letters in a given language appear as often as others. In English, for example, the most commonly occurring letter is E, followed by T, A, O, I, N and so on until Q and Z, the least frequent. If the cryptanalyst has an idea of the language of the original message, they can easily establish the frequency of letters by sampling just a few pages of a text written in that language.
By comparing the frequency with which different characters appear in the cipher text with their known frequency in the language, one can crack the cipher.
The great advance of Renaissance cryptologists was to introduce a number of cipher alphabets into the encryption process, thus creating what have become known as poly or multi-alphabetic substitution ciphers. The first person to adopt this method in the 15th century was Leon Battista Alberti, who became known as the father of modern cryptology. At first using only two cipher alphabets, Alberti's ideas were developed by others, until up to 26 different cipher alphabets could be employed to encrypt a single message. Such multi-alphabetic ciphers were to remain unbroken until the 19th century.
If the Voynich manuscript is the product of the late Middle Ages, then a simple substitution cipher would be the most likely encryption method used.
This form of cipher should be relatively easy to break, especially given the huge amount of text to work with, yet for 90 years attempted decipherments via this route have failed miserably (see "The wrong key", right): no one yet knows in which language the original text was written, the single most important piece of information required for frequency analysis. Most scholars of the Voynich manuscript have assumed that because it looks like a product of the late medieval or early modern era, the language is most likely to be Latin, yet frequency analysis based on this assumption has failed, as have those using Greek, English, German, Italian and most other major European tongues. Furthermore, there is the recurring difficulty of determining exactly how many characters there are in the "Voynichese" alphabet, as some of the more complex symbols appear to be compounds of two or more simpler characters.
The use of a multi-alphabetic cipher also seems unlikely, because of the level of entropy (a measure of disorder or randomness) present in the Voynich text. Most languages have a relatively low entropy because they possess a discernible structure imparted by grammatical and syntactical rules. Multi-alphabetic ciphers were developed to increase the level of entropy, creating apparent disorder and concealing any recognisable patterns such as differing letter frequency and repeated letters (eg "ee" in seen or "oo" in book), which could be exploited by cryptanalysts to break a cipher. But the Voynich manuscript has very low entropy, with a vast amount of apparent linguistic structure.
An altogether different possibility, and one championed by the leading UScipher-breaker of the Second World War, William Friedman, is that the Voynich manuscript is written in code. A code differs from a cipher in that it replaces entire words in the original message with other words, numbers or symbols. To operate such a system requires a code-book, somewhat like an English to French dictionary, in which one could look up the words of the original message and find the code equivalent. Friedman spent much of his spare time after the war pondering the Voynich conundrum, and instigated two study groups, eventually concluding that the manuscript represented an example of a priori code. Such codes operate on slightly different principles, and are essentially synthetic languages in which human experience is divided into sets of logical categories, in much the same way as Dr Roget classified words when creating his famous thesaurus. Numbers, letters or even musical notes could then be used to represent the various logical sets and sub-sets to create the coded language.
A number of priori-style languages were devised during the 17th century as Latin declined as the lingua franca of scholars. If Friedman's hypothesis is correct, then unless the original code-book is found or the classification system of the priori language is unravelled, there seems little chance of ever reading the secrets of the manuscript.
Yet another theory suggested by some researchers is that the manuscript is entirely without meaning, and the unreadable text and unfathomable illustrations are a deliberate fabrication to hoodwink a gullible buyer.
Some have even named a candidate for the historical hoaxer.
Since 1912, concurrent with their attempts to decipher the manuscript, researchers have gone to considerable lengths to establish its provenance and previous owners. The first to undertake this work was Voynich himself, aided by a letter he discovered with the manuscript. Dated 1665 or 1666, (the handwriting is not clear) the letter is from the former Rector of the University of Prague, Johannes Marcus Marci, to the renowned scholar Athanasius Kircher in Rome, entrusting him with the manuscript in the hope that he might uncover its encrypted secrets.
The letter also states that the volume was once the property of the Holy Roman Emperor Rudolf II, who apparently spent the considerable sum of 600 ducats to procure it, and suggests that the original author was the 13th-century English Franciscan friar and proto-scientist Roger Bacon. But if this were the case, how had the manuscript found its way to Bohemia? The courier suggested by a number of scholars is John Dee, the Elizabethan alchemist, mathematician, occultist and probable inspiration for Shakespeare's Prospero. Dee is known to have been an avid collector of Bacon's works, and certainly travelled to the court of Emperor Rudolf in Prague.
More intriguing still is an entry in Dee's still extant diaries, recording the receipt of 560 ducats during his stay on in Europe. A payment for the mysterious manuscript perhaps? But those who suggest the volume is a forgery often point to Edward Kelley, an accomplished trickster and con man who accompanied Dee on his continental journeys, posing as a scryer or crystal ball gazer, and regularly duped his gullible partner with bogus angelic messages supposedly received via his "scrying stone". Might Kelley have fabricated the manuscript to gain the patronage of the Emperor Rudolf?
Voynich continued to believe his find was the work of Bacon and in 1921 the manuscript enjoyed a brief period of fame when Voynich and William Newbold, professor of philosophy at the University of Pennsylvania, announced they had cracked the cipher. Professor Newbold claimed to have broken a complicated, multi-stage encryption process (which included elements of both transposition and substitution ciphers as well as a form of microscopic shorthand), to uncover Bacon's original text. Most astonishing of all were the scientific secret's Newbold had apparently discovered. It seemed from his decipherment that Bacon had invented both the microscope and telescope hundreds of years earlier than had previously been thought.
Unfortunately for Voynich's hopes of a lucrative sale, and even more so for Newbold's academic reputation, the decryption process and resultant deciphered text were later shown to be the products of the professor's rather fevered imagination. The multi-stage encryption process, although based on some of the cryptographic principles Bacon was known to have employed (see Security codes, left), was shown to be unworkable, while the supposedly ground-breaking decipherments were pure fiction, though most likely a product of Newbold's wishful thinking rather than any malicious intent.
Now the net has been brought into play in the struggle to unlock the secrets of the manuscript. Type "Voynich manuscript" into any search engine and you will find hundreds of sites and postings dedicated to the mysterious volume. There is also an email list where amateur cryptologists share their research and insights. Yet, despite all their efforts, the Voynich manuscript remains stubbornly impervious to every cryptological attack. The only certain fact is that it continues to ensnare and entrance successive generations of would-be solvers, and remains true to its epithet of "an elegant enigma".
Rob Churchill is the author, with Gerry Kennedy, of Voynich Manuscript: the Unsolved Riddle of an Extraordinary 16th-century Book which even today defies interpretation (Orion Publishing, pound;16.99) Tel: 020 7240 3444
Message alphabet a b c d e f g h i j k l m n o p q r s t u v w x y z
Caesar shift cipher f g h i j k l m n o p q r s t u v w x y z a b c d e
Random cipher J Z O N D H Q L B W S M G A E V U K Y I C X R P F T