COLUMN

290,500 Oxford words to live by

Compiling a dictionary is like painting the Brooklyn Bridge: it takes years to complete and, once finished, one must begin again

DIANE FRANCIS February 20 1989
COLUMN

290,500 Oxford words to live by

Compiling a dictionary is like painting the Brooklyn Bridge: it takes years to complete and, once finished, one must begin again

DIANE FRANCIS February 20 1989

290,500 Oxford words to live by

COLUMN

DIANE FRANCIS

A modernistic building on the campus of Ontario’s University of Waterloo was named the William G. Davis Computer Research Centre last November after Ontario’s former premier and, before that, education minister. It not only houses the school’s large, entrepreneurial computer science department but, in a spartan cluster of rooms, a project called “New OED,” which stands for the New Oxford English Dictionary Project. And for the past two years, a team of 10 Canadians, led by professors Frank Tompa and Gaston Gönnet, have created state-of-the-art computer software, which is sure to maintain the OED’s pre-eminence. The innovative software also promises to yield substantial royalties for the University of Waterloo.

Like many writers, I have had a lifelong love affair with words. That is why it was pure pleasure to sit down for an afternoon with Tompa as he took the system through its paces. The text of the OED is owned by Oxford University Press, an offshoot of the English university, and the software belongs to the University of Waterloo. The software could have important commercial value when the dictionary is published electronically. It may also benefit other publishers or database owners because it is capable of retrieving a wealth of complicated information from the voluminous dictionary database. In addition, the system will permit the dictionary to be produced and updated more quickly than before.

The latest edition of the OED will go on sale this March at a cost of $3,125 for a set of 20 volumes. Oxford University Press also produces many other spin-off dictionaries, including the Pocket Oxford Dictionary, the Shorter Oxford English Dictionary and the Concise Oxford Dictionary of Current English, but this monster is the granddaddy of them all. That is because the OED is more than just a dictionary. It contains more words than any other dictionary in any other language, as well as a history

Compiling a dictionary is like painting the Brooklyn Bridge: it takes years to complete and, once finished, one must begin again

of each word. The latest version took 31 years to update and will include 290,500 main words, their etymologies and about 2.4 million quotations illustrating shades of meaning and differing usage. Oxford University Press’s lexicographers always include at least one quotation per century of usage.

Shakespeare is by far the most quoted author. Tompa typed in the file name “shaks” and received 32,798 matches on the video screen. Sir Walter Scott is the second-most quoted with 16,169, followed by John Milton at 12,282, religious reformer—and one of the first prolific writers in English—John Wycliffe with 11,916 and Geoffrey Chaucer, who comes in fifth with 11,692.

The joint venture between the two universities has been a considerable success. The University of Waterloo’s research and development costs were paid for by two federal government grants totalling $1.7 million. Oxford University received free equipment and expert help from IBM United Kingdom Ltd., as well as a $760,000 grant from the British government, to defray the cost of putting the data onto electronic disks. Now, with one year of the software research left to go, the University of Waterloo’s project leaders are turning their

attentions to marketing their revolutionary software to encyclopedia publishers and the owners of commercial databases.

The system has tried to anticipate the needs of all users. The system will provide an extensive list of adverbs which do not end with “ly”—including “well,” “outright” and “highest”—or a small dictionary of the English language in the 17th century. “Want to know how many words are derived from Hebrew?” asked project manager Timothy Bray. He typed a coded request onto his screen, and up popped the information that there are 269 Hebrew-derivative words, including “behemoth” and “sapphire,” within the dictionary. Seconds later, the system regurgitated a list of 602 words derived from Arabic, including “alcohol,” “allah” and “admiral.”

Despite the speed of the new software, compiling a dictionary is equivalent to painting the Brooklyn Bridge: it not only takes years to complete but once finished one must begin again. The last complete edition of the OED was finished in 1928 and published in 13 volumes in 1933, including a supplement. It was updated between 1972 and 1986 by a four-volume supplement, which superseded the old supplement. The new, second edition will merge the 1928 version, the supplement and 5,000 new words or meanings.

Words enter the OED if they are read by its lexicographers or arrive from outside contributors. References are double-checked for accuracy, then the OED’s staff adjudicates.The OED always quotes the earliest use known.

I asked Tompa how many quoted references there are to Maclean’s magazine, and the system replied with 199, by far the highest among Canadian magazines. Within seconds, he determined that, of these, Maclean’s is credited with the first use of 14 words or new meanings, among them “love-bird,” in the sense of “lover,” in 1911, “toaster” in 1913, “stone-age,” in the sense of “hopelessly outmoded,” in 1927, “sasquatch” in 1929 and “toot” in 1977, meaning a snort of cocaine.

The system can also search out references to individual authors. Peter Gzowski is quoted 11 times from 1974’s The Book About This Country in the Morning to illustrate usage of “nudie,” “pogey,” “rebroadcast” and “shtik,” among others. Robertson Davies has several quotes to his credit, although Margaret Atwood has none. References are spotty, dependent upon what is submitted or read.

Sometimes the software yields surprises. During last fall’s christening of the University of Waterloo’s new computer science building, Tompa asked the system to produce any references about the guest of honor, William Davis. Out popped two, including a quote used to define the word “premier.”

But the second credited him with introducing to the language the word “ridership” in a speech. When asked, the equally surprised Davis recovered quickly. And the everpolitical Davis said that, “of course, I made it up.” According to the OED, that is.