One-click navigation
 
Sub Unsub

 

ELT NewsWeb  

Interview

Della Summers

Page 1 | Page 2 | Page 3

You restrict definitions in LDOCE to the DV, the defining vocabulary of 2,000 words.
Yes, if lexicographers go outside the DV, it's ringed in orange so they know they've done something wrong! It's the basis of the Longman ELT dictionary approach. It was invented in 1935, when Longman published the first ELT dictionary. It only uses words that the student is assumed to know to explain words that they're looking up. This improves the definition so much.

An example I saw was that the word "aircraft" would be replaced with "flying machine."
Well, that must be an old example because we wouldn't do that now. It would almost certainly be "plane" or we'd actually use "aircraft." We revise the defining vocabulary, which is listed in alphabetical order at the back of the book. When we revise it, we use two things - the Native Speaker Corpus, which shows the frequency of usage of words in natural English, and we compare that to the frequency of words in our Learner's Corpus. Longman's is the first learner's corpus and is made up of scripts from some 50,000 students all round the world - the biggest part is from Japan. It's put into the computer, preserving the mistakes, which we're interested in too. And the bits that the students get right, we are justified in adding them to the DV.

What is the current situation and the future for machine translation?
I sometimes get e-mails from Japan from people who use machine translation, and I spot them straight away. They really are unbelievable. I think we're still some way off, aren't we? I suppose that at some point it will happen.

I think there's something like 50-60% satisfactory resolution of translation on the Internet with things like AltaVista's Babel Fish. But it's still nothing like true translation. I have to be honest and say that I don't know what the problem is. But my assumption is that there isn't enough linguistic information in the system. I think parallel corpora is the way forward, but it would require a huge amount of data and money.

What unique challenges does the Japanese language present as far as translation is concerned?
Well, the total lack of congruence between Japanese sentence structure and English is a major hurdle for students. There are other problems, cultural problems, issues of social hierarchy that often come into play. As David Crystal says, language is culture. So if you take a sentence like "Would you please close the window?" there are several different ways of rendering that in Japanese, so the person translating it will need to put in that there's a social hierarchy which is not usually implied in the English. English is more neutral in that sense, or less dependent on the context, on whether one speaker is older or younger, a man or a woman and so on.

Can you tell us a bit about the whole area of corpora? Was the British National Corpus the first major project of its kind?
Well, I suppose that depends on what you mean by major. The person who started the whole corpus revolution was Randolph Quirk at the Survey of English Usage at University College London, which involved surreptitiously recording people talking. Professor Crystal was a student of Quirk and they are both still advisors of ours. There was also the Brown Corpus of American English. The next major development was the COBUILD work by John Sinclair at Birmingham University, where they started off with a small corpus based mainly on newspapers.

The way that we developed the British National Corpus was in a way a kind of response to that. I've always been extremely committed to the idea of a representative corpus. So if you just use newspapers and you look at the word "interest," it's quite likely that one of the most common usages will be about interest on a loan or financial interest, but in a representative corpus, that's never going to be the first meaning.

So at Longman, we put the most frequent use of a word first, even if that meaning is a phrase. A classic example of this is "on the lookout" - "lookout" meaning someone up in a treetop or something is a pretty rare usage. So we have built balanced corpora. The BNC is made up of 100-million words, about 90% written and 10% spoken. The written content is made up of a certain percentage from newspapers, a certain percentage from fiction and so on.

We worked mostly on the spoken part, using a market research company to select people demographically, a thousand volunteers originally, from all over the UK. They carried a Walkman around with them all the time, recording what they said or people said to them. David Crystal made a joke that we didn't have anything about sex in the spoken corpora. I'm afraid it's not true - we had lots of sex! Some purists say that the language recorded that way is not really spontaneous and natural but when you hear some of it, you realize that it is. Because people who are speaking to the person with the Walkman don't know they're being recorded.

So all that is then keyboarded and used by the lexicographers to review on the screen, along with the written usage for the word. An example from the corpus for LDOCE is a couple of American "valley girls" using the word "like" as in "He was like, no way!" So like is used as an adverb, and we think it's important that this authentic language be incorporated. Of course, there are many teachers who say that that's unacceptable usage and shouldn't be included in a dictionary. But our users are people who passionately want to improve their English, that's why they bought an English-English dictionary, because it has so much more information. So with an example such as "like," it's marked "spoken" and they have to realize that they may not be able to use it in every context, such as in an essay.

It's a huge topic for teachers too, isn't it? Knowing where to draw the line between "correct" English and authentic english. Most students are very curious about swear words but not many teachers will teach them.
Our policy is to include swear words in the dictionary, if it's for advanced level students. Swear words after all, are quite an important language phenomenon. We have loads of stuff (a full half page on the "f-word", for example - ed.), because it's obviously very common. It's very productive use of language and is becoming so much more common that it would be linguistically indefensible to leave it out.

Page 1 | Page 2 | Page 3


<<Back Number | Top | Recent Issue>>

eigoTown Friends

Sign up for free & meet...

Asia's largest friend finder network. Join FREE today!

Our Sponsors



Subscribe to our free weekly e-mail newsletter, featuring news updates, headlines, commentary, quotations, special offers & Web site news. We respect your privacy and do not pass on e-mail addresses to any third party without your permission.
Want more information? | Read the latest issue

subscribe
unsubscribe

TOP

Home | News | Jobs | Articles | Resources | Books | Guides | Newsletter | Store | Events | Message Board | Links | Archives
Policies & Disclaimers | Privacy Policy | Contact ELT News | Submit News / Article | Site Tour | © 2008 eigoTown.com Ltd.
Tel: +81-3-3770-8102 | Fax: +81-3-3770-8101


ELT News is the Web site for ELT, ESL, EFL, TESL, TESOL, TEFL professionals in Japan, updated every weekday. ELT news, world news, exchange rates, job classifieds, ELT books, English books.... If you're involved in the English Language Teaching (ELT) Industry in Japan, then this site is your home. If you're looking for an English teaching job or other ELT employment in Japan, check out our jobs section.