Project 712

Ogden's Basic English as a Lexical Database In Natural Language Processing
by Scott R. Hawkins

Chapter III
Basic English

    Early efforts in the area of associative networks illustrated the difficulty of coming up with a representative cross section of the English language. The tendency is to explore the areas of the language with which one's system deals well in great detail while ignoring other areas almost completely. In partial explanation of such practices let me say that this is not necessarily pure laziness on the part of the designer, but often at least partially a side effect of the designer's working in his or her native language. A speaker's vocabulary is often as fundamental to the speaker as water is to a fish; sorting out the essential from the superfluous can be a monumental task in itself. Fortunately it is not necessary to do so, as C. K. Ogden has already collected in his System of Basic English (Ogden, 1934) a vocabulary well suited to such tasks.

    The System of Basic English was first published in 1934. (See Appendix A) Its stated purpose was to extract from the approximately 20,000 English words in common use a subset which would provide enough vocabulary to get by in day-to-day operations, yet still be simple enough for an intelligent foreign student to absorb in a month or so of study.


    The first and most obvious advantage of the system of Basic English is that it is well thought out and fairly representative. In associative network design it is tempting to concentrate on the concepts which one's network handles well. With 20,000 words or more to choose from, it is possible to successfully encode a fairly large vocabulary using an approach that is not generalizable. Using Basic English as an input set closed off many blind alleys of that sort. The system contains vocabulary adequate to discuss a great many concepts, but few are explored in detail.

    As a side effect, the vocabulary's purpose tended to focus the subjects of the words chosen even as it limited their number. Words dealing with the basic areas of personal finance, interpersonal relations, food, water, medicine and sleep are treated heavily,3 while words dealing with, for example, the sciences are left almost untouched. I hasten to point out that this is not necessarily a blessing. Systems with a working knowledge of the 'common sense' subjects dealt with in Basic English still remain largely the province of science fiction.

    As the previous paragraph would suggest, Basic English leans heavily toward nouns.4 The verb vocabulary has been confined to the absolute essentials, though many of the General Things category can be treated as verbs


as well as nouns. There are 150 Qualities, more commonly referred to as adjectives and adverbs. The operations category is a grab bag of necessary words that are not nouns, adjectives or adverbs, mostly consisting of the so- called 'function words.'

    Though Basic English was originally selected mostly as a matter of convenience, it had many other advantages which slowly emerged during the categorization process. The most abstract and hence most difficult words (from a categorical standpoint) were omitted. For example, Ogden did not include the word 'ascend' because the same meaning could be communicated using the construct 'go up.' With few exceptions, all of Ogden's selected verbs were primitive and irreducible.
    In retrospect, this fad may have saved me from yet another trap. If I had planned to encode 'ascend' in my network, the obvious approach would be to reduce it to 'go up,' or some variation thereof. This is certainly possible, and might have left the impression that I had accomplished something which I had not. In reality, such a reduction simply pushes back the fundamental issues one step farther: the computer would be no closer to an understanding of the concepts of 'go' and 'up,' much less 'ascend.'


    For the three reasons mentioned above:
  1. Representative nature of the vocabulary
  2. Relative simplicity of the terms of the vocabulary
  3. Irreducible nature of the elements of the vocabulary

I feel that Basic English is an appropriate collection of words from which to construct a general purpose, expandable lexicon.


