WORD LISTS and SOFTWARE
for SPELL CHECKING and TRANSLATION
from the Basic English Dictionaries
48,000 common words-senses with Basic translation
Ogden's "The Basic
Dictionary" addresses 7,500 common words with 18,000 senses that
everyone knows. The intent here is to provide developers with a translation
list from standard English into Basic English. A related project is
a spell-check word list for the Basic English words and proper nouns.
Spell-check
Translation
To help
Downloads
I . Spell Checking for Basic English
-- two forms are now available.
Spell Checking wordlist contains the Basic words and all
derivative forms; international words, compound words, common
time, money, and mathematical words. Proper nouns are accepted
in Basic.
1.
OPEN OFFICE.
This is a Basic English spell checker specific to the OpenOffice.org Suite (OOo), formerly StarOffice.
OpenOffice is
a complete office suite that is now offered in the public domain.
It is a free, 96MB download or a CD can be ordered for about $5.50 from
one of their partners.
- Todd first adapted our general wordlist to work with OpenOffice.org.
Just substitute his new file for the default. This is one simple,
but long list,
of all Basic English word forms. The default "affix" file is irrelevant.
- Jim took a different approach. Once shown
how to do it, he simply deleted the non-Basic words from the
OOo default wordlist. This keeps the proper nouns provided by the
OpenOffice.org software.
OOo uses an "affix" file
to provide prefix and suffix -- there are 22 forms for standard
English; this is reduced to 10 for Basic -- -ed,-er, etc. This has been tested against Ogden's
writings in Basic English and is available for your use as a download.
AFFIX - Prefix and Suffix in Basic English, OOo, and Standard English
2 .
A Basic English, spell-checking wordlist for developers is available.
Version 0.0
This is an Excel Spreadsheet, a general format for developers to put into any format they need. If you need other than Excel, contact the Institute.
- The Basic Words of the 850 list have been expanded with derivatives :
-ED, -ER, -ING, -LY, and -S or -ES. Spell checking requires all forms of the Basic English words.
- Basic compound words, international words, and specialty words are added.
- Non-basic words are not required because they should not be there.
- In addition to the translated Basic English words, a spell check
requires many other words that are valid in Basic.
- Measurement, numerals, currency, calendar, and international terms
in English form are included.
- Technical expressions required and customary for the immediate task are
included in the locally used form.
- Almanac items, words that Basic English accepts in their known form.
These are proper names in, in this case, standard English, forms.
These include proper names of mountains and rivers, cities and
countries, people and places. The list can be as long as desired.
Examples might be
- Geographic areas such as continents, mountains, areas,
oceans, seas, bays, lakes, islands, etc.
- 100 Political entities: countries, cities, states, ...
- Local entities of English speaking countries: counties, towns;
- Top 100 most famous people: leaders, scientists, literary,
entertainment, at your pleasure.
- Top 100 first/given names and top 100 last/surnames.
- Terms: legal, medical
- Lists can be extracted from common references.
- Whichever lists are used can be merged with the Basic
spell-check wordlist
- Standard English vocabulary words that are not on the Basic word list
are excluded.
A Basic English translator needs to only go one way: from English
to Basic. There is no Basic to English requirement because Basic is already good English.
II . Translation Wordlist and Definitions.
- Basic Words translate as themselves including -ED, -ER, -ING, -LY, and -S and their combinations. Non-Basic words and their derivatives are translated into good Basic.
- The non-Basic words are taken from "The Basic Dictionary" and are
expanded to include the suggested derivatives and such other derivative forms as
come to mind, -able , -ful , -ment.
- Those idioms (expressions) that Ogden mentions can be included, but are not at this time. Other software may not be recognize idioms because they contain multiple, good English words. They might require grammar checking software.
- Operators and Verbs forms conjugate in full. To BE = am, be, being, is, was, etc.
- Compounds from which the Basic words are clearly understandable are included and non-Basic compounds are translated. Basic compounds are from two nouns (milkman)
or a noun and a directive (sundown) from which the Basic English words
are clearly understandable.
III . Software for Translator / Thesaurus / Synonyms
-
The IDP Companion (Internet Dictionary Project), a stand-alone translation program,
is available with all Basic Words and tens of thousands of translation words and senses. See ReadIDP on the download page.
- Thesaurus for OpenOffice (Beta Test) About 48,000 senses can
be translated from English to Basic. See ReadThes on the download page.
IV . To USE.
Spell checking.
1 . Download Open Office suite.
2 . Swap our spelling list for theirs.
3 . Read more about this from our Read OOo.
Translation.
1 . Download IDP Companion.
2 . Swap our translation list for theirs.
3 . Read more about this from our Read IDP.
Translation OpenOffice.
1 . If you don't already have OOo, then download Open Office suite.
2 . Swap our thesaurus list for theirs.
3 . Read more about this from our Read Thesaurus.
Translation with Browser.
1 . See Read Browser for a simple display and find.
V . For Developers.
The translation wordlist and the spell check wordlist are technically
mutually exclusive. For application and development purposes the entire vocabulary
can be listed. From the single list, any questionable items
can be classified and the two lists properly separated as to function when required.
One is the list of Basic Words to be used in spell checking. The rest of
the vocabulary contains words to be translated. When combined, a Basic word
translates as itself.
What we are providing is the words and the translation.
Each translator will have its requirements of a wordlist format.
Developers will have to provide the customizations for the translation
software.
A delimiter will be required to differentiate the word from the
translation expression then from and the next word in the list.
Many translators will require that the multi-word Basic translations
be enclosed in quotes. These are easy to provide with most word
processing and spread sheet software.
Alternate meanings must be identified in some manner.
software package internals make reference to "hashing" algorithms.
The Basic translations may be too long for some
translators to handle. You may have to condense the wording.
What appears here is more or less what Ogden provided.
Acceptable condensations will take place over time, not necessarily
in time for your need.
Methodology.
- The complete dictionary vocabulary (Basic and no-Basic) is recorded in a spreadsheet.
- Create all derivatives.
- Basic words are translated as themselves.
- Apply Basic translations from "The Basic Dictionary" or from "The General Basic English Dictionary."
Columns.
- English Word
- Basic Translation #1
- Alternative sense #2
- Alternative sense #3
- Alternative sense #4
Hints:
- The larger, "The General Basic English Dictionary"
with 20,000 words and 40,000 senses, was created later and has better definitions than the earlier, pocket-sized work which was intended to provide translations --substitutions,
rather than complete descriptions.
The larger dictionary needs to be translated at some point;
do you want to help?
-
A developer's test that seems reasonable. Change the entire test text to uppercase.
The translation will be in lower case. Missing translations will appear as upper case.
-
Set word processors to automatically capitalize the first word of a sentence.
- OOo calls a spell check wordlist, a dictionary.
They call a translation dictionary, a thesaurus or synonym list.
Samples
- The examples are in HTML in the simple <PRE> form.
- Developers can "save as" a sample wordlist
or download when available.
- Early samples:
A | B |
C |
X | Y |
Z
VI . To Help:
Word Processing Technical
We need people who know Unix C++ to put translation word lists into
translation packages such as OpenOffice.org , AbiSoft , MS-Word, etc.
To specify and convert word lists into formats
acceptable to various spell, thesaurus, translation software.
To activate and test the translations as they are prepared.
To address idioms.
To integrate Basic English as a language into the software.
Compiling Translations
We need people to provide the translations lists. Sign up for
a letter of the alphabet and complete the list(s).
One approach to create the translation dictionary list is:
- Download the Basic English and derivatives wordlist in spreadsheet format.
- Acquire The General Basic English Dictionary (20,000 words) .
( book sources }
- Transcribe the new words and definitions.
- Create all the derivative forms.
- Another category is idioms. These are multiple words that have
a specific meaning when used together.
Compiling Spell Check Axillary Word lists
The Basic English spell check wordlist has been created. The
"Auxiliary word lists" include the special and detail word lists for
Basic English. The words are available on this web site.
The derivative spellings need to added.
- Science notation and measurement:
The system of numbers.
The metric system.
The measurement of latitude and longitude.
Mathematical symbols.
Money systems.
Chemical formula.
Time and the calendar.
Notation in music.
Lets define this as those typically taught in high schools.
- Proper names. Is a comprehensive list of these available from
an open-code spell-check program or any other source ? This topic
might be called "almanac lists" of the top 100 of many
categories such as geographical features and famous people.
- Note: advanced English words cannot be included in the
spell check wordlist -- such words need to be put into the
translation list.
Downloads available to users of OpenOffice,
translation,
and to assist developers .
References for developers.
Back to: Basic-English Institute
About this Page : translate.html --
discuss electronic aids for Basic English
Last updated on August 20, 2006. rewrite I-IV.
Contact us at
URL: http://www.basic-english.org/eoffice/translate.html