Basic English Spell-check for OpenOffice.org Word Processor
en_BE.zip readooo.html version 0.1.1.x January 14, 2004
This is an OpenOffice.org (OOo) (word processor suite), spell-check wordlist. It is
in proper form because it has been made by deleting the non-Basic words from
the regular OOo dictionary or spell-check wordlist. This spell-check list was then
tested on most of Ogden's writings that he says are written in Basic English. (they are)
Files contained: The ZIP file will automatically download, but you will need
software to UNZIP it.
If you don't want to use the ZIP form, click on any of the filenames below and SAVE AS to your disk.
This is a spell checking word list for use with OpenOffice.org software
as a Basic English spell checker.
This list is the default OpenOffice.org spell checking file
with the non-Basic words and derivatives removed. Capitalized words
are retained as proper nouns, except a bunch of person names were
removed to make the file more manageable. Many of these will be added
back as miscellaneous corrections are made.
See "toddlist" for a complete Basic English wordlist that works
with OpenOffice.org software that is without proper nouns.
Open Office.org uses a spell check file named en_US.dic. (or en_GB.dic in Great Britain.) We will substitute our file called
en_BE.dic and rename it to en_US.dic so that it will be recognized by Open Office.
Go to folder: (your path)/openoffice/share/dict/ooo/
|First, make a backup copy of the original en_US.dic:
||For example :|
|Change the original filename from : ||en_US.dic ||to : ||en_US.dicX.|
|Copy the new file : ||en_BE.dic ||into: ||(yourpath)/openoffice/share/dict/ooo/|
|Rename the BE file : ||en_BE.dic ||to : ||en_US.dic.|
|That is all there is.|
If you get mixed up, file sizes are : en_BE.dic is about 129KB ; en_US.dic is about 608KB ; and en_GB is about 514KB.
| Rename : ||en_US.dic ||back to: en_BE.dic|
| Rename : ||en_US.dicX ||back to: en_US.dic|
Details: - More than you need to know:
The spell checking software is MYSPELL, an enhancement of ISPELL,
by Kevin B. Hendricks.
In the same file folder are :
The file "dictionary.lst" tells OOo which Dictionary, Hyphenation instructions, Thesaurus,
and Affix to use. It should look like this -- or its British variation. Where the first column is the
type of file, the 2nd is language, 3rd is variation, and 4th is file name(?).
DICT en US en_US
HYPH en US hyph_en
THES en US th_en_US
(if you have trouble with hyphens, try changing HYPH line to :
HYPH en US hyph_en_US.
The wording show here is as the software author wrote the instructions,; this one just seems odd to me. I will test it later. )
These files will be in the file folder and/or their various language counterparts.
There is no need to touch the hyphenation and thesaurus files and the affix file is optional.
Affix means prefixs and suffixs.
Spell checking software uses an
algorithm to add prefix and suffix forms from the root word. The OpenOffice.org
affix file currently has 22 affixes defined. Ogden's Basic English
will make use of 9 of these. Ogden allows the use of -est for single
syllable words, which might entail a 10th. Affix files have some
idiosyncrasies; for example, re- is one of seven prefix options and
has a code value of option /A. The word "read" is coded automatically as ad/A.
This means that manual preparation of files will contain human errors.
There are programs to create lists of these files, called, "munch" and "unmunch,"
that use Unix utilities in the OpenOffice.org project.
The OpenOffice.org affix file has been reduced from 22 options available for OOo to
Basic English's simpler options. There is no specific need to replace the original en_US.aff file
except for some minor efficiency with the smaller file. This affix file of standard prefix and suffix
allows the word list to contain a base word coded with legitimate prefixes and suffixes.
This allows the word list to be smaller. The original affix file works fine with Basic English because illegal suffixes are not coded in the Basic English wordlist. If you want to substitute the Basic English affix file, follow the same instructions as for substituting the wordlist.
Inside the main dictionary file.
The number at the top is the word count. This saves the system
from having to do two passes thru the file. Therefore when you add
your name and town to the list, you will want to increase the
"TRY esianrtolcdugmphbyfvkw" in the affix file is the
the alphabetic search frequency list. It tells the affix software which letters to look for first.
When first working with the OOo wordlist (dictionary), several hundred
Basic English derivatives were found and added. This is to point out that these are
preliminary files. However, testing the spell checker against Ogden's
works only found a few dozen corrections. (So there are now even fewer.)
OpenOffice.org allows use of multiple languages. In the future we will create
and test the Basic Specialty Lists (Science, Business, Verse) and Basic Detail Lists (Geology, Economics, Bible, etc) with OpenOffice.org. Our expectation is that you can select the Basic 850, one or more Specialty lists (100) and any detail wordlists (50) as desired.
The standard language code is: a 2-character, lower case letters
to indicate the language. "en" means English ; an underscore ; and
2-capital letters indicating the language as spoken in a country.
US, BR and CA are USA, United Kingdom, and Canada. Thus en_US
is American English and BR is British.
The "BE" country is Belgium.
As Brussels is the capital of the common market, the use of
that symbol for "Basic English: International Second Language" might
not be inappropriate. Which is why we use that name here.
Other files :
See toddlist.txt for a text file.
Todd showed us the way to use Basic English into the OOo software.
Or the Institute's spellist.xls in Excel form. Todd and Excel are each without proper nouns and are a more strict use of derivatives.
See Read OOo 1.0 for an earlier version of OpenOffice.
Notes about OOo.
Somebody owns the word "OfficeOffice" so the software must be called
OpenOffice.org or OOo . Wonder what the story there is?
OOo calls a spell-checking wordlist, a dictionary.
They call a translation table, a thesaurus or synonym list.
An OOo dictionary, .dic, file is a simple text file that can be created or modified with any text processor. After you have confirmed that things are working properly, you can play with the file adding some local or personal words. For efficiency, using the OpenOffice.org
word processor, save as "text encoded", with LF, without CR. (It saves one
carriage return per word, which is about 10%.) Saving as a regular
text file with any text processor will work fine. Just be sure to rename to .dic before using.
OpenOffice.org must be restarted to recognize any changes.
Thus OpenOffice QuickStart must be "off" to recognize new dictionaries
or affix files. Otherwise it is not recognized until the next system restart.
This list is not yet approved by the OpenOffice.org project.
After we get the words stabilized and in the proper form, the
Basic English language will be submitted to the OOo Project for consideration.
Sample of the data:
The affix options used after a slash for Basic English are: (See OOo for complete instructions.)
About this Page: readooo11.html -- Readme file of a Basic
English version of OpenOffice.org version 1.1.x spellcheck wordlist (dictionary).
Last updated August 14, 2004.