Basic English Spell-check for OpenOffice.org Word Processor    Use OpenOffice.org

    This page gives discussion about the spelling check project and software.

    en_BZ.zip   readooo.html   For OOo versions 1.1 through 2.4   April 8, 2008.
          This is an OpenOffice.org (OOo) (word processor suite), spell-check wordlist. It is in proper form because it was made by taking the regular OOo dictionary (spell-check wordlist) and removing the non-Basic words. This spell-check list was then tested on most of Ogden's writings that he said are written in Basic English  (they are) and has now been in use for over three years.
    Files contained:   The ZIP file will automatically download, but you will need software to UNZIP it.
        If you don't want to use the ZIP form, click on any of the filenames below and SAVE AS to your disk.
      readooo.html      7KB
      readoomore.html     23KB - for techy types
      readooword.html   11KB - about the wordlist
      en_BE.dic181KB
      en_BE.aff   3KB
      dictionary.lst   1KB
      These files are also singlely available to be downloaded if needed.
    Purpose:
        This is a spell checking word list for use with OpenOffice.org software as a Basic English spell checker. It should also work with any other software that uses HunSpell or MyPELL for spell checking.
        This word list is the Basic English wordlist with derivatives, complex words, contractions. The OpenOffice.org spell checking file provides the capitalized words,1 proper nouns and names with the non-Basic words removed. These two files are merged into a file called "en_BE.dic". See "toddlist" for an early Basic English wordlist that is without proper nouns. The file named "en_BE.dic" has more compound words.
    SETUP for Windows -- To Use:
        Open Office.org offers a selection of languages. Several come bundled with the OOo download: US, GB, DK, IT, RU (partial). [ American, Great Britain, German, Italian, Russian. You can download many others.] Select a country dialect of minimal interest to you. In this example, English, en_BZ, is selected 2
    A .
      The OOo dictionary root directory -- our working directory-- is probably named
          C:\Program Files\OpenOffice.org.(version number)\share\dict\ooo It will contain dictionary files for several languages. Files for each language for spell checking include : a dictionary, its affix file, a hyphenation file, and maybe a thesaurus file. You will make one of the languages into Basic English and do your Basic English work using that country name, but it will really be Basic English.
      1 . Copy en_BE.dic and en_BE.aff into the directory :
          Program Files\OpenOffice.org.(version number)\share\dict\ooo\ 3
      2 . Make a backup copy of the original word processing files.
      Make a backup directory under this file and call it "originals" and put a copy of the entire ../ooo subdirectory into the new backup directory   into   ../ooo/originals/ .
      2 . Copy three unZipped files into the first directory :
          Program Files\OpenOffice.org.(version number)\share\dict\ooo\
        en_BE.aff
        en_BE.dic
        dictionary.lst - you will delete the original file of this name.
      3 . Commonwealth users (only) will want to use GB thesaurus and hyphenation. Open Program Files\OpenOffice.org 2.0\share\dict\ooo\ dictionary.lst with any text editor.
      Change these lines from :
        HYPH en BZ hyph_en_US    to   HYPH en BZ hyph_en_GB
        THES en BZ th_en_US_v2     to   THES en BZ thes_en_GB
    B . Start OpenOffice.org
      On the taskbar, select Tools |
        then Options |
          then Language Setting
            | then Languages
              Under "Default languages for documents" | Western  
                then select English (Belize)
                (keep "Language of" the software as "Default". This will be your language set during installation of OOo.)
      Again, on the taskbar, select Tools
        then Options
        Language Settings | Writing Aids | Options and be sure that these are NOT marked.
          [ ] "Check in all languages" and
          [ ] "Do not mark errors".
        Click Okay.

    That is all that is needed ;
    Exit OpenOffice.org

        Restart OOo.5 Any new writing in OOo will be spell checked for Basic English. (unless or until you change Language back to your native language. You are able to change this for any or every document.
    -------------
      1. We find it simple to accept all capitalized words as legitimate Basic English, but provide the capitalized word list to make editing more clear with less underlining. We found that the capitals file is excessively large, 169KB. For example, There are 17 spellings of the name Angelique. However, Open Office is fast enough that we provide the capitals in the default file.
      2 . OOo is capable of identifying a language for each memo or all future memos and this can be changed at will. You are able to maintain the flexibility of using your native language in OOo; this is why we recommend making a seldom used language into Basic English.
      3 .The hyphen feature has not been tested, but try it. Commonwealth members should change these instructions to use   hyph.en_GB.dic
      4 . The first opening of OOo (swriter.exe) will take a few seconds. Subsequent opening that same day will be quicker. OOo has a Quick Start icon that speeds up the first opening by loading all the files as part of turning on the computer. If you have Quick Start on today, you will have to close the quick Start icon so that OOo can have a full start to recognize your new Basic English files. Future uses will start normally.
    Thesaurus. The thesaurus is a drop-down from toolbar Tools | Language | Thesaurus. Test the thesaurus feature with the word "fractional". Multiple choices should appear. The Thesaurus provided is not Basic English. However, we have a Basic English Thesaurus in Beta Test to works okay that you can install after you have OOo working. We may include this is future isses of the en-BE "Zip file" download.
    To Undo :
    Change Tools | Options | Language Setting | Languages   to your native language.
    There are no registry changes. You can delete files you don't want.
    SETUP for UNIX and APPLE Adjust the instructions for a Windows setup to your system using the Basic English word list provided.
    Apple users, please tell us what wording is better known to you, so we are able to make instructions friendly to other readers.
    Unix users, please tell us what wording to better for you, so we are able to make instructions friendly to other readers.
    Advanced Unix users might use MYSPELL or its enhansement, HunSpell, directly and not need Open Office. If you do use directly, please tell us how to do it so we can make instructions available to other readers. If you know how to make MYSPELL/HunSpell usable by Windows users, please tell us.

    TWEAKS
    AutoCorrect is a useful feature. For example: the word "can" is not Basic. You can have this automatically changed to "are able to" by going to Taskbar, click : TOOLS | then AUTOCORRECT. . . Select the language English (Belize) and enter under Replace, "can", and under With, "may" or, perhaps, "are able to". Or to be more general "(be) able to". Hint : look under English(US) to see some how AutoCorrect works, there are likely a few spelling corrections or features already there, such as replace "(c)" with "©".

    Simplifying the working folder :
    These files will be in the file folder and/or their various language counterparts.
      dictionary.lst - - key file , see below
      DicOOo.swx - - get more languages
      FontOOo.sxw - - get more fonts (for OOo only)
      WordNet_License.txt - -
      en_US.aff - - US affixes
      en_US.dic - - US spelling wordlist
      hyph_en_US.dic - - US hyphen rules
      th_en_US_v2.dat - - US thesaurus data
      th_en_US_v2.idx - - US thesaurus index
      en_GB.aff
      en_GB.dic
      hyph_en_GB.dic
      en_BE.aff -- Basic English affix file
      en_BE.dic - - Basic English dictionary file
      hyph_en_US.dic - - copy of hyph_en_US.dic or your native language

        The file "dictionary.lst" tells OOo which Dictionary, Affixes, Hyphenation rules, and Thesaurus to use. Where the first column is the type of file, the 2nd is language, 3rd is variation, and 4th is file type.
    Here is what the Institute version of dictionary.lst file looks like.
      DICT en GB en_GB
      HYPH en GB hyph_en_GB
      DICT en US en_US
      HYPH en US hyph_en_US
      THES en US th_en_US_v2
      # These are added to activate Basic English
      DICT en BZ en_BE
      HYPH en BZ hyph_en_US
      THES en BZ th_en_BE.dat
      THES en BZ th_en_BE.idx
        The dictionary.lst file gives the names of the languages and features of those languages available to OOo. Other languages can be added and those that will never be used may be delete Because these are the only file names recognized, other language files can be deleted.

        Affix means prefixes and suffixes. Spell checking software uses an algorithm to add prefix and suffix forms to the root word. The OpenOffice.org affix file currently has 22 affixes defined. Ogden's Basic English will make use of 10 of these. Unix created affix files have some idiosyncrasies ; for example, re- is one of seven prefix options and has a code value of option /A. The word "read" is coded automatically as  ad/A. This means that manual preparation of files will contain human errors. There are UNIX programs to create lists of these files, called, "munch" and "unmunch," that use Unix utilities in the OpenOffice.org project. This is not a problem with our Basic English wordlist which is created in Windows, Excel. every Basic English word is listed without affixes, although some may have been manually added.
        The OpenOffice.org affix file for en_US.aff is duplicated as en_BE.aff for use with Capitalized words. Any manually added affixes for BAsic words used only 10 of the available affixes.
        The Basic prefix and suffixes are
          Prefix : un- .  Suffix : -ed , -est , -er , -ers , -ing , -ings , -ly , -s , -'s , plus restricted use of -able , -en and -th.
        Details:     - More than you need to know:
        The spell checking software is HunSpell, a Hungarian University enhansemeent of MYSPELL by Kevin B. Hendricks, which was an enhancement of ISPELL, .
      The extra non-Basic prefixes include :
          re- , in- , de- , dis- , con- , pro-
      And suffixes of :
          -ive , -ion, -en, -ication , ions , -ens , -ness , -th , -able , -ment .
      These might be allowed with an advanced Basic situation and with a little work these can be selectively included in the en_BE.dic file. An instructor of advanced Basic might selectively introduce additional affixes.]
        Affix Codes
        U  un-       D  -ed		T  -est		Restricted
         		G  -ing		S  -s		B  -able
         		J  -ings	Y  -ly		N  -en
         		M  -'s		Z  -ers		H  -th
         		R  -er
    Inside the main dictionary file.
    The number at the top is the word count. This saves the system from having to do two passes thru the file. Therefore when you add your name and town to the list, you will want to increase the word count.
        "TRY esianrtolcdugmphbyfvkw"   in the affix file is the the alphabetic search frequency list. It tells the affix software which letters to look for first.

    TROUBLE SHOOTING. Very little trouble has been reported in using OpenOffice for Basic English Spellchecking.
  • Does not find non-Basic words as misspelled.
        Select "Tools" from the tool bar. Then "Options" at the bottom of the drop-down list.. Them open "Language Settings" in the middle of the next drop-down list. Then open "Writing Aids". In the "Options" window near the bottom, UN-mark "Check all languages".
    Be sure that QuickStart is not on and restart OOo after any changes to options.

    Status and Future    The Basic English wordlist has been used for six years and is stable. More compound words are added from time to time.
      1. The big list that is provided today which contains all Basic words (spellings) and all Proper Nouns.
      2. The basic Basic wordlist only. This is much more than 850 words. Derivatives (affixes: -ed, -er, -ing, -ly, -s, un-), international words and compounds extend this to over 5000 different "words".
      3. The Proper Noun wordlist from OOo. All of these are allowed in Basic and prevent thousands of false signs of misspelling. Proper nouns are names of people, titles, and places and are capitalized. Alice, Betty, Charles , Ambassador, Bishop, Captain , Asia, Brazil, China , etc. We let OOo provide these words and affixes.
    There is a need for several versions to satisfy different needs and levels of readers, but NOT coming soon.
    1 . Very basic Basic for education. Ogden identified the derivatives allowed for each root word. Ogden suggests a limited number of complex words.
    2 . Basic English as we use it today allows all six derivative affixes put with all Basic root words where the new word is existent in English. We make addition of many complex words from basic Basic and includes some modern words as international (computer, internet).
    3 . and makes addition of the general-level special wordlists.
    4 . Intermediate Basic with Ogden's "next step" words towards full English and make addition of complex words from all the Basic words in lists 2. and 3. Basic words Able and Full will be allowed as suffixes where the meaning in a good Basic sense. This is the level at which Basic will be used in public media.
    5 . Advanced Basic will make use a few more common affixes (-ment, -th, -tion, non-) that the learner will have found in daily usage. Technically, this simply allows use of the default OOo affix list.
    6 . Simplified English. This is advanced Basic with the addition of the 1,000 most frequent English words and with the addition of complex words made from them. This tends to be used in Instruction Manuals and at Simple Wikapedia.
    7 . Make separate American and British spellings. They are joined today and allow mixed usage in text that may not be good.
    8 . Basic words only -- where the big capital words are not needed.

    Future.
        OpenOffice.org and HunSpell/MySPELL are supposed to allow User Dictionaries. We had not been able to get them to work in OOo version 1. In the future we will test OOo ver 2 with the Basic Specialty Lists (Science, Business, Verse) and Basic Detail Lists (Geology, Economics, Bible, etc). We hope this will work because it will allow your selection of one or more additional lists of allowed words for spell checking. Our expectation is that you can select the Basic 850, one or more Specialty lists (100) and any detail word lists (50) as desired.
      An instructor in two levels of Basic might use the technique described here to make another language into his Basic2 variation that might be either more strict or more advanced.
    .
        The standard language code is: a 2-character, lower case letters to indicate the language. "en" means English ; an underscore ; and 2-capital letters indicating the language as spoken in a country. US, GB and CA are USA, Great Britain and Canada. Thus en_US is American English and en_GB is British.
        The country dialect code BZ is used only because it is recognized by OOo. We may ask the standards committee or OOo to recognize a dialect of BE for "Basic English: International Second Language" at some time.
    Other files :
        See  toddlist.txt for a text file of only basic Basic words. . Todd showed us the way to use Basic English in the OOo software.
    Or, the Institute's spellist.xls in Excel form. Todd and Excel are each without proper nouns and are a more strict use of derivatives.
    Notes about OOo.
        Somebody owns the word "OfficeOffice" so the software must be called OpenOffice.org or OOo . Wonder what the story there is?
        OOo calls a spell-checking wordlist, a dictionary -- it is really a vocabulary.
        They call a synonym list, a thesaurus -- which is really a translation table. OOo thesaurus is more capable than we make use of in Basic.
        An OOo dictionary, .dic, file is a simple text file that can be created or modified with any text processor. After you have confirmed that things are working properly, you can play with (customize) the file adding some local or personal words.
        For efficiency using the OpenOffice.org word processor , if you have limited computer power, save as "text encoded", with LF, without CR. (It saves one carriage return per word, which is about 10% of file size. CR alone will not work.) But, saving as a regular text file with any text processor will work fine. Just be sure to rename to .dic before using. And remember the .dic files have a word-count at the top.
        OpenOffice.org must be restarted to recognize any changes. Thus OpenOffice QuickStart must be "off" to recognize new dictionaries or affix files. Otherwise it is not recognized until the next system restart (reboot).
        This list is not yet approved by the OpenOffice.org project. After we get the words stabilized and in the proper form, the Basic English language will be submitted to the OOo Project for consideration.
    Sample of the data:
      12108
      A
      a
      AA
      AAA
      Aachen/M
      Aaren/M
      Aarhus/M
      Aarika/M
      Aaron/M
      AB
      Abagael/M
      Abagail/M
      Abbey/M
      Abbie/M
      Abbi/M
      Abbot/M
      Abbott/M
      Abbye/M
      Abby/M
      ABC/M
      Abdel/M
      Abdul/M
      Abelard/M
      Abel/M
      Abelson/M
      Abe/M
      Aberdeen/M
      Abernathy/M
      Abeu/M
      Abey/M
      ablest
      able/UDRZG
    The affix options used after a slash for Basic English are : (See OOo for complete instructions.)
      U     un-
      D     -ed
      T     -est
      R     -er
      Z     -ers
      G     -ing
      J     -ings
      S     -s
      M     -'s
      Y     -ly

      About this Page: readoomore.html --Discussion of the spell checking project for a Basic English version of OpenOffice.org versions 1.1 through 2.4 spellcheck wordlist (dictionary).
      Last updated April 8, 2008 -- this page is created to allow simplification of readooo.html installation instructions.
      History : Jan 13, 2006 -- Simpifications
        Dec 26, 2005 -- creation of this page.
        Mar 18, 2005 -- OOo ver 2; Replace another language name, rather than English.
        Feb 2, 2005 -- OOo 1.1.4
        Aug 14, 2004 -- simplified
        Feb 13, 2003 -- for OOo 1.0
        Apr 28. 2002 -- initial release
      URL:   http://www.basic-english.org/down/readoomore.html