SIMPLE (ENGLISH) WIKAPEDIA Wordlist for Spell Checking
Draft warning -- this note is in transition between a prior Wiki
readme for developers and a simplified, version with only one main file for use by writers. This Wiki note will eventually look similar to the one for Simple English where VOA words are kept separate, whereas
VOA is included in this Wiki version. A couple of file names will change, but is otherwise the same. That readme may be more helpful until after planting season is over and this memo may be made more clear.
Simple English has no firm definition(s) but it is generally regarded as Basic English (to be able to say anything) ; Plus the 1000 (pick a number) most frequent English words (not already in Basic, to give fluidity) ; and some, including Simple.Wikipedia.Org, add VOA Special English (because of if its wide access for beginners). These have been provided in lists at http://www.basic-english.org/down/readsimple.html for selection, experientation, and
use in Open Office.org word processing (and more) suite wherein the non-Simple words are
highlighted for simplification. Comment within SimpleWiki is desired and perhaps a standard definition may be developed of the SimpleWiki vocabulary. Until such time as
some standards are established, we offer a Simple English writer spell check and translation of non-Simple words into Basic English (not into the wider Simple English). Please note that all proper nouns (capitized words) are allowed in Simple English, even though most capitalizied words will show as mispelled and not Wiki) A full range of of Basic, frequency, VOA and capitalized word files may be provided for a Simple English Developer. If you as a Wiki writer need proper nouns, then download
Simple.zip which will have Capitals built in and has a voase.dic add-on.
This page is being separated into its own page re-written for writers
of Simple.Wikedia English, rather than developers of Simple English. Please pardon the unfinished nature of this page.
I . FOR SIMPLE WIKI WRITERS
Files included:
a . readwiki.html 12B (this note)
b. en_WIKI.dic -- 4760 root words. 50 KB
Consists of Basic 1500, plus Frequent 1000 (adds 680 words) plus VOA-SE (adds 450 words).
Does not include proper nouns (capital letter words, 162KB)
c. en_WIKI.aff 2KB An "affix" includes both prefixes and suffixes. This is a feature of the software that adds some efficiency. It is the same as en_US.aff. You need not be concerned with it.
d . dictionary.lst (this lists dictionaries that are available to Open Office and replaces the one that comes with OOo and includes Simple English/Wiki.
e. Read Wiki More - 14KB
Follow the instructions for substituting your selected vocabulary in Open Office.org.
Note : Many people new to simple languages are surprised that a few root words multiply into many times their number of spellings and senses. Learning the Basic 850 results in over 5000 simple derivatives and compound words.
Example : "equal" becomes equaled, equaler, equaling, equally, equals, unequal, unequaled, unequally..
Common words making complex words are : -able, -full ; any-, out-. over-, short-, side-, some-, under-, up/upper- , work-.
Complex words have not been added for Frequent and VOA words
to this trial dictionary. Some affix derivatives have been added, more will follow.
VOA is silent on derivatives -- we have used Basic English rules.
Note : .aff, .dic, and .txt are simple text files that can be read/edited with any simple text editor.
Purpose:
Provide a spell checking filter for use in writing Simple English (example, Simple Wikipedia) by use with HunSpell(MySpell) software
that is most notably used by the free office suite, OpenOffice.org . The vocabulary of
Simple English is composed (recommended) as Basic English plus
the most Frequent words in English. Wiki-Simple English adds VOA Special English.
Basic 1500
Every learner of Basic English is expected to know the 850 words,
the international words, six affixes and complex words, plus
one area of General interest with 100 words, such as Science, Business,
or Verse; and one Specialty detail within that general topic with an additional 50 words
such as Biology, Economics, or Bible which are not included.
Basic English is a full language for general living
and work as an auxiliary international language. It is good English. The limited vocabulary allows
quick learning -- weeks, not years. Obviously it is an excellent first step in learning
full English because it allow almost immediate immersion into daily English-speaking life.
Note, Basic English is a subset of Standard English with simple rules of grammar -- there is
NO unlearning required to progress to full English. The originators of Basic also provide
a learning path beyond basic Basic of 150 Next Step words and 350 Subsequent words at
which point the learner should be able to continue at his own pace. Because Simple Enlgish comes after Basic English, the expanded, "next step" Basic is included.
For general Simple Wiki use we have included the general subject words but NOT
the Speciality lists of Basic. We included the Subsequent addenda 350 words for next step
from Basic towards full English. This combined list is sometimes referred to as the Basic 1500.
The First Suppliment of 150 words for common
foods, plants, and animals has been lost. If found they will be added, else a good guess will be provided
sometime.
There is much overlap between the three sources. For example,
98 of most frequent 100 words are already in Basic ,
Half of VOA-SE words are also Basic words.
Note that the Simple English versions here have attempted to remove duplicates.
More than you may want to know.
See page Read Wiki More
To INSTALL
(to copy from readsimple.html)
To Use:
Using OpenOffice.org Writer, open en_WIKI.dic.
While the file is open
Add your name, town, street, etc., one word per line.
Add a code word to confirm which wordlist is active : aawiki.
Sort alphabetically.
Delete duplicates. (note : affix codes may be in different order)
Change the word count in the first line to the new number.
Save as en_WIKI.dic
That's all.
You are ready to continue on to adding another language ihn Open Office.
It may have a path something like this :
C:\Program Files\OpenOffice.org 2.0\share\dict\ooo\dictionary.lst
Add or confirm this line :
English (Nimbabwe) will now be recognized as a language with spell checking capabilities.
Configure the OOo text processor to recognize the language "English (Nimbabwe)" as either "Default"
or "For the current document only."
Exit OOo QuickStart and re-start OOo.
OpenOffice QuickStarter must be "off" only once, after changes to recognize new dictionaries or affix files. QuickStarter is no longer useful for OOo 2.4 and higher and may be permanently turned off.
To Undo:
There are no registry entries. Simply delete or don't use any features
that are no longer wanted.
dictionary Details:
The number at the top is the word count. This saves the system
from having to do two passes thru the file. Therefore when you add
your name, town, etc. to the list, you will want to increase the word count.
Affix file.
Spell checking software often makes use of "affix" files and an
algorithm to add prefix and suffix forms to the root word. You do not
need to be concerned about the affix file.
The name en_WIKI is pre-set in the dictionary.lst file as a country dialect of
English.
It is preset as: en ZW en_SIMPLE
You will want to change this to : en ZW en_WIKI
en ZW indicates language of a document of English iwth a country dialect of Zimbabia, will use en_WIKI as the name of the spellcheck dictionary and affix file.
If you create multiple spelling lists, then another less used dialect acknowledged by OOo include English(Trinidad) = en TT. Note English (Belize) is currently used for pure Basic. And English (Jamaica) for Basic 1500. Basic will be used by the most skilled Simple Wiki writers. ;^)
dictionary.lst file.
A file suitable for Basic English and Simple English usage is provided to replace the file that came with OOo. Wiki users will have to make one change in the dictionary.lst file, changing
en ZW en_SIMPLE to
en ZW en_WIKI
The additional features of Hyphenation and Thesaurus are pre-set
to use full English. Commonwealth users may prefer to change these from US to GB.
Note : Spelling in all files is first American with the most useful examples of Great
Britain included. Wikipedia writers can use either, but try to be consistent within
any one page.
Notes about OpenOffice.org
Download OpenOffice.org,
its freeware and a large file (96MB), or order it as a CD from one of their
partners.
We paid $5.50 for a copy.
Somebody owns the word "OpenOffice" so the software must be called
OpenOffice.org. Wonder what he story there is?
OOo calls spell-checking word lists -- a dictionary.
They call a translation chart -- a thesaurus or synonym list.
An OOo dictionary, .dic, file is a simple text file saved as
with OpenOffice Writer as text, but with the name-end of .dic . A "techy type" might chose to save as "text encoded", with LF, without CR. (It saves one
carriage return per word, which is a lot.) Saving as a regular text
file will work fine and is easier to work with for additions and changes.
OpenOffice QuickStarter must be "off" only once, to recognize new dictionaries
or affix files. QuickStarter is no longer useful for OOo 2.4 and higher and may be permanently turned off.
About this Page: readwiki.html -- discussion of writing aids, spellchecking wordlist for Simple Wiki using HunSpell (MySpell) software, specifically for use with OpenOffice.org.
Created : January 14, 2005. Plan of aids for Simple English / Wikipedia
Last updated : May 8, 2008. Replace Basic850 with Basic1500.
Last updated : May 3, 2008. Simplify to one main word list and without Capitals.
Last updated : May 3, 2008. Simplify to one main word list and without Capitals.
July 26, 2007. Add two Wiki Option Lists
January 21, 2005.
URL: http://www.basic-english.org/down/readwiki.html
LINKS : Simple English Wiki