| OLD | NEW |
| 1 Version 7.1-0 | 1 en_US Hunspell Dictionary |
| 2 2011-01-06 | 2 Version 2014.08.11 |
| 3 Mon Aug 11 18:23:56 2014 +0200 [be45e88] |
| 4 http://wordlist.sourceforge.net |
| 3 | 5 |
| 4 README file for en_US and en_CA Hunspell dictionaries | 6 README file for English Hunspell dictionaries derived from SCOWL. |
| 5 | 7 |
| 6 These dictionaries are created using the speller/make-hunspell-dict | 8 These dictionaries are created using the speller/make-hunspell-dict |
| 7 script in SCOWL, version 7.1 released on January 6, 2011. | 9 script in SCOWL. |
| 10 |
| 11 The following dictionaries are available: |
| 12 |
| 13 en_US (American) |
| 14 en_CA (Canadian) |
| 15 en_GB-ise (British with "ise" spelling) |
| 16 en_GB-ize (British with "ize" spelling) |
| 17 |
| 18 en_US-large |
| 19 en_CA-large |
| 20 en_GB-large (with both "ize" and "ise" spelling) |
| 21 |
| 22 The normal (non-large) dictionaries correspond to SCOWL size 60 and, |
| 23 to encourage consistent spelling, generally only include one spelling |
| 24 variant for a word. The large dictionaries correspond to SCOWL size |
| 25 70 and may include multiple spelling for a word when both variants are |
| 26 considered almost equal. Also, the general quality of the larger |
| 27 dictionaries may also be less as they are not as carefully checked for |
| 28 errors as the normal dictionaries. |
| 29 |
| 30 To get an idea of the difference in size, here are 25 random words |
| 31 only found in the large dictionary for American English: |
| 32 |
| 33 Bermejo Freyr's Guenevere Hatshepsut Nottinghamshire arrestment |
| 34 crassitudes crural dogwatches errorless fetial flaxseeds godroon |
| 35 incretion jalapeño's kelpie kishkes neuroglias pietisms pullulation |
| 36 stemwinder stenoses syce thalassic zees |
| 37 |
| 38 The en_US and en_CA are the official dictionaries for Hunspell. The |
| 39 en_GB and large dictionaries are made available on an experimental |
| 40 basis. If you find them useful please send me a quick email at |
| 41 kevina@gnu.org. |
| 42 |
| 43 If none of these dictionaries suite you (for example, maybe you want |
| 44 the larger dictionary but only use spelling of a word) additional |
| 45 dictionaries can be generated at http://app.aspell.net/create or by |
| 46 modifying speller/make-hunspell-dict in SCOWL. Please do let me know |
| 47 if you end up publishing a customized dictionary. |
| 48 |
| 49 If a word is not found in the dictionary or a word is there you think |
| 50 shouldn't be, you can lookup the word up at http://app.aspell.net/lookup |
| 51 to help determine why that is. |
| 52 |
| 53 General comments on these list can be sent directly to me at |
| 54 kevina@gnu.org or to the wordlist-devel mailing lists |
| 55 (https://lists.sourceforge.net/lists/listinfo/wordlist-devel). If you |
| 56 have specific issues with any of these dictionaries please file a bug |
| 57 report at https://github.com/kevina/wordlist/issues. |
| 58 |
| 59 ADDITIONAL NOTES: |
| 8 | 60 |
| 9 The NOSUGGEST flag was added to certain taboo words. While I made an | 61 The NOSUGGEST flag was added to certain taboo words. While I made an |
| 10 honest attempt to flag the strongest taboo words with the NOSUGGEST | 62 honest attempt to flag the strongest taboo words with the NOSUGGEST |
| 11 flag, I MAKE NO GUARANTEE THAT I FLAGGED EVERY POSSIBLE TABOO WORD. | 63 flag, I MAKE NO GUARANTEE THAT I FLAGGED EVERY POSSIBLE TABOO WORD. |
| 12 The list was originally derived from Németh László, however I removed | 64 The list was originally derived from Németh László, however I removed |
| 13 some words which, while being considered taboo by some dictionaries, | 65 some words which, while being considered taboo by some dictionaries, |
| 14 are not really considered swear words in today's society. | 66 are not really considered swear words in today's society. |
| 15 | 67 |
| 16 You can find SCOWL and friend at http://wordlist.sourceforge.net/. | |
| 17 Bug reports should go to the Issue Tracker found on the previously | |
| 18 mentioned web site. General discussion should go to the | |
| 19 wordlist-devel at sourceforge net mailing list. | |
| 20 | |
| 21 COPYRIGHT, SOURCES, and CREDITS: | 68 COPYRIGHT, SOURCES, and CREDITS: |
| 22 | 69 |
| 23 The en_US and en_CA dictionaries come directly from SCOWL (up to level | 70 The English dictionaries come directly from SCOWL |
| 24 60) and is thus under the same copyright of SCOWL. The affix file is | 71 and is thus under the same copyright of SCOWL. The affix file is |
| 25 a heavily modified version of the original english.aff file which was | 72 a heavily modified version of the original english.aff file which was |
| 26 released as part of Geoff Kuenning's Ispell and as such is covered by | 73 released as part of Geoff Kuenning's Ispell and as such is covered by |
| 27 his BSD license. Part of SCOWL is also based on Ispell thus the | 74 his BSD license. Part of SCOWL is also based on Ispell thus the |
| 28 Ispell copyright is included with the SCOWL copyright. | 75 Ispell copyright is included with the SCOWL copyright. |
| 29 | 76 |
| 30 The collective work is Copyright 2000-2011 by Kevin Atkinson as well | 77 The collective work is Copyright 2000-2014 by Kevin Atkinson as well |
| 31 as any of the copyrights mentioned below: | 78 as any of the copyrights mentioned below: |
| 32 | 79 |
| 33 Copyright 2000-2011 by Kevin Atkinson | 80 Copyright 2000-2014 by Kevin Atkinson |
| 34 | 81 |
| 35 Permission to use, copy, modify, distribute and sell these word | 82 Permission to use, copy, modify, distribute and sell these word |
| 36 lists, the associated scripts, the output created from the scripts, | 83 lists, the associated scripts, the output created from the scripts, |
| 37 and its documentation for any purpose is hereby granted without fee, | 84 and its documentation for any purpose is hereby granted without fee, |
| 38 provided that the above copyright notice appears in all copies and | 85 provided that the above copyright notice appears in all copies and |
| 39 that both that copyright notice and this permission notice appear in | 86 that both that copyright notice and this permission notice appear in |
| 40 supporting documentation. Kevin Atkinson makes no representations | 87 supporting documentation. Kevin Atkinson makes no representations |
| 41 about the suitability of this array for any purpose. It is provided | 88 about the suitability of this array for any purpose. It is provided |
| 42 "as is" without express or implied warranty. | 89 "as is" without express or implied warranty. |
| 43 | 90 |
| (...skipping 90 matching lines...) Expand 10 before | Expand all | Expand 10 after Loading... |
| 134 The name of Princeton University or Princeton may not be used in | 181 The name of Princeton University or Princeton may not be used in |
| 135 advertising or publicity pertaining to distribution of the software | 182 advertising or publicity pertaining to distribution of the software |
| 136 and/or database. Title to copyright in this software, database and | 183 and/or database. Title to copyright in this software, database and |
| 137 any associated documentation shall at all times remain with | 184 any associated documentation shall at all times remain with |
| 138 Princeton University and LICENSEE agrees to preserve same. | 185 Princeton University and LICENSEE agrees to preserve same. |
| 139 | 186 |
| 140 The 40 level includes words from Alan's 3esl list found in version 4.0 | 187 The 40 level includes words from Alan's 3esl list found in version 4.0 |
| 141 of his 12dicts package. Like his other stuff the 3esl list is also in the | 188 of his 12dicts package. Like his other stuff the 3esl list is also in the |
| 142 public domain. | 189 public domain. |
| 143 | 190 |
| 144 The 50 level includes Brian's frequency class 1, words words appearing | 191 The 50 level includes Brian's frequency class 1, words appearing |
| 145 in at least 5 of 12 of the dictionaries as indicated in the 12Dicts | 192 in at least 5 of 12 of the dictionaries as indicated in the 12Dicts |
| 146 package, and uppercase words in at least 4 of the previous 12 | 193 package, and uppercase words in at least 4 of the previous 12 |
| 147 dictionaries. A decent number of proper names is also included: The | 194 dictionaries. A decent number of proper names is also included: The |
| 148 top 1000 male, female, and Last names from the 1990 Census report; a | 195 top 1000 male, female, and Last names from the 1990 Census report; a |
| 149 list of names sent to me by Alan Beale; and a few names that I added | 196 list of names sent to me by Alan Beale; and a few names that I added |
| 150 myself. Finally a small list of abbreviations not commonly found in | 197 myself. Finally a small list of abbreviations not commonly found in |
| 151 other word lists is included. | 198 other word lists is included. |
| 152 | 199 |
| 153 The name files form the Census report is a government document which I | 200 The name files form the Census report is a government document which I |
| 154 don't think can be copyrighted. | 201 don't think can be copyrighted. |
| 155 | 202 |
| 156 The file special-jargon.50 uses common.lst and word.lst from the | 203 The file special-jargon.50 uses common.lst and word.lst from the |
| 157 "Unofficial Jargon File Word Lists" which is derived from "The Jargon | 204 "Unofficial Jargon File Word Lists" which is derived from "The Jargon |
| 158 File". All of which is in the Public Domain. This file also contain | 205 File". All of which is in the Public Domain. This file also contain |
| 159 a few extra UNIX terms which are found in the file "unix-terms" in the | 206 a few extra UNIX terms which are found in the file "unix-terms" in the |
| 160 special/ directory. | 207 special/ directory. |
| 161 | 208 |
| 162 The 55 level includes words from Alan's 2of4brif list found in version | 209 The 55 level includes words from Alan's 2of4brif list found in version |
| 163 4.0 of his 12dicts package. Like his other stuff the 2of4brif is also | 210 4.0 of his 12dicts package. Like his other stuff the 2of4brif is also |
| 164 in the public domain. | 211 in the public domain. |
| 165 | 212 |
| 166 The 60 level includes all words appearing in at least 2 of the 12 | 213 The 60 level includes all words appearing in at least 2 of the 12 |
| 167 dictionaries as indicated by the 12Dicts package. | 214 dictionaries as indicated by the 12Dicts package. |
| 168 | 215 |
| 169 The 70 level includes Brian's frequency class 0 and the 74,550 common | 216 The 70 level includes Brian's frequency class 0 and the 74,550 common |
| 170 dictionary words from the MWords package. The common dictionary words, | 217 dictionary words from the MWords package. The common dictionary words, |
| 171 like those from the 12Dicts package, have had all likely inflections | 218 like those from the 12Dicts package, have had all likely inflections |
| 172 added. The 70 level also included the 5desk list from version 4.0 of | 219 added. The 70 level also included the 5desk list from version 4.0 of |
| 173 the 12Dics package which is the public domain. | 220 the 12Dics package which is in the public domain. |
| 174 | 221 |
| 175 The 80 level includes the ENABLE word list, all the lists in the | 222 The 80 level includes the ENABLE word list, all the lists in the |
| 176 ENABLE supplement package (except for ABLE), the "UK Advanced Cryptics | 223 ENABLE supplement package (except for ABLE), the "UK Advanced Cryptics |
| 177 Dictionary" (UKACD), the list of signature words in from YAWL package, | 224 Dictionary" (UKACD), the list of signature words from the YAWL package, |
| 178 and the 10,196 places list from the MWords package. | 225 and the 10,196 places list from the MWords package. |
| 179 | 226 |
| 180 The ENABLE package, mainted by M\Cooper <thegrendel@theriver.com>, | 227 The ENABLE package, mainted by M\Cooper <thegrendel@theriver.com>, |
| 181 is in the Public Domain: | 228 is in the Public Domain: |
| 182 | 229 |
| 183 The ENABLE master word list, WORD.LST, is herewith formally released | 230 The ENABLE master word list, WORD.LST, is herewith formally released |
| 184 into the Public Domain. Anyone is free to use it or distribute it in | 231 into the Public Domain. Anyone is free to use it or distribute it in |
| 185 any manner they see fit. No fee or registration is required for its | 232 any manner they see fit. No fee or registration is required for its |
| 186 use nor are "contributions" solicited (if you feel you absolutely | 233 use nor are "contributions" solicited (if you feel you absolutely |
| 187 must contribute something for your own peace of mind, the authors of | 234 must contribute something for your own peace of mind, the authors of |
| (...skipping 26 matching lines...) Expand all Loading... |
| 214 words, 4,946 female names and the 3,897 male names, and 21,986 names | 261 words, 4,946 female names and the 3,897 male names, and 21,986 names |
| 215 from the MWords package, ABLE.LST from the ENABLE Supplement, and some | 262 from the MWords package, ABLE.LST from the ENABLE Supplement, and some |
| 216 additional words found in my part-of-speech database that were not | 263 additional words found in my part-of-speech database that were not |
| 217 found anywhere else. | 264 found anywhere else. |
| 218 | 265 |
| 219 Accent information was taken from UKACD. | 266 Accent information was taken from UKACD. |
| 220 | 267 |
| 221 My VARCON package was used to create the American, British, and | 268 My VARCON package was used to create the American, British, and |
| 222 Canadian word list. | 269 Canadian word list. |
| 223 | 270 |
| 224 Since the original word lists used used in the VARCON package came | 271 Since the original word lists used in the VARCON package came |
| 225 from the Ispell distribution they are under the Ispell copyright: | 272 from the Ispell distribution they are under the Ispell copyright: |
| 226 | 273 |
| 227 Copyright 1993, Geoff Kuenning, Granada Hills, CA | 274 Copyright 1993, Geoff Kuenning, Granada Hills, CA |
| 228 All rights reserved. | 275 All rights reserved. |
| 229 | 276 |
| 230 Redistribution and use in source and binary forms, with or without | 277 Redistribution and use in source and binary forms, with or without |
| 231 modification, are permitted provided that the following conditions | 278 modification, are permitted provided that the following conditions |
| 232 are met: | 279 are met: |
| 233 | 280 |
| 234 1. Redistributions of source code must retain the above copyright | 281 1. Redistributions of source code must retain the above copyright |
| (...skipping 16 matching lines...) Expand all Loading... |
| 251 FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL GEOFF | 298 FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL GEOFF |
| 252 KUENNING OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, | 299 KUENNING OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, |
| 253 INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, | 300 INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, |
| 254 BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; | 301 BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; |
| 255 LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER | 302 LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER |
| 256 CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT | 303 CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT |
| 257 LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN | 304 LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN |
| 258 ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE | 305 ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE |
| 259 POSSIBILITY OF SUCH DAMAGE. | 306 POSSIBILITY OF SUCH DAMAGE. |
| 260 | 307 |
| 261 Build Date: Thu Jan 6 02:31:28 MST 2011 | 308 Build Date: Mon Aug 11 18:27:20 CEST 2014 |
| 309 Wordlist Command: mk-list en_US 60 | deaccent |
| OLD | NEW |