| Issue 4782: | Add a Turkish spell checker | |
| 4 people starred this issue and may be notified of changes. | Back to list |
Sign in to add a comment
|
Chrome Version : 0.4.154.25 There is no Turkish spell checker available in chrome yet. There is a project called tr-spell ( http://code.google.com/p/tr-spell/ ) they provide a word list based turkish hunspell dictionary. can be incorporated to chrome easily. |
||||||||||||||||||||||
,
Dec 04, 2008
(No comment was entered for this change.)
Status: Untriaged
Cc: js...@chromium.org sidc...@chromium.org hb...@chromium.org Labels: I18N |
|||||||||||||||||||||||
,
Dec 04, 2008
(No comment was entered for this change.)
Status: Available
Labels: Mstone-X Area-WebKit |
|||||||||||||||||||||||
,
Jan 14, 2009
Ahmetaa, et al, Sorry for my very late response. I briefly studied Turkish grammar and investigated this issue this week. Technically, this issue consists of three issues described below. 1. Our dictionary-converter tool (convert_dict.exe) throws a debug assertion while converting this Turkish dictionary. This is a bug of this tool that it causes a problem in handling affix rules more than 8190. 2. hunspell cannot handle an "AF" line (*1) whose length is more than 8192. Our dictionary-converter tool changes affix rules attached to a word into a "AF" line and moves to an affix section. Unfortunately, hunspell uses an array of a fixed size (8192) and makes a parser error while reading an "AF" line whose length is more than 8192. (*1) http://sourceforge.net/docman/display_doc.php?docid=29374&group_id=143754 3. our hunspell implementation cannot handle a "FLAG" line. This Turkish dictionary uses "FLAG num" option to use a lot of affix rules. Somehow, our hunspell implementation forgot handling a "FLAG" line. I sent a review request for a change which fixes the above three issues. (Even though I have not tested the dictionary thoroughly, the dictionary works OK on my local build.) Regards, Hironori Bono
Status: Started
Owner: hb...@chromium.org |
|||||||||||||||||||||||
,
Jan 25, 2009
Hi there. i have put a compressed dict and aff file in the download section. they are reduced from 1.130.000 words using 5000 affix. i hope it will work better for you. http://code.google.com/p/tr-spell/downloads/list file is: dict_aff_5000_suffix_1130000_words.zip |
|||||||||||||||||||||||
,
Feb 03, 2009
The following revision refers to this bug:
http://src.chromium.org/viewvc/chrome?view=rev&revision=9122
------------------------------------------------------------------------
r9122 | hbono@chromium.org | 2009-02-03 18:30:40 -0800 (Tue, 03 Feb 2009) | 4 lines
Changed paths:
M http://src.chromium.org/viewvc/chrome/trunk/src/chrome/third_party/hunspell/google/bdict_writer.cc?r1=9122&r2=9121
M http://src.chromium.org/viewvc/chrome/trunk/src/chrome/third_party/hunspell/src/hunspell/hashmgr.cxx?r1=9122&r2=9121
M http://src.chromium.org/viewvc/chrome/trunk/src/chrome/third_party/hunspell/src/hunspell/htypes.hxx?r1=9122&r2=9121
The first step towards Turkish spell-checker.This is a set of fixes for supporting the Turkish dictionary provided by the tr-spell project (*1).As I wrote in http://crbug.com/4782, this issue consists of three issues: one is against our convert_dict tool, and two are against our hunspell client.(*1) http://code.google.com/p/tr-spell/Unfortunately, the BDIC file converted from this Turkish dictionary is huge (7.1MB) because the dictionary has a lot of affix rules (> 18,000) and the most of the BDIC file is occupied by "AF" lines.
BUG=4782
Review URL: http://codereview.chromium.org/18041
------------------------------------------------------------------------
|
|||||||||||||||||||||||
,
Feb 05, 2009
Sorry for my late update. I put a change required to support your Turkish dictionary and it works OK on my local build of Chromium. Nevertheless, to use this dictionary on Chrome, we should upload your dictionary into our dictionary-download server so Chrome can automatically download and use it. Also, we would like to include your dictionary into the source tree of Chrome to publish it under a BSD license. Would it be good for us to include your Turkish dictionary into our source tree and upload it into our dictionary-download server? |
|||||||||||||||||||||||
,
Feb 06, 2009
Hironori: Yes, it is fine that you can add the Turkish dictionary to Chrome source tree. Maybe you have seen it, i also put a reduced affix dictionary file there, it may be better to use that one, but it is your call. With request of Mehmet, I also made a small test comparing that dictionary file against a true Turkish morphological parser using a Law related text contianing around 1.5 million words. Dictionary based mechanism detected %98.24 of the words correctly that morphological parser also marked as correct. Considering success of morphological parser based system is around %99.5 in general, i think this dictionary is good enough for most spell checking needs. Thanks for the effort and good luck. Ps: there is a chance we may update the dictionary file some time to time, i dont know how would you want to synchronize Chrome with that. |
|||||||||||||||||||||||
,
Feb 20, 2009
Merhaba, Thank you for your response and sorry for my slow update. We are now uploading your Turkish dictionary to our download server and adding UIs to choose the Turkish dictionary. I wish we can use the Turkish dictionary soon. Regards, Hironori Bono |
|||||||||||||||||||||||
,
Feb 20, 2009
The Turkish and Estonian spell check dictionaries are up and live in the download servers, and the required UI changes are now in the trunk. |
|||||||||||||||||||||||
,
Feb 25, 2009
Works fine in 2.0.166.1 (Official Build 10303).
Status: Fixed
|
|||||||||||||||||||||||
,
Feb 25, 2009
(No comment was entered for this change.)
Status: Verified
|
|||||||||||||||||||||||
|
|
|||||||||||||||||||||||