Finite state and Constraint Grammar based analysers, proofing tools and other resources
Present: Csilla, Jack, Trond
work with prefixed verbs
@U.VPref.konyl
@U.VPref.nox
@U.VPref.tyg
We have flags in the code. Let us discuss with Shjur whether tags (converted into flags later on) would be better.
Has been updating test routines to take typos.txt into account for testing of missing words.
Small changes to the better for coverage, we do 0.923 coverage for Luima Seripos.
242 ва̄рим # gramm, verb is not working
238 хотьют # typo хотъют
209 арыгтем # missing
196 арыгкем # missing lemma аригкем Adv
189 сыресыр # missing
173 места # missing loanword
160 ловиньтэлы̄н
159 ӯнттувес
...
We go for Comp. The tags were (mostly) fixed, cleanup after the meeting.
@U.VPref.xot@хот-воратаӈкве+V:@U.VPref.xot@ворат V_A “” ;
This should work, but does not. We discuss with Sjur. The tag_test.sh claims @хот-воратаӈкве
is an undeclared tag, which it of course not is. The problem is that @U.VPref.xot@
is declared.
We should check whether the tag_test.sh is ok.
Trond has lifted mns from alpha to beta level (but it does not come up on the page, though).
Our results are fantastic, taken into consideration the small lexica we do have:
3881 src/fst/stems/verbs.lexc
3460 src/fst/stems/nouns.lexc
1450 src/fst/stems/adjectives.lexc
714 src/fst/stems/abbreviations.lexc
529 src/fst/stems/Missing_words_20231006.txt
368 src/fst/stems/mns-propernouns.lexc
271 src/fst/stems/adverbs.lexc
197 src/fst/stems/pronouns.lexc
121 src/fst/stems/numerals.lexc
110 src/fst/stems/postpositions.lexc
30 src/fst/stems/conjunctions.lexc
24 src/fst/stems/interjections.lexc
11 src/fst/stems/participles.lexc
We understand why: No compound and not many loanwords in the newspaper. One way of improving our list would still be to look for dictionaries. This we do later, though
When we are down at few missing in the 10 and above we start seriously looking at the spellchecker.
Probably week 50, perhaps in Hki.