GiellaLT provides an infrastructure for rule-based language technology aimed at minority and indigenous languages, and streamlines building anything from keyboards to speech technology. Read more about Why. See also How to get started and our Privacy document.
1 alphabetically 2 by probabilities
1 rel freqency WP 2 rel freq actual corpus
only words with higher frequency in fo than in wp
we are looking for terms
could be but not so frequently
-6.146 = 50 – jo närare null desto meir frekvent confidence is conficence for the pair
likelihood of these words to be trans of each other
sme = dynamic compound first part nom, gen, pl
if it never changes I can add it back the reason they are removed is to get a smaller vocabulary size
lemma for compound ok for sme
updated, with all nouns, not the ones with high containing also absolute freq
giza++ ??
n a v exit rest
árvalit+V+TV+Der2+Der/eapmi+N+SgCmp#