GiellaLT

GiellaLT provides an infrastructure for rule-based language technology aimed at minority and indigenous languages, and streamlines building anything from keyboards to speech technology.

View GiellaLT on GitHub

Page Content

1 alphabetically 2 by probabilities

1 rel freqency WP 2 rel freq actual corpus

only words with higher frequency in fo than in wp

we are looking for terms

could be but not so frequently

-6.146 = 50 – jo närare null desto meir frekvent confidence is conficence for the pair

likelihood of these words to be trans of each other

sme = dynamic compound first part nom, gen, pl

if it never changes I can add it back the reason they are removed is to get a smaller vocabulary size

lemma for compound ok for sme

updated, with all nouns, not the ones with high containing also absolute freq

giza++ ??

n a v exit rest

árvalit+V+TV+Der2+Der/eapmi+N+SgCmp#