GiellaLT

GiellaLT provides an infrastructure for rule-based language technology aimed at minority and indigenous languages, and streamlines building anything from keyboards to speech technology. Read more about Why. See also How to get started, and our Privacy document.

View GiellaLT on GitHub giellalt/giellalt.github.io

Word guesser game

The word guesser game is built on the same idea as MasterMind.

Basic setup

Creating the word lists

cat ../../main/words/lists/smj/2021-11-03_smj_lemma.freq | # use an existing lemma list if available grep -v ' Prop'  | # Remove proper nounds - check the tag tr -s ' '   | # squeeze spaces (check output of previous command) cut -d' ' -f3  | # use the third field for further processing grep -v -e '[-é\ /.]' -e '[A-Z]' -e '[0-9]' | # Remove lines containing various noise letters grep '^......$'  | # Extract words only 6 letters long - adjust if needed hfst-lookup -q lang-smj/src/fst/analyser-gt-norm.hfstol | # analyse all extracted lemmas grep -v 'inf$'  | # Remove unrecognised lemmas grep -v '^$' | cut -f1 | uniq | # clean up the analysis output sort -R  # randomise the list of words

Alternatively, you can grab the list of lemmas directly from the lexc files:

./giella-core/scripts/extract-lemmas.sh \ lang-sje/src/fst/morphology/stems/*lexc | # Extract all lemmas from lexc grep '^......$'  | # Grep all and only words with correct length grep -v -e '[A-ZÁÆØÅÄÖ]' -e '\.' -e '[0-9]' | # Grep away problem strings sort -u | sort -R  # clean and randomise

Sitemap