Source file documentation
- Documentation written in the source files
- The source files themselves: stems / affixes / twolc / IPA / syntax
Using the analysers
- In the terminal: analyse words by writing
usme, generate withdsme - Generation of: paradigms / text /
- For more info, see How to use the morphological parsers
Projects involving North Saami
- Oversyn over ulike FST-ar for nordsamisk
- Dictionary projects
- ICALL
- Machine translation
- Grammar checker
- Text-to-speech
- The L2 Transducer
Tags used for analysis
- The morphological, morphological (readable version) and syntactic and
- Lemma homonymies and variants - tags for indentifying and sorting
- Lemma homonymies and variants: Main documentation in English
- Documentation of how use the tags for search in Korp
Discussions on improving our linguistic analysis
- Discussions on issues common for Saami languages
- Discussions on restricting generating of possessive suffixes
Morphophonology, morphology and syntax
- Documentation of the
twol-sme.txtrule file - Documentation of the lexicon files
- The use of flag diacritics
- Partly obsolete Documentation of the disambiguation file
- Syntax regression testing: run
sh test/src/syntax/disambiguation_developertest.sh(you may eventually have to adjust the path following$GTBIG, the files are in$GTBIG/gt/sme/corp) - See also the general disambiguation page.
Pre- and postprocessing
- Documentation of the preprocessing of running
text
- The perl-based
preprocessscript, our current preprocessor - For reference: Documentation of the old xfst-based preprocessor
tok.txtis found here
- The perl-based
- Documentation of
inituppercase.regex, (initial capitalisation) andallcaps.xfst, the file for words written in all-caps. Note: The latter is presently not in use. - Translating from xerox-style to vislcg3-style is done with the
script
lookup2cg
Normativity issues
Speller optimisations
There is a separate page on speller optimisations for SME.
Obsolete test reports, for reference
- A test plan for sme (obsolete)
- A test diary for sme (obsolete)
- Bug report sheet from the days before we got a bug report system) (obsolete)
- Our earlier treatment of foreign words (obsolete)