GiellaLT provides rule-based language technology aimed at minority and indigenous languages
Vislcg3 is the newest parser generation from Odense. As its predecessor, vislcg, it is open source. Vislcg3 is licensed under GPL.
Starting on March 5th 2008, we have migrated to vislcg3. Rule files for vislcg are still available in older revisions. For vislcg3 documentation we recommend the online documentation at beta.visl.sdu.dk/cg3.html. Here, we only document installation and basic usage.
/opt/local/bin/port search twigin a Terminal. If that is successfull, go on to the next step. If not install MacPorts, then issue the following commands in your terminal window:
sudo port install cmake sudo port install boost sudo port install icu cd /tmp svn co http://visl.sdu.dk/svn/visl/tools/vislcg3/trunk vislcg3 cd vislcg3/ ./cmake.sh --prefix=/usr/local make -j3 ./test/runall.pl sudo make install
The vislcg3 site has more info on Mac OS installation (including instructions for HomeBrew users), but note that we for our Apertium work assume your vislcg3 is instlled in /usr/local and not in /opt/local.
The vislcg3 site has installation instructions for Linux and Windows
The CG .rle files can be run as text files, or comiled. They will be
compiled with the
make TARGET=$LANG command d:
... | vislcg3 -g src/sme-dis.rle | ...
Vislcg3 can be run with this command:
... | vislcg3 -g src/sme-dis.rle | ...
The list of flags can be obtained by vislcg3 –help. That command prints something like this (use the newest version rather than this list):
-bash-3.00$ vislcg3 -h VISL CG-3 Disambiguator version 0.9.2.3279 Usage: vislcg3 [OPTIONS] Options: -h or -? or --help Displays this list. -V or --version Prints version number. -g or --grammar Specifies the grammar file to use for disambiguation. -p or --vislcg-compat Tells the grammar compiler to be compatible with older VISLCG syntax. --grammar-out Writes the compiled grammar back out in textual form to a file. --grammar-bin Writes the compiled grammar back out in binary form to a file. --grammar-info Writes the compiled grammar back out in textual form to a file, with lots of statistics and information. --grammar-only Compiles the grammar only. --trace Prints debug output alongside with normal output. --prefix Sets the prefix for mapping. Defaults to @. --sections Number of sections to run. Defaults to running all sections. --single-run Only runs each section once. --no-mappings Disables running any MAP, ADD, or REPLACE rules. --no-corrections Disables running any SUBSTITUTE or APPEND rules. --no-before-sections Disables running rules from BEFORE-SECTIONS. --no-sections Disables running rules from any SECTION. --no-after-sections Disables running rules from AFTER-SECTIONS. --num-windows Number of windows to keep in before/ahead buffers. Defaults to 2. --always-span Forces all scanning tests to always span across window boundaries. --soft-limit Number of cohorts after which the SOFT-DELIMITERS kick in. Defaults to 300. --hard-limit Number of cohorts after which the window is delimited forcefully. Defaults to 500. --no-magic-readings Prevents running rules on magic readings. --dep-allow-loops Allows the creation of circular dependencies. -O or --stdout A file to print output to instead of stdout. -I or --stdin A file to read input from instead of stdin. -E or --stderr A file to print errors to instead of stderr. -C or --codepage-all The codepage to use for grammar, input, and output streams. Auto-detects default from environment. --codepage-grammar Codepage to use for grammar. Overrides --codepage-all. --codepage-input Codepage to use for input. Overrides --codepage-all. --codepage-output Codepage to use for output and errors. Overrides --codepage-all. -L or --locale-all The locale to use for grammar, input, and output streams. Defaults to en_US_POSIX. --locale-grammar Locale to use for grammar. Overrides --locale-all. --locale-input Locale to use for input. Overrides --locale-all. --locale-output Locale to use for output and errors. Overrides --locale-all.