GiellaLT

GiellaLT provides an infrastructure for rule-based language technology aimed at minority and indigenous languages, and streamlines building anything from keyboards to speech technology.

View GiellaLT on GitHub

Page Content

Language resource maturity classification

This page presents and defines the maturity classification system of this site. At the bottom of the page comes a description of how to add and change maturity tags.

Maturity classes

In the GielllaLT infrastructure we use a five-step classification to broadly describe the quality and development level of various linguistic resources. These categories are used as labels in README files, on the documentation front page for each resource, as well as in the overview pages for language models, dictionaries, keyboards and spell checkers (the maturity level of grammar checkers, machine translation applications and speech technology are still undefined). The labels look like the following:

No. Label Type Colour
1. Maturity: Production Production green
2. Maturity:       Beta Beta yellow
3. Maturity:      Alpha Alpha red
4. Maturity: Experiment Experiment / student exercise black
5. Maturity:  Undefined Undefined grey

Maturity class definitions (in reverse order)

Some of the criterias for the various levels are common for all resource pages and listed under General criteria. Other criteria are application specific:

Undefined Maturity: Undefined

Used when the maturity is not definable, or has not yet been defined/tagged.

Experiment Maturity: Production

This category also covers student exercises (published with permission). The point of such exercises is not to make a working system, but to explore the possibilities for language technology. Such work can of course be extended and in the end result in a fully working, production tool.

General criteria

Application specific criteria

Language model

Dictionary

Keyboard

Spell checker

Alpha Maturity: Production

General criteria

Application specific criteria

Language model

Dictionary

Keyboard

Spell checker

Beta Maturity: Production

General criteria

Application specific criteria

Language model

Dictionary

Keyboard

Spell checker

Production Maturity: Production

General criteria

Application specific criteria

Language model

Dictionary

Keyboard

Spell checker

Registering maturity

The maturity classification is done using GitHub topics.

Maturity badges in README’s, documentation and elsewhere are generated automatically from these topics, and they are also used in the keyboard and language resource lists to group the repos automatically.

Adding maturity topic tags

Adding maturity tags is done via GitHub topics, and can only be done by repo or organisation owners or admins. It is also possible to use gut to set the topics from the command line if they do not exist, but presently it is not possible to remove or change GitHub topics.

The topic tags corresponding to the labels above are as follows:

The Maturity: Undefined category does of course not have a topic - that is the definition of the category. In the lists and tables linked to above it should ideally be empty, but it is listed in any case to easily spot repositories that do not yet have a defined maturity class.

The maturity tags are turned into json endpoints for shield.io, and stored in the gh-pages branch of each repository. This is done automatically by the CI on each push to GitHub, but requires that GitHub Pages have been configured for the repo.

There should be only ONE maturity tag pr repo. — It is technically possible to add more maturity tags to a single repo, but that does not make much sense and will probably cause the json file creation to fail.