GiellaLT

GiellaLT provides an infrastructure for rule-based language technology aimed at minority and indigenous languages, and streamlines building anything from keyboards to speech technology. Read more about Why. See also How to get started and our Privacy document.

View GiellaLT on GitHub

Status and future of Xerox and other FST tools

Presently the Giella infrastructure supports three fst technologies in parallel:

Each of them have their strengths and weaknesses, summarised as follows:

Hfst

This compiler is the default choice for the ./configure setup, i.e. it is the compiler you use if you do not specify what compiler you want.

It is a source-code compatible clone of the Xerox tools (below), based on OpenFST, but with multiple backends (Foma, SFST, OpenFST).

Strengths:

Weaknesses:

Xerox

Xerox is at present (spring 2021) the compiler used for cgi-bin services and online dictionaries.

Strengths:

Weaknesses:

Source code access

Even though the source code is not released, it is possible to get a license to the source code of the c-fsm library (documented here) by requesting a license for the XLE page. Information and relevant links can be found at the bottom of the project page.

Foma

A source-code and command interface compatible clone of the Xerox xfst tool, developed and maintained by Måns Huldén. Is open source.

Strengths:

Weaknesses:

How to cope with this…

… ie the lack of future for Xerox, the lack of twolc in Foma and the lack of speed in Hfst.

Today’s dual strategy

Having two compilers for different appliations becomes increasingly difficult.

The future

… is dependent upon

We should now migrate all applications to hfst.