GiellaLT

GiellaLT provides an infrastructure for rule-based language technology aimed at minority and indigenous languages, and streamlines building anything from keyboards to speech technology.

View GiellaLT on GitHub

Page Content

ELAN

This page documents conventions, standards and relevant workflows used for ELAN annotations created by the Freiburg-Tromsø Speech Corpora.

Introduction

ELAN is a GUI tool for the creation of annotations on video and audio resources. It is used by many documentary linguists and several language documentation projects in [DOBES http://dobes.mpi.nl/], HRELP and other similar programs.
The program allows for complex corpus searches using RegEx, multi-tier and multi-corpus (i.e. across several ELAN-files) as well as visualization of search results (concordance, frequency, etc.). For ELAN-files stored at [The Language Archive (TLA) TLA.html], these features work also with the online tool Trova.

We use ELAN for annotating our video and audio ressources stored at TLA as well as for annoting and presenting our purely written text corpora (without links to multimedia). Here are the ELAN Documentation Pages at TLA.

ELAN-xml

The name extension for ELAN files is .eaf. These are basically XML files (and can be opened as such), but they can also be read by the program ELAN for beeing presented and further edited in a GUI.

Workflow

Current praxis

Planned extension

There is a script for this , at the langdoc/elan-fst page at github, maintained by Niko Partanen, Joshua Wilbur and Mihael Rießler. The pipeline has been used for Komi (the Freiburg project), Pite Saami (Joshua Wilbur) and North Saami (in Oulu).

Planned external project (Zhivotova)

Annotation Conventions

ELAN

*Documentation page for the ELAN tier structures used by our projects and links to ELAN tier template files (XML file in ELAN’s .etf-format) *Documentation page for Transcription conventions applied by our projects

Related tools

*WebLicht, a web-based tool to semi-automatically annotate texts for linguistics and humanities research. Interaction with WebLicht from ELAN is still only under development