Vlax Romani NLP Grammar

Finite state and Constraint Grammar based analysers, proofing tools and other resources

View the project on GitHub giellalt/lang-rmy

Page Content

The Romani languages in the Nordic countries

Classifying Romani languages for language technology purposes is complicated. There are more Romani languages than there are ISO codes and some ISO codes thus cover several varieties. One and the same language may have several diverging normative bodies.

The table below gives language names, ISO and Glottolog codes (with link to Glottolog) for Romani languages in the Nordic countries, as well as indicated whether the language is standardised by the authorities in Finland, Sweden or Norway. Eventual status in other countries is not included in the table. GiellaLT status indicates whether it has been worked with here. Alpha indicates working language models with some content. Experiment indicates a working setup with no linguistic content.

Glottolog name Alternate names Name in official documents GiellaLT status ISO code*) GiellaLT code**) Glottolog code Standard in country
Kalo Finnish Romani Kaale Suomen romani Alpha rmf rmf kalo1256 Finland
Tavringer Romani   Resanderomska Experiment rmu rmu tavr1235 Sweden
Romani arli arlikane Arli Experiment rmn rmn***) arli1238 Sweden
Romani kalderaš kelderašicko Kalderash Experiment rmy rmy-x-kalderas kald1238 Sweden
Romani lovara lovari, lovaricko Romanés no rmy rmy-NO lova1240 Norway
Romani lovara lovari, lovaricko  Lovari no rmy rmy-x-lovara lova1240 Sweden
Polish Romani   Polsk romska no rml rml poli1261 Sweden
Traveller Norwegian romani rakkripa Romani Alpha rmg rmg trav1236 Norway

*) Note that three of the ISO codes have a wider coverage than in the table above: rmn – Balkan Romani, rml – Baltic Romani, rmy – Vlax Romani are all used also in a wider European context, and for more varieties than the ones referred to here.

**) BCP47 codes used to name repositories in the GiellaLT infrastructure.

***) rmn should in the GiellaLT context really be named rmn-SE, as we presently only work with data and representatives from Sweden.

Starting 2017, the Swedish Language Council has initiated a project aiming at revising the orthographies of Romani languages in Swedan, cf. this orientation. At present (spring 2022), all languages marked Sweden in the table above have their own distinct orthographies, but one possible outcome of the Swedish project is thus that several of them may be unified.