GiellaLT

GiellaLT provides an infrastructure for rule-based language technology aimed at minority and indigenous languages, and streamlines building anything from keyboards to speech technology.

View GiellaLT on GitHub

Page Content

Documentation of the syntactic tags

See also separate pages on compound, semantic, morphological and dependency tags.

On the bottom of this page there is a list with all tags in alphabetical order.

Syntactic tags

Our syntactic tags, or grammatical function tags, like @SUBJ>, @OBJ>, etc., are based upon a balanced compromise between 3 principles:

  1. use the same tags across giellalt languages
  2. use the conventions from within within constraint grammar (CG), e.g. as found in the visl project for interactive syntax learning
  3. take the grammatical tradition of the language in question into account

The main difference between the CG tradition (both giellalt and visl CG) and other descriptions is that CG is a linear system, where tags are given to wordforms, and not to phrases. Thus, in a sentence like the Saami equivalent of Peter’s dog barks only the word dog will get the tag @SUBJ>. The word Peter’s gets the tag @>N, or “modifying a noun to its right”. It is then up to the reader (or to further computer processing) to interpret the combination of @>N and @SUBJ> as a phrase (phrase information will also be available via the dependency tags when they are present).

The arrow in a syntactic tag points at the “mother” node, which means that the tag tells which kind of part of speech (N, A, P, Pron or Num) or syntactic constituent (like ADVL) the wordform modifies or is a complement to, and whether the “mother” is to the left of to the right.

The tag syntax is thus @Mother<Daughter or @Daughter>Mother, where either daughter or mother node may be left unspecified (giving 4 tag types).

In addition to these 4 types, some tags do not have arrows. These are of two types. One type is the central verb tags @+FAUXV etc. They do not need direction indication, either it is obvious, or the node points to zero. The other type is the set tag type. For each tag pair @SUBJ>, @<SUBJ, etc, there is a metatag @SUBJ, etc., so that @ means “either @SUBJ> or @<SUBJ”.

Note that all syntactic tags are identified by an initial @ symbol, to distinguish them from morphological tags, which do not have such a prefix. In the analysis, the syntactic tags are printed at the end of the tag string.

The syntactic tags for Saami

We present here the tags used for the Saami languages (the best developed languages in the Giellalt infrastructure), but linguists working on other languages will find the presentation useful. The rules assigning tags are found in the file lang-xxx/src/cg3/disambiguation.cg3, where xxx is the ISO code of your language.

The verb tags

These tags are self-explanatory: there are finite and infinite main and auxiliary verbs.

The major function tags

The four main functions for subject, object and their predicatives are obvious.

These are tags for the same functions of infinite verbs outside the verbal: mu gets @-FSUBJ> in Diet dáhpáhuvai mu dieđikeahttá (the infinite verb gets @<ADVL) and girjji gets @-F<OBJ in Munnje lei lossat lohkat girjji. (the infinite verb gets @<SPRED).

The adverbial tags

The @ADVL> @<ADVL @ADVL tags mark adverbials (many, but not all of the adverbials are adverbs). The two first ones indicate the direction to the mother node, the third is used to refer to both the former.

The adverbial of an infinite verb outside the verbal gets @-FADVL. The <hab> tag marks the habitive construction.

The @PCLE tag marks particles, and the @COMP-CS< tag is for the complement of a @CS.

The two tags @>P and @P< are for complements of post- and prepositions, respectively.

The two tags @>ADVL and @ADVL< modify the adverbial to the right, or is a complement of the adverbial to the left, respectively. Note that these tags mark modifyers of adverbials rather than adverbials themselves.

The NP-internal modifiers

The other syntactic tags for modifiers tell which word they modify, and whether they modify to the left or to the right.

The morphological tag will tell what kind of part of speech the constituent itself is.

The @Pron< tag is for eg. numerals modifying pronouns to their left, like in Mii golmmas finaimet máná geahčen.

The @Num< tag is for complements of numerals, like máná in Sudnos leat golbma máná.

Appositions

The apposition tag marks whether it is an apposition of a noun, a pronoun, a numeral or an adverbial.

The function words

Conjunction connecting NPs and VPs.

Sentence-external tags

Stray noun in sentence fragment, interjection and vocative.

The @X tag

An @X tag is assigned to mark that no tag has been assigned (because of omissions in our rule component)

The tags, listed alphabetically

Here is a list of the tags, with a definition or description, and one or more examples following each of them