GiellaLT provides an infrastructure for rule-based language technology aimed at minority and indigenous languages, and streamlines building anything from keyboards to speech technology. Read more about Why. See also How to get started and our Privacy document.
[to be written..]
Typically, we have a long list of words, (several tens of thousands). Do a reverse sort on that list. If it is called newwords, give the command rev newwords | sort | rev > r-newwords.
Take out a set of lines that behave the same way, e.g. all ending in “njárga”. Make a new file, C-x C-f njarga1, and another file njarga2. The file njarga1 should contain everything before the “:” mark, and the file njarga2 everything after the “:” mark. Cf. the line:
Stuoranjár0ga:Stuoranjár’ga GOAHTI ;
The file njarga1 should contain entries like
Stuoranjár0ga
and the file njarga2 should contain entries like
Stuoranjár’ga GOAHTI ;
So, starting out with a list containing words like “Stuoranjárga”, change the entries in njarga1 to “Stuoranjár0ga”, and the entries in njarga2 to “Stuoranjar’ga GOAHTI ;”. Use the command M-x queTAB-TAB.
Save both files, and leave emacs.
Then you should paste these two files together. To do that, on the command line write the command
paste -d”:” njarga1 njarga2 | less
-d tells that you want to use a separator mark, and the mark you want to use (:) must be in quotation marks. Then comes the file (the leftmost file first), and since you want to control the output, you should write | less. If everything works, you should replace the last part with a command that creates a new file (e.g. named > 2njarga), or, if you know what you are doing, you should just add it directly to the file where you want it, with the >> operator.
paste -d”:” njarga1 njarga2 > 2njarga
or
paste -d”:” njarga1 njarga2 >> propernoun-sme-lex.txt