mns meeting
- Time: April 29nd 2024
- Present: Csilla, Jaska, Trond
Agenda
- Status
- Speller
- Grammar
- More texts
- Next meeting
Status
Testing
240422: 1-(27055/707586) = 0.96176
240429: 1-(26573/707554) = 0.96244
240422: 547 1.09 46.75 49.54
240429: 579 1.16 45.71 48.86
old command for coverage:
cat test/data/Luima_Seripos_2013-2017.txt |\
hfst-tokenise tools/tokenisers/tokeniser-disamb-gt-desc.pmhfst |\
preprocess --corr=test/data/typos.txt|\
hfst-tokenise -cg tools/tokenisers/tokeniser-disamb-gt-desc.pmhfst |\
grep " ?"|cut -d'"' -f2|wc -l
New command for coverage, removing Russian words (from Russian phrases in text):
cat test/data/Luima_Seripos_2013-2017.txt |\
hfst-tokenise tools/tokenisers/tokeniser-disamb-gt-desc.pmhfst |\
preprocess --corr=test/data/typos.txt|\
hfst-tokenise -cg tools/tokenisers/tokeniser-disamb-gt-desc.pmhfst |\
grep " ?"|cut -d'"' -f2|\
hfst-lookup -q ../lang-rus/src/fst/analyser-gt-desc.hfstol |\
grep "+?"|wc -l
Trond
Has worked with Børre. The question of columns is still open.
Csilla
Work done
Work with the missing list, added verbs and some other words.
Jack
Added to the missing list (the “to be solved” entries).
Worked on the reflexive pronouns (+Emph still to be done).
Has been adding words to foreign_words.lexc.
Russian words should be expelled from foreign_words.lexc.
Speller
We now have 581 typos. More typos are welcome.
Grammar
Personal pronouns to stems/
Emphatic and refelxive personal pronouns
ам+Pron+Emph+Sg1: амки = emph = Type 1 below
ам+Pron+Refl+Sg1+Nom: амки = Types 2, 3 below
ам+Pron+Pers+Sg1+Nom: ам =
ам+Pron+Refl+Sg1+Acc: [амкинам, амкина̄м, амким] = refl = minä pesin itseni
ам+Pron+Refl+Sg1+Dat: [амкинамн, амкимн]
ам+Pron+Refl+Sg1+Abl: [амкинамныл, акимныл]
ам+Pron+Refl+Sg1+Ins: [акинамтыл, амкимтыл]
General discussion:
Arg1 V Arg2 = refl Peter känner sig själv, Jag känner mig själv
| | Pron.Refl.Sg1
R <-----+
Arg1 V Arg2 = non-refl Peter känner honom, Peter känner mig
| |
R1 R2
Peter tränade mig själv (Peter trained me himself)
Pron.Acc Emph
hun
Látom magam mirrorban/futureben. 'I see myself.'
..., látom magam (is). 'I myself see (too).'
látod magad?
itse itse+Adv 0,000000
itse itse+Pcle 0,000000
itse itse+Pron+Sg+Nom 0,000000
The fst and yaml files should be updated accordingly.
Mansi discussion:
I.
1. According to literature, nominative forms are emphatic. This is supported by the data as well.
Амки о̄с ты па̄выл-т школа а̄стла-сум.
Sg1Emph too this village+Loc school finish+V+Act+Ind+Prt+ScSg1
I myself graduated from that school too.
Амки ня̄врам ат о̄ньщигл-асум, о̄йка-м тав хӯрум пыганэ, хӯрум а̄ги-янэ янмалт-асанум.
Sg1Emph child no own+V+Act+Ind+Prt+ScSg1 husband+PxSg1 he three boy+N+Pl+Nom+PxSg3 three girl+N+Pl+Nom+PxSg3 grow(tr)+V+Act+Ind+Prt+ScSg1+OcPl
I didn't have children myself, I was raising my husband's three sons and three daughters.
2. According to the data, nominative forms can be understood in reflexive-like meanings, especially when followed by postpoisitions. It doesn't seem to be a grammatical difference, but I still feel it to be different from the previous analysis.
Ты ня̄врам-акве-г ам [PoP амки нупыл-ум] хас-сагум, то̄нт нэ̄мхотьют а̄нум ат ла̄в-ыс, тыщир ва̄руӈкве ат кос э̄р-ыс.
thus child+Dim+Du Sg1 Sg1Refl above+PxSg1 write+V+Act+Ind+Prt+ScSg1+OcDu then nobody no say+V+Act+Ind+Prt+ScSg3 this.way do+Inf no atlhough may+V+Act+Ind+Prt+ScSg3
So I adopted the two children, I wasn't told that I would not have been doing so.
Ты а̄ги-рищ-иг-пыг-рищ-иг янмалт-ым ам о̄с нила ня̄врам-акве амки пал-т-ум ви-санум.
this girl+Dim+Du-boy+Dim+Du grow(tr)+Ger Sg1 too four child+Dim Sg1Refl side-Loc-PxSg1 bring+V+Act+Ind+Prt+ScSg1+OcPl
While I was raising this little girl and boy, I took in four other children as well.
Ты ня̄врам-ыт нэ̄пак-аныл хунь ва̄р-санум, то̄нт ла̄в-ве̄сум, амки нупыл-ум воссыг ул вос хансы-янум.
this child+Pl book+Nom+PxPl3 when do-+V+Act+Ind+Prt+ScSg1+OcPl then say-+V+Act+Ind+Prt+ScSg1+OcPl Sg1Refl no.nore write+V+Act+Ind+Prt+ScSg1+OcPl
When I was creating these children's documents, only then was I told not to adopt them.
3. In some situations, I'd translate the nominative form in the meaning of Russian "свой"
Ам тав-е̄н амки та̄рвитыӈ ва̄рмал-юм урыл потыртасум.
Sg1 Sg3 Sg1Refl difficult thing+PxSg1 about speak+V+Act+Ind+Prt+ScSg1
I told him about my(own) difficult situation.
Ам амки яныг-м-ам пора-м ёмащакв номи-лум, ма̄нь па̄выл-кве-ныл яныг ӯс-н интернат-ын о̄луӈкве ке̄т-вес-ӯв.
Sg1 Sg1Refl grow+V+PrfPrc+PxSg1 time+PxSg1 well think+V+Act+Ind+Prs+ScSg1+OcSg small village+Dim+Abl big town+Lat boarding.school+Lat live-Inf send+V+Pass+Ind+Prt+ScPl1
I well remember my own childhood, we were sent to the boarding school in the big town from our small village.
Ам амки о̄парищнам-ум Енизорова о̄л-ыс.
Sg1 Sg1Refl surname+PxSg1 Yenizarova be+V+Act+Ind+Prt+ScSg3
My maiden name (my own family name) was Yenizarova.
II.
1. According to literature, accusative, dative, ablative and instrumental forms are reflexive. It is supported by the data as well.
Ольга Александровна Фомичёва, Силава па̄выл-т о̄л-нэ ма̄ньщи нэ̄, таквина̄тэ̄н ос а̄гитэ̄н тамле суп тот ёвт-ыс.
O.A.F. Silava village-Loc live+V+PrsPrc Mansi woman Sg3.Refl+Dat and girl+PxSg3+Lat this.kind dress there buy+V+Act+Ind+Prt+ScSg3
A Mansi woman from Silawa village, O.A.F. bought herself and her daughter costumes like that there.
2. Some sentences do not fit to this understanding.
Ма̄хум-акве-т, на̄н ос на̄нкина̄н ёмас ва̄р-е̄н, пӯльница-н ва̄тихал уральтахтуӈкве ял-э̄н.
people+Dim+Pl Pl2 too Pl2Refl good do+V+Act+Imprt+ScPl2 hospital+Loc often guard(refl)+Inf go(freq)+V+Act+Imprt+ScPl2
Dear people, do yourselves a favour and go to the hospital often to have yourselves checked.
на̄н ос на̄нкина̄н ёмас ва̄р-е̄н
you(pl) too you(pl)?refl+Acc good(adj) do+Imp
Solitary pronouns
- Ювле ёхт-ысум, мощ номсахт-асум, э̄лаль амкем рӯпитаӈкве та пат-сум.
- home(lat) come
ам+Pron+Pers+Sg1+Nom: [амкем, амккем, амке̄мт, амкке̄мт]
ам+Pron+Pers+Du1+Nom: [ме̄нккеме̄н, ме̄нкеме̄н, ме̄нкемнт]
ам+Pron+Pers+Pl1+Nom: [ма̄нкке̄в, ма̄нке̄в, ма̄нке̄вт]
Both, the two of (them) = сас
Па̄вл-э̄н-т тэ̄н ханты мир культура центр-ыт рӯпит-э̄г, сас ма̄щтыр нам-ыл ма-им о̄л-э̄г.
village+PxDu3+Loc Du3 Khanty people culture centre+Loc work+V+Du3, both master name+Instr give-Gerund be+Du3
New prefix:
Also found what might be a new prefix: ёла-
!@U.VPref.jola@ёла-тохруӈкве+V:@U.VPref.jol@тохр V_U “21odd” ;
$ grep jola src/fst/morphology/root.lexc
$ grep jol src/fst/morphology/root.lexc
+Pref/jol
@U.VPref.jol@ !!≈ * @CODE@
@R.VPref.jol@ !!≈ * @CODE@
More texts
Last meeting’s todo list still not done:
- We try to get a meeting with Børre (Trond to discuss with Børre and then schedule a meeting, perhaps week 20).
- Csilla to systematize conversion errors and report for the meeting. My findings on lines 306-338
Next meeting
Preliminary time: 15.00 Monday 6th Finnish time (06.00 Edmonton time).