Marks particle as error instead of the preceding Err/Orth of the same mwe #45

duomdaamaendra · 2022-01-28T03:19:14Z

(↑ is divvun/divvun-gramcheck-web#18 , ↓ is this issue)

duomdaamaendra · 2022-01-28T03:30:12Z

this example is from erroneous word (correct: "-diehtagis"), but the marked part is korrekt: it is the not-enclitical particle "gis"

snomos · 2022-01-28T06:11:11Z

In both cases I need the original text to be able to reproduce and debug. The paragraph containing the problem should be enough, maybe even just the sentence.

duomdaamaendra · 2022-01-28T09:50:40Z

Jos dal vel Sámis leat sullasaš dilit go davviriikkain muđuid, de fuobmá árvvoštallamiin goit ovtta erenoamáš ášši mii earuha sámi árvvoštallamiid omd. dáža árvvoštallamiin. Girječálli birra, ja su ođđa girji ovddeš bargguiguin veardádallon, gávnnat hárve sámi árvvoštallamiin. Čiekŋaleabbo dieđu go ahte gos čálli lea riegádan ja gos ássá, gávnnat hárve. Oalle dábálaš lea dákkár diehtu lohkkái: «Mus eai leat obanassii sánitge rámidit nn čehppodaga, dajan dušše ahte áŋgirit ja čeahpit gultturbargi ii gávnna ohcaminge.» (Samefolket 1/89, s. 92). Fuobmá maiddái dán čállosa ovdamearkka vuosttas siiddus: «- rohkkes Láhpoluobbala gollenieida …» Čállái báhcá goit rápmi, jos dal ii čiekŋalit ággaduvvon.

duomdaamaendra · 2022-01-28T09:51:50Z

Jos ohcala siva dasa ahte čálli birra leat uhcán dieđut, de vástádus dáidá leat nu álki go ahte nie šaddá lunddolaččat servodagasgos [116] buohkat dovddadit. Árvvoštalli duhtá dasto daid dábálaš dieđuide mat juo buohkain leat čálli birra. Dás han maiddái lea sáhka dušše ođđa girjjiin, ja otnáš čálliin. Ii árvvoštalli arvva lebbet dieđuid čálli eallimis omd., jos dal vel oaivvildeš ahte leat leamaš váikkuheaddji áššit čálli loahpalaš bargui, go dát sáhttet leat oalle persovnnalaččat ja dan sivas eai gula almmolašvuhtii. Sáhttá gal maiddái árvvoštalli leat oaivvildeamen ahte eat dárbbaš dárkilieabbo dieđuid go mat mis juo buohkain leat. Dán dili bahá ja buriid beliid garvván dán oktavuođas. Muhto dattege, jos vuos dieavaslaččat áigut árvvoštallat sámi girjjiid, de fertet árvvoštaladettiin maiddái ohcalit ja čállit biográfalaš dieđuid, muhto dieđusge dakkár dieđuid mat leat relevánta. Geaid luhtte čálli lea ijastallan maŋemuš jagiid, ja galle luovosmáná sus leat, eai leat eanemus relevánta dieđut. Dattege sáhttet leat čálli birrasis dakkár váikkuheaddji elementtat maid birra sáhtášii leat dehálaš diehtit. Ahte čálli eallimis sáhttet leat váikkuheaddji olbmot, dáhpáhusat ja fearánat mat leat váikkuhan su go girjji čálii, leat girjjálašvuođadiehttagis dohkkehan dutkanveara áššin. Historjjábiográfalaš árvvoštallama vuogis dát lea guovddáš ášši, earret dieđusge teaksta dahje girji maid čálli almmuha.

duomdaamaendra · 2022-01-28T10:03:56Z

the same in Googledocs

snomos · 2022-01-28T10:07:53Z

The first case, gavnnat / nn, seems to be the same as this bug. That is, this bug is not restricted to GDocs.

snomos · 2022-01-28T10:23:05Z

The second example, the gis bug, is most likely on our end, but needs further investigation.

duomdaamaendra · 2022-01-28T14:45:40Z

@lynnda-hill

lynnda-hill · 2022-02-02T13:42:12Z

this example is from erroneous word (correct: "-diehtagis"), but the marked part is korrekt: it is the not-enclitical particle "gis"

"<girjjálašvuođadiehtta>"
"girji" Ex/N Sem/Txt Der/lasj Ex/A Der/vuota N Cmp/SgGen Cmp <W:0.0> #21->21
"girji" Ex/N Sem/Txt Der/lasj Ex/A Ex/Attr Der/vuota N Cmp/SgGen Cmp <W:0.0> #21->21
"girjjálaš" Ex/A Der/vuota N Cmp/SgGen Cmp <W:0.0> #21->21
"girjjálašvuohta" N Sem/Txt Cmp/SgGen Cmp <W:0.0> #21->21
""
"gis" Pcle <W:0.0> @pcle MAP:22087:r16 &typo #22->22 ADD:10066:Err/Orth-any
"diehtit" V TV Ind Prs Sg3 Err/Orth <W:0.0> SUBSTITUTE:4876 #22->22
typo
"gis" Pcle <W:0.0> @pcle MAP:22087:r16 &typo &SUGGEST #22->22 ADD:10066:Err/Orth-any COPY:10075:Err/
Orth-any
"diehtit" V Ind Prs Sg3 <W:0.0> SUBSTITUTE:4876 #22->22
diehtit+V+Ind+Prs+Sg3#gis+Pcle ?
:
This seems to be the old particle problem again, we should really do something about it

unhammer · 2022-03-10T13:43:04Z

This is what's going on:

$ echo 'girjjálašvuođadiehttagis' | modes/trace-smegramrelease3-cg.mode 
"<girjjálašvuođadiehttagis>"
        "gis" Pcle <W:0.0> "<gis>" <LastCohort> <firstCohort>
                "diehtit" V <EX-Nom-Ani> TV Ind Prs Sg3 Err/Orth <W:0.0> <LastCohort> <firstCohort> SUBSTITUTE:4876
                        "girji" Ex/N Sem/Txt Der/lasj Ex/A Der/vuota N Cmp/SgGen Cmp <W:0.0> "<girjjálašvuođadiehtta>" <LastCohort> <firstCohort>
        "gis" Pcle <W:0.0> "<gis>" <LastCohort> <firstCohort>
                "diehtit" V <EX-Nom-Ani> TV Ind Prs Sg3 Err/Orth <W:0.0> <LastCohort> <firstCohort> SUBSTITUTE:4876
                        "girji" Ex/N Sem/Txt Der/lasj Ex/A Ex/Attr Der/vuota N Cmp/SgGen Cmp <W:0.0> "<girjjálašvuođadiehtta>" <LastCohort> <firstCohort>
        "gis" Pcle <W:0.0> "<gis>" <LastCohort> <firstCohort>
                "diehtit" V <EX-Nom-Ani> TV Ind Prs Sg3 Err/Orth <W:0.0> <LastCohort> <firstCohort> SUBSTITUTE:4876
                        "girjjálaš" Ex/A Der/vuota N Cmp/SgGen Cmp <W:0.0> "<girjjálašvuođadiehtta>" <LastCohort> <firstCohort>
        "gis" Pcle <W:0.0> "<gis>" <LastCohort> <firstCohort>
                "diehtit" V <EX-Nom-Ani> TV Ind Prs Sg3 Err/Orth <W:0.0> <LastCohort> <firstCohort> SUBSTITUTE:4876
                        "girjjálašvuohta" N Sem/Txt Cmp/SgGen Cmp <W:0.0> "<girjjálašvuođadiehtta>" <LastCohort> <firstCohort>


$ echo 'girjjálašvuođadiehttagis' | modes/trace-smegramrelease4-mwe-split.mode
"<girjjálašvuođadiehtta>"
        "girji" Ex/N Sem/Txt Der/lasj Ex/A Der/vuota N Cmp/SgGen Cmp <W:0.0> <LastCohort> <firstCohort>
        "girji" Ex/N Sem/Txt Der/lasj Ex/A Ex/Attr Der/vuota N Cmp/SgGen Cmp <W:0.0> <LastCohort> <firstCohort>
        "girjjálaš" Ex/A Der/vuota N Cmp/SgGen Cmp <W:0.0> <LastCohort> <firstCohort>
        "girjjálašvuohta" N Sem/Txt Cmp/SgGen Cmp <W:0.0> <LastCohort> <firstCohort>
"<gis>"
        "gis" Pcle <W:0.0> <LastCohort> <firstCohort>
                "diehtit" V <EX-Nom-Ani> TV Ind Prs Sg3 Err/Orth <W:0.0> <LastCohort> <firstCohort> SUBSTITUTE:4876


$ echo 'girjjálašvuođadiehttagis' | modes/trace-smegramrelease.mode 
"<girjjálašvuođadiehtta>"
        "girji" Ex/N Sem/Txt Der/lasj Ex/A Der/vuota N Cmp/SgGen Cmp <W:0.0> <LastCohort> <firstCohort>
        "girji" Ex/N Sem/Txt Der/lasj Ex/A Ex/Attr Der/vuota N Cmp/SgGen Cmp <W:0.0> <LastCohort> <firstCohort>
        "girjjálaš" Ex/A Der/vuota N Cmp/SgGen Cmp <W:0.0> <LastCohort> <firstCohort>
        "girjjálašvuohta" N Sem/Txt Cmp/SgGen Cmp <W:0.0> <LastCohort> <firstCohort>
"<gis>"
        "gis" Pcle <W:0.0> <LastCohort> <firstCohort> @PCLE MAP:22090:r16 &typo ADD:10126:Err/Orth-any
                "diehtit" V <EX-Nom-Ani> TV Ind Prs Sg3 Err/Orth <W:0.0> <LastCohort> <firstCohort> SUBSTITUTE:4876
typo
        "gis" Pcle <W:0.0> <LastCohort> <firstCohort> @PCLE MAP:22090:r16 &typo &SUGGEST ADD:10126:Err/Orth-any COPY:10135:Err/Orth-any
                "diehtit" V <EX-Nom-Ani> Ind Prs Sg3 <W:0.0> <LastCohort> <firstCohort> SUBSTITUTE:4876
diehtit+V+Ind+Prs+Sg3#gis+Pcle  ?

Much simplified, we have the following from the analyser:

"<abc>"
	"c" Pcle "<c>"
		"b" V Err/Orth
			"a" N "<ab>"

which cg-mwesplit turns into

"<ab>"
	"a" N
"<c>"
	"c" Pcle
		"b" V Err/Orth

Now the generator gets sent

b+V#c+Pcle

which doesn't give any results. If we could send just b+V, we would get the correct form for that part, but then we'd need an input mark between "<a>" and "<b>" so we got

"<abc>"
	"c" Pcle "<c>"
		"b" V Err/Orth "<b>"
			"a" N "<a>"

or in the original example:

"<girjjálašvuođadiehttagis>"
	"gis" Pcle "<gis>"
		"diehtit" V TV Ind Prs Sg3 Err/Orth "<diehtta>"
			"girjjálašvuohta" N Cmp/SgGen Cmp "<girjjálašvuođa>"

which cg-mwesplit would turn into

"<girjjálašvuođa>"
	"girjjálašvuohta" N Cmp/SgGen Cmp
"<diehtta>"
	"diehtit" V TV Ind Prs Sg3 Err/Orth
"<gis>"
	"gis" Pcle

At least that's one possibility – I have no idea how hard that would be to do on the lexicon side.

snomos changed the title ~~msword marks part of correct word as error~~ MS Word marks part of correct word as error Jan 28, 2022

snomos mentioned this issue Jan 28, 2022

GDocs/MS Word extensions mix error with identical substring in earlier word divvun/divvun-gramcheck-web#18

Closed

unhammer changed the title ~~MS Word marks part of correct word as error~~ Marks particle as error instead of the preceding Err/Orth of the same mwe Mar 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Marks particle as error instead of the preceding Err/Orth of the same mwe #45

Marks particle as error instead of the preceding Err/Orth of the same mwe #45

duomdaamaendra commented Jan 28, 2022 •

edited by unhammer

Loading

duomdaamaendra commented Jan 28, 2022

snomos commented Jan 28, 2022 •

edited

Loading

duomdaamaendra commented Jan 28, 2022

duomdaamaendra commented Jan 28, 2022

duomdaamaendra commented Jan 28, 2022

snomos commented Jan 28, 2022

snomos commented Jan 28, 2022

duomdaamaendra commented Jan 28, 2022

lynnda-hill commented Feb 2, 2022

unhammer commented Mar 10, 2022 •

edited

Loading

Marks particle as error instead of the preceding Err/Orth of the same mwe #45

Marks particle as error instead of the preceding Err/Orth of the same mwe #45

Comments

duomdaamaendra commented Jan 28, 2022 • edited by unhammer Loading

duomdaamaendra commented Jan 28, 2022

snomos commented Jan 28, 2022 • edited Loading

duomdaamaendra commented Jan 28, 2022

duomdaamaendra commented Jan 28, 2022

duomdaamaendra commented Jan 28, 2022

snomos commented Jan 28, 2022

snomos commented Jan 28, 2022

duomdaamaendra commented Jan 28, 2022

lynnda-hill commented Feb 2, 2022

unhammer commented Mar 10, 2022 • edited Loading

duomdaamaendra commented Jan 28, 2022 •

edited by unhammer

Loading

snomos commented Jan 28, 2022 •

edited

Loading

unhammer commented Mar 10, 2022 •

edited

Loading