Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make att.repeatable work for <sequence>, <alternate>, and <anyElement> #633

Merged
merged 12 commits into from
Mar 15, 2024

Conversation

sydb
Copy link
Member

@sydb sydb commented Oct 9, 2023

Re-worked how att.repeatable attributes (@minOccurs and @maxOccurs) work for <sequence>, <alternate>, and <anyElement> in order to fix #627. I have not attacked <datatype> yet, but might do so this week.

…ne for RELAXNG, one for DTD) with a new function, tei:norMinMax(); should be no change to output; passes both Test/ and Test2/.
… untested because of #631. This comit is *not* ready for prime time.
In this commit I intended to comment out some debugging code (which renders output kinda useless, is it sticks the results of each pass into the output as RELAXNG <div> elements, rather than as comments) and update some tests. But I found to my horror that (despite message of previous 3 commits) tests were not passing. So lots of test updates, too. NB: One of the things that makes this ineligible for merging into dev is that the constraint for the <choice> element has changed from the incorrect, but workable '( model.choicePart | choice )+' to the incorrect and *not* workable '( model.choicePart | choice )+'. At the moment I have no idea why, but it does not really matter, as attacking minOccurs= and maxOccurs= on <alternate> and <sequence> is next.
…nterleave>), too. Passes tests, so *may* be eligible to merge into dev, but not really ready, yet, as need to work on repeatability of <alternate>.
 * Removed some debugging code
 * Added tests (in Test2/testPure1) for sequence/@m[ia][nx]Occurs and alternate/@m[ia][nx]Occurs, including valid instance, but not invalid
 * Updated tests (in Test/)
Copy link
Contributor

@raffazizzi raffazizzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Read through the code and it makes sense. Requested very minor changes; see comments.
I'm not approving it yet because I'm not sure this is fully working yet -- repeating the rng wrapper may not be sufficient to make this work.

I tried something silly like this:

<elementSpec ident="g" mode="change">
  <content>
    <alternate>
      <sequence minOccurs="3" maxOccurs="4">
        <elementRef key="g"/>
        <elementRef key="l"/>
      </sequence>
      <elementRef key="gap"/>
    </alternate>
  </content>
</elementSpec>

which generated:

element g {
    ((test_g, test_l)
     | (test_g, test_l)
     | (test_g, test_l)
     | (test_g, test_l)?
     | test_gap),

But this validates, meaning that minOccurs="3" isn't forcing a minimum of three occurrences.

<g>
   <g></g>
    <l></l>
   <!-- expecting at least two more occurrences of the `g` `l` pair. -->
</g>

Or am I misunderstanding how this should work?

<xsl:apply-templates mode="pass3"/>
</xsl:for-each>
</xsl:template>
<xsl:variable name="pass3">
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be the main change in this file, right? The rest looks like minor refactoring.

</xsl:template>

<xsl:template match="tei:elementRef|tei:classRef|tei:macroRef" mode="#default tangle">
<xsl:variable name="prefixedName" select="tei:generateRefPrefix(.)"/>
<xsl:variable name="norMinMax" select="tei:norMinMax(.)"/>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about calling the variable with a different name? It's confusing to have the result of tei:notMinMax() be called norMinMax

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m afraid I just disagree. I think it is helpful to have the result of tei:norMinMax() be $norMinMax. (There is no chance of confusing the two: one ends with a ‘(’, the other starts with a ‘$’, and there is nothing else that has that name.)

That said, if you really hate it, we could use $norMinMax_result_sequence or some such, I suppose.

odds/teiodds.xsl Show resolved Hide resolved
<xsl:when test="xs:integer( $maxOccurs ) lt $min"><xsl:message terminate="yes">The default value of @maxOccurs is 1. You cannot have a @minOccurs greater than the @maxOccurs.</xsl:message></xsl:when>
<xsl:otherwise><xsl:value-of select="xs:integer( $maxOccurs )"/></xsl:otherwise>
<xsl:when test="xs:integer( $maxOccurs ) lt $min">
<xsl:message terminate="yes">The default value of @maxOccurs is 1. You cannot have a @minOccurs greater than the @maxOccurs.</xsl:message>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this caught by schematron on ODD? Anyway, probably worth checking here are well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, maybe not, but either way it is not part of this PR. (The only thing I did here was add a newline before the <xsl:messsage> so that I could read the thing.)

odds/teiodds.xsl Outdated Show resolved Hide resolved
odds/teiodds.xsl Outdated Show resolved Hide resolved
odds/teiodds.xsl Show resolved Hide resolved
odds/teiodds.xsl Outdated Show resolved Hide resolved
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<xsl:template match="tei:sequence | tei:interleave" mode="#default tangle">
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a different improvement that the other requested for #633? Was tei:interleave not being processed properly here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not understand the question as worded, @raffazizzi , but you probably are asking if this change is part of solving this issue (#627) or TEI issue #609. <tei:interleave> was not being processed at all, anywhere. So this change solves the problem discussed in the latter (609), and technically has nothing to do with this issue. (But since it was hard to find and I was in the right place and it is only a 17-character change …)

1) Apparently there are times when the 'c' variable in the template for elementRef|classRef|macroRef boils down to an empty sequence.
2) Handle the case when there is no <datatype> child of the <attDef> being processed.
@sydb
Copy link
Member Author

sydb commented Oct 12, 2023

Read through the code and it makes sense.
. . .
Or am I misunderstanding how this should work?

Oh dear. Looks like the <sequence> is generating that which <alternate> should. (Is it possible I have them reversed? That would be embarrassing.) Looking now …

I have confirmed you are correct, @raffazizzi, I get the same (wrong) results. May take quite a bit longer to figure out why.

@sydb
Copy link
Member Author

sydb commented Oct 15, 2023

Both to keep @raffazizzi and any one else paying attention abreast of what is going on, and to leave some breadcrumbs for myself …

I have tweaked the algorithm for generating RELAX NG output in my local copy of odd2relax.xsl, and the result works on @raffazizzi’s test case, and passes Test2/ (finally — took a bit of doing) but fails on the oddbyexample and test21 cases in Test/. The reason it fails is that for the input

  <sequence minOccurs="0" maxOccurs="1">
    <elementRef key="front"/>
    <classRef key="model.global" minOccurs="0" maxOccurs="unbounded"/>
  </sequence>

my code produces

      <zeroOrMore>
        <ref name="model.global"/>
      </zeroOrMore>
      <group>
        <optional>
          <ref name="front"/>
          <zeroOrMore>
            <ref name="model.global"/>
          </zeroOrMore>
        </optional>
      </group>

whereas the current (released branch) code produces

      <zeroOrMore>
        <ref name="model.global"/>
      </zeroOrMore>
      <optional>
        <group>
          <ref name="front"/>
          <zeroOrMore>
            <ref name="model.global"/>
          </zeroOrMore>
        </group>
      </optional>

These two are exactly the same except that the <optional> and <group> are reversed. But they validate the same set of documents. Heck, trang converts them both to

model.global*, (front, model.global*)?

in the compact syntax. But when the thing that <ref name="front"> is referring to does not exist, the code in odd2relax.xsl that removes unneeded and problematic clauses (I think "pass3") fails to remove one of the model.global* when it is coded with the <group> on the outside. Sigh.

I am currently trying decide whether I should try to

  1. get my code to produce the <group> inside the <optional>, or
  2. get my code to drop the <group> entirely (it serves no purpose), or
  3. to get pass3 to be smarter; or (this last idea just struck me as I wrote this list)
  4. add another pass that converts the code I produce to that which pass3 expects.

@ebeshero
Copy link
Member

@sydb aren't the <group> elements constraining your options? (Are they supposed to, but that isn't working in the RNG conversion?)

@sydb
Copy link
Member Author

sydb commented Nov 5, 2023

I implemented (4), above.
It seems to work:

  • Stylesheets Test/ passes on my local machine
  • Stylesheets Test2/ passes on my local machine
  • Guidelines (dev branch) Test/ passes (against this branch) on my local machine
  • Guidelines (dev branch) build successfully (against this branch) on my local machine

However:

  1. There is some ickyness in the RELAX NG output. There are differences in where the RELAX NG namespace is defaulted and when the prefix is specified, and there are now extraneous <rng:group> elements in the output. They make no change to what files are valid vs invalid, but just add pointless verbosity. I am not sure why they are there at the moment.
  2. It probably deserves more testing.

@raffazizzi raffazizzi removed this from the Release 7.56.0 milestone Nov 9, 2023
@sydb
Copy link
Member Author

sydb commented Mar 9, 2024

So after 4 months there had been no progress whatsoever. Figuring that lots has happened in dev branch during that time, I merged dev into this branch. That took quite a bit of doing — several hand merges, and then quite a bit of fixing stuff up. The result is that I am pretty sure this branch is back to doing what it is supposed to. E.g., this branch is back to passing all 4 tests listed in my previous comment. Moreover, if done soon, it should be easy to merge into dev.

Copy link
Member

@ebeshero ebeshero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just tested this branch (running Stylesheets bin/teitorng) with a simple ODD and TEI file for a personography entry. I added one or more of <anyElement> in either the MathML or egXML namespace to appear after model.personTop, and the output RNG all works as expected. I tested the same thing over on dev branch, and the output is pretty close to the same--only changes are in naming the anyElement I applied. (anyElement-person on dev, and anyElement-person_2 on sydb_672).

From what I can see so far, I think it's safe to merge in...Should we test a few more ODDs? @sydb

@sydb
Copy link
Member Author

sydb commented Mar 11, 2024

I think we should test a few more ODDs that actually exercise the bug the fix addresses. So here is a quick set of instructions on how to do that.

  • Write a little test ODD that has a couple of places where @minOccurs or @minOccurs occur on <sequence>, <alternate>, or <anyElement>. I am calling it “minmaxtest.odd” here, and presuming it is int he /tmp/ directory with a prefix="tst_" on the <schemaSpec>.[1]
  • Make the Stylesheets directory your current working directory, e.g. cd /path/to/TEIC_Stylesheets_repo.
  • git checkout dev
  • bin/teitornc /tmp/minmaxtest.odd /tmp/minmaxtest_dev.rnc
  • git checkout sydb_627
  • bin/teitornc /tmp/minmaxtest.odd /tmp/minmaxtest_627.rnc
  • diff /tmp/minmaxtest_dev.rnc /tmp/minmaxtest_627.rnc

But of course feel free to use RNG instead. Presuming nothing bombs, we expect the 627 output to be at least as correct, if not better than, the dev output. We also expect a few unimportant differences like

(((tst_model.common), tst_model.global*)+,

instead of

((tst_model.common, tst_model.global*)+,

for reasons I have not figured out. But that extra set of parens has no effect on validation, so don’t fret it. (The important bit is that in both cases model.common is plain, model.global has a ‘*’, and the whole thing has a ‘+’.)

Note
[1] I will post one of my little test files and its results in a subsequent post here on the PR soon.

@sydb
Copy link
Member Author

sydb commented Mar 11, 2024

The following is a slightly modified copy of a shell transcript testing this on my machine. I have inserted a line of 34 hash marks between commands to make them easier to differentiate. (Hash mark is a comment in bash, so those lines have no effect.)

$ cd /path/to/TEIC_Stylesheets_repo/
$ ##################################
$ git pull
remote: Enumerating objects: 180, done.        
remote: Counting objects: 100% (180/180), done.        
remote: Compressing objects: 100% (37/37), done.        
remote: Total 180 (delta 145), reused 169 (delta 143), pack-reused 0        
Receiving objects: 100% (180/180), 440.00 KiB | 4.31 MiB/s, done.
Resolving deltas: 100% (145/145), completed with 60 local objects.
From github.com:TEIC/Stylesheets
   cfa97aa7..cc55edf4  dev               -> origin/dev
   91a8ff4d..885f7b54  sydb_627          -> origin/sydb_627
 * [new branch]        tri               -> origin/tri
 * [new branch]        trishaoconnor_548 -> origin/trishaoconnor_548
 * [new branch]        trishaoconnor_584 -> origin/trishaoconnor_584
Updating ef148441..cc55edf4
Fast-forward
 Documentation/teixsl.xml    | 4 ++--
 README.md                   | 4 ++--
 bin/teitorelaxng            | 2 +-
 common/common_core.xsl      | 2 +-
 common/functions.xsl        | 2 +-
 docx/from/paragraphs.xsl    | 2 +-
 docx/from/pass2.xsl         | 2 +-
 docx/to/mml2omml.xsl        | 8 ++++----
 docx/tools/build.xml        | 4 ++--
 fo/fo_textstructure.xsl     | 6 +++---
 html/html_textstructure.xsl | 6 +++---
 11 files changed, 21 insertions(+), 21 deletions(-)
$ ##################################
$ cat /tmp/minmaxtest.odd 
<?xml version="1.0" encoding="UTF-8"?>
<TEI xmlns="http://www.tei-c.org/ns/1.0">
  <teiHeader>
    <fileDesc>
      <titleStmt>
        <title>minOccurs maxOccurs test</title>
      </titleStmt>
      <publicationStmt>
        <ab>not really published, just a file for debugging stylesheets</ab>
      </publicationStmt>
      <sourceDesc>
        <ab>born digital</ab>
      </sourceDesc>
    </fileDesc>
  </teiHeader>
  <text>
    <body>
      <p>Just a little test schema to exercise <att>minOccurs</att>
      and <att>maxOccurs</att> a wee bit for issue 627, PR 633.</p>
    </body>
    <back>
      <schemaSpec xmlns="http://www.tei-c.org/ns/1.0" ident="tst" prefix="tst_">
        <gloss>test</gloss>
        <moduleRef key="tei"/>
        <moduleRef key="textstructure"/>
        <moduleRef key="core"/>
        <moduleRef key="header"/>
        <moduleRef key="namesdates"/>
        <elementSpec ident="couplet" mode="add" ns="http://www.example.edu/ns">
          <classes>
            <memberOf key="model.lLike"/>
            <memberOf key="att.global"/>
          </classes>
          <content>
            <sequence minOccurs="2" maxOccurs="2">
              <elementRef key="l"/>
            </sequence>
          </content>
        </elementSpec>
        <elementSpec ident="triplet" mode="add" ns="http://www.example.edu/ns">
          <classes>
            <memberOf key="model.lLike"/>
            <memberOf key="att.global"/>
          </classes>
          <content>
            <sequence minOccurs="3" maxOccurs="3">
              <elementRef key="l"/>
            </sequence>
          </content>
        </elementSpec>
      </schemaSpec>
    </back>
  </text>
</TEI>
$ ##################################
$ git checkout dev
Already on 'dev'
Your branch is up to date with 'origin/dev'.
$ ##################################
$ bin/teitornc /tmp/minmaxtest.odd /tmp/minmaxtest_dev.rnc
WARNING: No localsource set. Will get a copy from /path/to/TEIC_Stylesheets_repo/source/p5subset.xml if necessary.
Convert /tmp/minmaxtest.odd to /tmp/minmaxtest_dev.rnc (tei to rnc) using profile default
     [echo] Do ODD expand processing (schema ) 
Warning: XML resolver not found; external catalogs will be ignored
WARNING: A terminally deprecated method in java.lang.System has been called
WARNING: System::setSecurityManager has been called by org.apache.tools.ant.types.Permissions (file:/usr/share/ant/lib/ant.jar)
WARNING: Please consider reporting this to the maintainers of org.apache.tools.ant.types.Permissions
WARNING: System::setSecurityManager will be removed in a future release

BUILD SUCCESSFUL
Total time: 1 second
$ ##################################
$ git checkout sydb_627
branch 'sydb_627' set up to track 'origin/sydb_627'.
Switched to a new branch 'sydb_627'
$ ##################################
$ bin/teitornc /tmp/minmaxtest.odd /tmp/minmaxtest_627.rnc
WARNING: No localsource set. Will get a copy from /path/to/TEIC_Stylesheets_repo/source/p5subset.xml if necessary.
Convert /tmp/minmaxtest.odd to /tmp/minmaxtest_627.rnc (tei to rnc) using profile default
     [echo] Do ODD expand processing (schema ) 
Warning: XML resolver not found; external catalogs will be ignored
WARNING: A terminally deprecated method in java.lang.System has been called
WARNING: System::setSecurityManager has been called by org.apache.tools.ant.types.Permissions (file:/usr/share/ant/lib/ant.jar)
WARNING: Please consider reporting this to the maintainers of org.apache.tools.ant.types.Permissions
WARNING: System::setSecurityManager will be removed in a future release

BUILD SUCCESSFUL
Total time: 1 second
$ ##################################
$ diff -C0 /tmp/minmaxtest_dev.rnc /tmp/minmaxtest_627.rnc 
*** /tmp/minmaxtest_dev.rnc	2024-03-11 10:45:32.751098138 -0400
--- /tmp/minmaxtest_627.rnc	2024-03-11 10:45:53.916055804 -0400
***************
*** 11 ****
! # Schema generated from ODD source 2024-03-11T14:45:32Z. . 
--- 11 ----
! # Schema generated from ODD source 2024-03-11T14:45:53Z. . 
***************
*** 2637 ****
!     (tst_model.choicePart | tst_choice)+,
--- 2637,2639 ----
!     (tst_model.choicePart | tst_choice),
!     (tst_model.choicePart | tst_choice),
!     (tst_model.choicePart | tst_choice)*,
***************
*** 4112 ****
!        | (((tst_model.common), tst_model.global*)+,
--- 4114 ----
!        | ((tst_model.common, tst_model.global*)+,
***************
*** 4173 ****
!        | (((tst_model.common), tst_model.global*)+,
--- 4175 ----
!        | ((tst_model.common, tst_model.global*)+,
***************
*** 4189 ****
!        | (((tst_model.common), tst_model.global*)+,
--- 4191 ----
!        | ((tst_model.common, tst_model.global*)+,
***************
*** 4205 ****
!        | (((tst_model.common), tst_model.global*)+,
--- 4207 ----
!        | ((tst_model.common, tst_model.global*)+,
***************
*** 4221 ****
!        | (((tst_model.common), tst_model.global*)+,
--- 4223 ----
!        | ((tst_model.common, tst_model.global*)+,
***************
*** 4237 ****
!        | (((tst_model.common), tst_model.global*)+,
--- 4239 ----
!        | ((tst_model.common, tst_model.global*)+,
***************
*** 4253 ****
!        | (((tst_model.common), tst_model.global*)+,
--- 4255 ----
!        | ((tst_model.common, tst_model.global*)+,
***************
*** 4268 ****
!      (((tst_model.common), tst_model.global*)+,
--- 4270 ----
!      ((tst_model.common, tst_model.global*)+,
***************
*** 8378 ****
!   element couplet { tst_l, tst_att.global.attributes, empty }
--- 8380 ----
!   element couplet { (tst_l, tst_l), tst_att.global.attributes, empty }
***************
*** 8382 ****
!   element triplet { tst_l, tst_att.global.attributes, empty }
--- 8384,8386 ----
!   element triplet {
!     (tst_l, tst_l, tst_l), tst_att.global.attributes, empty
!   }
$ 

Besides the timestamps and all those extra parens around model.common, the only differences I see are:

  • Around line 2637 (which happens to be the content model of <choice>): in the dev version one or more of model.choicePart | choice is required, whereas in the 627 version two or more of them are required. (I.e., the content model of <choice> is finally expressed correctly.)
  • Around line 8379: in the dev version a <couplet> can only have one <l> child; in the 627 version it can only have two <l> children.
  • Around line 8383: in the dev version a <triplet> can only have one <l> child; in the 627 version it can only have three <l> children.

So I declare this test a success. But it only tested one combination of minOccurs= and maxOccurs= on <sequence>. (Well, also another one on <alternate> if you count the change to <choice>, but pretty much every test has that, as it is broken as is in the Guidelines.)

@ebeshero ebeshero merged commit f33f71e into dev Mar 15, 2024
4 checks passed
@ebeshero ebeshero deleted the sydb_627 branch March 15, 2024 23:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

att.repeatable attributes have no effect on <sequence> or <alternate>
4 participants