Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xmlquery issue when values are set to match another variable #4646

Open
bartgol opened this issue Jun 24, 2024 · 26 comments
Open

xmlquery issue when values are set to match another variable #4646

bartgol opened this issue Jun 24, 2024 · 26 comments
Assignees

Comments

@bartgol
Copy link
Contributor

bartgol commented Jun 24, 2024

I'm getting this error with --listall (or when query the HIST_N variable):

$ ./xmlquery HIST_N
ERROR: Entry HIST_N was listed as type int but value '$STOP_N' is not valid int
$ ./xmlquery --listall
ERROR: Entry HIST_N was listed as type int but value '$STOP_N' is not valid int

Did something change in CIME related to xmlquery? I don't think we recently changed the default for HIST_N, so I'm guessing this is a new thing? I'm using the tag cime6.0.226 (4388509).

@jedwards4b
Copy link
Contributor

This should work and recursively give you the value of STOP_N. I just tried in the latest cime master and it works as expected. (cime6.0.250)

@bartgol
Copy link
Contributor Author

bartgol commented Jun 24, 2024

Ok, then maybe it's simply time to update the cime submodule in our fork. Thanks for checking!

Edit: uhm, I tried to checkout cime6.0.250, and still got the same error. @jgfouca could there be something funky in e3sm?

@jgfouca
Copy link
Contributor

jgfouca commented Jun 24, 2024

It may be E3SM-only. I've assigned @jasonb5

@bartgol
Copy link
Contributor Author

bartgol commented Jun 24, 2024

I may have a suspect. In generic_xml.py, we have

        for m in reference_re.finditer(item_data):
            var = m.groups()[0]
            print (f"found m={m}, var={var}")
            logger.debug("find: {}".format(var))
            # The overridden versions of this method do not simply return None
            # so the pylint should not be flagging this
            # pylint: disable=assignment-from-none
            ref = self.get_value(var, subgroup=subgroup)
            ...

However, the referenced var may be in a different subgroup. E.g., in my case, HIST_N is in the group run_drv_history, while STOP_N is in run_begin_stop_restart. Removing the subgroup arg from the get_value call, makes things work. However, that means that the referenced var must be unique across the XML file, which may be a restriction.

@rljacob
Copy link
Member

rljacob commented Jun 24, 2024

I just tried ./xmlquery HIST_N on latest e3sm master with a case built back on May 14 and it works for me.

@jgfouca
Copy link
Contributor

jgfouca commented Jun 24, 2024

@rljacob , try this case: SMS.ne256pg2_ne256pg2.F2010-SCREAMv1

@rljacob
Copy link
Member

rljacob commented Jun 24, 2024

Still works. See /home/jacob/E3SM/cime/scripts/Fscream on Chrysalis.

@jgfouca
Copy link
Contributor

jgfouca commented Jun 24, 2024

Now I'm really confused.

@rljacob
Copy link
Member

rljacob commented Jun 24, 2024

Is it an old case you're trying to query? If you're doing that with a newer CIME, it may not work.

@jgfouca
Copy link
Contributor

jgfouca commented Jun 24, 2024

I just tried again, making sure i was on lastest master with subs updated, same error. This is on mappy.

@rljacob
Copy link
Member

rljacob commented Jun 24, 2024

Oh weird. I tried actually doing "create_test" and going to the casedir and now I see the error.
Try making your case with: ./create_newcase --case Fscream --compset F2010-SCREAMv1 --res ne256pg2_ne256pg2 and going to the casedir it creates.

@jedwards4b
Copy link
Contributor

It will only occur with create_test and not with create_newcase since the setting of
HIST_N=$STOP_N happens in config_tests.xml

@rljacob
Copy link
Member

rljacob commented Jun 24, 2024

I went to a test directory created in our overnight testing and it was fine. If it wasn't, all the tests would fail since there would be no coupler history output without changing HIST_N from its default.

@jedwards4b
Copy link
Contributor

I don't think that's the case - you can set the value of HIST_N - the reported problem is that you cannot query the value after setting HIST_N=$STOP_IN

@jedwards4b
Copy link
Contributor

Is that overnight test directory on mappy - where they are apparently having problems? If so can you try ./xmlquery HIST_N in one of those ERS tests?

@rljacob
Copy link
Member

rljacob commented Jun 24, 2024

Oh I see what you're saying. Its always the case that HIST_N is set as <HIST_N>$STOP_N</HIST_N> in a test. But ./xmlquery is working in at least one testdir I tried.

@jgfouca
Copy link
Contributor

jgfouca commented Jun 24, 2024

To be clear, all the basics are running fine. It's only xmlquery for specific items that's the problem.

@rljacob
Copy link
Member

rljacob commented Jun 24, 2024

No I'm trying Chrysalis. I've seen ./xmlquery HIST_N give an error on a test dir I just created but it works fine in a test dir created by the overnight testing. On the same machine.

@rljacob
Copy link
Member

rljacob commented Jun 24, 2024

It must be something about that case and the SMS test. I tried ./create_test --no-build ERP.ne256pg2_ne256pg2.F2010-SCREAMv1 and ./xmlquery HIST_N worked..

@bartgol
Copy link
Contributor Author

bartgol commented Jun 24, 2024

It must be something about that case and the SMS test. I tried ./create_test --no-build ERP.ne256pg2_ne256pg2.F2010-SCREAMv1 and ./xmlquery HIST_N worked..

For ERS/ERP tests, I suspect HIST_N is set to an integer, and not $STOP_N?

@jedwards4b
Copy link
Contributor

Not true: https://github.com/ESMCI/cime/blob/master/CIME/data/config/config_tests.xml#L270

@rljacob
Copy link
Member

rljacob commented Jun 24, 2024

By default, STOP_N is 5 for SMS and 11 for ERS.

@bartgol
Copy link
Contributor Author

bartgol commented Jun 25, 2024

I can confirm that also on my laptop ./xmlquery HIST_N works with an ERP test, but fails with an SMS one. My theory above regarding HIST_N and STOP_N being in different groups must be wrong (or incomplete).

@jasonb5
Copy link
Collaborator

jasonb5 commented Jun 26, 2024

@bartgol The issues is with HIST_N and STOP_N being in different groups. The reason ERS, ERP, IRT, etc continue work is due to it having STOP_N defined in env_test.xml. Any test with <HIST_N>$STOP_N</HIST_N> and <STOP_N>...</STOP_N> defined in config_tests.xml will work still. But anything relying on $STOP_N being populated from somewhere else will fail, e.g. PEA, PEM, PET, etc.

I'm thinking the fix is probably making sure values not provided in config_tests.xml are given a default when the file is generated. Changing the grouping could work too, might be simpler than the first. I don't think we could remove the subgroup from

cime/CIME/case/case.py

Lines 553 to 557 in 67ade4a

item = env_file.get_resolved_value(
item,
allow_unresolved_envvars=allow_unresolved_envvars,
subgroup=subgroup,
)
since it'd probably re-introduce the origin issue from #4572.

@bartgol
Copy link
Contributor Author

bartgol commented Jun 26, 2024

What if we allowed to specify indirect parameters via "scoping"? I'm thinking something like setting HIST_N to $run_begin_stop_restart::STOP_N. We'd still process $-containing strings as params to be resolved, but we'd add an extra step:

        for m in reference_re.finditer(item_data):
            var = m.groups()[0]
            logger.debug("find: {}".format(var))
            # The overridden versions of this method do not simply return None
            # so the pylint should not be flagging this
            # pylint: disable=assignment-from-none
            tokens = var.split('::')
			if len(tokens)==2:
               ref = self.get_value(tokens[0], subgroup=tokens[1])
			else:
               ref = self.get_value(var, subgroup=subgroup)

I used the :: token since I'm thinking about C++ scopes, but any other "safe" string would do (double underscore, @, ...)

Copy link
Contributor

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants