Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CIME should be able to detect hangs #4553

Open
jgfouca opened this issue Dec 20, 2023 · 1 comment
Open

CIME should be able to detect hangs #4553

jgfouca opened this issue Dec 20, 2023 · 1 comment
Assignees

Comments

@jgfouca
Copy link
Contributor

jgfouca commented Dec 20, 2023

This would be similar to NODE_FAIL_REGEX and MPI_FAIL_REGEX, something like HANG_TIMEOUT_SEC. Have a thread in case_run watch the modification timestamps of all the log files. If none have been updated in HANG_TIMEOUT_SEC seconds, consider the job hung.

Copy link
Contributor

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants