Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce duplicate md link check #506

Merged
merged 1 commit into from
Dec 4, 2024

Conversation

MonkeyCanCode
Copy link
Contributor

@MonkeyCanCode MonkeyCanCode commented Dec 4, 2024

Description

We are using gaurav-nelson/github-action-markdown-link-check@v1 to perform md link check. According to https://github.com/gaurav-nelson/github-action-markdown-link-check/tree/v1, the max depth by default is set to -1 which means check all levels. Thus, when a duplicate dir levels are provided, it will actually cause duplicate check on the same files multiple times.

Here is the current behavior:

# current logic 
➜  find regtests regtests/client/python/docs regtests/client/python .github build-logic polaris-core polaris-service extension spec k8 getting-started -name '*.md' -not -path './node_modules/*' -exec markdown-link-check '{}' --config .github/workflows/check-md-link-config.json -q ';'

# current file count after removed the exec part
➜  find regtests regtests/client/python/docs regtests/client/python .github build-logic polaris-core polaris-service extension spec k8 getting-started -name '*.md' -not -path './node_modules/*' | wc -l
     515

# Current file count after removed duplicate files
➜ find regtests regtests/client/python/docs regtests/client/python .github build-logic polaris-core polaris-service extension spec k8 getting-started -name '*.md' -not -path './node_modules/*' | sort | uniq | wc -l
     206

Here is the fixed one:

# new command without the exec part
➜ find regtests .github build-logic polaris-core polaris-service extension spec k8 getting-started -name '*.md' -not -path './node_modules/*' | wc -l
     206
# show no diff in the file captured
➜ find regtests regtests/client/python/docs regtests/client/python .github build-logic polaris-core polaris-service extension spec k8 getting-started -name '*.md' -not -path './node_modules/*' | sort | uniq > 1
➜ find regtests .github build-logic polaris-core polaris-service extension spec k8 getting-started -name '*.md' -not -path './node_modules/*' | sort | uniq > 2
➜ diff 1 2
➜ 

Also, I added helm as another directory to check (not necessary as file-path already checked for README.md. However, looking at folder-path, we are specifying all dirs except the one which we want to skipped (e.g. site used by hugo), it may be consistent for this.

Lastly, there is a missing separate for file-path for the last element

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • Documentation update
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

I tested this locally via ack:

➜ act -j markdown-link-check
...
| Using markdown-link-check configuration file: .github/workflows/check-md-link-config.json
| USE_QUIET_MODE: yes
| USE_VERBOSE_MODE: no
| FOLDER_PATH: regtests, .github, build-logic, polaris-core, polaris-service, extension, spec, k8, getting-started
| MAX_DEPTH: -1
| CHECK_MODIFIED_FILES: no
| FILE_EXTENSION: .md
| FILE_PATH: CHAT_BYLAWS.md, CODE_OF_CONDUCT.md, CONTRIBUTING.md, README.md, SECURITY.md
| + find regtests .github build-logic polaris-core polaris-service extension spec k8 getting-started -name '*.md' -not -path './node_modules/*' -exec markdown-link-check '{}' --config .github/workflows/check-md-link-config.json -q ';'
| + set +x
| + find . -type f '(' -wholename CHAT_BYLAWS.md -o -wholename CODE_OF_CONDUCT.md -o -wholename CONTRIBUTING.md -o -wholename README.md -o -wholename SECURITY.md ')' -not -path './node_modules/*' -exec markdown-link-check '{}' --config .github/workflows/check-md-link-config.json -q ';'
| + set +x
| =========================> MARKDOWN LINK CHECK <=========================
|
| [✔] All links are good!
|
| =========================================================================
[Check Markdown links/markdown-link-check]   ✅  Success - Main gaurav-nelson/github-action-markdown-link-check@v1
[Check Markdown links/markdown-link-check] Cleaning up container for job markdown-link-check
[Check Markdown links/markdown-link-check] 🏁  Job succeeded

Checklist:

Please delete options that are not relevant.

  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • If adding new functionality, I have discussed my implementation with the community using the linked GitHub issue

Copy link
Contributor

@flyrain flyrain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 pending pipeline

@flyrain flyrain merged commit 3a8d7fe into apache:main Dec 4, 2024
5 checks passed
@flyrain
Copy link
Contributor

flyrain commented Dec 4, 2024

Thanks @MonkeyCanCode for the fix!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants