Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add/update cocoapods and pypi download_url (and other) support to fetchcode/package.py #116

Open
johnmhoran opened this issue Apr 2, 2024 · 3 comments
Assignees
Labels
enhancement New feature or request

Comments

@johnmhoran
Copy link
Contributor

In connection with a purl2url issue in packageurl-python, we cannot add download URL support for cocoapods or pypi in purl2url.py because the process involves network calls, and consequently will need to add/update that support in fetchcode/package.py.

@johnmhoran
Copy link
Contributor Author

@pombredanne I have a rather detailed question for you re metadata queries (i.e., using fetchcode/package.py) for a PURL with no version. The issue: missing podspec.json files, which we use to populate the metadata object for each version.

Sometimes the default URL we use, the raw.githubusercontent.com/CocoaPods/Specs... URL from cocoapods.py (with /blob deleted), returns a 404 for one or more of the tags I retrieve for that PURL from its GitHub repo. I currently print a statement to that effect in the console and add it to a new .log file in package.py, which I can access from the metadata command in purlcli.py and include in the headers/errors section of the metadata JSON output. Simple and clean, but can result in cocoapod tags for which we report no data.

SCTK's packagedcode/cocoapods.py also has a "backup" podspec URL: https://cdn.cocoapods.org/Specs/{hashed_path}/{name}/{version}/{name}.podspec.json. I don't currently use this as a backup in my code but could. FWIW, using sqlite as an example, both the default and the backup URLs return a 404 for the tags I've tried. BTW, these are the tags for sqlite: ['0.1.23', '0.1.22', '0.1.20', '0.1.18', '0.1.16', '0.1.15', '0.1.14', '0.1.13', '0.1.12', '0.1.11', '0.1.10', '0.1.9', '0.1.8', '0.1.7', '0.1.6', '0.1.5', '0.1.4', '0.1.3', '0.1.2', '0.1.1', '0.1.0'].

Finally, a 3rd podspec URL we might have available: sometimes, the cocoapods.org page has a podspec link to a location in the pod's GitHub repo. sqlite is an example: https://github.com/CocoaPods/Specs/blob/master/Specs/4/9/7/SQLite/0.1.23/SQLite.podspec.json. And sometimes that GitHub URL will also return a podspec.json for some of the other tags for that pod -- but not always.

How would you like this search, analysis and reporting process to be handled? For each tag, exhaust all 3 URLs if needed in case one of them has a podspec.json for the tag in the list, and do that for each tag in the list of tags from GitHub (which can be numerous)?

@johnmhoran
Copy link
Contributor Author

Note that the 3rd podspec URL example uses a different hashed_path from that used in the 2 cocoapods.py podspec URLs, e.g.

I don't (yet) know how that latter hashed_path is calculated....

@johnmhoran
Copy link
Contributor Author

^ It's the familiar MD5 hash we already use, applied to a cocoapod name that does not reflect the "official" uppercase/lowercase name as it exists in the cocoapods.org page for that pod, e.g., incorrect upper/lowercase of the input PURL in the PURL CLI metadata command (which of course leads to a different MD5 result ;-)

We now use the cocoapods.org case structure for the hashing and (via the .log file I've added to fetchcode/package.py) disclose that in the headers/warnings section of the JSON output (e.g., "Input PURL name 'afnetworking' processed as 'AFNetworking' per https://cocoapods.org/pods/afnetworking").

johnmhoran added a commit that referenced this issue Apr 25, 2024
Reference: #116

Signed-off-by: John M. Horan <[email protected]>
johnmhoran added a commit that referenced this issue May 8, 2024
johnmhoran added a commit that referenced this issue Jun 13, 2024
Reference: #116

Signed-off-by: John M. Horan <[email protected]>
johnmhoran added a commit that referenced this issue Jun 19, 2024
Reference: #116
Signed-off-by: John M. Horan <[email protected]>
johnmhoran added a commit that referenced this issue Jun 19, 2024
Reference: #116
Signed-off-by: John M. Horan <[email protected]>
johnmhoran added a commit that referenced this issue Jun 19, 2024
- We now have a check_package function that can
  load a file and, in the future, regenerate the
  test file(s).

Reference: #116
Signed-off-by: John M. Horan <[email protected]>
johnmhoran added a commit that referenced this issue Jun 19, 2024
johnmhoran added a commit that referenced this issue Jun 19, 2024
Reference: #116
Signed-off-by: John M. Horan <[email protected]>
johnmhoran added a commit that referenced this issue Jun 20, 2024
johnmhoran added a commit that referenced this issue Jul 19, 2024
- Adjusted data output for bitbucket, cargo, npm, pypi and rubygems
  types to return metadata (1) for all versions when the input PURL has
  no version and (2) for just the specified version when the input PURL
  has a version.
Signed-off-by: John M. Horan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant