Skip to content

fix(candidate): use exact dist-info path in get_metadata_for_wheel()#1224

Open
rd4398 wants to merge 1 commit into
python-wheel-build:mainfrom
rd4398:incorrect-metadata
Open

fix(candidate): use exact dist-info path in get_metadata_for_wheel()#1224
rd4398 wants to merge 1 commit into
python-wheel-build:mainfrom
rd4398:incorrect-metadata

Conversation

@rd4398

@rd4398 rd4398 commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

get_metadata_for_wheel() returned the first zip entry matching *.dist-info/METADATA, which could be a vendored package's metadata when a wheel bundles multiple dist-info directories. Derive the correct {name}-{version}.dist-info/METADATA path from the wheel URL instead and read it directly.

Closes: #1146

@rd4398 rd4398 requested a review from a team as a code owner June 29, 2026 19:29
@coderabbitai

coderabbitai Bot commented Jun 29, 2026

Copy link
Copy Markdown

Review Change Stack

Warning

Review limit reached

@rd4398, you've reached your PR review limit, so we couldn't start this review.

Next review available in: 46 minutes

Enable usage-based reviews in Billing to review now. Otherwise, wait until the next included review is available.
You're only billed for reviews past your plan's rate limits ($0.25/file).

How can I continue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based reviews.

How do review limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window.

Please refer docs for additional details.

Review details
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 5f6fb0de-94fb-42ea-a2b5-a8f5bad54623

📥 Commits

Reviewing files that changed from the base of the PR and between 7bd33b7 and d2c0d71.

📒 Files selected for processing (2)
  • src/fromager/candidate.py
  • tests/test_pep658_support.py

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@mergify mergify Bot added the ci label Jun 29, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/fromager/candidate.py`:
- Around line 149-152: The exception handling in the wheel metadata lookup
currently suppresses the original ZipFile member lookup context by using from
None in the KeyError handler. Update the KeyError branch in the candidate
metadata lookup logic to capture the caught exception (for example in the code
path around the metadata_path lookup) and re-raise the ValueError using
exception chaining with from err so the original lookup failure is preserved.
- Around line 163-166: The wheel filename validation in candidate.py is too
permissive, allowing malformed names like pkg-1.0-bad.whl to pass into
_wheel_metadata_path() and later produce misleading errors in
get_metadata_for_wheel(). Tighten the parsing in the URL filename handling so
only valid wheel filenames are accepted before deriving dist-info, and fail fast
with a ValueError for malformed inputs. Use the existing filename parsing logic
around urlparse(url).path and the wheel metadata path derivation in
_wheel_metadata_path() as the place to enforce the stricter check.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 24d86223-c631-4e46-8dc4-6fdaaed43579

📥 Commits

Reviewing files that changed from the base of the PR and between cc322cd and 7bd33b7.

📒 Files selected for processing (2)
  • src/fromager/candidate.py
  • tests/test_pep658_support.py

Comment thread src/fromager/candidate.py Outdated
Comment thread src/fromager/candidate.py
`get_metadata_for_wheel()` returned the first zip entry matching
`*.dist-info/METADATA`, which could be a vendored package's metadata
when a wheel bundles multiple dist-info directories. Derive the
correct `{name}-{version}.dist-info/METADATA` path from the wheel
URL instead and read it directly.

Closes: python-wheel-build#1146
Co-Authored-By: Claude <claude@anthropic.com>
Signed-off-by: Rohan Devasthale <rdevasth@redhat.com>
@rd4398 rd4398 force-pushed the incorrect-metadata branch from cbbae11 to d2c0d71 Compare June 29, 2026 19:42
Comment thread src/fromager/candidate.py
return Metadata.from_email(metadata_content, validate=validate)


def _wheel_metadata_path(url: str) -> str:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using parse_wheel_filename() from packaging.utils instead of manually parsing the wheel filename. Note that it returns canonicalized names, so use filename.split("-", 1)[0] for the dist name (needed for case-sensitive ZIP lookups):


filename = posixpath.basename(urlparse(url).path)
_, dist_version, _, _ = parse_wheel_filename(filename)
dist_name = filename.split("-", 1)[0]
return f"{dist_name}-{dist_version}.dist-info/METADATA"

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fromager has four implementations of mapping a wheel filename to a dist-info directory. The PR is adding the third implementation of extracting dist name and version from a wheel filename. I would prefer to have a single implementation of these algorithms.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

get_metadata_for_wheel() may return wrong metadata

3 participants