Don't extract suffix-shaped parenthesized/quoted content as nicknames by derek73 · Pull Request #189 · derek73/python-nameparser

derek73 · 2026-07-01T07:08:05Z

Summary

parse_nicknames() used to extract everything inside parens/quotes as a nickname unconditionally, so suffix-like content such as (Ret), (Jr.), (MBA) was wrongly classified as a nickname instead of a suffix (Check for known suffixes when processing nicknames #111).
Added SUFFIX_ACRONYMS_AMBIGUOUS (ed, jd) as a small standalone exception list for acronym suffixes that also plausibly collide with a common given-name nickname, wired into Constants as a plain SetManager attribute. Existing SUFFIX_ACRONYMS/suffix_acronyms are otherwise unchanged (no API break).
Moved '(ret)'/'(vet)' out of SUFFIX_ACRONYMS into SUFFIX_NOT_ACRONYMS as bare 'ret'/'vet' — those literal-paren entries were a Wikipedia-import artifact and effectively dead code under the old unconditional-extraction behavior.
Rewrote parse_nicknames() to use a per-match callback: suffix-shaped or period-terminated content is left in place (undelimited) for normal downstream parsing instead of being routed to nickname_list. JEFFREY (JD) BRICKEN still parses JD as a nickname by default (regression guard), since jd is in the ambiguous set.
Also fixed an unrelated, pre-existing bug caught during review: a missing comma between 'msc' and 'mscmsm' in SUFFIX_ACRONYMS caused Python's implicit string-literal concatenation to silently merge them into a bogus 'mscmscmsm' entry.

Known, documented limitation (not fixed, out of scope): parse_full_name's no-comma suffix detection only recognizes a trailing run of suffix-shaped pieces. A suffix-shaped word freed from parens by this fix only lands in suffix when it's already in a trailing/comma position in the source string — e.g. "Andrew Perkins (MBA)" → suffix MBA, but "Lon (Jr.) Williams" → Jr. lands in middle, identical to how the bare string "Lon Jr. Williams" already parses on master. This fix's guarantee is that the delimited and undelimited forms of a name now parse identically — not that a freed suffix always reaches suffix_list regardless of position.

Test plan

pytest -q — 978 passed, 4 skipped, 22 xfailed, no failures
New regression/behavior tests for suffix-shaped content in parens, single quotes, and double quotes
Regression test confirming JD stays a nickname by default
Behavior tests confirming suffix_acronyms_ambiguous is customizable (adding/removing entries flips classification)

Closes #111

…_ACRONYMS

…names (#111)

Includes the corrected scope-boundary note discovered during Task 2 implementation: parse_full_name's suffix detection only recognizes a trailing run of suffix-shaped pieces, so mid-name content freed from parens/quotes doesn't always land in suffix -- documented as a known, pre-existing limitation rather than worked around.

Addresses code-review minor findings on the parse_nicknames rewrite: comments explaining why is_suffix() isn't reused directly and why handle_match is shared across all three delimiter regexes, plus test coverage for suffix-shaped content in single/double quotes (previously only parenthesis was covered).

Python's implicit string-literal concatenation silently merged 'msc' and 'mscmsm' into a single bogus 'mscmscmsm' entry in SUFFIX_ACRONYMS, dropping both real entries. Caught during code review of an unrelated change; unrelated to issue #111 but small and low-risk enough to fix alongside it.

A parse-behavior test on two specific acronyms doesn't generalize to catch other missing-comma bugs in SUFFIX_ACRONYMS; it only documents this one instance. Not the right way to guard against this class of bug.

Design specs and implementation plans are working notes, not published documentation. Keep them on disk locally but stop tracking them in git.

…ACRONYMS Easier to find right after SUFFIX_NOT_ACRONYMS instead of scrolling past the large acronym list.

…knames handle_match() checked stripped (lc()-only, leading/trailing periods stripped) against suffix_acronyms, unlike is_suffix() which also strips internal periods. This meant an acronym suffix like "M.D" (periods between letters, no trailing period) fell through to nickname_list instead of being recognized as a suffix. Found during PR review by the pr-test-analyzer subagent. Also adds the release_log.rst entries for this fix, per CLAUDE.md's maintenance requirement, flagged by the code-reviewer subagent.

…ACRONYMS Without this, a future maintainer diffing suffixes.py in isolation could read the (ret)/(vet) removal as an accidental regression and re-add the parenthesized form, silently reintroducing the #111 bug. Flagged by the comment-analyzer subagent during PR review.

nameparser/config/__init__.py's suffix_acronyms_ambiguous attribute (import, docstring, type hint, constructor param, and __init__ assignment -- all from Task 1, commit 135395b) was accidentally reverted by commit c2efbd8. Root cause: a review subagent was running concurrently in this same shared worktree directory and appears to have checked out an older commit in place (rather than in an isolated copy) to compare pre/post-PR behavior for its "verified against unmodified master" claims; that checkout's staged state got swept into c2efbd8's commit alongside my own staged changes, since `git commit` includes everything staged, not just newly `git add`ed paths. The working tree itself was never wrong -- parser.py's handle_match() has referenced self.C.suffix_acronyms_ambiguous correctly this whole time, and pytest passed after every commit since the corruption, because pytest reads the filesystem, not git history. Only the *committed* nameparser/config/__init__.py (and, transiently, the already-pushed PR branch) lacked the attribute -- a fresh clone of c2efbd8..4e91b7a would have raised AttributeError on any name containing parens/quotes. Verified via `git show HEAD:...` and `git diff 135395b HEAD -- nameparser/config/__init__.py` that this commit is the only remaining gap; suffixes.py, parser.py, and the tests were independently re-added correctly in later commits. Also includes docs/usage.rst and docs/customize.rst updates documenting the new suffix_acronyms_ambiguous behavior and SUFFIX_ACRONYMS_AMBIGUOUS constant.

derek73 added 8 commits June 30, 2026 23:32

config: add SUFFIX_ACRONYMS_AMBIGUOUS, move (ret)/(vet) to SUFFIX_NOT…

135395b

…_ACRONYMS

fix: don't extract suffix-shaped parenthesized/quoted content as nick…

8995b2b

…names (#111)

test: remove test_msc_and_mscmsm_suffix_acronyms

7c91594

A parse-behavior test on two specific acronyms doesn't generalize to catch other missing-comma bugs in SUFFIX_ACRONYMS; it only documents this one instance. Not the right way to guard against this class of bug.

chore: untrack docs/superpowers planning docs

e3ffbc8

Design specs and implementation plans are working notes, not published documentation. Keep them on disk locally but stop tracking them in git.

refactor: move SUFFIX_ACRONYMS_AMBIGUOUS above the ~530-entry SUFFIX_…

30d478b

…ACRONYMS Easier to find right after SUFFIX_NOT_ACRONYMS instead of scrolling past the large acronym list.

derek73 self-assigned this Jul 1, 2026

derek73 added this to the v1.3.0 milestone Jul 1, 2026

derek73 added 3 commits July 1, 2026 00:13

derek73 merged commit bb02739 into master Jul 1, 2026
8 checks passed

derek73 deleted the worktree-suffix-in-parens branch July 1, 2026 07:40

derek73 added the enhancement label Jul 2, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't extract suffix-shaped parenthesized/quoted content as nicknames#189

Don't extract suffix-shaped parenthesized/quoted content as nicknames#189
derek73 merged 11 commits into
masterfrom
worktree-suffix-in-parens

derek73 commented Jul 1, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

derek73 commented Jul 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

derek73 commented Jul 1, 2026 •

edited

Loading