fix(clone): clone agentic project documents + resolve exported agentic tools#23
Merged
Merged
Conversation
…c tools
Two gaps surfaced cloning agentic ("Agentic Prompt Studio") projects:
1. Documents were dropped. Agentic projects keep their uploads in their own
store (agentic/documents/), separate from Prompt Studio prompt-documents.
The files phase only iterates the custom_tool remap, so agentic docs were
never cloned. AgenticStudioPhase now clones them per project (download raw
bytes from source, skip names already on target, upload to the target
project), and counts them in the dry-run plan.
2. Exported agentic tools were skipped in workflows. A workflow tool_instance
references a registry id; exported agentic projects register under
agentic_studio_registry, but ToolInstancePhase resolved tool_id only via
prompt_studio_registry. It now falls back to the agentic registry, so the
"Agentic tool API" workflow lands with its tool wired.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011ja9H1rnSXmPUgQtHm8TNS
|
| Filename | Overview |
|---|---|
| src/unstract/clone/client.py | Adds five new client methods following the established request/error-handling pattern. |
| src/unstract/clone/phases/agentic_studio.py | Adds document and verified-data clone methods with correct file_strategy/dry-run guards. |
| src/unstract/clone/phases/tool_instance.py | Adds agentic_studio_registry fallback inside the existing lock — thread-safe and correct. |
| tests/clone/test_agentic_studio_phase.py | Eight new test cases covering all critical edge cases. |
| tests/clone/test_tool_instance_phase.py | Adds test verifying the agentic-registry fallback resolve path. |
Reviews (4): Last reviewed commit: "fix(clone): honour skip strategy in _clo..." | Re-trigger Greptile
Agentic verified-data ("ground truth manually verified by user") is curated
input, not regenerable output, so it must be cloned. AgenticStudioPhase now
re-points each source verified-data row to the cloned target document by
filename and recreates it (skipping docs absent on target and rows already
present). Extracted/comparison data stays uncloned — both regenerate on a
re-run + re-verify.
Also honour file_strategy="skip" in the new document path, matching the files
and lookups phases: under skip, agentic documents are listed and counted as
skipped (operator re-uploads), not transferred.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011ja9H1rnSXmPUgQtHm8TNS
…tegy Verified data FKs a document; with file_strategy=skip no docs land on target, so a dry-run must predict the rows as skipped rather than as creates that the real run silently drops. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_011ja9H1rnSXmPUgQtHm8TNS
_plan_children already forecasts verified-data as skipped under file_strategy=skip, but the runtime path lacked the matching guard: on a re-run where documents reached the target by other means, it would create verified rows the plan said it would skip. Add the early-return guard, mirroring _clone_documents. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_011ja9H1rnSXmPUgQtHm8TNS
Deepak-Kesavan
approved these changes
Jun 23, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Completes org-to-org
clonefor cloud Agentic ("Agentic Prompt Studio") projects, which previously cloned only the project shell:file_strategy="skip"like the files/lookups phases.tool_instance.tool_idnow resolves againstagentic_studio_registryin addition toprompt_studio_registry.Why
agentic/documents/), distinct from Prompt Studio prompt-documents. Thefilesphase only iterates thecustom_toolremap, so agentic docs were silently dropped — a clone landed the project with zero documents.agentic_verified_datais "ground truth manually verified by user" — curated input (the accuracy baseline), not regenerable output. Without it the target org can't measure extraction accuracy without a human re-verifying every doc.tool_instance.tool_idis a registry id. Exported agentic projects register underagentic_studio_registry, butToolInstancePhaseresolved only viaprompt_studio_registry, so an agentic-tool instance found no remap and was skipped (no registry remap for tool_id …). The dependent workflow (e.g. "Agentic tool API") then landed with no tool wired.All three reproduced on a staging org→org run.
How
client.py:list_agentic_documents,download_agentic_document(raw binary, likedownload_lookup_file),upload_agentic_document(multipart toagentic/projects/{id}/documents/upload/— the real upload route; thedocumentsviewsetuploadaction is a backend stub);list_agentic_verified_data,create_agentic_verified_data.agentic_studio.py:_clone_documents/_clone_one_documentand_clone_verified_datarun after schemas, before registry republish; idempotent by filename; honourmax_file_sizeandfile_strategy; dry-run plan counts both.tool_instance.py:tool_idresolves viaprompt_studio_registryoragentic_studio_registry; corrected the misleading "custom tool unpublished" skip message.Like Prompt Studio uploads, this clones the file + creates the document row; extraction/summary stays a UI step.
Can this PR break any existing features?
No. New client methods are additive. The doc/verified-data paths are gated to the cloud-only
AgenticStudioPhase(probe-skipped on OSS) and add work that previously didn't happen at all. The tool_instance change only adds a fallback resolve when the primary lookup misses, so existing Prompt Studio tool instances are unaffected.Database Migrations
Env Config
Relevant Docs
Related Issues or PRs
AgenticStudioPhase.Dependencies Versions
Notes on Testing
tests/clone/green (192 passing). Added: agentic doc-clone (skip-existing, dry-run count,file_strategy=skip), verified-data clone (filename mapping, skip-existing, skip-when-doc-missing, dry-run count), and tool_instance resolve-via-agentic-registry.Checklist
I have read and understood the Contribution Guidelines.
🤖 Generated with Claude Code