Skip to content

Fix missing git info for bundles deployed from in-workspace Git folders#5709

Open
ilyakuz-db wants to merge 2 commits into
mainfrom
fix-git-info-new-git-folders
Open

Fix missing git info for bundles deployed from in-workspace Git folders#5709
ilyakuz-db wants to merge 2 commits into
mainfrom
fix-git-info-new-git-folders

Conversation

@ilyakuz-db

@ilyakuz-db ilyakuz-db commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Changes

When a bundle is deployed from inside the workspace and its source lives in a Git folder, the CLI reads git provenance from get-status?return_git_info=true. The new in-workspace Git-folder model (git-in-dataplane) returns only id and path there (no branch/commit/url), so the CLI recorded empty git info (origin URL, branch, commit).

This parses the git folder id and, when the origin URL is missing, falls back to Repos.GetByRepoId to fill in branch/commit/url. Classic Repos return these inline from get-status and skip the extra call. A failed lookup degrades to best-effort, matching FetchRepositoryInfo's existing contract.

Why

On dataplanes with the new in-workspace Git folders, get-status returns git_info: { id, path } only, while classic Repos return the full object — so in-workspace deploys from a Git folder recorded blank git provenance (including in the deployment metadata service version). The Repos API still returns full provenance for these folders by id. (Reading on-disk .git is not an option here — workspace Repos don't expose a usable .git on the dataplane, which is why the API path exists.)

Tests

Unit tests in libs/git/info_test.go (testserver + dbr.MockRuntime): new Git folder → provenance recovered via Repos API; classic Repo → Repos API not called; Repos lookup failure → graceful degradation. go test ./libs/git/ passes; lint clean.

For acceptance tests I can't see proper way to test RN and we can't mock DBR

This PR was written by Claude Code.

@github-actions

Copy link
Copy Markdown
Contributor

Approval status: pending

/libs/git/ - needs approval

Files: libs/git/info.go, libs/git/info_test.go
Suggested: @simonfaltum
Also eligible: @Divyansh-db, @renaudhartert-db, @hectorcast-db, @parthban-db, @tanmay-db, @tejaskochar-db, @mihaimitrea-db, @chrisst, @rauchy

General files (require maintainer)

Files: NEXT_CHANGELOG.md
Based on git history:

  • @denik -- recent work in ./, libs/git/

Any maintainer (@andrewnester, @anton-107, @denik, @pietern, @shreyas-goenka, @simonfaltum, @renaudhartert-db) can approve all areas.
See OWNERS for ownership rules.

Co-authored-by: Isaac
@eng-dev-ecosystem-bot

Copy link
Copy Markdown
Collaborator

Integration test report

Commit: 804d517

Run: 28115881609

Env 💚​RECOVERED 🙈​SKIP ✅​pass 🙈​skip Time
💚​ aws linux 7 13 244 1024 6:35
💚​ aws windows 7 13 246 1022 4:36
💚​ aws-ucws linux 7 13 334 940 7:29
💚​ aws-ucws windows 7 13 336 938 5:52
💚​ azure linux 1 15 247 1022 5:42
💚​ azure windows 1 15 249 1020 5:03
💚​ azure-ucws linux 1 15 339 936 8:27
💚​ azure-ucws windows 1 15 341 934 6:06
💚​ gcp linux 1 15 246 1024 4:28
💚​ gcp windows 1 15 248 1022 5:08
20 interesting tests: 13 SKIP, 7 RECOVERED
Test Name aws linux aws windows aws-ucws linux aws-ucws windows azure linux azure windows azure-ucws linux azure-ucws windows gcp linux gcp windows
💚​ TestAccept 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R
🙈​ TestAccept/bundle/invariant/no_drift 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/permissions 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
💚​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions 💚​R 💚​R 💚​R 💚​R 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
💚​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions/DATABRICKS_BUNDLE_ENGINE=direct 💚​R 💚​R 💚​R 💚​R
💚​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions/DATABRICKS_BUNDLE_ENGINE=terraform 💚​R 💚​R 💚​R 💚​R
💚​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions 💚​R 💚​R 💚​R 💚​R 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
💚​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions/DATABRICKS_BUNDLE_ENGINE=direct 💚​R 💚​R 💚​R 💚​R
💚​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions/DATABRICKS_BUNDLE_ENGINE=terraform 💚​R 💚​R 💚​R 💚​R
🙈​ TestAccept/bundle/resources/postgres_branches/basic 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_branches/recreate 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_branches/replace_existing 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_branches/update_protected 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_branches/without_branch_id 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_endpoints/basic 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_projects/update_display_name 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/synced_database_tables/basic 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/vector_search_endpoints/drift/recreated_same_name 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/vector_search_indexes/recreate/embedding_dimension 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/ssh/connection 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
Top 6 slowest tests (at least 2 minutes):
duration env testname
4:10 aws linux TestSecretsPutSecretStringValue
3:33 gcp windows TestAccept
3:12 azure-ucws windows TestAccept
2:45 azure windows TestAccept
2:42 aws windows TestAccept
2:36 aws-ucws windows TestAccept

Comment thread libs/git/info.go
result.CurrentBranch = gi.Branch
result.WorktreeRoot = fixedPath
} else {
if gi == nil {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens in the new Git folders case? Does the CLI successfully read GitInfo from the local .git? Can we add some acceptance test coverage for all three cases (new git, classic repo, and vanilla). You can mock the API endpoints.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On DBR this code never read locally, see this guard

if strings.HasPrefix(path, "/Workspace/") && dbr.RunsOnRuntime(ctx) {

That also make it harder to test with acceptance tests as we can't properly mock the environment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some unit tests

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With new git folders API returns only gitInfo.ID

See example

🔴 workspace/get-status doesn't return git provenance for git-in-data-plane folders

  Same repo (github.com/databricks/bundle-examples), same path, two workspaces — different
  result.

  ① AWS staging — git-in-DP folder (object_type: DIRECTORY)

  Request:
  GET /api/2.0/workspace/get-status
        ?path=/Workspace/Users/<user_name>/bundle-examples
        &return_git_info=true
  Response:
  {
    "object_type": "DIRECTORY",
    "git_info": {
      "id":   2884540697170475,
      "path": "/Users/<user_name>/bundle-examples"
    }
  }
  ❌ missing branch, head_commit_id, url, provider

  …yet the data does exist — same id, Repos API:
  GET /api/2.0/repos/2884540697170475
  {
    "branch":         "main",
    "head_commit_id": "d53214e177cd372afa03bfc044be9bb94103ba9a",
    "url":            "https://github.com/databricks/bundle-examples.git",
    "provider":       "gitHub"
  }
  ✅ full provenance

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There must be a way to read the commit ID? From the .git folders directly maybe as we do locally?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be a followup to this PR - but capturing Git metadata from Git folders is an important usecase? Since DABs in the workspace is supported there?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be a followup to this PR - but capturing Git metadata from Git folders is an important usecase? Since DABs in the workspace is supported there?

It's important — and this PR already covers it, so it doesn't need to be a followup. To clarify the scope:

  • This only affects bundles deployed from the workspace. Local and CI/CD deploys are unaffected — they read .git directly
  • We already supported two object types: classic Repos and Git folders. Both return the full git_info inline from Workspace get-status API, so they work correctly today
  • The gap is a new Private Preview feature, Git in Dataplane. Objects created with that flag don't return a full git_info from get-status — only git_info.id (the repo id) — and there's no .git directory to read on the Dataplane either
  • This PR handles that case: it takes the id and calls the Repos API (GET /api/2.0/repos/{id}) to recover branch/commit/url. So Git metadata ends up captured for all of them

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll check if it's possible to add acceptance test using RunOnDbr

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants