Attribute remote SSH session WSFS activity via command origin#5728
Attribute remote SSH session WSFS activity via command origin#5728sbauersfeld wants to merge 2 commits into
Conversation
The SSH server bootstrap notebook writes "RemoteSshServer" to /Workspace/.proc/self/metadata/command_origin so workspace-file activity from a remote SSH session is attributed to its own WSFS command origin instead of "PythonDriver". WSFS resolves each request to its leaf-most registered ancestor, so the SSH server subprocess and the shells it spawns inherit this origin. Best-effort: never blocks server startup if .proc is unavailable. Pairs with the WsfsOperation.CommandOrigin enum value COMMAND_ORIGIN_REMOTE_SSH_SERVER added in databricks-eng/universe. Signed-off-by: Scott Bauersfeld <scott.bauersfeld@databricks.com> Co-authored-by: Isaac
81e5f5a to
cf9aa85
Compare
anton-107
left a comment
There was a problem hiding this comment.
Reviewed alongside the backend enum PR (databricks-eng/universe#2127479). The mechanism is sound: WSFS registers command origin per-PID and resolves it by walking up the process ancestry, so the single write of RemoteSshServer at the bootstrap notebook PID propagates to the SSH server subprocess and all session shells — which the existing PR_SET_CHILD_SUBREAPER keeps within the bootstrap's process tree. The best-effort try/except OSError is the right call for clusters where the path isn't writable (serverless/shared), and it correctly avoids turning a metadata hiccup into a bootstrap failure.
Two non-blocking notes:
- Coordinate merge timing with the proto PR: until
COMMAND_ORIGIN_REMOTE_SSH_SERVERis regenerated and deployed to the WSFS logger, sessions log asCOMMAND_ORIGIN_UNSPECIFIED. Backward-compatible and self-healing, so order doesn't strictly matter. - Worth one e2e confirmation when the proto is live:
ssh connect, write a workspace file, and verify the WSFS log line showscommandOrigin: RemoteSshServer.
LGTM.
|
An authorized user can trigger integration tests manually by following the instructions below: Trigger: Inputs:
Checks will be approved automatically on success. |
Changes
The SSH server bootstrap notebook (
experimental/ssh/internal/client/ssh-server-bootstrap.py) now writesRemoteSshServerto/Workspace/.proc/self/metadata/command_originjust before launching the SSH server.Why
The bootstrap runs as a notebook job on the cluster, so without this, all workspace-file (WSFS) activity from a remote SSH session is attributed to the generic
PythonDrivercommand origin. Writing a dedicated origin makes that activity attributable in WSFS logs.Tests
python3 -m py_compileon the bootstrap script (syntax check)..proc/.../command_originwrite path is exercised server-side by the WSFSTestMetadataCommandOriginunit test, which confirms the file is writable and that the write updates the command's origin.This pull request and its description were written by Isaac.