feat: hotdata 0.4.1 + typed error handling + ruff/mypy tooling (v0.3.0)#25
Conversation
Adopt the hotdata 0.4.1 SDK and add a typed error-handling public API. - New error-handling exports: HotdataError, HotdataTerminalError, HotdataTransientError, classify_sdk_error (errors.py) and ManagedDatabaseClient (managed_client.py). - Ship py.typed so consumers pick up inline types. - Pin hotdata>=0.4.1; refresh uv.lock (root version 0.3.0). - Add ruff + mypy tooling config and dev deps (ruff>=0.5, mypy); apply ruff lint/format cleanup across the package. - Update test fixtures for 0.4.1 (QueryResponse fields) and the __all__ contract list. - Bump version to 0.3.0 with matching CHANGELOG entry. Verified: 53 tests pass, ruff clean, uv sync --locked OK on linux/amd64.
db1c8bb to
c1dbbcf
Compare
| class ManagedDatabaseClient: | ||
| """Managed-database client with bounded retries over hotdata-runtime. | ||
|
|
||
| This is the shared client used by Hotdata adapter packages (Airflow, | ||
| dlt, etc.). It wraps the lower-level RuntimeClient with retry logic, | ||
| Arrow-based result fetching, and convenience helpers for the managed | ||
| database lifecycle. | ||
| """ |
There was a problem hiding this comment.
nit: (not blocking) This PR introduces substantial new behavior — classify_sdk_error's status-code → transient/terminal mapping, _request_with_retry's backoff/attempt loop, and the async polling in _query_database_scoped/_wait_result_ready — but the only test touching the new surface is the __all__ contract assertion in test_contract.py. None of the retry, classification, or polling logic is exercised.
Since these are public exports that two downstream packages (hotdata-airflow, hotdata-dlt-destination) will depend on, please add unit tests covering at least: each branch of classify_sdk_error (TimeoutError/ConnectionError/ApiException 4xx vs 5xx vs retryable codes), that _request_with_retry retries only on HotdataTransientError and stops at max_retries, and the succeeded/failed/timeout paths of the pollers.
| if not self.table_is_synced(database, table, schema=schema): | ||
| return None | ||
| db = self._runtime.resolve_managed_database(database) | ||
| sql = f'SELECT * FROM "default"."{schema}"."{table}"' |
There was a problem hiding this comment.
nit: (not blocking) schema and table are interpolated directly into the SQL string with only static double-quoting. A name containing a " (e.g. from a dlt-generated table name) would break out of the quoted identifier — a SQL-injection / malformed-query vector. Consider validating the identifiers against an allowed pattern, or escaping embedded quotes (schema.replace('"', '""')) before interpolation.
| def _request_with_retry(self, operation: Callable[[], T]) -> T: | ||
| for attempt in range(1, self._max_retries + 1): | ||
| try: | ||
| return operation() | ||
| except Exception as error: | ||
| mapped_error = classify_sdk_error(error.__cause__ or error) | ||
| if isinstance(mapped_error, HotdataTransientError) and attempt < self._max_retries: | ||
| backoff = min(self._retry_backoff_seconds * attempt, self._MAX_BACKOFF_SECONDS) | ||
| time.sleep(backoff) | ||
| continue | ||
| raise mapped_error from error | ||
| raise RuntimeError("No retry attempts configured") |
There was a problem hiding this comment.
super nit: (not blocking) Two things on the retry loop:
max_retriesis really "max attempts" — the first call consumes attempt 1, somax_retries=3yields 3 total tries (0 retries means... ). Worth a docstring/param note so callers don't off-by-one their config.- If a caller passes
max_retries=0, the loop body never runs and every call raisesRuntimeError("No retry attempts configured")instead of executing the operation once. Consider guarding against< 1in__init__, or treating 0 as "one attempt, no retries".
| @@ -0,0 +1,31 @@ | |||
| from __future__ import annotations | |||
|
|
|||
| from hotdata.rest import ApiException | |||
There was a problem hiding this comment.
super nit: (not blocking) client.py imports ApiException from hotdata.exceptions, while here it's imported from hotdata.rest. They resolve to the same class today (rest re-exports it), but using one canonical import path across the package avoids confusion if the SDK ever changes the re-export.
There was a problem hiding this comment.
Reviewed. No blocking issues. Left a few non-blocking notes: missing test coverage for the new retry/classification/polling logic, SQL identifier interpolation in fetch_table, max_retries semantics/zero-value edge, and an ApiException import-path inconsistency. None block merge.
Summary
Rebased cleanly onto fresh
origin/main(v0.2.4 + dependabot/CI) and redone. Adopts the hotdata 0.4.1 SDK and adds a typed error-handling public API.What changed
HotdataError,HotdataTerminalError,HotdataTransientError,classify_sdk_error(hotdata_runtime/errors.py), andManagedDatabaseClient(hotdata_runtime/managed_client.py).hotdata_runtime/py.typedso downstream consumers pick up inline type info.hotdata>=0.4.1; refreshuv.lock.ruff>=0.5,mypy); apply ruff lint/format cleanup across the package. mypy is intentionally not wired into CI.QueryResponsefields) and the__all__contract list.Version
New public exported symbols = a feature, so 0.3.0 (bumped from main's released 0.2.4). pyproject version, uv.lock root version, and CHANGELOG all set to 0.3.0.
Verification
uv run pytest -q-> 53 passeduv run ruff check .-> cleanuv sync --lockedon linux/amd64 (uv 0.11.23) -> LOCKED_OK (avoids the--lockedCI trap)scripts/check-release.py-> "release metadata ok for 0.3.0"