Skip to content

build: prepare PyAirbyte slim package boundaries#1035

Draft
Aaron ("AJ") Steers (aaronsteers) wants to merge 8 commits into
mainfrom
devin/1779853894-airbyte-slim-packaging
Draft

build: prepare PyAirbyte slim package boundaries#1035
Aaron ("AJ") Steers (aaronsteers) wants to merge 8 commits into
mainfrom
devin/1779853894-airbyte-slim-packaging

Conversation

@aaronsteers
Copy link
Copy Markdown
Member

@aaronsteers Aaron ("AJ") Steers (aaronsteers) commented May 27, 2026

Summary

This draft PR starts the PyAirbyte multi-package split requested by AJ by preparing import, dependency, and build boundaries for a new airbyte-slim distribution. It keeps the current airbyte package behavior intact while reducing eager imports of full-package-only modules, defaulting Python connector execution to uv-managed Python 3.12 with 3.11 fallback, and building matching airbyte and airbyte-slim artifacts from the same version tag.

The slim distribution is now a first-class checked-in package project at packages/airbyte-slim/pyproject.toml, using the shared airbyte/ source tree with declarative Hatch wheel/sdist excludes instead of a generated pyproject build script. It publishes package metadata as airbyte-slim, targets Python >=3.14,<3.15, excludes local execution, MCP, CLI, cache, source, destination, and SQL processor modules, and supports runtime version lookup from either the full or slim distribution metadata.

The PR also moves deterministic datetime parsing into airbyte._util.datetime_parse.parse_datetime with focused unit coverage, so cloud sync result parsing no longer depends on CDK datetime helpers.

Review & Testing Checklist for Human

  • Review packages/airbyte-slim/pyproject.toml, packages/airbyte-slim/airbyte, and .github/workflows/release_drafter.yml to confirm the first-class slim package layout and release build path should attach both airbyte and airbyte-slim wheels/sdists from one tag.
  • Review the module-level __getattr__ lazy export boundary in airbyte/__init__.py; this follows PEP 562 and keeps full-only imports deferred until their top-level names are accessed.
  • Review the uv dependency groups in pyproject.toml and uv.lock; this PR models slim, full, and dev groups for lockfile and packaging boundaries.
  • Verify the Python connector venv fallback behavior matches expectations for connector execution environments.
  • Review the stdlib datetime parsing helper and test cases to confirm the accepted timestamp formats are sufficient for Airbyte Cloud job payloads.

Recommended end-to-end release artifact check:

  • Run UV_DYNAMIC_VERSIONING_BYPASS=0.46.0.dev0 uv build.
  • Run UV_DYNAMIC_VERSIONING_BYPASS=0.46.0.dev0 uv build packages/airbyte-slim --out-dir dist.
  • Confirm dist/ contains matching airbyte and airbyte_slim wheel/sdist artifacts.
  • Inspect the airbyte_slim wheel metadata for Name: airbyte-slim and Requires-Python: <3.15,>=3.14.
  • Install the slim wheel on Python 3.14 and verify import airbyte plus airbyte.version.get_version().

Notes

Local validation run so far:

  • uv run --project /home/ubuntu/repos/PyAirbyte ruff check /home/ubuntu/repos/PyAirbyte/airbyte/version.py /home/ubuntu/repos/PyAirbyte/packages/airbyte-slim/pyproject.toml /home/ubuntu/repos/PyAirbyte/pyproject.toml
  • TOML parsing check for root pyproject.toml and packages/airbyte-slim/pyproject.toml.
  • uv run --project /home/ubuntu/repos/PyAirbyte pyrefly check /home/ubuntu/repos/PyAirbyte/airbyte/version.py
  • uv run --project /home/ubuntu/repos/PyAirbyte pytest /home/ubuntu/repos/PyAirbyte/tests/unit_tests/test_datetime_parse.py /home/ubuntu/repos/PyAirbyte/tests/unit_tests/test_cloud_api_util.py
  • UV_DYNAMIC_VERSIONING_BYPASS=0.46.0.dev0 uv build /home/ubuntu/repos/PyAirbyte --out-dir /home/ubuntu/repos/PyAirbyte/dist-check && UV_DYNAMIC_VERSIONING_BYPASS=0.46.0.dev0 uv build /home/ubuntu/repos/PyAirbyte/packages/airbyte-slim --out-dir /home/ubuntu/repos/PyAirbyte/dist-check
  • Verified dist-check/ contained matching airbyte and airbyte_slim wheel/sdist artifacts.
  • Verified the airbyte_slim wheel and sdist exclude CLI, MCP, caches, sources, destinations, and SQL processor modules.
  • Verified the airbyte_slim wheel metadata has Name: airbyte-slim, Version: 0.46.0.dev0, and Requires-Python: <3.15,>=3.14.
  • uv run --python 3.14 --with /home/ubuntu/repos/PyAirbyte/dist-check/airbyte_slim-0.46.0.dev0-py3-none-any.whl python -c "import airbyte; from airbyte.version import get_version; print(airbyte.__name__, get_version())"
  • uv run --project /home/ubuntu/repos/PyAirbyte poe -C /home/ubuntu/repos/PyAirbyte build-slim

Latest CI snapshot before the first-class package pivot was commit 9cbc54d: 20 passed, 0 failed, 0 pending. CI for latest commit bb99bc2 is running.

Current dependency compatibility finding: target Python 3.14 for the first airbyte-slim release. Python 3.15 metadata resolution works, but install reliability is blocked today by missing or source-build-failing wheels for compiled dependencies such as pydantic-core, with cryptography wheel availability also still catching up.

Link to Devin session: https://app.devin.ai/sessions/e8366a4b403b4b9daf05e812c50696b4
Requested by: Aaron ("AJ") Steers (@aaronsteers)

@devin-ai-integration
Copy link
Copy Markdown
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@github-actions
Copy link
Copy Markdown

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

💡 Show Tips and Tricks

Testing This PyAirbyte Version

You can test this version of PyAirbyte using the following:

# Run PyAirbyte CLI from this branch:
uvx --from 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1779853894-airbyte-slim-packaging' pyairbyte --help

# Install PyAirbyte from this branch for development:
pip install 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1779853894-airbyte-slim-packaging'

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /fix-pr - Fixes most formatting and linting issues
  • /uv-lock - Updates uv.lock file
  • /test-pr - Runs tests with the updated PyAirbyte
  • /prerelease - Builds and publishes a prerelease version to PyPI
📚 Show Repo Guidance

Helpful Resources

Community Support

Questions? Join the #pyairbyte channel in our Slack workspace.

📝 Edit this welcome message.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 27, 2026

PyTest Results (Fast Tests Only, No Creds)

476 tests  +9   476 ✅ +9   6m 0s ⏱️ -7s
  1 suites ±0     0 💤 ±0 
  1 files   ±0     0 ❌ ±0 

Results for commit bb99bc2. ± Comparison against base commit 8427840.

♻️ This comment has been updated with latest results.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 27, 2026

PyTest Results (Full)

546 tests  +9   528 ✅ +9   24m 28s ⏱️ +4s
  1 suites ±0    18 💤 ±0 
  1 files   ±0     0 ❌ ±0 

Results for commit bb99bc2. ± Comparison against base commit 8427840.

♻️ This comment has been updated with latest results.

Comment thread airbyte/__init__.py Fixed
Comment thread airbyte/cloud/workspaces.py Fixed
Comment thread airbyte/constants.py
"""


DEFAULT_CONNECTOR_PYTHON_VERSIONS: tuple[str, ...] = tuple(
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 27, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Modernizes connector execution to try configurable Python versions, centralizes venv creation, migrates datetime utilities to stdlib with a new parse_datetime, implements lazy top-level exports, relaxes cloud destination typing, and adds slim/full dependency groups and slim build tooling.

Changes

Core Library Infrastructure Improvements

Layer / File(s) Summary
Environment-configurable Python versions for connectors
airbyte/constants.py
DEFAULT_CONNECTOR_PYTHON_VERSIONS tuple is computed from AIRBYTE_CONNECTOR_PYTHON_VERSIONS env var, defaulting to 3.12,3.11 for use in connector venv fallback logic.
Python executor venv creation with version fallbacks
airbyte/_executors/python.py, airbyte/_executors/util.py
VenvExecutor centralizes venv command construction in _create_venv and adds _create_venv_with_fallbacks to attempt multiple Python versions when use_python=True. The executor factory now sets use_python=True for PYTHON install type and install() uses the fallback helper; failures aggregate logs into AirbyteConnectorInstallationError.
Module-level lazy loading of top-level exports
airbyte/__init__.py
Switches from eager imports to runtime-resolved lazy exports via __getattr__ and _LAZY_EXPORTS, caching resolved symbols in module globals and retaining TYPE_CHECKING imports for static analysis.
Standardize datetime handling with parse_datetime
airbyte/_util/datetime_parse.py, airbyte/_message_iterators.py, airbyte/cloud/sync_results.py, airbyte/logs.py, airbyte/progress.py, tests/unit_tests/test_datetime_parse.py
Adds parse_datetime utility and replaces ab_datetime_now/ab_datetime_parse usages with stdlib datetime/timezone and parse_datetime across trace messages, SyncResult parsing, daily log naming, and progress/log timestamps. Includes unit tests for the new parser.
Type flexibility and local imports in cloud modules
airbyte/cloud/workspaces.py, airbyte/cloud/sync_results.py, tests/unit_tests/test_cloud_api_util.py
Adds internal _DestinationLike Protocol and widens CloudWorkspace.deploy_destination to accept protocol instances; copies dict configs or uses _hydrated_config for objects and validates destinationType; moves heavy imports into method scope and uses forward references for CachedDataset. Adds tests verifying deploy_destination validation and payload forwarding.
Organize project dependencies into slim and full groups
pyproject.toml
Adds dependency-groups.slim (minimal runtime deps) and dependency-groups.full (includes integrations), updates dev to include full, and adds a poe task build-slim.
Slim package build script
scripts/build_slim_package.uv
Adds a script to generate a slim pyproject.toml, stage a build tree excluding incompatible packages, and run uv build to produce airbyte-slim artifacts (CLI and main provided).
Release workflow update
.github/workflows/release_drafter.yml
Release workflow "Build package" step now runs the main build and the slim build script, passing the same dynamic version override.
Unit tests for datetime parsing and deploy_destination
tests/unit_tests/*
Adds tests for parse_datetime covering multiple input forms and errors, and tests for CloudWorkspace.deploy_destination validation and payload forwarding.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Would you like me to run a quick checklist of likely follow-ups (e.g., verify uv behavior across platforms or confirm environment variable parsing edge cases), wdyt?

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 61.76% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'build: prepare PyAirbyte slim package boundaries' directly and accurately reflects the main objective of the PR: setting up infrastructure for a slimmed-down PyAirbyte distribution with proper import and dependency boundaries.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch devin/1779853894-airbyte-slim-packaging

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
airbyte/cloud/workspaces.py (1)

441-449: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Could we validate destinationType for both dict and destination-like inputs?

Right now the destinationType check only runs in the non-dict branch, so raw dict inputs can slip through without early validation and fail later in API calls. Could we move the validation after the branch so both paths are validated, wdyt?

Proposed patch
         if isinstance(destination, dict):
             destination_conf_dict = destination.copy()
         else:
             destination_conf_dict = destination._hydrated_config.copy()  # noqa: SLF001
             destination_conf_dict["destinationType"] = destination.name.replace("destination-", "")
-            if "destinationType" not in destination_conf_dict:
-                raise exc.PyAirbyteInputError(
-                    message="Missing `destinationType` in configuration dictionary.",
-                )
+
+        if "destinationType" not in destination_conf_dict:
+            raise exc.PyAirbyteInputError(
+                message="Missing `destinationType` in configuration dictionary.",
+            )
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@airbyte/cloud/workspaces.py` around lines 441 - 449, The branch that builds
destination_conf_dict should validate destinationType for both input types:
after the if/else that sets destination_conf_dict (using destination.copy() for
dict inputs or destination._hydrated_config + setting
destination_conf_dict["destinationType"] =
destination.name.replace("destination-", "") for destination-like objects), move
the check that raises exc.PyAirbyteInputError if "destinationType" is missing to
immediately follow the branch so both paths are validated; update/remove the
in-else duplicate check and ensure the check references the
destination_conf_dict variable.
♻️ Duplicate comments (3)
airbyte/logs.py (1)

20-20: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Critical: UTC import fails in docs generation (confirmed by pipeline failure).

The pipeline explicitly reports:

ImportError: cannot import name 'UTC' from 'datetime' (from 'datetime import UTC, datetime').

This confirms that the docs generation environment uses Python <3.11. You need a compatibility shim for UTC across all affected files.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@airbyte/logs.py` at line 20, The import "from datetime import UTC, datetime"
fails on Python <3.11; update airbyte/logs.py to provide a compatibility shim:
import datetime and timezone (e.g., import datetime or from datetime import
datetime, timezone), then define UTC = timezone.utc when UTC is not available
(or use a try/except ImportError to fall back to timezone.utc); replace usages
that rely on UTC in this module to use the shimmed UTC variable so docs and
older Python versions don't break.
airbyte/cloud/sync_results.py (1)

106-106: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Same UTC import compatibility issue as in _message_iterators.py.

This import requires Python 3.11+. Please apply the same compatibility fix here.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@airbyte/cloud/sync_results.py` at line 106, Replace the Python-3.11-only
import "from datetime import UTC, datetime" with a backwards-compatible import
and usage: import datetime and timezone (e.g. "from datetime import datetime,
timezone") or use a try/except to import UTC and fall back to timezone.utc, then
update any use of the UTC symbol in sync_results.py to reference the chosen
fallback (timezone.utc) so code works on older Python versions; ensure
references to "UTC" and "datetime" in this module are updated accordingly.
airbyte/progress.py (1)

27-27: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Same UTC import compatibility issue.

Please apply the same Python 3.11+ compatibility fix here as well.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@airbyte/progress.py` at line 27, The import of UTC from datetime is
incompatible with Python 3.11+; change the import to "from datetime import
datetime, timezone" and replace usages of the symbol UTC with timezone.utc
(e.g., update any datetime(..., tzinfo=UTC) or UTC references in
functions/classes in this module to use timezone.utc) so the code consistently
uses the standard timezone.utc available across supported Python versions.
🧹 Nitpick comments (2)
pyproject.toml (2)

75-97: ⚡ Quick win

Should the full group include the slim group?

Semantically, "full" typically means the complete feature set, but the current structure treats it as only the additional dependencies beyond slim. This could be confusing when users run uv sync --only-group full expecting to get everything.

Consider adding { include-group = "slim" } at the beginning of the full group, similar to how dev includes full. This would make full truly represent the complete dependency set, wdyt?

♻️ Suggested structure
 full = [
+  { include-group = "slim" },
   "airbyte-cdk>=7.3.9,<8.0.0",
   "cyclopts>=4.0,<5.0",

This would make the dependency hierarchy clearer:

  • slim = base runtime
  • full = slim + additional features
  • dev = full + development tools
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pyproject.toml` around lines 75 - 97, The "full" dependency group currently
lists only extra packages and should include the "slim" group so it truly
represents the complete install; update the "full" group definition in
pyproject.toml to include slim by adding an include-group = "slim" entry at the
top of the full group (mirroring how the "dev" group includes "full"), ensuring
the group name "full" is used and not changing package entries themselves.

54-97: Clarify dependency duplication between project.dependencies and [dependency-groups] (slim/full)

In pyproject.toml, project.dependencies contains the full set that’s also split into [dependency-groups].slim and .full, while uv treats dependency-groups as local/dev-only requirements (not end-user published requirements). Is the intent that end users always get the “full” union via project.dependencies today, and slim/full groups exist only for local workflows/lockfiles (and future package split)? If so, wdyt adding a short comment explaining this transition so version bumps don’t drift?

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pyproject.toml` around lines 54 - 97, Add a short clarifying comment near the
project.dependencies section and above the [project.optional-dependencies] /
[dependency-groups].slim and .full blocks stating that project.dependencies
intentionally contains the union (end-user published requirements) while the
slim/full groups are for local/dev workflows and lockfile management
(uv/uv.lock), and that developers should update both places when changing
published requirements to avoid drift; reference the symbols
project.dependencies, dependency-groups.slim, dependency-groups.full, and
uv.lock so future maintainers know where to sync changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@airbyte/cloud/workspaces.py`:
- Around line 441-449: The branch that builds destination_conf_dict should
validate destinationType for both input types: after the if/else that sets
destination_conf_dict (using destination.copy() for dict inputs or
destination._hydrated_config + setting destination_conf_dict["destinationType"]
= destination.name.replace("destination-", "") for destination-like objects),
move the check that raises exc.PyAirbyteInputError if "destinationType" is
missing to immediately follow the branch so both paths are validated;
update/remove the in-else duplicate check and ensure the check references the
destination_conf_dict variable.

---

Duplicate comments:
In `@airbyte/cloud/sync_results.py`:
- Line 106: Replace the Python-3.11-only import "from datetime import UTC,
datetime" with a backwards-compatible import and usage: import datetime and
timezone (e.g. "from datetime import datetime, timezone") or use a try/except to
import UTC and fall back to timezone.utc, then update any use of the UTC symbol
in sync_results.py to reference the chosen fallback (timezone.utc) so code works
on older Python versions; ensure references to "UTC" and "datetime" in this
module are updated accordingly.

In `@airbyte/logs.py`:
- Line 20: The import "from datetime import UTC, datetime" fails on Python
<3.11; update airbyte/logs.py to provide a compatibility shim: import datetime
and timezone (e.g., import datetime or from datetime import datetime, timezone),
then define UTC = timezone.utc when UTC is not available (or use a try/except
ImportError to fall back to timezone.utc); replace usages that rely on UTC in
this module to use the shimmed UTC variable so docs and older Python versions
don't break.

In `@airbyte/progress.py`:
- Line 27: The import of UTC from datetime is incompatible with Python 3.11+;
change the import to "from datetime import datetime, timezone" and replace
usages of the symbol UTC with timezone.utc (e.g., update any datetime(...,
tzinfo=UTC) or UTC references in functions/classes in this module to use
timezone.utc) so the code consistently uses the standard timezone.utc available
across supported Python versions.

---

Nitpick comments:
In `@pyproject.toml`:
- Around line 75-97: The "full" dependency group currently lists only extra
packages and should include the "slim" group so it truly represents the complete
install; update the "full" group definition in pyproject.toml to include slim by
adding an include-group = "slim" entry at the top of the full group (mirroring
how the "dev" group includes "full"), ensuring the group name "full" is used and
not changing package entries themselves.
- Around line 54-97: Add a short clarifying comment near the
project.dependencies section and above the [project.optional-dependencies] /
[dependency-groups].slim and .full blocks stating that project.dependencies
intentionally contains the union (end-user published requirements) while the
slim/full groups are for local/dev workflows and lockfile management
(uv/uv.lock), and that developers should update both places when changing
published requirements to avoid drift; reference the symbols
project.dependencies, dependency-groups.slim, dependency-groups.full, and
uv.lock so future maintainers know where to sync changes.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 056713d3-2756-43f4-a0a4-10c09ee57715

📥 Commits

Reviewing files that changed from the base of the PR and between 8427840 and 8c12be3.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (10)
  • airbyte/__init__.py
  • airbyte/_executors/python.py
  • airbyte/_executors/util.py
  • airbyte/_message_iterators.py
  • airbyte/cloud/sync_results.py
  • airbyte/cloud/workspaces.py
  • airbyte/constants.py
  • airbyte/logs.py
  • airbyte/progress.py
  • pyproject.toml

Comment thread .github/workflows/release_drafter.yml Outdated
Comment thread airbyte/cloud/sync_results.py Outdated
Comment on lines +116 to +128
def _parse_datetime(value: str | int) -> datetime:
if isinstance(value, int) or (
isinstance(value, str)
and (value.isdigit() or (value.startswith("-") and value[1:].isdigit()))
):
return datetime.fromtimestamp(int(value), tz=timezone.utc)

if not isinstance(value, str):
raise TypeError(f"Could not parse datetime string: {value}")

normalized = value.replace("Z", "+00:00")
return datetime.fromisoformat(normalized)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get why we want to reuse cdk dateparse logic but higher priority is a deterministic code path. Vendor in what's needed and drop cdk for datetime parse logic here.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh. You already did? Sorry, my bad.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries. I double-checked the current file and there is no remaining airbyte_cdk / ab_datetime_parse import on this path; it is already using the local deterministic helper.


Devin session

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move this to a proper _util helper module and import a non-underscored helper from there.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applied in 9cbc54d: moved this to airbyte._util.datetime_parse.parse_datetime and updated sync_results.py to import the non-underscored helper from there. I also added focused unit coverage for Unix timestamp, RFC3339/Z, offset, date-only, and invalid inputs.

Validation run:

  • uv run --directory /home/ubuntu/repos/PyAirbyte ruff check scripts/build_slim_package.uv airbyte/_util/datetime_parse.py airbyte/cloud/sync_results.py tests/unit_tests/test_datetime_parse.py
  • uv run --directory /home/ubuntu/repos/PyAirbyte ruff format --check scripts/build_slim_package.uv airbyte/_util/datetime_parse.py airbyte/cloud/sync_results.py tests/unit_tests/test_datetime_parse.py
  • uv run --directory /home/ubuntu/repos/PyAirbyte pytest tests/unit_tests/test_datetime_parse.py
  • UV_DYNAMIC_VERSIONING_BYPASS=0.46.0.dev0 uv run --directory /home/ubuntu/repos/PyAirbyte scripts/build_slim_package.uv

Devin session

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@scripts/build_slim_package.uv`:
- Around line 98-101: In build_airbyte_slim ensure dist_dir and build_dir are
resolved to absolute paths up front (e.g., using Path.resolve() or
os.path.abspath) so relative-path differences don't change behavior when calling
dist_dir.mkdir(...) and subprocess.run(..., cwd=build_dir); replace any uses of
the original variables with the resolved Path objects and pass
str(resolved_dist_dir) and str(resolved_build_dir) into subprocess.run and mkdir
to keep output deterministic.
- Around line 72-73: The code currently calls shutil.rmtree(build_dir)
unconditionally if build_dir.exists(); add a safety-guard before deletion that
validates build_dir.resolve() is a non-root, non-repo-root, and within an
expected build/output directory subtree (e.g., ensure build_dir.resolve() !=
Path("/") and is a child of the project temp/output directory or matches an
allowed basename like "build" or "dist"); if the check fails, raise an exception
or abort instead of calling shutil.rmtree; update the deletion site that uses
build_dir and shutil.rmtree to perform this validation first.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 3f8ddf5d-3a64-47b0-8db8-acdfb4d66401

📥 Commits

Reviewing files that changed from the base of the PR and between ba6dca9 and 52bd2a8.

📒 Files selected for processing (3)
  • .github/workflows/release_drafter.yml
  • pyproject.toml
  • scripts/build_slim_package.uv
🚧 Files skipped from review as they are similar to previous changes (1)
  • pyproject.toml

Comment thread scripts/build_slim_package.uv Outdated
Comment thread scripts/build_slim_package.uv Outdated
Comment thread airbyte/__init__.py
Comment on lines +136 to +149
_LAZY_EXPORTS = {
"BigQueryCache": "airbyte.caches.bigquery",
"CachedDataset": "airbyte.datasets",
"Destination": "airbyte.destinations.base",
"DuckDBCache": "airbyte.caches.duckdb",
"ReadResult": "airbyte.results",
"Source": "airbyte.sources.base",
"WriteResult": "airbyte.results",
"get_colab_cache": "airbyte.caches.util",
"get_default_cache": "airbyte.caches.util",
"get_destination": "airbyte.destinations.util",
"get_source": "airbyte.sources.util",
"new_local_cache": "airbyte.caches.util",
}
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow what 'lazy exports' is supposed to mean. If obvious/easy, answer here. If shameful/controversial, answer in slack.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not shameful. Here "lazy exports" means the top-level airbyte module keeps the existing public API names, but it no longer imports full-package modules until the specific name is accessed.

Example: import airbyte as ab stays lightweight; ab.get_source then imports airbyte.sources.util on first access and caches the resolved function. This lets airbyte-slim import the shared top-level package without immediately requiring full-only modules like caches, sources, destinations, CLI, MCP, DuckDB, SQLAlchemy, or CDK.


Devin session

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant