Improving the Dockerfile#160
Conversation
WalkthroughSplits runtime and development dependencies into ChangesDependency Separation and Alignment
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Interesting fact, that the repo of librdkafka was transferred from Magnus Edenhill to Confluent Inc. in 2023. So the Dockerfile, that updated its version also had to change the repository to fetch the version from. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/check_python.yml:
- Around line 40-42: The CHANGED_FILES assignment uses gh api to list PR files
but only gets the first page; update the gh api call inside the CHANGED_FILES
assignment to enable pagination by adding the --paginate and --slurp flags so
the existing --jq '.[].filename | select(endswith(".py") or
(startswith("requirements") and endswith(".txt")))' runs over the combined JSON
result; ensure the flags are added to the gh api invocation that constructs
CHANGED_FILES so the jq expression still extracts filenames as before.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: b2e4c4a1-be3b-4dae-a994-9d2600760b9a
📒 Files selected for processing (5)
.github/workflows/check_python.ymlDEVELOPER.mdDockerfilerequirements-dev.txtrequirements.txt
| tar -xf v2.4.0.tar.gz && \ | ||
| cd /tmp/env-install-workdir/confluent-kafka/confluent-kafka-python-2.4.0 && \ | ||
| CPPFLAGS="-I/usr/local/include" LDFLAGS="-L/opt" python setup.py install && \ | ||
| pip install -r ${LAMBDA_TASK_ROOT}/requirements.txt --no-binary confluent-kafka && \ |
There was a problem hiding this comment.
confluent-kafka - what is this about?
It's also part of the requirements.txt file so why it's needed like this here?
There was a problem hiding this comment.
So, if I understand it correctly, the requirements.txt confluent-kafka pins its version. This --no-binary way of forcing source compilation against the system librdkafka, that is set above. That one includes the Kerberos support.
The PyPI has an issue, that it compiles without GSSAPI (would break SASL_SSL authentication).
So requirements holds the version and --no-binary says how pip builds the confluent-kafka dependency. I added a comment into the Dockerfile to be more clear about that step: 9fd16fc
There was a problem hiding this comment.
Ok, I didn't know about any of this. Thanks!
There was a problem hiding this comment.
🧹 Nitpick comments (1)
Dockerfile (1)
63-68: ⚡ Quick winDrop package-manager and pip caches in this layer.
This will reduce final image size and better match the PR’s deployment-size objective.
♻️ Suggested change
- pip install -r ${LAMBDA_TASK_ROOT}/requirements.txt --no-binary confluent-kafka && \ + pip install --no-cache-dir -r ${LAMBDA_TASK_ROOT}/requirements.txt --no-binary confluent-kafka && \ echo "##############" && \ echo "### cleanup ###" && \ echo "##############" && \ cd /root && \ - rm -rf /tmp/env-install-workdir + dnf clean all && \ + rm -rf /var/cache/dnf /tmp/env-install-workdir🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@Dockerfile` around lines 63 - 68, Add removal of package-manager and pip caches in the same Dockerfile layer after the pip install and before removing /tmp/env-install-workdir: delete apt/yum/apk caches and pip caches (for example remove /var/lib/apt/lists/*, /var/cache/apt/* or /var/cache/yum/* or /var/cache/apk/* as appropriate for the base image) and remove pip caches such as /root/.cache/pip and /home/sbx_user/.cache/pip so the cache cleanup happens in the same RUN layer as the pip install step shown in the Dockerfile snippet.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@Dockerfile`:
- Around line 63-68: Add removal of package-manager and pip caches in the same
Dockerfile layer after the pip install and before removing
/tmp/env-install-workdir: delete apt/yum/apk caches and pip caches (for example
remove /var/lib/apt/lists/*, /var/cache/apt/* or /var/cache/yum/* or
/var/cache/apk/* as appropriate for the base image) and remove pip caches such
as /root/.cache/pip and /home/sbx_user/.cache/pip so the cache cleanup happens
in the same RUN layer as the pip install step shown in the Dockerfile snippet.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 1e778e82-6a75-41d2-8d73-507f3c824ad2
📒 Files selected for processing (4)
.github/actions/setup-dev-python-env/action.yml.github/copilot-instructions.md.github/workflows/check_python.ymlDockerfile
✅ Files skipped from review due to trivial changes (2)
- .github/actions/setup-dev-python-env/action.yml
- .github/copilot-instructions.md
Overview
This pull request updates the Python dependency management and CI workflow to better separate development dependencies from production dependencies. The main changes involve introducing a new
requirements-dev.txtfor development tools and updating both documentation and CI to use it. Additionally, the Docker build process is improved to use this separation and update some underlying libraries.Release Notes
Related
Closes #154
Summary by CodeRabbit
Documentation
Chores