Skip to content

RHAIENG-568: Issue opendatahub-io/notebooks#385: decrease Python 3.11 CUDA images size by installing less #1597

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 11, 2025

Conversation

jiridanek
Copy link
Member

@jiridanek jiridanek commented Aug 1, 2025

https://issues.redhat.com/browse/RHAIENG-568

Description

How Has This Been Tested?

Merge criteria:

  • The commits are squashed in a cohesive manner and have meaningful messages.
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has manually tested the changes and verified that the changes work

Summary by CodeRabbit

  • Refactor
    • Simplified container images by removing CUDA development libraries and tools, retaining only essential development utilities and CUDA/cuDNN runtime components.
    • Updated comments to reference runtime package sources instead of development package sources.

Copy link
Contributor

coderabbitai bot commented Aug 1, 2025

Walkthrough

This change removes the installation of CUDA development packages and their associated environment variables from several Dockerfiles across Jupyter, RStudio, and runtimes images. Only minimal development tools (make and findutils) and the runtime cuDNN package remain installed. Comments are updated to reference runtime Dockerfiles instead of development ones.

Changes

Cohort / File(s) Change Summary
Jupyter CUDA Dockerfiles
jupyter/minimal/ubi9-python-3.11/Dockerfile.cuda, jupyter/pytorch/ubi9-python-3.11/Dockerfile.cuda, jupyter/tensorflow/ubi9-python-3.11/Dockerfile.cuda
Removed installation of CUDA development packages and related environment variables; now only installs make, findutils, and runtime cuDNN. Updated comments to reference runtime Dockerfiles.
RStudio CUDA Dockerfiles
rstudio/c9s-python-3.11/Dockerfile.cuda, rstudio/rhel9-python-3.11/Dockerfile.cuda
Deleted CUDA development packages and associated environment variables; retained only minimal development tools and runtime cuDNN. Updated Dockerfile comments accordingly.
Runtimes CUDA Dockerfiles
runtimes/pytorch/ubi9-python-3.11/Dockerfile.cuda, runtimes/tensorflow/ubi9-python-3.11/Dockerfile.cuda
Removed CUDA development package installation and related environment variables; kept only make, findutils, and runtime cuDNN package. Adjusted comments to reflect these changes.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Possibly related PRs

Suggested labels

lgtm

Suggested reviewers

  • jesuino
  • harshad16
  • dibryant
✨ Finishing Touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@jiridanek jiridanek linked an issue Aug 1, 2025 that may be closed by this pull request
@openshift-ci openshift-ci bot requested review from andyatmiami and jstourac August 1, 2025 17:19
Copy link
Contributor

openshift-ci bot commented Aug 1, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign daniellutz for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the size/l label Aug 1, 2025
@jiridanek
Copy link
Member Author

@coderabbitai full review

Copy link
Contributor

coderabbitai bot commented Aug 1, 2025

✅ Actions performed

Full review triggered.

@openshift-ci openshift-ci bot added size/l and removed size/l labels Aug 1, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (4)
jupyter/pytorch/ubi9-python-3.11/Dockerfile.cuda (1)

114-121: Same tooling concern as raised in RStudio image

See earlier remark about the possible need for a compiler tool-chain once CUDA -devel packages are gone.

jupyter/minimal/ubi9-python-3.11/Dockerfile.cuda (1)

101-108: Same tooling concern as raised in RStudio image

See earlier remark about the possible need for a compiler tool-chain once CUDA -devel packages are gone.

jupyter/tensorflow/ubi9-python-3.11/Dockerfile.cuda (1)

114-121: Same tooling concern as raised in RStudio image

See earlier remark about the possible need for a compiler tool-chain once CUDA -devel packages are gone.

runtimes/pytorch/ubi9-python-3.11/Dockerfile.cuda (1)

101-108: Same tooling concern as raised in RStudio image

See earlier remark about the possible need for a compiler tool-chain once CUDA -devel packages are gone.

🧹 Nitpick comments (4)
rstudio/c9s-python-3.11/Dockerfile.cuda (1)

107-113: Double-check if the stripped-down tool-chain is still sufficient

After removing all CUDA -devel RPMs we now keep only make + findutils.
If any image layer (e.g. R packages installed from CRAN or pip extras pulled in later) attempts to build native extensions it will now fail because no compiler, headers or CUDA stubs are present.

Before merging, please confirm via a CI build that:

  1. micropipenv install in every downstream stage still succeeds on both x86_64 and aarch64.
  2. At runtime, common GPU–aware R/Python libraries (torch, tensorflow, cupy, etc.) load successfully and do not attempt JIT compilation.

If unexpected build-time failures appear, we may need to re-introduce gcc-c++ (CPU only) or ship wheels exclusively.

runtimes/tensorflow/ubi9-python-3.11/Dockerfile.cuda (1)

101-107: Do we really need make & findutils in the slimmed-down runtime layer?

Both utilities are only useful when native builds occur at runtime, but the image no longer carries a compiler tool-chain (gcc, g++, etc.).
If no package compilation happens after this layer, the two packages are dead weight; if compilation is still expected, the current tool-set is insufficient.

Consider either:

-    make \
-    findutils \
+    # (remove entirely) – no runtime compilation expected

or explicitly adding the full build tool-chain (gcc, gcc-c++, glibc-devel, …) so that future pip installs don’t fail.

rstudio/rhel9-python-3.11/Dockerfile.cuda (2)

130-136: Same devel-tools question as in TensorFlow runtime

Only make and findutils remain. If no in-container builds are planned, drop them; if builds are still required for R packages with C/C++ code, also keep gcc, gcc-c++, etc. to avoid runtime compilation failures.


138-149: Typo in comment URL (hhttps)

Line 139 currently reads hhttps://gitlab.com/.... One extra h.

-# hhttps://gitlab.com/nvidia/container-images/cuda/-/blob/master/dist/12.4.1/ubi9/runtime/cudnn/Dockerfile
+# https://gitlab.com/nvidia/container-images/cuda/-/blob/master/dist/12.4.1/ubi9/runtime/cudnn/Dockerfile
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 702b4ff and 761478f.

📒 Files selected for processing (7)
  • jupyter/minimal/ubi9-python-3.11/Dockerfile.cuda (1 hunks)
  • jupyter/pytorch/ubi9-python-3.11/Dockerfile.cuda (1 hunks)
  • jupyter/tensorflow/ubi9-python-3.11/Dockerfile.cuda (1 hunks)
  • rstudio/c9s-python-3.11/Dockerfile.cuda (1 hunks)
  • rstudio/rhel9-python-3.11/Dockerfile.cuda (1 hunks)
  • runtimes/pytorch/ubi9-python-3.11/Dockerfile.cuda (1 hunks)
  • runtimes/tensorflow/ubi9-python-3.11/Dockerfile.cuda (1 hunks)
🧰 Additional context used
🧠 Learnings (16)
📓 Common learnings
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1306
File: jupyter/trustyai/ubi9-python-3.12/kustomize/base/kustomization.yaml:8-12
Timestamp: 2025-07-08T19:09:48.746Z
Learning: jiridanek requested GitHub issue creation for misleading CUDA prefix in TrustyAI image tags during PR #1306 review, affecting both Python 3.11 and 3.12 versions. Issue #1338 was created with comprehensive problem description covering both affected images, repository pattern analysis comparing correct vs incorrect naming conventions, clear solution with code examples, detailed acceptance criteria, and proper context linking, continuing the established pattern of systematic code quality improvements through detailed issue tracking.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1306
File: jupyter/trustyai/ubi9-python-3.12/kustomize/base/kustomization.yaml:8-12
Timestamp: 2025-07-08T19:09:48.746Z
Learning: jiridanek requested GitHub issue creation for misleading CUDA prefix in TrustyAI image tags during PR #1306 review. Issue was created with comprehensive problem description covering both Python 3.11 and 3.12 versions, repository pattern analysis showing correct vs incorrect naming, clear solution with code examples, detailed acceptance criteria, and proper context linking, continuing the established pattern of systematic code quality improvements through detailed issue tracking.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#0
File: :0-0
Timestamp: 2025-08-01T14:46:03.215Z
Learning: jiridanek requested GitHub issue creation for two nitpicks during PR #1588 review: comment wording improvement in ROCm TensorFlow Dockerfile and typo fix in Jupyter DataScience Dockerfile stage header. Issues #1589 and #1590 were successfully created with comprehensive problem descriptions, specific file locations and line numbers, clear before/after solutions, detailed acceptance criteria, and proper context linking, continuing the established pattern of systematic code quality improvements through detailed issue tracking.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1259
File: jupyter/rocm/tensorflow/ubi9-python-3.12/kustomize/base/service.yaml:5-15
Timestamp: 2025-07-02T18:59:15.788Z
Learning: jiridanek creates targeted GitHub issues for specific test quality improvements identified during PR reviews in opendatahub-io/notebooks. Issue #1268 demonstrates this by converting a review comment about insufficient tf2onnx conversion test validation into a comprehensive improvement plan with clear acceptance criteria, code examples, and ROCm-specific context.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1333
File: runtimes/tensorflow/ubi9-python-3.12/Pipfile:13-15
Timestamp: 2025-07-08T19:29:32.006Z
Learning: jiridanek requested GitHub issue creation for investigating TensorFlow "and-cuda" extras usage patterns during PR #1333 review. Issue #1345 was created with comprehensive investigation framework covering platform-specific analysis, deployment scenarios, TensorFlow version compatibility, clear acceptance criteria, and testing approach, continuing the established pattern of systematic code quality improvements through detailed issue tracking.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1333
File: runtimes/pytorch/ubi9-python-3.12/Dockerfile.cuda:17-25
Timestamp: 2025-07-09T08:07:30.628Z
Learning: jiridanek requested GitHub issue creation for oc client installation permission problem in PyTorch CUDA runtime Dockerfile during PR #1333 review. Issue #1356 was created with comprehensive problem description covering USER 1001 permission conflicts with root-owned /opt/app-root/bin directory, detailed impact analysis of build failures and non-executable binaries, current problematic code snippet, complete solution with user switching approach, clear acceptance criteria, and proper context linking, continuing the established pattern of systematic code quality improvements through detailed issue tracking.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1333
File: runtimes/pytorch/ubi9-python-3.12/utils/bootstrapper.py:619-626
Timestamp: 2025-07-08T19:33:14.340Z
Learning: jiridanek requested GitHub issue creation for Python 3.12 version check bug in bootstrapper.py during PR #1333 review. Issue #1348 was created with comprehensive problem description covering version check exclusion affecting all Python 3.12 runtime images, detailed impact analysis of bootstrapper execution failures, clear solution with code examples, affected files list including all 6 runtime bootstrapper copies, acceptance criteria for testing and verification, implementation notes about code duplication and upstream reporting, and proper context linking, continuing the established pattern of systematic code quality improvements through detailed issue tracking.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1333
File: runtimes/tensorflow/ubi9-python-3.12/Pipfile:13-15
Timestamp: 2025-07-08T19:29:32.006Z
Learning: jiridanek requested GitHub issue creation for investigating TensorFlow "and-cuda" extras usage patterns during PR #1333 review. Issue #1340 was created with comprehensive investigation framework covering platform-specific analysis, deployment scenarios, TensorFlow version compatibility, clear acceptance criteria, testing approach, and implementation timeline, continuing the established pattern of systematic code quality improvements through detailed issue tracking.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#0
File: :0-0
Timestamp: 2025-07-11T11:16:05.131Z
Learning: jiridanek requested GitHub issue creation for RStudio py311 Tekton push pipelines during PR #1379 review. Issue #1384 was successfully created covering two RStudio variants (CPU and CUDA) found in manifests/base/params-latest.env, with comprehensive problem description, implementation requirements following the same pattern as other workbench pipelines, clear acceptance criteria, and proper context linking, continuing the established pattern of systematic code quality improvements through detailed issue tracking.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1333
File: runtimes/rocm-tensorflow/ubi9-python-3.12/Pipfile:13-15
Timestamp: 2025-07-09T08:07:24.937Z
Learning: jiridanek requested GitHub issue creation for tensorflow_rocm Python 3.12 compatibility problem during PR #1333 review. Issue #1354 was successfully created with comprehensive problem description covering missing cp312 wheels causing build failures, three solution options (upstream TensorFlow, Python 3.11 only, custom build), clear acceptance criteria, and proper context linking, continuing the established pattern of systematic code quality improvements through detailed issue tracking.
Learnt from: grdryn
PR: opendatahub-io/notebooks#1320
File: rstudio/rhel9-python-3.11/Dockerfile.cuda:34-35
Timestamp: 2025-07-04T10:41:13.061Z
Learning: In the opendatahub-io/notebooks repository, when adapting NVIDIA CUDA Dockerfiles, the project intentionally maintains consistency with upstream NVIDIA patterns even when it might involve potential risks like empty variable expansions in package installation commands. This is considered acceptable because the containers only run on RHEL 9 with known yum/dnf behavior, and upstream consistency is prioritized over defensive coding practices.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#0
File: :0-0
Timestamp: 2025-07-01T10:41:56.419Z
Learning: In the opendatahub-io/notebooks repository, TensorFlow packages with `extras = ["and-cuda"]` can cause build conflicts on macOS due to platform-specific CUDA packages. When the Dockerfile installs CUDA system-wide, removing the extras and letting TensorFlow find CUDA at runtime resolves these conflicts.
Learnt from: grdryn
PR: opendatahub-io/notebooks#1396
File: jupyter/tensorflow/ubi9-python-3.12/Pipfile:13-14
Timestamp: 2025-07-16T00:17:10.313Z
Learning: grdryn corrected CodeRabbit's false assessment about CUDA companion package wheel availability during PR #1396 review. The original analysis incorrectly checked all package releases instead of the specific versions that would be installed with tensorflow[and-cuda]~=2.19.0. The actual versions (nvidia-cudnn-cu12/9.3.0.75, nvidia-cuda-runtime-cu12/12.5.82, nvidia-cublas-cu12/12.5.3.2) do have aarch64 wheels available on PyPI, making the and-cuda extra compatible with arm64 builds.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#0
File: :0-0
Timestamp: 2025-07-11T11:54:28.202Z
Learning: In opendatahub-io/notebooks, Python 3.12-based images (e.g., runtime-cuda-pytorch-ubi9-python-3.12) may fail container runtime tests with "libcrypt.so.1 => not found" for MySQL SASL2 plugin libraries if `libxcrypt-compat` is missing. The solution is to install `libxcrypt-compat` in the Dockerfile.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1519
File: jupyter/pytorch+llmcompressor/ubi9-python-3.11/runtime-images/llmcompressor-pytorch-ubi9-py311.json:2-9
Timestamp: 2025-07-30T08:42:19.393Z
Learning: The new architecture for runtime images in opendatahub-io/notebooks uses a dynamic system instead of static JSON files: (1) Runtime images are defined as ImageStreams in manifests/base/ with `opendatahub.io/runtime-image: "true"` label and JSON metadata in `opendatahub.io/runtime-image-metadata` annotation, (2) odh-notebook-controller dynamically creates JSONs in configmaps by listing ImageStreams on cluster, (3) Elyra-enabled workbenches get these JSONs mounted and setup-elyra.sh moves them from `/opt/app-root/pipeline-runtimes/..data/*.json` to `/opt/app-root/share/jupyter/metadata/runtime-images/`. Static JSON files in `*/runtime-images/` directories are no longer used.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1519
File: jupyter/pytorch+llmcompressor/ubi9-python-3.11/runtime-images/llmcompressor-pytorch-ubi9-py311.json:2-9
Timestamp: 2025-07-29T15:56:20.036Z
Learning: Current practice for runtime-images JSON files in opendatahub-io/notebooks: All runtime image configurations use SHA256 digest pinning (format: `@sha256:<digest>`) instead of mutable tags for immutability, use `quay.io/opendatahub/workbench-images` registry, include `schema_name: "runtime-image"` field, and follow consistent structure as seen in existing files like datascience-ubi9-py311.json, pytorch-ubi9-py311.json, etc.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1259
File: jupyter/rocm/tensorflow/ubi9-python-3.12/Pipfile:13-15
Timestamp: 2025-07-03T07:03:45.020Z
Learning: The Python 3.11 infrastructure for ROCm TensorFlow images in opendatahub-io/notebooks is already properly configured in the Makefile with both BASE_DIRS entries for Pipfile lock renewals and all-images targets for CI builds, requiring only commenting out the corresponding Python 3.12 entries when downgrading due to wheel availability issues.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1269
File: codeserver/ubi9-python-3.12/nginx/root/usr/share/container-scripts/nginx/common.sh:1-3
Timestamp: 2025-07-03T12:07:19.365Z
Learning: jiridanek consistently requests GitHub issue creation for technical improvements identified during code reviews in opendatahub-io/notebooks, ensuring systematic tracking of code quality enhancements like shell script portability issues with comprehensive descriptions, solution options, and acceptance criteria.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#968
File: codeserver/ubi9-python-3.11/Dockerfile.cpu:138-140
Timestamp: 2025-08-01T16:07:58.677Z
Learning: jiridanek prefers architectural solutions that eliminate problems entirely rather than just fixing immediate technical issues. When presented with a pipeline safety concern about micropipenv requirements generation, he suggested removing micropipenv from the build process altogether by using pre-committed requirements.txt files, demonstrating preference for simplification and deterministic builds over complex workarounds.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1269
File: codeserver/ubi9-python-3.12/utils/process.sh:17-19
Timestamp: 2025-07-03T14:00:00.909Z
Learning: jiridanek efficiently identifies when CodeRabbit review suggestions are already covered by existing comprehensive issues, demonstrating excellent issue management and avoiding duplicate tracking of the same improvements across multiple locations.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1151
File: jupyter/tensorflow/ubi9-python-3.12/test/test_notebook.ipynb:31-34
Timestamp: 2025-07-01T07:03:05.385Z
Learning: jiridanek demonstrates excellent pattern recognition for identifying duplicated code issues across the opendatahub-io/notebooks repository. When spotting a potential problem in test notebooks, he correctly assesses that such patterns are likely replicated across multiple similar files rather than being isolated incidents, leading to more effective systematic solutions.
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1259
File: jupyter/rocm/tensorflow/ubi9-python-3.12/Dockerfile.rocm:56-66
Timestamp: 2025-07-02T18:19:49.397Z
Learning: jiridanek consistently creates comprehensive follow-up GitHub issues for security concerns raised during PR reviews in opendatahub-io/notebooks, ensuring systematic tracking and resolution of supply-chain security improvements like GPG signature verification for package repositories.
📚 Learning: in the opendatahub-io/notebooks repository, when adapting nvidia cuda dockerfiles, the project inten...
Learnt from: grdryn
PR: opendatahub-io/notebooks#1320
File: rstudio/rhel9-python-3.11/Dockerfile.cuda:34-35
Timestamp: 2025-07-04T10:41:13.061Z
Learning: In the opendatahub-io/notebooks repository, when adapting NVIDIA CUDA Dockerfiles, the project intentionally maintains consistency with upstream NVIDIA patterns even when it might involve potential risks like empty variable expansions in package installation commands. This is considered acceptable because the containers only run on RHEL 9 with known yum/dnf behavior, and upstream consistency is prioritized over defensive coding practices.

Applied to files:

  • jupyter/minimal/ubi9-python-3.11/Dockerfile.cuda
  • runtimes/tensorflow/ubi9-python-3.11/Dockerfile.cuda
  • runtimes/pytorch/ubi9-python-3.11/Dockerfile.cuda
  • jupyter/tensorflow/ubi9-python-3.11/Dockerfile.cuda
  • rstudio/c9s-python-3.11/Dockerfile.cuda
  • rstudio/rhel9-python-3.11/Dockerfile.cuda
  • jupyter/pytorch/ubi9-python-3.11/Dockerfile.cuda
📚 Learning: in the opendatahub-io/notebooks repository, tensorflow packages with `extras = ["and-cuda"]` can cau...
Learnt from: jiridanek
PR: opendatahub-io/notebooks#0
File: :0-0
Timestamp: 2025-07-01T10:41:56.419Z
Learning: In the opendatahub-io/notebooks repository, TensorFlow packages with `extras = ["and-cuda"]` can cause build conflicts on macOS due to platform-specific CUDA packages. When the Dockerfile installs CUDA system-wide, removing the extras and letting TensorFlow find CUDA at runtime resolves these conflicts.

Applied to files:

  • jupyter/minimal/ubi9-python-3.11/Dockerfile.cuda
  • runtimes/tensorflow/ubi9-python-3.11/Dockerfile.cuda
  • runtimes/pytorch/ubi9-python-3.11/Dockerfile.cuda
  • jupyter/tensorflow/ubi9-python-3.11/Dockerfile.cuda
  • rstudio/c9s-python-3.11/Dockerfile.cuda
  • rstudio/rhel9-python-3.11/Dockerfile.cuda
  • jupyter/pytorch/ubi9-python-3.11/Dockerfile.cuda
📚 Learning: jiridanek requested github issue creation for misleading cuda prefix in trustyai image tags during p...
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1306
File: jupyter/trustyai/ubi9-python-3.12/kustomize/base/kustomization.yaml:8-12
Timestamp: 2025-07-08T19:09:48.746Z
Learning: jiridanek requested GitHub issue creation for misleading CUDA prefix in TrustyAI image tags during PR #1306 review, affecting both Python 3.11 and 3.12 versions. Issue #1338 was created with comprehensive problem description covering both affected images, repository pattern analysis comparing correct vs incorrect naming conventions, clear solution with code examples, detailed acceptance criteria, and proper context linking, continuing the established pattern of systematic code quality improvements through detailed issue tracking.

Applied to files:

  • jupyter/minimal/ubi9-python-3.11/Dockerfile.cuda
  • runtimes/tensorflow/ubi9-python-3.11/Dockerfile.cuda
  • runtimes/pytorch/ubi9-python-3.11/Dockerfile.cuda
  • jupyter/tensorflow/ubi9-python-3.11/Dockerfile.cuda
  • rstudio/c9s-python-3.11/Dockerfile.cuda
  • rstudio/rhel9-python-3.11/Dockerfile.cuda
  • jupyter/pytorch/ubi9-python-3.11/Dockerfile.cuda
📚 Learning: jiridanek requested github issue creation for misleading cuda prefix in trustyai image tags during p...
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1306
File: jupyter/trustyai/ubi9-python-3.12/kustomize/base/kustomization.yaml:8-12
Timestamp: 2025-07-08T19:09:48.746Z
Learning: jiridanek requested GitHub issue creation for misleading CUDA prefix in TrustyAI image tags during PR #1306 review. Issue was created with comprehensive problem description covering both Python 3.11 and 3.12 versions, repository pattern analysis showing correct vs incorrect naming, clear solution with code examples, detailed acceptance criteria, and proper context linking, continuing the established pattern of systematic code quality improvements through detailed issue tracking.

Applied to files:

  • jupyter/minimal/ubi9-python-3.11/Dockerfile.cuda
  • runtimes/tensorflow/ubi9-python-3.11/Dockerfile.cuda
  • runtimes/pytorch/ubi9-python-3.11/Dockerfile.cuda
  • jupyter/tensorflow/ubi9-python-3.11/Dockerfile.cuda
  • rstudio/c9s-python-3.11/Dockerfile.cuda
  • rstudio/rhel9-python-3.11/Dockerfile.cuda
  • jupyter/pytorch/ubi9-python-3.11/Dockerfile.cuda
📚 Learning: grdryn corrected coderabbit's false assessment about cuda companion package wheel availability durin...
Learnt from: grdryn
PR: opendatahub-io/notebooks#1396
File: jupyter/tensorflow/ubi9-python-3.12/Pipfile:13-14
Timestamp: 2025-07-16T00:17:10.313Z
Learning: grdryn corrected CodeRabbit's false assessment about CUDA companion package wheel availability during PR #1396 review. The original analysis incorrectly checked all package releases instead of the specific versions that would be installed with tensorflow[and-cuda]~=2.19.0. The actual versions (nvidia-cudnn-cu12/9.3.0.75, nvidia-cuda-runtime-cu12/12.5.82, nvidia-cublas-cu12/12.5.3.2) do have aarch64 wheels available on PyPI, making the and-cuda extra compatible with arm64 builds.

Applied to files:

  • jupyter/minimal/ubi9-python-3.11/Dockerfile.cuda
  • runtimes/tensorflow/ubi9-python-3.11/Dockerfile.cuda
  • runtimes/pytorch/ubi9-python-3.11/Dockerfile.cuda
  • jupyter/tensorflow/ubi9-python-3.11/Dockerfile.cuda
  • rstudio/c9s-python-3.11/Dockerfile.cuda
  • rstudio/rhel9-python-3.11/Dockerfile.cuda
  • jupyter/pytorch/ubi9-python-3.11/Dockerfile.cuda
📚 Learning: in opendatahub-io/notebooks, python 3.12-based images (e.g., runtime-cuda-pytorch-ubi9-python-3.12) ...
Learnt from: jiridanek
PR: opendatahub-io/notebooks#0
File: :0-0
Timestamp: 2025-07-11T11:54:28.202Z
Learning: In opendatahub-io/notebooks, Python 3.12-based images (e.g., runtime-cuda-pytorch-ubi9-python-3.12) may fail container runtime tests with "libcrypt.so.1 => not found" for MySQL SASL2 plugin libraries if `libxcrypt-compat` is missing. The solution is to install `libxcrypt-compat` in the Dockerfile.

Applied to files:

  • jupyter/minimal/ubi9-python-3.11/Dockerfile.cuda
  • runtimes/tensorflow/ubi9-python-3.11/Dockerfile.cuda
  • runtimes/pytorch/ubi9-python-3.11/Dockerfile.cuda
  • jupyter/tensorflow/ubi9-python-3.11/Dockerfile.cuda
  • rstudio/c9s-python-3.11/Dockerfile.cuda
  • rstudio/rhel9-python-3.11/Dockerfile.cuda
  • jupyter/pytorch/ubi9-python-3.11/Dockerfile.cuda
📚 Learning: the python 3.11 infrastructure for rocm tensorflow images in opendatahub-io/notebooks is already pro...
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1259
File: jupyter/rocm/tensorflow/ubi9-python-3.12/Pipfile:13-15
Timestamp: 2025-07-03T07:03:45.020Z
Learning: The Python 3.11 infrastructure for ROCm TensorFlow images in opendatahub-io/notebooks is already properly configured in the Makefile with both BASE_DIRS entries for Pipfile lock renewals and all-images targets for CI builds, requiring only commenting out the corresponding Python 3.12 entries when downgrading due to wheel availability issues.

Applied to files:

  • jupyter/minimal/ubi9-python-3.11/Dockerfile.cuda
  • runtimes/tensorflow/ubi9-python-3.11/Dockerfile.cuda
  • runtimes/pytorch/ubi9-python-3.11/Dockerfile.cuda
  • jupyter/tensorflow/ubi9-python-3.11/Dockerfile.cuda
  • rstudio/c9s-python-3.11/Dockerfile.cuda
  • rstudio/rhel9-python-3.11/Dockerfile.cuda
  • jupyter/pytorch/ubi9-python-3.11/Dockerfile.cuda
📚 Learning: jiridanek requested github issue creation for improving fragile sed-based jupyter kernel display_nam...
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1306
File: jupyter/trustyai/ubi9-python-3.12/Dockerfile.cpu:135-136
Timestamp: 2025-07-04T05:52:49.464Z
Learning: jiridanek requested GitHub issue creation for improving fragile sed-based Jupyter kernel display_name modification in jupyter/trustyai/ubi9-python-3.12/Dockerfile.cpu during PR #1306 review. Issue #1321 was created with comprehensive problem description covering JSON corruption risks, greedy regex patterns, maintenance burden, and proposed Python-based JSON parsing solution with detailed acceptance criteria, implementation guidance, and proper context linking, continuing the established pattern of systematic code quality improvements.

Applied to files:

  • jupyter/minimal/ubi9-python-3.11/Dockerfile.cuda
  • rstudio/c9s-python-3.11/Dockerfile.cuda
  • rstudio/rhel9-python-3.11/Dockerfile.cuda
📚 Learning: jiridanek requested github issue creation for oc client installation permission problem in pytorch c...
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1333
File: runtimes/pytorch/ubi9-python-3.12/Dockerfile.cuda:17-25
Timestamp: 2025-07-09T08:07:30.628Z
Learning: jiridanek requested GitHub issue creation for oc client installation permission problem in PyTorch CUDA runtime Dockerfile during PR #1333 review. Issue #1356 was created with comprehensive problem description covering USER 1001 permission conflicts with root-owned /opt/app-root/bin directory, detailed impact analysis of build failures and non-executable binaries, current problematic code snippet, complete solution with user switching approach, clear acceptance criteria, and proper context linking, continuing the established pattern of systematic code quality improvements through detailed issue tracking.

Applied to files:

  • jupyter/minimal/ubi9-python-3.11/Dockerfile.cuda
  • runtimes/tensorflow/ubi9-python-3.11/Dockerfile.cuda
  • runtimes/pytorch/ubi9-python-3.11/Dockerfile.cuda
  • jupyter/tensorflow/ubi9-python-3.11/Dockerfile.cuda
  • rstudio/c9s-python-3.11/Dockerfile.cuda
  • rstudio/rhel9-python-3.11/Dockerfile.cuda
  • jupyter/pytorch/ubi9-python-3.11/Dockerfile.cuda
📚 Learning: the pod-patch.yaml file in jupyter/pytorch/ubi9-python-3.12/kustomize/components/accelerator/ is use...
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1230
File: jupyter/pytorch/ubi9-python-3.12/kustomize/components/accelerator/pod-patch.yaml:11-22
Timestamp: 2025-06-30T14:36:53.890Z
Learning: The pod-patch.yaml file in jupyter/pytorch/ubi9-python-3.12/kustomize/components/accelerator/ is used only for running tests, not production deployments. This affects the risk assessment for resource management configurations like sizeLimit on emptyDir volumes.

Applied to files:

  • jupyter/minimal/ubi9-python-3.11/Dockerfile.cuda
  • runtimes/pytorch/ubi9-python-3.11/Dockerfile.cuda
  • jupyter/pytorch/ubi9-python-3.11/Dockerfile.cuda
📚 Learning: in the opendatahub-io/notebooks repository, mixing centos packages with ubi base images is bad pract...
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1396
File: jupyter/tensorflow/ubi9-python-3.12/Dockerfile.cuda:192-195
Timestamp: 2025-07-18T19:01:39.811Z
Learning: In the opendatahub-io/notebooks repository, mixing CentOS packages with UBI base images is bad practice that removes supportability and creates "Frankenstein" images according to Red Hat guidance. However, using EPEL packages is acceptable, though it may require extra work with AIPCC for internal Red Hat builds. The official reference is at developers.redhat.com/articles/ubi-faq.

Applied to files:

  • jupyter/minimal/ubi9-python-3.11/Dockerfile.cuda
  • runtimes/tensorflow/ubi9-python-3.11/Dockerfile.cuda
  • runtimes/pytorch/ubi9-python-3.11/Dockerfile.cuda
  • jupyter/tensorflow/ubi9-python-3.11/Dockerfile.cuda
  • rstudio/c9s-python-3.11/Dockerfile.cuda
  • jupyter/pytorch/ubi9-python-3.11/Dockerfile.cuda
📚 Learning: jiridanek requested github issue creation for investigating tensorflow "and-cuda" extras usage patte...
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1333
File: runtimes/tensorflow/ubi9-python-3.12/Pipfile:13-15
Timestamp: 2025-07-08T19:29:32.006Z
Learning: jiridanek requested GitHub issue creation for investigating TensorFlow "and-cuda" extras usage patterns during PR #1333 review. Issue #1340 was created with comprehensive investigation framework covering platform-specific analysis, deployment scenarios, TensorFlow version compatibility, clear acceptance criteria, testing approach, and implementation timeline, continuing the established pattern of systematic code quality improvements through detailed issue tracking.

Applied to files:

  • runtimes/tensorflow/ubi9-python-3.11/Dockerfile.cuda
  • jupyter/tensorflow/ubi9-python-3.11/Dockerfile.cuda
📚 Learning: jiridanek requested github issue creation for investigating tensorflow "and-cuda" extras usage patte...
Learnt from: jiridanek
PR: opendatahub-io/notebooks#1333
File: runtimes/tensorflow/ubi9-python-3.12/Pipfile:13-15
Timestamp: 2025-07-08T19:29:32.006Z
Learning: jiridanek requested GitHub issue creation for investigating TensorFlow "and-cuda" extras usage patterns during PR #1333 review. Issue #1345 was created with comprehensive investigation framework covering platform-specific analysis, deployment scenarios, TensorFlow version compatibility, clear acceptance criteria, and testing approach, continuing the established pattern of systematic code quality improvements through detailed issue tracking.

Applied to files:

  • runtimes/tensorflow/ubi9-python-3.11/Dockerfile.cuda
  • jupyter/tensorflow/ubi9-python-3.11/Dockerfile.cuda
📚 Learning: in the opendatahub-io/notebooks repository, there is a known issue with missing `runtimes/rocm/pytor...
Learnt from: jiridanek
PR: opendatahub-io/notebooks#0
File: :0-0
Timestamp: 2025-06-16T11:32:09.203Z
Learning: In the opendatahub-io/notebooks repository, there is a known issue with missing `runtimes/rocm/pytorch/ubi9-python-3.11/kustomize/base/kustomization.yaml` file that causes rocm runtime tests to fail with "no such file or directory" error. This is tracked in JIRA RHOAIENG-22044 and was intended to be fixed in PR #1015.

Applied to files:

  • runtimes/pytorch/ubi9-python-3.11/Dockerfile.cuda
  • jupyter/pytorch/ubi9-python-3.11/Dockerfile.cuda
📚 Learning: jiridanek requested github issue creation for rstudio py311 tekton push pipelines during pr #1379 re...
Learnt from: jiridanek
PR: opendatahub-io/notebooks#0
File: :0-0
Timestamp: 2025-07-11T11:16:05.131Z
Learning: jiridanek requested GitHub issue creation for RStudio py311 Tekton push pipelines during PR #1379 review. Issue #1384 was successfully created covering two RStudio variants (CPU and CUDA) found in manifests/base/params-latest.env, with comprehensive problem description, implementation requirements following the same pattern as other workbench pipelines, clear acceptance criteria, and proper context linking, continuing the established pattern of systematic code quality improvements through detailed issue tracking.

Applied to files:

  • rstudio/c9s-python-3.11/Dockerfile.cuda
  • rstudio/rhel9-python-3.11/Dockerfile.cuda
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: Red Hat Konflux / odh-workbench-rstudio-minimal-cuda-py311-c9s-on-pull-request
  • GitHub Check: Red Hat Konflux / odh-pipeline-runtime-pytorch-cuda-py311-ubi9-on-pull-request
  • GitHub Check: Red Hat Konflux / odh-workbench-jupyter-minimal-cuda-py311-ubi9-on-pull-request
  • GitHub Check: Red Hat Konflux / odh-workbench-jupyter-minimal-rocm-py311-ubi9-on-pull-request
  • GitHub Check: Red Hat Konflux / odh-workbench-jupyter-pytorch-cuda-py311-ubi9-on-pull-request
  • GitHub Check: Red Hat Konflux / odh-workbench-rstudio-minimal-cpu-py311-c9s-on-pull-request
  • GitHub Check: Red Hat Konflux / odh-workbench-jupyter-minimal-cpu-py311-ubi9-on-pull-request
  • GitHub Check: Red Hat Konflux / odh-workbench-jupyter-tensorflow-cuda-py311-ubi9-on-pull-request
  • GitHub Check: Red Hat Konflux / odh-pipeline-runtime-tensorflow-cuda-py311-ubi9-on-pull-request
  • GitHub Check: Red Hat Konflux / odh-workbench-jupyter-datascience-cpu-py311-ubi9-on-pull-request
🔇 Additional comments (1)
runtimes/tensorflow/ubi9-python-3.11/Dockerfile.cuda (1)

109-118: Runtime-only cuDNN install looks good – verify ABI match with TensorFlow wheels

Switching to libcudnn9-cuda-12 without the -devel package is a safe size win and should satisfy TF ≥2.16.
Please double-check that the version (9.5.1.17-1) matches the cuDNN major expected by the wheel pinned in Pipfile.lock; mismatches surface as libcudnn.so.X errors at import time.

Copy link
Contributor

openshift-ci bot commented Aug 1, 2025

@jiridanek: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/runtimes-ubi9-e2e-tests 761478f link true /test runtimes-ubi9-e2e-tests

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/kfbuild odh-workbench-jupyter-tensorflow-cuda-py311-ubi9-on-pull-request

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/kfbuild odh-workbench-jupyter-tensorflow-cuda-py311-ubi9-on-pull-request

@atheo89
Copy link
Member

atheo89 commented Aug 4, 2025

not sure if it we can remove this layer in the already distributed images 🤔
In the upcoming py312 I totally agree to remove it.

@jiridanek jiridanek changed the title Issue opendatahub-io/notebooks#385: decrease Python 3.11 CUDA images size by installing less RHAIENG-568: Issue opendatahub-io/notebooks#385: decrease Python 3.11 CUDA images size by installing less Aug 8, 2025
@openshift-ci openshift-ci bot added size/l and removed size/l labels Aug 8, 2025
@jiridanek jiridanek merged commit 3c5d2f5 into opendatahub-io:main Aug 11, 2025
41 of 46 checks passed
@jiridanek jiridanek deleted the jd_remove_cudad_useless branch August 11, 2025 11:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Install only CUDA runtime into images instead of entire CUDA SDK
2 participants