Skip to content

Conversation

@aaronsteers
Copy link
Contributor

@aaronsteers aaronsteers commented Nov 13, 2025

feat: Add bearer token authentication support (do not merge)

Summary

This PR adds support for bearer token authentication to PyAirbyte's Cloud integration as an alternative to OAuth2 client credentials flow.

Status: DRAFT - Implementation in progress

This is an early draft PR created to get visibility and early feedback. No code changes have been implemented yet.

Planned Changes:

  1. Add bearer_token parameter to CloudWorkspace class
  2. Add validation to ensure only one auth method is used (client_id+secret OR bearer_token, mutually exclusive)
  3. Add environment variable support for bearer token (AIRBYTE_CLOUD_BEARER_TOKEN)
  4. Update get_airbyte_server_instance() to support bearer token authentication
  5. Update _make_config_api_request() to use bearer token if provided
  6. Add public create_oauth_token() method to CloudWorkspace to generate and return bearer tokens
  7. Update documentation with bearer token usage examples
  8. Add tests for bearer token authentication

Review & Testing Checklist for Human

  • No code to review yet - this is an early draft PR
  • Once implemented, verify that bearer token and client credentials are mutually exclusive
  • Test that bearer token authentication works end-to-end with Cloud API
  • Verify environment variable resolution works correctly
  • Test the new create_oauth_token() method

Notes

Summary by CodeRabbit

  • New Features

    • Added bearer-token authentication as an alternative to OAuth2 client credentials across the SDK.
    • Public APIs and workspace operations now accept an optional bearer token; client ID/secret are optional and validated so exactly one auth method is used.
    • Cloud workspaces can be created with a bearer token and can generate an OAuth2 token from client credentials.
  • Documentation

    • Added guidance and an env-var helper to resolve bearer tokens from the environment.
  • Tests / Chores

    • Updated token helper renaming and related tests to reflect the new token retrieval name.

✏️ Tip: You can customize this high-level summary in your review settings.

@devin-ai-integration
Copy link
Contributor

Original prompt from AJ Steers
@Devin - Does PyAirbyte still have the ability to authenticate to Cloud using a bearer token directly rather than passing client ID & secret?
Thread URL: https://airbytehq-team.slack.com/archives/D089P0UPVT4/p1763009638868299

@devin-ai-integration
Copy link
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@github-actions
Copy link

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

Testing This PyAirbyte Version

You can test this version of PyAirbyte using the following:

# Run PyAirbyte CLI from this branch:
uvx --from 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1763010134-bearer-token-auth' pyairbyte --help

# Install PyAirbyte from this branch for development:
pip install 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1763010134-bearer-token-auth'

Helpful Resources

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /fix-pr - Fixes most formatting and linting issues
  • /poetry-lock - Updates poetry.lock file
  • /test-pr - Runs tests with the updated PyAirbyte

Community Support

Questions? Join the #pyairbyte channel in our Slack workspace.

📝 Edit this welcome message.

@github-actions
Copy link

github-actions bot commented Nov 13, 2025

PyTest Results (Fast Tests Only, No Creds)

348 tests  ±0   348 ✅ ±0   5m 32s ⏱️ -34s
  1 suites ±0     0 💤 ±0 
  1 files   ±0     0 ❌ ±0 

Results for commit f64f941. ± Comparison against base commit 441105e.

♻️ This comment has been updated with latest results.

@aaronsteers aaronsteers marked this pull request as ready for review November 13, 2025 05:11
- Add bearer_token parameter to CloudWorkspace class as alternative to client credentials
- Add validation to ensure only one auth method is used (client_id+secret OR bearer_token)
- Add CLOUD_BEARER_TOKEN_ENV_VAR constant and resolve_cloud_bearer_token() function
- Update get_airbyte_server_instance() to support bearer token authentication
- Update _make_config_api_request() to use bearer token if provided
- Add create_oauth_token() public method to CloudWorkspace
- Update all api_util functions to accept optional bearer_token parameter
- Update CloudWorkspace docstring with bearer token usage examples
- Add type: ignore comments for optional client_id/client_secret parameters

Co-Authored-By: AJ Steers <[email protected]>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 13, 2025

📝 Walkthrough

Walkthrough

Add optional bearer-token authentication across Cloud clients and utilities. CloudWorkspace and many API wrappers now accept a mutually-exclusive bearer_token (or client credentials), propagate it through API calls, and CloudWorkspace gains create_oauth_token() with validation enforcing a single auth method.

Changes

Cohort / File(s) Change summary
Core API util
airbyte/_util/api_util.py
Add optional bearer_token, make client_id/client_secret optional, enforce mutual-exclusion, select bearer vs client_credentials flow, rename get_bearer_tokenget_new_bearer_token, update _make_config_api_request and many public wrappers to accept/forward bearer_token.
Cloud workspace model & ops
airbyte/cloud/workspaces.py
CloudWorkspace now accepts `bearer_token: SecretString
Cloud auth helpers & constants
airbyte/cloud/auth.py, airbyte/constants.py
Add resolve_cloud_bearer_token() helper and CLOUD_BEARER_TOKEN_ENV_VAR constant for env-based bearer token resolution.
Cloud clients: connections / connectors / sync results
airbyte/cloud/connections.py, airbyte/cloud/connectors.py, airbyte/cloud/sync_results.py
Thread bearer_token=self.workspace.bearer_token into many api_util calls; no public signature changes; minor lint comment removals.
MCP cloud ops
airbyte/mcp/cloud_ops.py
Import resolve_cloud_bearer_token; short-circuit CloudWorkspace construction to use bearer_token when present instead of client-credentials path.
Public API wrappers (various)
airbyte/_util/api_util.py (wrappers like list_connections, list_workspaces, create_source, create_destination, get_job_info, get_connection, check_connector, custom YAML ops, etc.)
Many wrappers now accept `bearer_token: SecretString
Tests
tests/integration_tests/cloud/test_cloud_api_util.py
Rename import and call from get_bearer_tokenget_new_bearer_token.

Sequence Diagram(s)

mermaid
sequenceDiagram
participant User as SDK Consumer
participant Workspace as CloudWorkspace
participant APIUtil as api_util.get_airbyte_server_instance
participant Airbyte as Airbyte Server

User->>Workspace: construct(workspace_id, client_id?, client_secret?, bearer_token?)
Workspace->>Workspace: validate exactly-one-auth
alt bearer_token provided
    Workspace->>APIUtil: get_airbyte_server_instance(api_root, bearer_token=token)
    APIUtil->>Airbyte: HTTP requests with Authorization: Bearer <token>
else client credentials provided
    Workspace->>Workspace: (optional) create_oauth_token() -> token
    Workspace->>APIUtil: get_airbyte_server_instance(api_root, client_id, client_secret)
    APIUtil->>Airbyte: obtain/configure token via client_credentials flow
end
User->>Workspace: call operation (e.g., list_connections)
Workspace->>APIUtil: make request(with bearer_token/or token)
APIUtil->>Airbyte: HTTP request with chosen auth
Airbyte-->>APIUtil: response
APIUtil-->>Workspace: return data
Workspace-->>User: return result

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~30–90 minutes

Potential areas requiring extra attention:

  • airbyte/_util/api_util.py: mutual-exclusion validation, token selection priority, header construction, and renaming of get_bearer_tokenget_new_bearer_token.
  • airbyte/cloud/workspaces.py: CloudWorkspace constructor changes, create_oauth_token() behavior, and enforcement of exactly-one-auth.
  • Call sites using # type: ignore[arg-type] where Optional creds are passed — verify runtime typing and behavior.
  • airbyte/mcp/cloud_ops.py: short-circuit path when env-provided bearer token exists.
  • Broad propagation: ensure all wrappers consistently forward bearer_token and preserve backward compatibility.

Would you like a focused test checklist to validate both auth flows and edge cases, wdyt?

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title accurately and concisely describes the main change: adding bearer token authentication support to PyAirbyte's Cloud integration.
Docstring Coverage ✅ Passed Docstring coverage is 96.63% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch devin/1763010134-bearer-token-auth

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
airbyte/cloud/connectors.py (1)

162-168: Bearer-token auth path breaks here
When a workspace is instantiated with only bearer_token, both client_id and client_secret stay None, so this call reaches api_util.check_connector with no usable credentials at all. The new bearer-token flow then 401s instead of succeeding. Could we thread bearer_token=self.workspace.bearer_token (as you did for the workspace methods) through this and the other connector API calls so the new auth mode actually works, wdyt?

         result = api_util.check_connector(
             workspace_id=self.workspace.workspace_id,
             connector_type=self.connector_type,
             actor_id=self.connector_id,
             api_root=self.workspace.api_root,
             client_id=self.workspace.client_id,  # type: ignore[arg-type]
             client_secret=self.workspace.client_secret,  # type: ignore[arg-type]
+            bearer_token=self.workspace.bearer_token,
         )
airbyte/cloud/workspaces.py (1)

348-378: Bearer-token deletions 401
permanently_delete_source and permanently_delete_destination never forward self.bearer_token, so bearer-token-only workspaces can’t delete resources—they send neither client creds nor a token to the Config API. Could we add bearer_token=self.bearer_token to these api_util.delete_* calls to keep the new auth mode consistent, wdyt?

         api_util.delete_source(
             source_id=source.connector_id if isinstance(source, CloudSource) else source,
             api_root=this.api_root,
             client_id=self.client_id,  # type: ignore[arg-type]
             client_secret=self.client_secret,  # type: ignore[arg-type]
+            bearer_token=self.bearer_token,
         )
...
         api_util.delete_destination(
             destination_id=(destination if isinstance(destination, str) else destination.destination_id),
             api_root=self.api_root,
             client_id=self.client_id,  # type: ignore[arg-type]
             client_secret=self.client_secret,  # type: ignore[arg-type]
+            bearer_token=self.bearer_token,
         )
airbyte/_util/api_util.py (1)

1389-1395: Don't drop bearer_token during safe-mode checks

Line 1391: If callers supply only bearer_token (with client_id/client_secret unset), this safe-mode lookup hits get_custom_yaml_source_definition without any auth, so get_airbyte_server_instance raises PyAirbyteInputError and the delete path breaks for bearer-token users. Could we forward the provided bearer token here? wdyt?

         definition_info = get_custom_yaml_source_definition(
             workspace_id=workspace_id,
             definition_id=definition_id,
             api_root=api_root,
             client_id=client_id,
             client_secret=client_secret,
+            bearer_token=bearer_token,
         )
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7eb746b and 49a96aa.

📒 Files selected for processing (6)
  • airbyte/_util/api_util.py (43 hunks)
  • airbyte/cloud/auth.py (1 hunks)
  • airbyte/cloud/connections.py (6 hunks)
  • airbyte/cloud/connectors.py (12 hunks)
  • airbyte/cloud/workspaces.py (16 hunks)
  • airbyte/constants.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
airbyte/cloud/auth.py (2)
airbyte/secrets/base.py (1)
  • SecretString (38-143)
airbyte/secrets/util.py (1)
  • try_get_secret (33-60)
airbyte/cloud/workspaces.py (3)
airbyte/secrets/base.py (1)
  • SecretString (38-143)
airbyte/exceptions.py (1)
  • PyAirbyteInputError (201-210)
airbyte/_util/api_util.py (1)
  • get_bearer_token (1041-1066)
airbyte/_util/api_util.py (3)
airbyte/secrets/base.py (1)
  • SecretString (38-143)
airbyte/exceptions.py (1)
  • PyAirbyteInputError (201-210)
airbyte/cloud/workspaces.py (1)
  • list_connections (466-492)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Pytest (All, Python 3.10, Windows)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (No Creds)
  • GitHub Check: Pytest (All, Python 3.11, Windows)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)

@devin-ai-integration devin-ai-integration bot changed the title feat: Add bearer token authentication support (do not merge) feat: Add bearer token authentication support Nov 13, 2025
…ules

- Add bearer_token parameter to all api_util calls in connectors.py (13 calls)
- Add bearer_token parameter to all api_util calls in connections.py (7 calls)
- Add bearer_token parameter to all api_util calls in sync_results.py (6 calls)
- Ensures bearer token authentication works throughout cloud integration

Co-Authored-By: AJ Steers <[email protected]>
@github-actions
Copy link

github-actions bot commented Nov 13, 2025

PyTest Results (Full)

416 tests  ±0   391 ✅  - 8   23m 16s ⏱️ - 1m 12s
  1 suites ±0    17 💤 ±0 
  1 files   ±0     8 ❌ +8 

For more details on these failures, see this check.

Results for commit f64f941. ± Comparison against base commit 441105e.

♻️ This comment has been updated with latest results.

@aaronsteers
Copy link
Contributor Author

Devin, update from main. Also do a thorough check to make sure we didn't miss anything.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
airbyte/_util/api_util.py (3)

1069-1117: Type safety concern with the type: ignore comments.

The # type: ignore[arg-type] comments on lines 1081-1082 suggest there's a type mismatch. The function signature allows None for client_id and client_secret, but get_bearer_token expects non-None SecretString values.

While the mutual exclusivity validation in get_airbyte_server_instance ensures this scenario won't occur for most calls, _make_config_api_request can be called directly by check_connector and get_connector_builder_project_for_definition_id.

Would it be safer to add an explicit validation here instead of using type ignores? Something like:

    if bearer_token is None:
+        if client_id is None or client_secret is None:
+            raise PyAirbyteInputError(
+                message="Must provide either bearer_token or both client_id and client_secret.",
+            )
        bearer_token = get_bearer_token(
-            client_id=client_id,  # type: ignore[arg-type]
-            client_secret=client_secret,  # type: ignore[arg-type]
+            client_id=client_id,
+            client_secret=client_secret,
            api_root=api_root,
        )

This would eliminate the need for type ignores and make the function more defensive. Wdyt?


1389-1395: Critical: Missing bearer_token propagation in safe_mode check.

In the safe_mode validation block, the call to get_custom_yaml_source_definition doesn't pass the bearer_token parameter. This will cause authentication failures when users rely on bearer token authentication.

Apply this diff to fix:

        definition_info = get_custom_yaml_source_definition(
            workspace_id=workspace_id,
            definition_id=definition_id,
            api_root=api_root,
            client_id=client_id,
            client_secret=client_secret,
+            bearer_token=bearer_token,
        )

866-883: Inconsistent parameter optionality may prevent bearer token usage.

In create_connection (and similarly in get_connection_by_name, delete_connection, patch_connection, and the custom YAML definition functions), client_id and client_secret are required parameters (typed as SecretString), while bearer_token is optional (typed as SecretString | None).

This makes it impossible to call these functions with only a bearer token, which contradicts the PR's goal of supporting bearer token as an alternative to client credentials.

Based on the past review comment, it seems the intent is to make all three authentication parameters explicitly optional (SecretString | None), allowing callers to choose which auth method to provide. Wdyt about updating these signatures?

For example:

 def create_connection(  # noqa: PLR0913  # Too many arguments
     name: str,
     *,
     source_id: str,
     destination_id: str,
     api_root: str,
-    client_id: SecretString,
-    client_secret: SecretString,
+    client_id: SecretString | None = None,
+    client_secret: SecretString | None = None,
     bearer_token: SecretString | None = None,
     workspace_id: str | None = None,
     prefix: str,
     selected_stream_names: list[str],
 ) -> models.ConnectionResponse:

The mutual exclusivity validation in get_airbyte_server_instance will still enforce that exactly one auth method is provided.

🧹 Nitpick comments (1)
airbyte/_util/api_util.py (1)

77-119: Nice mutual exclusivity validation!

The authentication validation logic looks solid. The checks correctly enforce that exactly one auth method must be provided, and the error messages are clear.

One small edge case to consider: if someone provides only client_id without client_secret (or vice versa), the current logic would treat it as "no client creds" and pass if bearer_token is provided. If neither is provided, the user gets "Must provide either client credentials or bearer_token" which might be slightly confusing since they did provide a client_id.

Would it be worth adding an explicit check that if either client_id or client_secret is provided, both must be provided? Something like:

    has_client_creds = client_id is not None and client_secret is not None
    has_bearer = bearer_token is not None
+
+    # Validate that partial client credentials aren't provided
+    if (client_id is not None or client_secret is not None) and not has_client_creds:
+        raise PyAirbyteInputError(
+            message="Both client_id and client_secret must be provided together for "
+            "client credentials authentication.",
+        )

    if has_client_creds and has_bearer:

Wdyt?

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 97392bf and 6b1a26f.

📒 Files selected for processing (1)
  • airbyte/_util/api_util.py (43 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: yohannj
Repo: airbytehq/PyAirbyte PR: 716
File: airbyte/logs.py:384-402
Timestamp: 2025-07-11T19:53:44.427Z
Learning: In the PyAirbyte project, when reviewing PRs, maintain clear separation of concerns. Don't suggest changes that are outside the scope of the PR's main objective, even if they would improve consistency or fix other issues. This helps with reviewing changes and potential reverts.
🧬 Code graph analysis (1)
airbyte/_util/api_util.py (2)
airbyte/secrets/base.py (1)
  • SecretString (38-143)
airbyte/exceptions.py (1)
  • PyAirbyteInputError (201-210)

devin-ai-integration bot and others added 5 commits December 11, 2025 08:36
Resolved conflicts by:
- Combining bearer_token params with main's new features (offset/order_by in get_job_logs)
- Preserving main's safe_mode deletion logic while adding bearer_token support
- Keeping main's bug fix (destination_id instead of source_id in connections.py)
- Adding bearer_token to all new safe_mode delete functions

Co-Authored-By: AJ Steers <[email protected]>
- Import resolve_cloud_bearer_token in cloud_ops.py
- Update _get_cloud_workspace() to prefer bearer token auth
- Falls back to client credentials if bearer token not available
- Addresses PR comment about resolve_cloud_bearer_token() not being called

Co-Authored-By: AJ Steers <[email protected]>
… parameters

- Remove keyword-only marker (*) from function signature
- Remove default values (= None) from client_id, client_secret, bearer_token
- Forces explicit passing of all auth parameters (even if None)
- Addresses PR comment requesting explicit parameter passing for correctness

Co-Authored-By: AJ Steers <[email protected]>
…ocstrings

- Pass bearer_token parameter through instead of hardcoding None
- Add bearer_token to docstrings for all affected functions
- Update function signatures to accept SecretString | None for consistency
- Fixes type checking errors and unused argument warnings
- Addresses PR comment about ensuring bearer_token is properly threaded

Co-Authored-By: AJ Steers <[email protected]>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (8)
airbyte/_util/api_util.py (8)

35-44: Duplicate import block detected

There's a duplicate TYPE_CHECKING block with from collections.abc import Callable appearing twice (lines 36 and 40). Would you want to consolidate these into a single block, wdyt?

 if TYPE_CHECKING:
     from collections.abc import Callable
-
-
-if TYPE_CHECKING:
-    from collections.abc import Callable
 
     from airbyte_api.models import (
         DestinationConfiguration,
     )

983-1001: Signature inconsistency – client_id/client_secret required even when bearer_token provided

The parameters client_id and client_secret are required (SecretString), while bearer_token is optional. However, get_airbyte_server_instance enforces mutual exclusivity — if a caller provides bearer_token along with (required) credentials, it will raise an error.

Should these be SecretString | None = None to align with the mutual exclusivity model used elsewhere in this file, wdyt?

 def create_connection(  # noqa: PLR0913  # Too many arguments
     name: str,
     *,
     source_id: str,
     destination_id: str,
     api_root: str,
-    client_id: SecretString,
-    client_secret: SecretString,
+    client_id: SecretString | None = None,
+    client_secret: SecretString | None = None,
     bearer_token: SecretString | None = None,

1025-1041: Same signature inconsistency as create_connection

client_id and client_secret are required here too, making it impossible to use bearer-token-only authentication for this function.

 def get_connection_by_name(
     workspace_id: str,
     connection_name: str,
     *,
     api_root: str,
-    client_id: SecretString,
-    client_secret: SecretString,
+    client_id: SecretString | None = None,
+    client_secret: SecretString | None = None,
     bearer_token: SecretString | None = None,

1077-1086: Same signature inconsistency

delete_connection also requires credentials even when bearer_token is provided.

 def delete_connection(
     connection_id: str,
     connection_name: str | None = None,
     *,
     api_root: str,
     workspace_id: str,
-    client_id: SecretString,
-    client_secret: SecretString,
+    client_id: SecretString | None = None,
+    client_secret: SecretString | None = None,
     bearer_token: SecretString | None = None,

1155-1162: Same signature inconsistency

patch_connection has the same issue with required credentials.

 def patch_connection(  # noqa: PLR0913  # Too many arguments
     connection_id: str,
     *,
     api_root: str,
-    client_id: SecretString,
-    client_secret: SecretString,
+    client_id: SecretString | None = None,
+    client_secret: SecretString | None = None,
     bearer_token: SecretString | None = None,

1300-1307: Same signature inconsistency for check_connector

 def check_connector(
     *,
     actor_id: str,
     connector_type: Literal["source", "destination"],
-    client_id: SecretString,
-    client_secret: SecretString,
+    client_id: SecretString | None = None,
+    client_secret: SecretString | None = None,
     bearer_token: SecretString | None = None,

1385-1394: Custom YAML functions share the same signature inconsistency

The functions create_custom_yaml_source_definition, list_custom_yaml_source_definitions, get_custom_yaml_source_definition, update_custom_yaml_source_definition, and delete_custom_yaml_source_definition all require client_id and client_secret while making bearer_token optional.

Would you consider making credentials optional across all of these to enable bearer-token-only usage, wdyt?


1621-1629: Same signature inconsistency for get_connector_builder_project_for_definition_id

 def get_connector_builder_project_for_definition_id(
     *,
     workspace_id: str,
     definition_id: str,
     api_root: str,
-    client_id: SecretString,
-    client_secret: SecretString,
+    client_id: SecretString | None = None,
+    client_secret: SecretString | None = None,
     bearer_token: SecretString | None = None,
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1579a1d and bbff833.

📒 Files selected for processing (1)
  • airbyte/_util/api_util.py (71 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
airbyte/_util/api_util.py (3)
airbyte/secrets/base.py (1)
  • SecretString (38-143)
airbyte/exceptions.py (1)
  • PyAirbyteInputError (201-210)
airbyte/cloud/workspaces.py (1)
  • list_connections (603-629)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: preview_docs
  • GitHub Check: Pytest (All, Python 3.10, Windows)
  • GitHub Check: Pytest (All, Python 3.11, Windows)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (No Creds)
🔇 Additional comments (4)
airbyte/_util/api_util.py (4)

81-126: LGTM! Clean mutual exclusivity enforcement

The authentication validation logic is well-structured — it properly enforces that exactly one authentication method must be provided, and the error messages are clear about what's expected. The security mode selection is clean too.


378-407: Changes look good

The _ = workspace_id pattern here is appropriate since the parameter is kept for API consistency but not actually used in this function's implementation.


1662-1707: LGTM!

The signature here correctly has client_id and client_secret as SecretString | None, which is consistent with supporting bearer-token-only authentication.


1713-1928: LGTM! Consistent signatures in organization and workspace utilities

The remaining functions (list_organizations_for_user, list_workspaces_in_organization, get_workspace_organization_info, get_connection_state, get_connection_catalog) all correctly use optional credentials, allowing for bearer-token-only authentication.

Comment on lines 1259 to 1264
if bearer_token is None:
bearer_token = get_bearer_token(
client_id=client_id, # type: ignore[arg-type]
client_secret=client_secret, # type: ignore[arg-type]
api_root=api_root,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Potential runtime issue when both bearer_token and credentials are None

If bearer_token is None and either client_id or client_secret is also None, get_bearer_token will receive None where it expects SecretString. The # type: ignore[arg-type] comments suppress the warning but don't prevent the runtime failure.

Would you consider adding validation here similar to get_airbyte_server_instance, or at least an assertion before calling get_bearer_token, wdyt?

     if bearer_token is None:
+        if client_id is None or client_secret is None:
+            raise PyAirbyteInputError(
+                message="Must provide either bearer_token or both client_id and client_secret.",
+            )
         bearer_token = get_bearer_token(
-            client_id=client_id,  # type: ignore[arg-type]
-            client_secret=client_secret,  # type: ignore[arg-type]
+            client_id=client_id,
+            client_secret=client_secret,
             api_root=api_root,
         )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if bearer_token is None:
bearer_token = get_bearer_token(
client_id=client_id, # type: ignore[arg-type]
client_secret=client_secret, # type: ignore[arg-type]
api_root=api_root,
)
if bearer_token is None:
if client_id is None or client_secret is None:
raise PyAirbyteInputError(
message="Must provide either bearer_token or both client_id and client_secret.",
)
bearer_token = get_bearer_token(
client_id=client_id,
client_secret=client_secret,
api_root=api_root,
)
🤖 Prompt for AI Agents
In airbyte/_util/api_util.py around lines 1259 to 1264, the code calls
get_bearer_token while silencing type errors with "# type: ignore[arg-type]"
which can lead to runtime failures if client_id or client_secret are None; add a
validation step before calling get_bearer_token (following the pattern used in
get_airbyte_server_instance) that asserts or raises a clear ValueError when
client_id or client_secret is missing, and only call get_bearer_token when both
credentials are present (and remove the type: ignore comments).

aaronsteers and others added 2 commits December 12, 2025 13:37
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
airbyte/_util/api_util.py (1)

981-1000: create_connection() / check_connector() still effectively require client creds even when bearer_token is provided—intended?

Right now, a bearer-token-only caller can’t omit client_id/client_secret (they’re required in the signature), but passing both creds + bearer triggers the mutual-exclusion error. If the goal is “either auth method”, should these switch to optional creds like the other wrappers, wdyt?

@@
 def create_connection(  # noqa: PLR0913  # Too many arguments
@@
-    client_id: SecretString,
-    client_secret: SecretString,
+    client_id: SecretString | None = None,
+    client_secret: SecretString | None = None,
     bearer_token: SecretString | None = None,
@@
 def check_connector(
@@
-    client_id: SecretString,
-    client_secret: SecretString,
+    client_id: SecretString | None = None,
+    client_secret: SecretString | None = None,
     bearer_token: SecretString | None = None,

Also applies to: 1298-1326

♻️ Duplicate comments (2)
airbyte/_util/api_util.py (2)

1219-1245: _make_config_api_request() can crash + may leak secrets in error context; can we validate + redact + add timeouts?

  1. The # type: ignore[arg-type] is masking a real runtime hazard when creds are missing.
  2. Including response.request.body / response.__dict__ in exception context is very likely to capture connector credentials/PII (and conflicts with the project’s “don’t bake dynamic/PII into error payloads” posture). Based on learnings, please keep messages/context telemetry-safe.
  3. Outbound requests should have timeouts to avoid hanging.
@@
 def get_bearer_token(
@@
-    response = requests.post(
+    response = requests.post(
         url=api_root + "/applications/token",
@@
         json={
             "client_id": client_id,
             "client_secret": client_secret,
-        }
+        },
+        timeout=30,
     )
@@
 def _make_config_api_request(
@@
     config_api_root = get_config_api_root(api_root)
     if bearer_token is None:
+        if client_id is None or client_secret is None:
+            raise PyAirbyteInputError(
+                message="Must provide either bearer_token or both client_id and client_secret.",
+            )
         bearer_token = get_bearer_token(
-            client_id=client_id,  # type: ignore[arg-type]
-            client_secret=client_secret,  # type: ignore[arg-type]
+            client_id=client_id,
+            client_secret=client_secret,
             api_root=api_root,
         )
@@
     response = requests.request(
         method="POST",
         url=full_url,
         headers=headers,
         json=json,
+        timeout=30,
     )
@@
             raise AirbyteError(
                 message=error_message,
                 context={
                     "full_url": full_url,
                     "config_api_root": config_api_root,
                     "path": path,
                     "status_code": response.status_code,
-                    "url": response.request.url,
-                    "body": response.request.body,
-                    "response": response.__dict__,
+                    # Avoid logging request/response bodies which may contain secrets/PII.
                 },
             ) from ex

Also applies to: 1247-1295


285-286: Please drop the reintroduced _ = workspace_id no-ops (regression).

These are misleading “unused var” suppressions, and you do use workspace_id (or at least include it in error context). Can we remove them again, wdyt?

@@
-    _ = workspace_id
     airbyte_instance: airbyte_api.AirbyteAPI = get_airbyte_server_instance(
@@
-    _ = workspace_id
     airbyte_instance = get_airbyte_server_instance(
@@
-    _ = workspace_id
     airbyte_instance = get_airbyte_server_instance(
@@
-    _ = workspace_id
     airbyte_instance = get_airbyte_server_instance(

Also applies to: 335-336, 386-387, 423-424

🧹 Nitpick comments (1)
airbyte/_util/api_util.py (1)

132-146: Bearer-token threading is consistent; do you want to remove = None defaults on internal wrappers to force explicit auth?

Given get_airbyte_server_instance() now errors when all auth inputs are None, keeping defaults on many wrappers makes it easy to call them incorrectly; would you prefer requiring callers to pass auth explicitly (no = None), similar to the earlier reviewer intent, wdyt?

Also applies to: 167-188, 219-240, 451-484, 540-560, 579-593, 609-670, 685-705, 752-768, 791-805, 839-901, 916-936, 1023-1039, 1075-1138, 1153-1191, 1383-1410, 1420-1434, 1456-1471, 1495-1511, 1539-1600, 1619-1656, 1711-1736, 1751-1800, 1824-1860, 1863-1892, 1895-1927

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bbff833 and e7973fd.

📒 Files selected for processing (1)
  • airbyte/_util/api_util.py (71 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2024-10-06T23:44:31.534Z
Learnt from: aaronsteers
Repo: airbytehq/PyAirbyte PR: 411
File: airbyte/cli.py:111-160
Timestamp: 2024-10-06T23:44:31.534Z
Learning: In PyAirbyte, error messages in functions like `_resolve_source_job` in `airbyte/cli.py` are designed to decouple the message text from dynamic values, following a structlog-inspired design. Dynamic values are provided via parameters like `input_value`. This approach helps avoid including PII in the message strings, which may be used in telemetry.

Applied to files:

  • airbyte/_util/api_util.py
🧬 Code graph analysis (1)
airbyte/_util/api_util.py (2)
airbyte/secrets/base.py (1)
  • SecretString (38-143)
airbyte/exceptions.py (1)
  • PyAirbyteInputError (201-210)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (No Creds)
🔇 Additional comments (1)
airbyte/_util/api_util.py (1)

81-126: Auth mutual-exclusion logic is solid, and the bearer_auth field usage is correct for airbyte-api 0.53.0.

The code correctly uses models.Security(bearer_auth=str(bearer_token)) — the field name matches the SDK's expected kwarg, and converting SecretString to string is the right approach for the string-typed bearer token field.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
airbyte/cloud/workspaces.py (1)

156-169: Bearer-token auth path seems broken for get_organization(): bearer token isn’t passed. If a workspace is constructed with only bearer_token, _organization_info currently calls get_workspace_organization_info(...) without it, so org lookup will likely fail. Should we thread bearer_token=self.bearer_token here too, wdyt?

         return api_util.get_workspace_organization_info(
             workspace_id=self.workspace_id,
             api_root=self.api_root,
             client_id=self.client_id,
             client_secret=self.client_secret,
+            bearer_token=self.bearer_token,
         )
🧹 Nitpick comments (2)
airbyte/cloud/workspaces.py (2)

245-251: Avoid repeated auth plumbing + type: ignore[arg-type] by centralizing auth kwargs. Would you be open to adding a small helper like _auth_kwargs() (returning either {bearer_token: ...} or {client_id: ..., client_secret: ...}) and then using **self._auth_kwargs() at call sites, so we can drop the type: ignore noise and reduce drift risk, wdyt?

Also applies to: 275-279, 360-368, 416-424, 451-459, 484-496, 531-542, 579-588, 613-621, 641-649, 669-677, 770-778, 808-814, 841-848


254-279: create_oauth_token() return-only vs “set on workspace” semantics. Right now it returns a token but doesn’t store it on self.bearer_token; should we either (a) clearly document “return-only” intent, or (b) optionally set self.bearer_token (e.g., set_on_workspace: bool = False) for convenience, wdyt?

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e7973fd and ed5b844.

📒 Files selected for processing (1)
  • airbyte/cloud/workspaces.py (16 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
airbyte/cloud/workspaces.py (3)
airbyte/secrets/base.py (1)
  • SecretString (38-143)
airbyte/exceptions.py (1)
  • PyAirbyteInputError (201-210)
airbyte/_util/api_util.py (1)
  • get_bearer_token (1219-1244)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Pytest (No Creds)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Windows)
  • GitHub Check: Pytest (All, Python 3.11, Windows)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)

Copy link
Contributor

@aldogonzalez8 aldogonzalez8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

APPROVED

CLOUD_CLIENT_SECRET_ENV_VAR: str = "AIRBYTE_CLOUD_CLIENT_SECRET"
"""The environment variable name for the Airbyte Cloud client secret."""

CLOUD_BEARER_TOKEN_ENV_VAR: str = "AIRBYTE_CLOUD_BEARER_TOKEN"
Copy link
Contributor

@aldogonzalez8 aldogonzalez8 Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks correct to me.
===============================QA===============================
Scenario:
For Organization Aldo Alberto Jesus Gonzalez Campos's Organization (4ab08a42-189b-4133-9049-841d5c34e83c) obtained a token through application/token endpoint:

curl --request POST \
  --url https://cloud.airbyte.com/api/v1/applications/token \
  --header 'content-type: application/json' \
  --data '{
  "client_id": "{{client_id}}",
  "client_secret": "{{client_secret}}"
}'

In airbyte_mcp.env, I commented out AIRBYTE_CLOUD_CLIENT_ID, AIRBYTE_CLOUD_CLIENT_SECRET, and added AIRBYTE_CLOUD_BEARER_TOKEN with the value from the previous request.

# AIRBYTE_CLOUD_CLIENT_SECRET=obfuscated
# AIRBYTE_CLOUD_WORKSPACE_ID=obfuscated
AIRBYTE_CLOUD_BEARER_TOKEN=obfuscated

After that created a server_config_local.json with local PyAirbyte (checked out from AJ's PR)

{
  "mcpServers": {
    "airbyte": {
      "command": "uvx",
      "args": [
        "--python=3.11",
        "--from",
        "/Users/aldogonzalez/dev/workspaces/support-agent-coral/PyAirbyte",
        "airbyte-mcp"
      ],
      "env": {
        "AIRBYTE_MCP_ENV_FILE": "/Users/aldogonzalez/.mcp/airbyte_mcp.env",
        "AIRBYTE_CLOUD_MCP_SAFE_MODE": "1",
        "AIRBYTE_CLOUD_MCP_READONLY_MODE": "0"
      }
    }
  }
}

After that initiated claude with mcp config:

claude --mcp-config /Users/aldogonzalez/.mcp/server_config_local.json

Finally asked to access WS in Aldo's organization and WS out of Aldo's organization (Marathon client):

> can you list my airbyte connections in 3a6efb89-8c44-4277-b865-8d009245a886 WS?

⏺ airbyte - list_deployed_cloud_connections (MCP)(workspace_id: "3a6efb89-8c44-4277-b865-8d009245a886")
  ⎿  {
       "result": [
         {
     … +25 lines (ctrl+o to expand)

⏺ Here are the connections in workspace 3a6efb89-8c44-4277-b865-8d009245a886:

  | Name                                  | Connection ID                        | Source ID                            | Destination ID                       |
  |---------------------------------------|--------------------------------------|--------------------------------------|--------------------------------------|
  | Sample Data (Faker) - s3-staging-aldo | 092afed7-743e-4b90-aec7-f2668a1e41f3 | 6542b00d-9889-4871-ae02-e6711b09dc78 | a4bf5e3e-0734-4581-a7ef-e41ac499622d |
  | Sample Data (Faker) - s3-staging-aldo | e8e3cc4b-6f48-49fd-b908-ca9761c36a7b | 11cdb83a-11fd-4006-bcf8-503c756184aa | a4bf5e3e-0734-4581-a7ef-e41ac499622d |

  Both connections share the same name and destination, but have different sources. Neither has any job history recorded yet.

> can you list my airbyte connections in c23ea6a8-ab44-4a74-9bba-47ae3a8ce904 ws?

⏺ airbyte - list_deployed_cloud_connections (MCP)(workspace_id: "c23ea6a8-ab44-4a74-9bba-47ae3a8ce904")
  ⎿  Error: Error calling tool 'list_deployed_cloud_connections': API error occurred: Status 403
     {"status":403,"type":"https://reference.airbyte.com/reference/errors#forbidden","title":"forbidden","detail":"The request is forbidden.","documentationUrl":null,"data":{"message":"Caller does not have the required WORKSPACE_READER
     permissions to access the resource(s)."}}

⏺ I don't have access to workspace c23ea6a8-ab44-4a74-9bba-47ae3a8ce904. The API returned a 403 Forbidden error indicating that the configured credentials don't have WORKSPACE_READER permissions for that workspace.

  You may need to:
  1. Verify the workspace ID is correct
  2. Ensure your Airbyte Cloud API credentials have access to this workspace
  3. Check if the workspace belongs to a different organization than your current credentials

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@devin-ai-integration
Copy link
Contributor

❌ Cannot revive Devin session - the session is too old. Please start a new session instead.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
airbyte/cloud/workspaces.py (1)

163-168: Missing bearer_token parameter in API call.

The call to api_util.get_workspace_organization_info is missing the bearer_token parameter, which was added to the function signature. This will cause a runtime error.

     return api_util.get_workspace_organization_info(
         workspace_id=self.workspace_id,
         api_root=self.api_root,
         client_id=self.client_id,
         client_secret=self.client_secret,
+        bearer_token=self.bearer_token,
     )
airbyte/_util/api_util.py (1)

1247-1262: Critical: signature expects non-optional params but callers pass optionals.

The signature at lines 1252-1254 expects SecretString (non-optional), but callers throughout the file pass SecretString | None. This causes the pipeline type errors. Additionally, line 1257's condition doesn't guard against None values before calling get_bearer_token, which could cause a runtime error.

Should we update the signature to accept optional params and add validation similar to get_airbyte_server_instance, wdyt?

 def _make_config_api_request(
     *,
     api_root: str,
     path: str,
     json: dict[str, Any],
-    client_id: SecretString,
-    client_secret: SecretString,
-    bearer_token: SecretString,
+    client_id: SecretString | None,
+    client_secret: SecretString | None,
+    bearer_token: SecretString | None,
 ) -> dict[str, Any]:
     config_api_root = get_config_api_root(api_root)
-    if client_id and client_secret and not bearer_token:
+    if bearer_token is None:
+        if client_id is None or client_secret is None:
+            raise PyAirbyteInputError(
+                message="Must provide either bearer_token or both client_id and client_secret.",
+            )
         bearer_token = get_bearer_token(
             client_id=client_id,
             client_secret=client_secret,
             api_root=api_root,
         )
♻️ Duplicate comments (4)
airbyte/cloud/workspaces.py (2)

53-64: Doc example leaks the bearer token to stdout/logs.

The snippet print(f"Bearer token: {token}") will print the raw token (since SecretString is a str subclass); should we switch to masked repr ({token!r}) or avoid printing the token entirely, wdyt?

-print(f"Bearer token: {token}")
+print(f"Bearer token: {token!r}")  # masked

121-149: Public typing likely too strict for CloudWorkspace(...) inputs.

The examples pass str, but the dataclass fields are SecretString | None, which will trip mypy/pyright users; should we widen to str | SecretString | None (and still wrap to SecretString in __post_init__), wdyt?

-    client_id: SecretString | None = None
-    client_secret: SecretString | None = None
-    bearer_token: SecretString | None = None
+    client_id: str | SecretString | None = None
+    client_secret: str | SecretString | None = None
+    bearer_token: str | SecretString | None = None

Also, should we add a dedicated error when only one of client_id / client_secret is provided (vs the current generic "Must provide either…"), and reject empty-string secrets via SecretString(...).is_empty(), wdyt?

airbyte/_util/api_util.py (2)

423-423: workspace_id used in error context on line 442.

-    _ = workspace_id

285-285: Leftover no-op assignment – workspace_id used on line 298.

The line _ = workspace_id was supposedly addressed in commit 199b916 per past review comments, but it's still here. Since workspace_id is actually used later (line 298 in the request), this dummy assignment should be removed, wdyt?

-    _ = workspace_id
🧹 Nitpick comments (2)
airbyte/cloud/workspaces.py (2)

245-251: Consider type narrowing over type: ignore comments.

The type: ignore[arg-type] comments suppress legitimate type concerns. Since __post_init__ guarantees exactly one auth method is set, could we use assertions or typing.cast() to narrow types instead of silencing the checker, wdyt?

For example:

assert (self.client_id is not None and self.client_secret is not None) or self.bearer_token is not None

254-279: LGTM with a minor documentation suggestion.

The implementation correctly validates the auth method and generates a token. Should the docstring clarify whether this creates a fresh token on each call (vs caching), wdyt?

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ed5b844 and db8ba66.

📒 Files selected for processing (2)
  • airbyte/_util/api_util.py (71 hunks)
  • airbyte/cloud/workspaces.py (16 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2024-10-08T15:34:31.026Z
Learnt from: aaronsteers
Repo: airbytehq/PyAirbyte PR: 411
File: airbyte/cli.py:111-160
Timestamp: 2024-10-08T15:34:31.026Z
Learning: In PyAirbyte, error messages in functions like `_resolve_source_job` in `airbyte/cli.py` are designed to decouple the message text from dynamic values, following a structlog-inspired design. Dynamic values are provided via parameters like `input_value`. This approach helps avoid including PII in the message strings, which may be used in telemetry.

Applied to files:

  • airbyte/_util/api_util.py
🧬 Code graph analysis (2)
airbyte/cloud/workspaces.py (3)
airbyte/secrets/base.py (1)
  • SecretString (38-143)
airbyte/exceptions.py (1)
  • PyAirbyteInputError (201-210)
airbyte/_util/api_util.py (1)
  • get_bearer_token (1219-1244)
airbyte/_util/api_util.py (3)
airbyte/secrets/base.py (1)
  • SecretString (38-143)
airbyte/exceptions.py (1)
  • PyAirbyteInputError (201-210)
airbyte/cloud/workspaces.py (1)
  • list_connections (603-629)
🪛 GitHub Actions: Run Linters
airbyte/cloud/workspaces.py

[error] 218-218: Missing argument bearer_token in function airbyte._util.api_util.get_workspace_organization_info

airbyte/_util/api_util.py

[error] 1323-1323: Argument SecretString | None is not assignable to parameter client_id with type SecretString in function _make_config_api_request [bad-argument-type]


[error] 1324-1324: Argument SecretString | None is not assignable to parameter client_secret with type SecretString in function _make_config_api_request [bad-argument-type]


[error] 1325-1325: Argument SecretString | None is not assignable to parameter bearer_token with type SecretString in function _make_config_api_request [bad-argument-type]


[error] 1653-1653: Argument SecretString | None is not assignable to parameter client_id with type SecretString in function _make_config_api_request [bad-argument-type]


[error] 1654-1654: Argument SecretString | None is not assignable to parameter client_secret with type SecretString in function _make_config_api_request [bad-argument-type]


[error] 1655-1655: Argument SecretString | None is not assignable to parameter bearer_token with type SecretString in function _make_config_api_request [bad-argument-type]

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (All, Python 3.11, Windows)
  • GitHub Check: Pytest (All, Python 3.10, Windows)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (No Creds)
🔇 Additional comments (1)
airbyte/_util/api_util.py (1)

81-126: LGTM on the mutual-exclusion validation.

The auth method validation correctly enforces exactly one method (client credentials OR bearer token). The security object construction follows the right pattern.

Note: A past review comment suggested removing the = None defaults to force explicit parameter passing, but that's a style preference—current approach works fine too, wdyt?

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (4)
airbyte/_util/api_util.py (4)

335-335: Same pattern – remove unused-variable assignment.

Similar to line 285, this _ = workspace_id incorrectly suggests the variable is unused when it's actually referenced on line 348. This was also marked as addressed in a past review but remains present, wdyt about removing it?

-    _ = workspace_id
     airbyte_instance = get_airbyte_server_instance(

386-386: Remove misleading assignment (same pattern).

Like the previous cases, this _ = workspace_id is misleading. While workspace_id isn't directly used in the function body currently, the no-op assignment should be removed. Was this perhaps overlooked when merging commit 199b916, wdyt?

-    _ = workspace_id
     airbyte_instance = get_airbyte_server_instance(

423-423: Remove misleading assignment (last occurrence).

Final instance of the _ = workspace_id pattern. The variable is used in the error context on line 442, so this assignment is misleading. Should be removed like the others, wdyt?

-    _ = workspace_id
     airbyte_instance = get_airbyte_server_instance(

285-285: Remove misleading unused-variable assignment.

This _ = workspace_id line suggests the variable is unused, but workspace_id is actually passed to ListSourcesRequest on line 298. Past reviews marked this as addressed in commit 199b916, but it appears to still be present. Should we remove it for clarity, wdyt?

-    _ = workspace_id
     airbyte_instance: airbyte_api.AirbyteAPI = get_airbyte_server_instance(
🧹 Nitpick comments (1)
airbyte/_util/api_util.py (1)

1257-1262: Consider adding defensive validation for auth parameters.

Past reviews noted that if bearer_token is None and either client_id or client_secret is also None, the function would construct an invalid Bearer None authorization header. While current callers validate through get_airbyte_server_instance, adding validation here would make the code more defensive. Would you consider adding a check similar to the pattern in get_airbyte_server_instance, wdyt?

     config_api_root = get_config_api_root(api_root)
-    if client_id and client_secret and not bearer_token:
+    if bearer_token is None:
+        if client_id is None or client_secret is None:
+            raise PyAirbyteInputError(
+                message="Must provide either bearer_token or both client_id and client_secret.",
+            )
         bearer_token = get_new_bearer_token(
             client_id=client_id,
             client_secret=client_secret,
             api_root=api_root,
         )
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between db8ba66 and 4f05142.

📒 Files selected for processing (5)
  • airbyte/_util/api_util.py (72 hunks)
  • airbyte/cloud/connections.py (8 hunks)
  • airbyte/cloud/connectors.py (16 hunks)
  • airbyte/cloud/workspaces.py (17 hunks)
  • tests/integration_tests/cloud/test_cloud_api_util.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • airbyte/cloud/connections.py
  • airbyte/cloud/connectors.py
  • airbyte/cloud/workspaces.py
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2024-10-08T15:34:31.026Z
Learnt from: aaronsteers
Repo: airbytehq/PyAirbyte PR: 411
File: airbyte/cli.py:111-160
Timestamp: 2024-10-08T15:34:31.026Z
Learning: In PyAirbyte, error messages in functions like `_resolve_source_job` in `airbyte/cli.py` are designed to decouple the message text from dynamic values, following a structlog-inspired design. Dynamic values are provided via parameters like `input_value`. This approach helps avoid including PII in the message strings, which may be used in telemetry.

Applied to files:

  • airbyte/_util/api_util.py
🧬 Code graph analysis (2)
tests/integration_tests/cloud/test_cloud_api_util.py (2)
airbyte/_util/api_util.py (1)
  • get_new_bearer_token (1219-1244)
airbyte/secrets/base.py (1)
  • SecretString (38-143)
airbyte/_util/api_util.py (2)
airbyte/secrets/base.py (1)
  • SecretString (38-143)
airbyte/exceptions.py (1)
  • PyAirbyteInputError (201-210)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Windows)
  • GitHub Check: Pytest (All, Python 3.11, Windows)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (No Creds)
🔇 Additional comments (4)
airbyte/_util/api_util.py (3)

81-126: LGTM! Mutual exclusivity logic is well-implemented.

The authentication validation cleanly enforces exactly one auth method (client credentials OR bearer token), which aligns perfectly with the PR objectives. The error messages are clear and actionable.


1219-1244: LGTM! Function rename improves clarity.

Renaming from get_bearer_token to get_new_bearer_token better conveys that this generates a new token rather than retrieving a cached one. Implementation remains solid.


132-1927: Consistent bearer token propagation throughout the file.

All API utility functions have been consistently updated to accept and propagate the bearer_token parameter. The pattern is uniform and well-integrated across the entire module.

tests/integration_tests/cloud/test_cloud_api_util.py (1)

18-18: LGTM! Test updated to match renamed function.

The import and usage correctly reflect the function rename from get_bearer_token to get_new_bearer_token. The test continues to validate the same authentication behavior.

Also applies to: 256-256

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
airbyte/_util/api_util.py (1)

376-404: Remove unused workspace_id parameter from get_connection.

The parameter is never referenced in the function body. Since callers are explicitly passing this argument (in airbyte/cloud/sync_results.py, airbyte/cloud/connections.py, and airbyte/_util/api_util.py), removing it requires updating those call sites as well. Want to go ahead with this cleanup?

♻️ Duplicate comments (3)
airbyte/_util/api_util.py (3)

422-422: Remove leftover no-op assignment.

This _ = workspace_id assignment is unnecessary and was flagged in previous reviews. The variable is used in the error context on line 441. Should be removed, wdyt?

-    _ = workspace_id
     airbyte_instance = get_airbyte_server_instance(

335-335: Remove leftover no-op assignment.

Similar to line 285, this _ = workspace_id assignment is unnecessary since workspace_id is used later (line 348). Should be removed, wdyt?

-    _ = workspace_id
     airbyte_instance = get_airbyte_server_instance(

285-285: Remove leftover no-op assignment.

This _ = workspace_id assignment was flagged in previous reviews as unnecessary since workspace_id is actually used later in the function (line 298). Should be removed, wdyt?

-    _ = workspace_id
     airbyte_instance: airbyte_api.AirbyteAPI = get_airbyte_server_instance(
🧹 Nitpick comments (1)
airbyte/_util/api_util.py (1)

1246-1294: Consider adding validation for missing authentication.

The logic correctly generates a bearer token when client credentials are provided, but if all three auth parameters are None, line 1264 would use bearer_token=None resulting in "Bearer None" in the Authorization header. While upstream validation in get_airbyte_server_instance prevents this, functions calling _make_config_api_request directly (like check_connector) might bypass that validation.

Would you consider adding a validation check similar to the one in get_airbyte_server_instance, wdyt?

 def _make_config_api_request(
     *,
     api_root: str,
     path: str,
     json: dict[str, Any],
     client_id: SecretString | None,
     client_secret: SecretString | None,
     bearer_token: SecretString | None,
 ) -> dict[str, Any]:
     config_api_root = get_config_api_root(api_root)
+    
+    has_client_creds = client_id is not None and client_secret is not None
+    has_bearer = bearer_token is not None
+    
+    if not has_client_creds and not has_bearer:
+        raise PyAirbyteInputError(
+            message="Must provide either client credentials (client_id + client_secret) "
+            "or bearer_token for authentication.",
+        )
+    
     if client_id and client_secret and not bearer_token:
         bearer_token = get_new_bearer_token(
             client_id=client_id,
             client_secret=client_secret,
             api_root=api_root,
         )
📜 Review details

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4f05142 and f64f941.

📒 Files selected for processing (1)
  • airbyte/_util/api_util.py (72 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2024-10-08T15:34:31.026Z
Learnt from: aaronsteers
Repo: airbytehq/PyAirbyte PR: 411
File: airbyte/cli.py:111-160
Timestamp: 2024-10-08T15:34:31.026Z
Learning: In PyAirbyte, error messages in functions like `_resolve_source_job` in `airbyte/cli.py` are designed to decouple the message text from dynamic values, following a structlog-inspired design. Dynamic values are provided via parameters like `input_value`. This approach helps avoid including PII in the message strings, which may be used in telemetry.

Applied to files:

  • airbyte/_util/api_util.py
🪛 GitHub Actions: Run Linters
airbyte/_util/api_util.py

[warning] 377-377: ARG001 Unused function argument: workspace_id

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.11, Windows)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (All, Python 3.10, Windows)
  • GitHub Check: Pytest (No Creds)
  • GitHub Check: Pytest (Fast)
🔇 Additional comments (3)
airbyte/_util/api_util.py (3)

81-126: Nice authentication refactor!

The mutual exclusivity validation between bearer token and client credentials is well-implemented. The error messages are clear, and the security model construction properly handles both authentication paths.


132-161: Excellent consistency in bearer token propagation!

The bearer token parameter has been consistently added across all API utility functions with the same pattern: bearer_token: SecretString | None = None, and properly passed through to get_airbyte_server_instance or _make_config_api_request. The docstrings are appropriately updated where needed. This makes the authentication change clean and maintainable.

Also applies to: 167-216, 219-266, 269-316, 319-370, 376-404, 407-444, 450-503, 506-533, 539-575, 578-605, 608-681, 684-723, 751-787, 790-835, 838-912, 915-954, 980-1019, 1022-1057, 1074-1149, 1152-1212, 1297-1340, 1382-1416, 1419-1452, 1455-1491, 1494-1535, 1538-1615, 1618-1656, 1659-1704, 1710-1747, 1750-1820, 1823-1859, 1862-1891, 1894-1926


1218-1243: Good naming improvement!

Renaming to get_new_bearer_token makes it clear that this generates a fresh token rather than retrieving an existing one. Nice clarification!

devin-ai-integration bot added a commit that referenced this pull request Dec 17, 2025
…on support

This commit consolidates authentication handling for Airbyte Cloud API:

- Add CloudCredentials dataclass in airbyte/cloud/credentials.py that
  encapsulates auth configuration (client_id/secret or bearer_token)
- Add bearer token authentication support to api_util functions
- Add resolve_cloud_credentials() in MCP _util.py for multi-source
  credential resolution (explicit params -> HTTP headers -> env vars)
- Add HTTP header extraction helpers for MCP HTTP/SSE transport
- Update CloudWorkspace to support bearer token authentication
- Export CloudCredentials from airbyte.cloud module

This replaces PRs #866, #867, and #914 with a unified implementation.

Co-Authored-By: AJ Steers <[email protected]>
@aaronsteers
Copy link
Contributor Author

Replaced by #916

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants