Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 2 additions & 3 deletions plugins/filter_kubernetes/kube_conf.h
Original file line number Diff line number Diff line change
Expand Up @@ -74,11 +74,10 @@
#define SERVICE_NAME_SOURCE_MAX_LEN 64

/*
* Configmap used for verifying whether if FluentBit is
* on EKS or native Kubernetes
* Namespace and token path used for verifying whether FluentBit is
* on EKS or native Kubernetes by inspecting serviceaccount token issuer
*/
#define KUBE_SYSTEM_NAMESPACE "kube-system"
Comment on lines +77 to 80
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

🧩 Analysis chain

Clarify comment accuracy regarding namespace usage.

The updated comment states "Namespace and token path used for verifying..." but based on the new JWT issuer inspection approach described in the PR and AI summary, the detection logic only reads and parses the serviceaccount token to check the issuer field—it doesn't appear to query or use the KUBE_SYSTEM_NAMESPACE for platform verification itself.

Please verify whether KUBE_SYSTEM_NAMESPACE is actually utilized in the new detection flow, or if it's a legacy constant that remains for other purposes. If the namespace is no longer part of platform detection, consider updating the comment to reflect this more accurately.

Run the following script to verify how KUBE_SYSTEM_NAMESPACE is used in the codebase:


🏁 Script executed:

#!/bin/bash
# Description: Check where KUBE_SYSTEM_NAMESPACE is referenced and how it's used

# Find all references to KUBE_SYSTEM_NAMESPACE
echo "=== References to KUBE_SYSTEM_NAMESPACE ==="
rg -n --type=c --type=h 'KUBE_SYSTEM_NAMESPACE' -A 3 -B 3

echo ""
echo "=== Check if it's used in determine_platform ==="
ast-grep --pattern $'determine_platform($$$) {
  $$$
}'

Length of output: 623


Update header comment to reflect JWT issuer–only detection
The KUBE_SYSTEM_NAMESPACE constant isn’t referenced by the new platform-detection logic; revise the comment to mention only the serviceaccount token path and issuer inspection.

🤖 Prompt for AI Agents
In plugins/filter_kubernetes/kube_conf.h around lines 77 to 80, the header
comment incorrectly mentions namespace-based detection though the new
platform-detection only checks the serviceaccount token path and inspects the
token issuer; update the comment to remove reference to
namespace/namespace-based detection and state that detection is performed solely
by reading the serviceaccount token and checking its issuer field (JWT
issuer-only detection).

#define AWS_AUTH_CONFIG_MAP "aws-auth"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we keep this as a fallback for old clusters? Or does the issuer exist on both old and new cluster so it should just be the definitive mechanism for determining platforms?


/*
* Possible platform values for Kubernetes plugin
Expand Down
106 changes: 97 additions & 9 deletions plugins/filter_kubernetes/kubernetes_aws.c
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@
#include <fluent-bit/flb_jsmn.h>
#include <fluent-bit/flb_record_accessor.h>
#include <fluent-bit/flb_ra_key.h>
#include <fluent-bit/flb_utils.h>

#include "kube_conf.h"
#include "kube_meta.h"
Expand Down Expand Up @@ -282,19 +283,106 @@ int fetch_pod_service_map(struct flb_kube *ctx, char *api_server_url,
return 0;
}

/* Determine platform by checking aws-auth configmap */
/* Determine platform by checking serviceaccount token issuer */
int determine_platform(struct flb_kube *ctx)
{
int ret;
char *config_buf;
size_t config_size;

ret = get_api_server_configmap(ctx, KUBE_SYSTEM_NAMESPACE, AWS_AUTH_CONFIG_MAP, &config_buf, &config_size);
if (ret != -1) {
flb_free(config_buf);
return 1;
char *token_buf = NULL;
size_t token_size;
char *payload = NULL;
size_t payload_len;
char *issuer_start, *issuer_end;

/* Read serviceaccount token */
ret = flb_utils_read_file(FLB_KUBE_TOKEN, &token_buf, &token_size);
if (ret != 0 || !token_buf) {
return -1;
}

/* JWT tokens have 3 parts separated by dots: header.payload.signature */
char *first_dot = strchr(token_buf, '.');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure variable declarations are on the top of function: https://github.com/fluent/fluent-bit/blob/master/CONTRIBUTING.md#variable-definitions

if (!first_dot) {
flb_free(token_buf);
return -1;
}

char *second_dot = strchr(first_dot + 1, '.');
if (!second_dot) {
flb_free(token_buf);
return -1;
}

/* Extract and decode the payload (middle part) */
size_t payload_b64_len = second_dot - (first_dot + 1);
char *payload_b64 = flb_malloc(payload_b64_len + 1);
if (!payload_b64) {
flb_free(token_buf);
return -1;
}

memcpy(payload_b64, first_dot + 1, payload_b64_len);
payload_b64[payload_b64_len] = '\0';

/* Base64 decode the payload */
payload = flb_malloc(payload_b64_len * 3 / 4 + 4); /* Conservative size estimate */
if (!payload) {
flb_free(token_buf);
flb_free(payload_b64);
return -1;
}

ret = flb_base64_decode((unsigned char *)payload, payload_b64_len * 3 / 4 + 4,
&payload_len, (unsigned char *)payload_b64, payload_b64_len);
Comment on lines +334 to +335

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Decode service account JWT with standard Base64

determine_platform decodes the service account token payload via flb_base64_decode, which only accepts standard Base64 characters (+ and /) and padding. Kubernetes service-account tokens are JWTs encoded with Base64URL (- and _, typically unpadded), so this call fails with FLB_BASE64_ERR_INVALID_CHARACTER and the function always returns -1. As a result EKS environments are never detected and the platform stays k8s, breaking Explore Related tagging on EKS clusters. Use a Base64URL decoder or normalize the payload before decoding.

Useful? React with 👍 / 👎.


flb_free(token_buf);
flb_free(payload_b64);

if (ret != 0) {
flb_free(payload);
return -1;
}

payload[payload_len] = '\0';

/* Look for "iss" field in the JSON payload */
issuer_start = strstr(payload, "\"iss\":");
if (!issuer_start) {
flb_free(payload);
return -1;
}

/* Skip to the value part */
issuer_start = strchr(issuer_start, ':');
if (!issuer_start) {
flb_free(payload);
return -1;
}
return -1;
issuer_start++;

/* Skip whitespace and opening quote */
while (*issuer_start == ' ' || *issuer_start == '\t') issuer_start++;
if (*issuer_start != '"') {
flb_free(payload);
return -1;
}
issuer_start++;

/* Find closing quote */
issuer_end = strchr(issuer_start, '"');
if (!issuer_end) {
flb_free(payload);
return -1;
}
Comment on lines +347 to +375
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Replace manual JSON parsing with a proper parser.

The current string-based parsing for the "iss" field is fragile and may fail or produce false positives in the following scenarios:

  • The "iss" field appears in a nested JSON object
  • The issuer value contains escaped quotes
  • Another field name contains "iss" as a substring (e.g., "missing")

Since the file already includes flb_jsmn.h (line 23), use the jsmn JSON parser for robust and reliable extraction of the issuer field.

Consider applying this approach:

/* Parse JSON payload using jsmn */
jsmn_parser parser;
jsmntok_t tokens[64];  /* Adjust size as needed */
int token_count;

jsmn_init(&parser);
token_count = jsmn_parse(&parser, payload, payload_len, tokens, sizeof(tokens)/sizeof(tokens[0]));

if (token_count < 0) {
    flb_free(payload);
    return -1;
}

/* Find and extract "iss" field value */
char *issuer = NULL;
size_t issuer_len = 0;

for (int i = 1; i < token_count; i++) {
    if (tokens[i].type == JSMN_STRING && 
        strncmp(payload + tokens[i].start, "iss", tokens[i].end - tokens[i].start) == 0) {
        /* Next token is the value */
        if (i + 1 < token_count && tokens[i + 1].type == JSMN_STRING) {
            issuer = payload + tokens[i + 1].start;
            issuer_len = tokens[i + 1].end - tokens[i + 1].start;
            break;
        }
    }
}

if (!issuer) {
    flb_free(payload);
    return -1;
}
🤖 Prompt for AI Agents
plugins/filter_kubernetes/kubernetes_aws.c lines 347-375: Replace the brittle
string-based "iss" extraction with jsmn parsing: initialize a jsmn_parser, parse
payload using payload_len into a tokens array (resize tokens if needed), check
token_count for errors and free payload on failure, then iterate tokens to find
a JSMN_STRING token whose content exactly equals "iss" (use
tokens[i].start/end), verify the next token exists and is a string, and set
issuer_start = payload + tokens[i+1].start and issuer_end = payload +
tokens[i+1].end; on any failure free payload and return -1; this handles nested
objects and escaped characters correctly.


/* Check if issuer contains EKS OIDC URL pattern */
/* EKS OIDC URLs follow pattern: https://oidc.eks.{region}.amazonaws.com/id/{cluster-id} */
if (strstr(issuer_start, "oidc.eks.") && strstr(issuer_start, ".amazonaws.com/id/")) {
flb_free(payload);
return 1; /* EKS detected */
}

flb_free(payload);
return -1; /* Not EKS */
Comment on lines +377 to +385
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix pattern matching to search within issuer value only.

The current implementation uses strstr(issuer_start, ...) which searches the entire remaining payload string, not just the issuer value between issuer_start and issuer_end. This could produce false positives if the EKS patterns appear elsewhere in the JSON payload after the issuer field.

Apply this diff to limit the pattern search to the issuer value:

-    /* Check if issuer contains EKS OIDC URL pattern */
-    /* EKS OIDC URLs follow pattern: https://oidc.eks.{region}.amazonaws.com/id/{cluster-id} */
-    if (strstr(issuer_start, "oidc.eks.") && strstr(issuer_start, ".amazonaws.com/id/")) {
-        flb_free(payload);
-        return 1; /* EKS detected */
-    }
+    /* Check if issuer contains EKS OIDC URL pattern */
+    /* EKS OIDC URLs follow pattern: https://oidc.eks.{region}.amazonaws.com/id/{cluster-id} */
+    size_t issuer_len = issuer_end - issuer_start;
+    char *issuer_value = flb_strndup(issuer_start, issuer_len);
+    if (!issuer_value) {
+        flb_free(payload);
+        return -1;
+    }
+    
+    int is_eks = (strstr(issuer_value, "oidc.eks.") && strstr(issuer_value, ".amazonaws.com/id/"));
+    flb_free(issuer_value);
+    
+    if (is_eks) {
+        flb_free(payload);
+        return 1; /* EKS detected */
+    }
🤖 Prompt for AI Agents
In plugins/filter_kubernetes/kubernetes_aws.c around lines 377 to 385, the code
currently calls strstr on issuer_start which searches the rest of the payload
and can hit patterns outside the issuer value; instead limit the search to only
the issuer substring between issuer_start and issuer_end. Extract or reference
the issuer value length (issuer_end - issuer_start) and either copy that range
into a NUL-terminated buffer or use a bounded search (e.g., memmem with the
length) to look for "oidc.eks." and ".amazonaws.com/id/" only inside that issuer
value, then free any temporary buffer and return 1 for EKS or -1 otherwise.

}

/* Gather pods list information from Kubelet */
Expand Down