fix: typos in k8s discovery code #12288

ChuanFF · 2025-06-04T16:55:04Z

there is a spelling error in k8s discovery module. post_List should be post_list

Description

There is a spelling error in the k8s service discovery module code. As shown below, the judgment condition in the code of apisix/discovery/kubernetes/informer_factory.lua should be informer.post_list instead of informer.post_List. The current bug causes the post_list function to not be executed.

    informer.fetch_state = "list finished"
    if informer.post_List then
        informer:post_list()
    end

Checklist

I have explained the need for this PR and the problem it solves
I have explained the changes or the new features added to this PR
I have added tests corresponding to this change
I have updated the documentation to reflect this change
I have verified that this change is backward compatible (If not, please discuss on the APISIX mailing list first)

Spelling error. post_List should be post_list

Baoyuantop · 2025-06-05T01:30:41Z

Hi @ChuanFF, need add test for this change.

ChuanFF · 2025-06-05T02:17:59Z

@Baoyuantop just a typo, is the test necessary?

Baoyuantop · 2025-06-05T03:03:12Z

The current bug causes the post_list function to not be executed.

We need to verify this.

ChuanFF · 2025-06-06T07:39:18Z

There are two typos in the following code:
informer.pre_List should be informer.pre_list,
informer.post_List should be informer.post_list.

apisix/apisix/discovery/kubernetes/informer_factory.lua

Lines 288 to 303 in 58066ab

    
           if informer.pre_List then 
        
               informer:pre_list() 
        
           end 
        
           ok, reason, message = list(httpc, apiserver, informer) 
        
           if not ok then 
        
               informer.fetch_state = "list failed" 
        
               core.log.error("list failed, kind: ", informer.kind, 
        
                       ", reason: ", reason, ", message : ", message) 
        
               return false 
        
           end 
        
           informer.fetch_state = "list finished" 
        
           if informer.post_List then 
        
               informer:post_list() 
        
           end

Function Behavior:

pre_list calls shared_dict:flush_all() (docs), which expires all keys in the shared dictionary but doesn't physically delete them.
post_list calls shared_dict:flush_expired() (docs), which physically deletes expired keys from the shared dictionary.

Issue After Fixing Typos:

When the typos are corrected, a new problem emerges:
flush_all function seems will kill 3 key-value pairs data in shared_dict (code reference). This creates a critical window between the pre_list call and when new endpoints are populated. During this interval:

some endpoint data in the shared dictionary will be missing
requests may fail due to unavailable endpoint information

Therefore, this typos bug seems not to be fixed directly. I'm not sure whether the behavior of shared_dict:flush_all() deleting some data is a bug

Baoyuantop · 2025-06-16T06:13:19Z

Hi @ChuanFF, I think we can optimize this by modifying the pre_list and pre_post methods:

pre_list function only records existing data keys and does not clean up any data.
pre_post compares old and new data and removes old data that no longer exists.

ChuanFF · 2025-06-23T09:36:12Z

@Baoyuantop if we want to removes old data that no longer exists, there is a problem that how to records existing data keys,ngx.shared.DICT.get_keys does not seem to be suitable（official document have a caution: Avoid calling this method on dictionaries with a very large number of keys as it may lock the dictionary for significant amount of time and block Nginx worker processes trying to access the dictionary.）

I think we can prepare two tables, data_history and data_existing, recording data to data_existing during list and watch, assigning data_existing to data_history in pre_list and setting data_existing to {}, then comparing data_history with data_existing in post_list and removes old data that no longer exists.

Test cases seem difficult to write, dirty data is generated only after the watch ends and before the next list call, i do not know how to verify old data that no longer exists, do you have any suggest？

Or can we ignore dirty data? After all, the conditions for dirty data to be generated are harsh, and it seems to have no effect.

Baoyuantop · 2025-06-26T08:12:26Z

Lua table cannot share data between multiple workers.

ChuanFF · 2025-07-02T03:17:08Z

Lua table just used for record data to clear dirty data in shared_dict.

github-actions · 2025-08-31T10:04:46Z

This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 4 weeks if no further activity occurs. If you think that's incorrect or this pull request should instead be reviewed, please simply write any comment. Even if closed, you can still revive the PR at any time or discuss it on the [email protected] list. Thank you for your contributions.

…pelling_error

Baoyuantop · 2025-09-04T01:42:38Z

Hi @ChuanFF, before continuing to complete the code, could you please describe your current modification plan in the description? This will help other maintainers review this PR.

ChuanFF · 2025-09-04T15:59:57Z

@Baoyuantop Sorry, I've been trying to fix this issue in my own repository for the past couple of days and didn't realize the PR would sync here as well. I have now completed the code changes. The approach is:

In the pre_list function, I record the existing keys from the shared_dict into a temporary table handle.existing_keys. Then, when APISIX lists Kubernetes endpoints/endpointSlices, I record the keys written to the shared_dict into another temporary table handle.current_keys_hash. Essentially, handle.existing_keys represents the old list of keys, while handle.current_keys_hash represents the new list of keys. Finally, in the post_list function, I compare the old and new data to identify and remove any dirty data.

ChuanFF · 2025-09-04T16:11:13Z

Additionally, the test case requires waiting for the list query to finish, so a 2-second sleep was added in the location /t. I plan to submit an additional PR that improve the function _M.status_ready() in apisix/init.lua to check if APISIX's Kubernetes service discovery is ready. Once that is implemented, the 2-second sleep will be replaced with a wait for the readiness check.

ChuanFF · 2025-09-12T02:47:53Z

Hi @ChuanFF, before continuing to complete the code, could you please describe your current modification plan in the description? This will help other maintainers review this PR.

@Baoyuantop The modification has been completed. Can we start the review？

Update informer_factory.lua

4ae068c

Spelling error. post_List should be post_list

dosubot bot added size:XS This PR changes 0-9 lines, ignoring generated files. bug Something isn't working labels Jun 4, 2025

ChuanFF changed the title ~~fix: typos in k8s service discovery code~~ fix: typos in k8s discovery code Jun 4, 2025

Baoyuantop added this to ⚡️ Apache APISIX Roadmap Jun 13, 2025

Baoyuantop moved this to 👀 In review in ⚡️ Apache APISIX Roadmap Jun 13, 2025

Baoyuantop moved this from 👀 In review to 📋 Backlog in ⚡️ Apache APISIX Roadmap Jul 29, 2025

github-actions bot added the stale label Aug 31, 2025

Baoyuantop removed the stale label Sep 1, 2025

ChuanFF added 2 commits September 1, 2025 23:48

Merge remote-tracking branch 'origin/master' into fix_k8s_discovery_s…

09300ec

…pelling_error

pre_list/post_list bug fix修复

2e30cd9

dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. and removed size:XS This PR changes 0-9 lines, ignoring generated files. labels Sep 1, 2025

ChuanFF added 5 commits September 2, 2025 23:14

pre_list/post_list bug fix修复

c21247f

pre_list/post_list bug fix修复

370d087

pre_list/post_list bug fix修复-测试用例

899633e

pre_list/post_list bug fix修复-测试用例

750733e

get_healthcheck_events_module spelling fix

977557b

ChuanFF added 2 commits September 4, 2025 11:18

Update kubernetes3.t

d1e513c

Update kubernetes3.t

31df72a

ChuanFF added 10 commits September 4, 2025 13:09

Update kubernetes3.t

1a6055e

Update kubernetes3.t

7c1dabd

Update init.lua

b1315a2

Update kubernetes3.t

1cd6556

Update kubernetes3.t

2f04ec1

Update kubernetes3.t

b133cdb

Update kubernetes3.t

08b56a7

Update kubernetes3.t

a584109

Update kubernetes3.t

7d21358

Update kubernetes3.t

d0788bc

dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:M This PR changes 30-99 lines, ignoring generated files. labels Sep 4, 2025

Update kubernetes3.t

74a19ed

Merge branch 'master' into fix_k8s_discovery_spelling_error

dd4f9ea

Baoyuantop self-requested a review September 12, 2025 08:31

Baoyuantop moved this from 📋 Backlog to 🏗 In progress in ⚡️ Apache APISIX Roadmap Sep 12, 2025

Baoyuantop moved this from 🏗 In progress to 👀 In review in ⚡️ Apache APISIX Roadmap Sep 12, 2025

Baoyuantop assigned ChuanFF Sep 12, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: typos in k8s discovery code #12288

fix: typos in k8s discovery code #12288

Uh oh!

ChuanFF commented Jun 4, 2025 •

edited

Loading

Uh oh!

Baoyuantop commented Jun 5, 2025

Uh oh!

ChuanFF commented Jun 5, 2025

Uh oh!

Baoyuantop commented Jun 5, 2025

Uh oh!

ChuanFF commented Jun 6, 2025 •

edited

Loading

Uh oh!

Baoyuantop commented Jun 16, 2025

Uh oh!

ChuanFF commented Jun 23, 2025 •

edited

Loading

Uh oh!

Baoyuantop commented Jun 26, 2025

Uh oh!

ChuanFF commented Jul 2, 2025

Uh oh!

github-actions bot commented Aug 31, 2025

Uh oh!

Baoyuantop commented Sep 4, 2025

Uh oh!

ChuanFF commented Sep 4, 2025 •

edited

Loading

Uh oh!

ChuanFF commented Sep 4, 2025

Uh oh!

ChuanFF commented Sep 12, 2025 •

edited

Loading

Uh oh!

Uh oh!

fix: typos in k8s discovery code #12288

Are you sure you want to change the base?

fix: typos in k8s discovery code #12288

Uh oh!

Conversation

ChuanFF commented Jun 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checklist

Uh oh!

Baoyuantop commented Jun 5, 2025

Uh oh!

ChuanFF commented Jun 5, 2025

Uh oh!

Baoyuantop commented Jun 5, 2025

Uh oh!

ChuanFF commented Jun 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Function Behavior:

Issue After Fixing Typos:

Uh oh!

Baoyuantop commented Jun 16, 2025

Uh oh!

ChuanFF commented Jun 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Baoyuantop commented Jun 26, 2025

Uh oh!

ChuanFF commented Jul 2, 2025

Uh oh!

github-actions bot commented Aug 31, 2025

Uh oh!

Baoyuantop commented Sep 4, 2025

Uh oh!

ChuanFF commented Sep 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ChuanFF commented Sep 4, 2025

Uh oh!

ChuanFF commented Sep 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

ChuanFF commented Jun 4, 2025 •

edited

Loading

ChuanFF commented Jun 6, 2025 •

edited

Loading

ChuanFF commented Jun 23, 2025 •

edited

Loading

ChuanFF commented Sep 4, 2025 •

edited

Loading

ChuanFF commented Sep 12, 2025 •

edited

Loading