Skip to content

Conversation

@JiriCtvrtka
Copy link
Contributor

@JiriCtvrtka JiriCtvrtka commented Nov 27, 2025

PMM-14442

With huge load and 50+ PMM clients I was able to hit data race in HA. See log below. Even HA is not enabled it is called from main. I believe we need add mutexes to place where is Add, Done etc because it is used in multiple go routines. With this fix I am not able to hit this data race. Also it caused a lot of disconnects and errors in log, once you hit that.

WARNING: DATA RACE
Write at 0x00c00036fb08 by goroutine 107:
  runtime.racewrite()
      <autogenerated>:1 +0x1e
  github.com/percona/pmm/managed/services/ha.(*services).Wait()
      /root/go/src/github.com/percona/pmm/managed/services/ha/services.go:100 +0x2d
  github.com/percona/pmm/managed/services/ha.(*Service).Run()
      /root/go/src/github.com/percona/pmm/managed/services/ha/highavailability.go:114 +0x23af
  main.main.func20()
      /root/go/src/github.com/percona/pmm/managed/cmd/pmm-managed/main.go:1164 +0x16f

Previous read at 0x00c00036fb08 by goroutine 108:
  runtime.raceread()
      <autogenerated>:1 +0x1e
  github.com/percona/pmm/managed/services/ha.(*services).StartAllServices()
      /root/go/src/github.com/percona/pmm/managed/services/ha/services.go:69 +0x3a9
  github.com/percona/pmm/managed/services/ha.(*Service).Run.func1()
      /root/go/src/github.com/percona/pmm/managed/services/ha/highavailability.go:103 +0x365

Goroutine 107 (running) created at:
  main.main()
      /root/go/src/github.com/percona/pmm/managed/cmd/pmm-managed/main.go:1162 +0xc908
  github.com/percona/pmm/managed/services/server.(*Server).UpdateSettingsFromEnv()
      /root/go/src/github.com/percona/pmm/managed/services/server/server.go:149 +0x6d6
  main.setup()
      /root/go/src/github.com/percona/pmm/managed/cmd/pmm-managed/main.go:501 +0x345
  main.main()
      /root/go/src/github.com/percona/pmm/managed/cmd/pmm-managed/main.go:1021 +0x9f9a
  github.com/percona/pmm/managed/services/grafana.NewClient()
      /root/go/src/github.com/percona/pmm/managed/services/grafana/client.go:81 +0x504
  main.main()
      /root/go/src/github.com/percona/pmm/managed/cmd/pmm-managed/main.go:916 +0x760e
  github.com/percona/pmm/managed/services/vmalert.NewVMAlert()
      /root/go/src/github.com/percona/pmm/managed/services/vmalert/vmalert.go:75 +0x664
  main.main()
      /root/go/src/github.com/percona/pmm/managed/cmd/pmm-managed/main.go:830 +0x577b
  main.main()
      /root/go/src/github.com/percona/pmm/managed/cmd/pmm-managed/main.go:818 +0x5384
  github.com/percona/pmm/managed/models.SetupDB()
      /root/go/src/github.com/percona/pmm/managed/models/database.go:1241 +0x877
  github.com/percona/pmm/managed/models.SetupDB()
      /root/go/src/github.com/percona/pmm/managed/models/database.go:1228 +0x186
  main.migrateDB()
      /root/go/src/github.com/percona/pmm/managed/cmd/pmm-managed/main.go:587 +0x457
  main.main()
      /root/go/src/github.com/percona/pmm/managed/cmd/pmm-managed/main.go:808 +0x4ea4

Goroutine 108 (running) created at:
  github.com/percona/pmm/managed/services/ha.(*Service).Run()
      /root/go/src/github.com/percona/pmm/managed/services/ha/highavailability.go:97 +0x1c4
  main.main.func20()
      /root/go/src/github.com/percona/pmm/managed/cmd/pmm-managed/main.go:1164 +0x16f
==================

@codecov
Copy link

codecov bot commented Nov 27, 2025

Codecov Report

❌ Patch coverage is 0% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 45.67%. Comparing base (c92fe2d) to head (7c6bf6d).

Files with missing lines Patch % Lines
managed/services/ha/services.go 0.00% 6 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##               v3    #4784      +/-   ##
==========================================
- Coverage   45.68%   45.67%   -0.01%     
==========================================
  Files         364      364              
  Lines       37759    37765       +6     
==========================================
  Hits        17249    17249              
- Misses      18851    18857       +6     
  Partials     1659     1659              
Flag Coverage Δ
managed 46.20% <0.00%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@JiriCtvrtka JiriCtvrtka marked this pull request as ready for review November 27, 2025 14:35
@JiriCtvrtka JiriCtvrtka requested a review from a team as a code owner November 27, 2025 14:35
@JiriCtvrtka JiriCtvrtka requested review from a team, BupycHuk, ademidoff and idoqo and removed request for a team November 27, 2025 14:35
@JiriCtvrtka
Copy link
Contributor Author

Alex will prepare PR to fix discovered issues. Another tunes needs to be done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants