-
Notifications
You must be signed in to change notification settings - Fork 4.8k
SDN-4168: Add IPsec resilience tests #29232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
49571f9 to
50854b0
Compare
|
@pperiyasamy: This pull request references SDN-4168 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.18.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/retest |
50854b0 to
a6da4a6
Compare
|
/retest |
|
Job Failure Risk Analysis for sha: a6da4a6
|
a6da4a6 to
6e19beb
Compare
|
/payload-job ? |
|
@pperiyasamy: it appears that you have attempted to use some version of the payload command, but your comment was incorrectly formatted and cannot be acted upon. See the docs for usage info. |
|
/payload-job periodic-ci-openshift-release-master-nightly-4.19-e2e-ovn-ipsec-step-registry |
|
@pperiyasamy: trigger 0 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command |
|
/assign @jluhrsen |
This commit adds required tests to ensure pod traffic across nodes are not impacted upon multiple reboot of ipsec dameonset pods. Signed-off-by: Periyasamy Palanisamy <[email protected]>
6e19beb to
d5d96b1
Compare
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: pperiyasamy The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
Job Failure Risk Analysis for sha: d5d96b1
|
|
@pperiyasamy: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
| }, | ||
| Parallelism: 30, | ||
| MaximumAllowedFlakes: 15, | ||
| TestTimeout: 20 * time.Minute, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you need to have a discussion with the maintainers regarding this - did it occur on another comms channel?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved this test to ipsec test suite, so timeout change is not needed here.
| func restartIPsecDaemonSet(oc *exutil.CLI) error { | ||
| ds, err := getDaemonSet(oc, ovnNamespace, ovnIPsecDsName) | ||
| if err != nil { | ||
| return err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you may need to enrich this error message with context because you could return an api error message and we cannot determine where in this func it occurs. Also, ln 242
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
making it as ginkgo helper function now, so this is not needed now.
| }) | ||
|
|
||
| g.It("check pod traffic are working across nodes after ipsec daemonset restart", func() { | ||
| if ipsecMode != v1.IPsecModeFull { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not do this for other modes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the previous tested is valid for all ipsec modes, but this one is valid for full mode, added a comment in the code, hope that helps.
| }) | ||
|
|
||
| func createWebServerPods(oc *exutil.CLI, namespace string) []corev1.Pod { | ||
| g.GinkgoHelper() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does this do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/*
GinkgoHelper marks the function it's called in as a test helper. When a failure occurs inside a helper function, Ginkgo will skip the helper when analyzing the stack trace to identify where the failure occurred.
This is an alternative, simpler, mechanism to passing in a skip offset when calling Fail or using Gomega.
*/
| for _, targetPod := range pods { | ||
| if sourcePod.Name == targetPod.Name { | ||
| continue | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should you make sure also to only try to ping pods that are on different nodes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, the sourcePod.Name == targetPod.Name check avoids the self ping.
| for i := 1; i <= 5; i++ { | ||
| g.By(fmt.Sprintf("attempt#%d restarting IPsec pods", i)) | ||
| err := restartIPsecDaemonSet(oc) | ||
| o.Expect(err).NotTo(o.HaveOccurred()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit; i like to see the reasons why something failed with this expects otherwise you have to go look at the line of code that it failed on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh! yes martin, good catch. added restartIPsecDaemonSet as a test helper function, that will solve the problem that you described.
pperiyasamy
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you suggested, move this change into another PR #29563, let's continue our review there.
so i'm going to close this PR,
| }, | ||
| Parallelism: 30, | ||
| MaximumAllowedFlakes: 15, | ||
| TestTimeout: 20 * time.Minute, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved this test to ipsec test suite, so timeout change is not needed here.
| func restartIPsecDaemonSet(oc *exutil.CLI) error { | ||
| ds, err := getDaemonSet(oc, ovnNamespace, ovnIPsecDsName) | ||
| if err != nil { | ||
| return err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
making it as ginkgo helper function now, so this is not needed now.
| }) | ||
|
|
||
| g.It("check pod traffic are working across nodes after ipsec daemonset restart", func() { | ||
| if ipsecMode != v1.IPsecModeFull { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the previous tested is valid for all ipsec modes, but this one is valid for full mode, added a comment in the code, hope that helps.
| for i := 1; i <= 5; i++ { | ||
| g.By(fmt.Sprintf("attempt#%d restarting IPsec pods", i)) | ||
| err := restartIPsecDaemonSet(oc) | ||
| o.Expect(err).NotTo(o.HaveOccurred()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh! yes martin, good catch. added restartIPsecDaemonSet as a test helper function, that will solve the problem that you described.
| }) | ||
|
|
||
| func createWebServerPods(oc *exutil.CLI, namespace string) []corev1.Pod { | ||
| g.GinkgoHelper() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/*
GinkgoHelper marks the function it's called in as a test helper. When a failure occurs inside a helper function, Ginkgo will skip the helper when analyzing the stack trace to identify where the failure occurred.
This is an alternative, simpler, mechanism to passing in a skip offset when calling Fail or using Gomega.
*/
| for _, targetPod := range pods { | ||
| if sourcePod.Name == targetPod.Name { | ||
| continue | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, the sourcePod.Name == targetPod.Name check avoids the self ping.
This PR adds required resilience e2e tests which ensures IPsec deployment and pod traffic are working in all such scenarios.