🌱 chore(deps): upgrade k8s to 1.35 and Go dependencies (edge) #2486

camilamacedo86 · 2026-02-05T13:15:47Z

Description

Reviewer Checklist

API Go Documentation
Tests: Unit Tests (and E2E Tests, if appropriate)
Comprehensive Commit Messages
Links to related GitHub Issue(s)

netlify · 2026-02-05T13:15:53Z

✅ Deploy Preview for olmv1 ready!

Name	Link
🔨 Latest commit	`efbe1f6`
🔍 Latest deploy log	https://app.netlify.com/projects/olmv1/deploys/698b804fb743d10008cb22b5
😎 Deploy Preview	https://deploy-preview-2486--olmv1.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Copilot

Copilot wasn't able to review any files in this pull request.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot

Pull request overview

Copilot reviewed 7 out of 8 changed files in this pull request and generated 5 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

api/v1/suite_test.go

api/v1/clustercatalog_types_test.go

internal/operator-controller/applier/boxcutter_test.go

internal/operator-controller/controllers/suite_test.go

codecov · 2026-02-05T14:22:40Z

Codecov Report

❌ Patch coverage is 33.33333% with 12 lines in your changes missing coverage. Please review.
✅ Project coverage is 69.44%. Comparing base (28102fa) to head (efbe1f6).
⚠️ Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
test/utils.go	0.00%	12 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2486      +/-   ##
==========================================
- Coverage   73.74%   69.44%   -4.30%     
==========================================
  Files         102      102              
  Lines        8496     8504       +8     
==========================================
- Hits         6265     5906     -359     
- Misses       1747     2129     +382     
+ Partials      484      469      -15

Flag	Coverage Δ
e2e	`45.59% <100.00%> (-0.65%)`	⬇️
experimental-e2e	`12.72% <100.00%> (-41.32%)`	⬇️
unit	`57.89% <27.77%> (-0.09%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copilot

Pull request overview

Copilot reviewed 9 out of 10 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

api/v1/suite_test.go

Copilot

Pull request overview

Copilot reviewed 10 out of 11 changed files in this pull request and generated no new comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot

Pull request overview

Copilot reviewed 11 out of 12 changed files in this pull request and generated no new comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

camilamacedo86 · 2026-02-06T08:06:09Z

internal/operator-controller/applier/boxcutter.go

+	// Track in a future issue: "Generate apply configurations and migrate to typed Apply methods"
+	//
+	// nolint:staticcheck // SA1019: server-side apply required, needs generated apply configurations
 	return bc.Client.Patch(ctx, rev, client.Apply, client.FieldOwner(bc.FieldOwner), client.ForceOwnership)


TODO: Lets looking on that in a follow up shows very complex to go in this PR

perdasilva · 2026-02-06T10:29:31Z

test/utils.go

+	if lastErr != nil {
+		log.Printf("StopWithRetry: timeout reached before successful teardown; last error: %v", lastErr)
+	}
+	return env.Stop() // Final attempt


is it likely that the final attempt works? could we be calling out a false negative with the previous log line?

maybe we should just return:

fmt.Errorf("StopWithRetry: timeout reached before successful teardown; last error: %v", lastErr)

?

Yes, it works as it is. Within controller-runtime v0.23. has some timing changes in the test environment teardown process. Without using Eventually, testEnv.Stop() can fail intermittently with errors related to graceful shutdown timing.

So this code does the same of:

var _ = AfterSuite(func() { By("tearing down the test environment") cancel() Eventually(func() error { return testEnv.Stop() }, time.Minute, time.Second).Should(Succeed()) })

My question is whether we can be in a situation where

StopWithRetry: timeout reached before successful teardown; last error: ... is printed out but actually it stopped successfully in the final attempt (and whether we care)

maybe just do

err := env.Stop() // Final attempt if err != nil { log ... } return err

then you also don't need to keep the lastErr

pedjak

/lgtm

openshift-ci · 2026-02-10T08:17:51Z

New changes are detected. LGTM label has been removed.

camilamacedo86 · 2026-02-10T08:26:23Z

test/upgrade-e2e/post_upgrade_test.go

+		require.NoError(ct, err)
+		require.NotNil(ct, lease.Spec.HolderIdentity, "lease should have a holder")
+		require.NotEmpty(ct, *lease.Spec.HolderIdentity, "lease holder identity should not be empty")
+	}, 3*time.Minute, time.Second)


I asked for Claude evaluate those changes
Following bellow.

The Problem (WHY the change was needed)

Old Implementation (main branch)

t.Log("Waiting for acquired leader election") leaderCtx, leaderCancel := context.WithTimeout(ctx, 3*time.Minute) defer leaderCancel() leaderSubstrings := []string{"successfully acquired lease"} leaderElected, err := watchPodLogsForSubstring(leaderCtx, &managerPod, leaderSubstrings...) require.NoError(t, err) require.True(t, leaderElected)

The watchPodLogsForSubstring function used:

podLogOpts := corev1.PodLogOptions{ Follow: true, // <-- THIS IS THE PROBLEM Container: container, }

The Race Condition

controller-runtime v0.23.1+ has FASTER leader election startup

The test starts AFTER the pod is created

By the time the test calls watchPodLogsForSubstring, leader election has already completed

Follow: true only streams NEW logs, missing messages already emitted

The test waits forever for a message that already passed

Timeline Visualization

controller-runtime v0.22.x (SLOW startup): ├─ Pod starts ├─ Test starts watching logs (Follow=true) ├─ Leader election happens ← Message logged and caught by test ✅ └─ Test passes controller-runtime v0.23.1+ (FAST startup): ├─ Pod starts ├─ Leader election happens ← Message logged ├─ Test starts watching logs (Follow=true) ← Too late! Message missed ❌ └─ Test times out waiting for message

New Implementation (deps-up branch)

Approach 1: Direct Lease Checking (for leader election verification)

t.Logf("Waiting for acquired leader election by checking lease") // Instead of watching logs (which has timing issues in controller-runtime v0.23.1+), // directly check the lease object to confirm leader election occurred require.EventuallyWithT(t, func(ct *assert.CollectT) { var lease coordinationv1.Lease err := c.Get(ctx, types.NamespacedName{Name: "catalogd-operator-lock", Namespace: "olmv1-system"}, &lease) require.NoError(ct, err) require.NotNil(ct, lease.Spec.HolderIdentity, "lease should have a holder") require.NotEmpty(ct, *lease.Spec.HolderIdentity, "lease holder identity should not be empty") }, 3*time.Minute, time.Second)

Benefits:

✅ Timing-independent - checks the actual state, not log messages

✅ More reliable - verifies the desired outcome (lease acquired)

✅ No race conditions - lease object persists

✅ Tests the right thing - we care about the lease, not the log message

Approach 2: Polling with TailLines (for other log messages)

For messages that MUST be verified from logs (e.g., "reconcile ending"), the implementation was changed to use polling:

func watchPodLogsForSubstring(ctx context.Context, pod *corev1.Pod, substrings ...string) (bool, error) { // Use a polling approach to periodically check pod logs for the substrings // This handles controller-runtime v0.23.1+ where leader election happens faster // and the message may be emitted before we start watching with Follow ticker := time.NewTicker(2 * time.Second) defer ticker.Stop() for { select { case <-ctx.Done(): return false, ctx.Err() case <-ticker.C: // Get recent logs without Follow to check if message exists logOpts := corev1.PodLogOptions{ Container: container, TailLines: ptr.To(int64(1000)), // <-- Check EXISTING logs } // ... check logs for substring ... } } }

Benefits:

✅ Checks existing logs, not just new ones

✅ Handles messages emitted before watching starts

✅ Still timing-independent

Behavior Verification

Old Behavior

❌ Waited for "successfully acquired lease" log message

❌ Could miss fast leader elections

❌ Relied on timing

New Behavior

✅ Checks that lease.Spec.HolderIdentity is set

✅ Verifies leader election occurred (same goal, better method)

✅ Independent of timing

Are they equivalent?

YES - Both verify that leader election completed successfully:

Old: "If log says 'successfully acquired lease', then leader election worked"

New: "If lease has a HolderIdentity, then leader election worked"

The new approach is actually MORE accurate because:

It checks the actual state object, not a log message

It can't be affected by log rotation, buffering, or truncation

It's the single source of truth for leader election status

Copilot

Pull request overview

Copilot reviewed 11 out of 12 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

go.mod

test/upgrade-e2e/post_upgrade_test.go

Assisted-by: Cursor/Claude

…0.23.0 controller-runtime v0.23.0 introduced type-safe webhook builders using generics. Webhook methods now receive typed objects directly instead of runtime.Object, eliminating the need for type assertions and providing compile-time type safety. Changes: - Update ClusterCatalog webhook Default method to use *ocv1.ClusterCatalog - Update SetupWebhookWithManager to pass type to NewWebhookManagedBy - Simplify tests to leverage type safety (no type assertion tests needed) Assisted-by: Cursor/Claude

controller-runtime v0.23.0 changed test environment teardown timing behavior. Direct calls to testEnv.Stop() can fail intermittently due to graceful shutdown timing, requiring retry logic to ensure proper cleanup. Changes: - Add StopWithRetry helper function to test/utils.go with retry logic - Wrap testEnv.Stop() calls with retry helper (1-minute timeout, 1-second interval) - Add error logging for each retry attempt to aid debugging - Update api/v1/suite_test.go to use test.StopWithRetry - Update internal/operator-controller/controllers/suite_test.go to use test.StopWithRetry This shared utility follows DRY principle and ensures consistent teardown behavior across all test suites, preventing flaky test failures during teardown. Assisted-by: Cursor/Claude Co-authored-by: Cursor <cursoragent@cursor.com> # Conflicts: # api/v1/suite_test.go # internal/operator-controller/controllers/suite_test.go

…v0.23.0 controller-runtime v0.23.0 added Apply method to the StatusWriter interface for server-side apply support on status subresources. Changes: - Add Apply method to statusWriterMock to satisfy StatusWriter interface - Method signature: Apply(context.Context, runtime.ApplyConfiguration, ...SubResourceApplyOption) Assisted-by: Cursor/Claude

The client.Apply patch type constant is deprecated in controller-runtime v0.23.0+ in favor of typed Apply methods that require generated apply configurations. However, server-side apply with field ownership management is essential for this controller to properly manage ClusterExtensionRevision lifecycle. The deprecated client.Apply provides: - Field ownership via client.FieldOwner() - Force ownership via client.ForceOwnership - Automatic create-or-update semantics Alternative approaches (MergeFrom, StrategicMerge) lack field management support, causing upgrade test failures where ownership conflicts prevent proper updates during controller upgrades. Changes: - Keep client.Apply usage with comprehensive nolint explanation - Document why server-side apply semantics are required - Note migration path: generate apply configurations project-wide This ensures controllers work correctly during upgrades while documenting the technical debt for future migration. Assisted-by: Cursor/Claude

The fake.NewSimpleClientset is deprecated in client-go in favor of fake.NewClientset which provides better support for field management and server-side apply testing. The migration is straightforward for tests that don't require apply configurations - simply replace NewSimpleClientset() with NewClientset(). The new implementation works seamlessly with existing reactor-based mocking. Changes: - Replace fake.NewSimpleClientset() with fake.NewClientset() - Remove nolint suppression comment - All tests pass without modifications Assisted-by: Cursor/Claude

Address code review feedback by making the Apply method implementation more defensive: - Add applyCalled tracking field to statusWriterMock - Return error if Apply is unexpectedly called during tests - This ensures tests fail immediately if the code starts using Apply, rather than silently passing with incorrect assumptions The error message clearly indicates this method is not expected to be used, making debugging easier if the interface usage changes in the future. Assisted-by: Cursor/Claude

Controller-runtime v0.23.1+ has faster leader election startup, causing timing issues with the upgrade e2e tests that watch pod logs for leader election messages. The log watching approach fails because it uses Follow=true which only streams NEW logs, missing messages emitted before the test starts watching. With faster startup in v0.23.1+, leader election often completes before the test begins watching. Changes: - Replace log watching with direct lease object checking for leader election - Poll lease.Spec.HolderIdentity to confirm leader election occurred - Refactor watchPodLogsForSubstring to use polling with TailLines instead of Follow - Add per-poll context timeout (5s) to prevent hanging on slow log streams - Add scanner error checking to detect and handle stream errors - Fix context cancellation timing - pollCancel() called after stream operations complete - Add k8s.io/utils/ptr import for log options This approach is timing-independent and directly verifies the desired state rather than relying on log message observation. The per-poll timeout prevents tests from hanging indefinitely while still respecting the outer context timeout. Fixes address Copilot review comments about context management and error handling. Assisted-by: Cursor/Claude Co-authored-by: Cursor <cursoragent@cursor.com>

…ernetes v1.35 Kubernetes v1.35 changed CEL validation error message format to include the actual validated value instead of just the type ("string" or "object"). This provides more helpful error messages for debugging validation failures. This commit was accidentally dropped during interactive rebase. Changes: - Update test expectations to include actual values in error messages - Change from 'Invalid value: "string"' to 'Invalid value: "actual-value"' - Update 20+ test cases in TestImageSourceCELValidationRules and related tests - Fix indentation consistency across all test cases Assisted-by: Cursor/Claude Co-authored-by: Cursor <cursoragent@cursor.com>

Copilot

Pull request overview

Copilot reviewed 11 out of 12 changed files in this pull request and generated no new comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@perdasilva

Address code review feedback from @perdasilva: the "final attempt" after timeout creates a confusing false negative. If the final attempt succeeds after we've logged "timeout reached", we return success despite logging a timeout message. Changes: - Remove final attempt after timeout - Return error immediately when timeout is reached - Clearer error message with timeout duration - Prevents false negatives where timeout is logged but success is returned This makes the function behavior match its log messages - if we say we timed out, we actually return an error. Assisted-by: Cursor/Claude Co-authored-by: Cursor <cursoragent@cursor.com>

Copilot AI review requested due to automatic review settings February 5, 2026 13:15

openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 5, 2026

Copilot AI reviewed Feb 5, 2026

View reviewed changes

camilamacedo86 force-pushed the deps-up branch from c288cf6 to 41e7cc0 Compare February 5, 2026 13:18

Copilot AI review requested due to automatic review settings February 5, 2026 13:47

Copilot started reviewing on behalf of camilamacedo86 February 5, 2026 13:47 View session

Copilot AI reviewed Feb 5, 2026

View reviewed changes

camilamacedo86 force-pushed the deps-up branch from 83cde26 to a16409e Compare February 5, 2026 14:04

Copilot AI review requested due to automatic review settings February 5, 2026 15:03

Copilot started reviewing on behalf of camilamacedo86 February 5, 2026 15:03 View session

Copilot AI reviewed Feb 5, 2026

View reviewed changes

api/v1/suite_test.go Outdated Show resolved Hide resolved

camilamacedo86 force-pushed the deps-up branch from f3bb698 to f14866b Compare February 5, 2026 15:29

Copilot AI review requested due to automatic review settings February 5, 2026 15:59

camilamacedo86 force-pushed the deps-up branch from f14866b to 4a09f4b Compare February 5, 2026 15:59

Copilot started reviewing on behalf of camilamacedo86 February 5, 2026 16:00 View session

camilamacedo86 changed the title ~~WIP: 🌱 chore(deps): upgrade k8s to 1.35 and Go dependencies (edge)~~ 🌱 chore(deps): upgrade k8s to 1.35 and Go dependencies (edge) Feb 5, 2026

camilamacedo86 changed the title ~~🌱 chore(deps): upgrade k8s to 1.35 and Go dependencies (edge)~~ WIP 🌱 chore(deps): upgrade k8s to 1.35 and Go dependencies (edge) Feb 5, 2026

Copilot AI reviewed Feb 5, 2026

View reviewed changes

camilamacedo86 force-pushed the deps-up branch from 4a09f4b to e7a5d34 Compare February 5, 2026 23:08

Copilot AI review requested due to automatic review settings February 5, 2026 23:55

Copilot started reviewing on behalf of camilamacedo86 February 5, 2026 23:55 View session

Copilot AI reviewed Feb 6, 2026

View reviewed changes

camilamacedo86 commented Feb 6, 2026

View reviewed changes

camilamacedo86 changed the title ~~WIP 🌱 chore(deps): upgrade k8s to 1.35 and Go dependencies (edge)~~ 🌱 chore(deps): upgrade k8s to 1.35 and Go dependencies (edge) Feb 6, 2026

openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 6, 2026

perdasilva reviewed Feb 6, 2026

View reviewed changes

pedjak reviewed Feb 6, 2026

View reviewed changes

openshift-ci bot assigned pedjak Feb 6, 2026

camilamacedo86 force-pushed the deps-up branch from 4ce7ef4 to f03b3e9 Compare February 10, 2026 08:17

openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Feb 10, 2026

Copilot AI review requested due to automatic review settings February 10, 2026 08:21

camilamacedo86 force-pushed the deps-up branch from f03b3e9 to 11681ab Compare February 10, 2026 08:21

Copilot started reviewing on behalf of camilamacedo86 February 10, 2026 08:21 View session

camilamacedo86 closed this Feb 10, 2026

camilamacedo86 reopened this Feb 10, 2026

camilamacedo86 commented Feb 10, 2026

View reviewed changes

openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 10, 2026

Copilot AI reviewed Feb 10, 2026

View reviewed changes

go.mod Outdated Show resolved Hide resolved

test/upgrade-e2e/post_upgrade_test.go Outdated Show resolved Hide resolved

test/upgrade-e2e/post_upgrade_test.go Show resolved Hide resolved

camilamacedo86 force-pushed the deps-up branch from 34f8634 to eef3ba9 Compare February 10, 2026 08:35

openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 10, 2026

camilamacedo86 added 7 commits February 10, 2026 08:36

chore(deps): upgrade k8s to 1.35 and Go dependencies (edge)

9732ef8

Assisted-by: Cursor/Claude

camilamacedo86 force-pushed the deps-up branch from eef3ba9 to bfc9355 Compare February 10, 2026 08:36

openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 10, 2026

camilamacedo86 and others added 2 commits February 10, 2026 09:18

Copilot AI review requested due to automatic review settings February 10, 2026 09:19

camilamacedo86 force-pushed the deps-up branch from bfc9355 to e0b1619 Compare February 10, 2026 09:19

Copilot started reviewing on behalf of camilamacedo86 February 10, 2026 09:19 View session

Copilot AI reviewed Feb 10, 2026

View reviewed changes

camilamacedo86 force-pushed the deps-up branch from e0b1619 to efbe1f6 Compare February 10, 2026 19:00

🌱 chore(deps): upgrade k8s to 1.35 and Go dependencies (edge) #2486

Are you sure you want to change the base?

🌱 chore(deps): upgrade k8s to 1.35 and Go dependencies (edge) #2486

Conversation

camilamacedo86 commented Feb 5, 2026

Description

Reviewer Checklist

Uh oh!

netlify bot commented Feb 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for olmv1 ready!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Feb 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

camilamacedo86 Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

perdasilva Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

perdasilva Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

camilamacedo86 Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

perdasilva Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

perdasilva Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

camilamacedo86 Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

pedjak left a comment

Choose a reason for hiding this comment

Uh oh!

openshift-ci bot commented Feb 10, 2026

Uh oh!

camilamacedo86 Feb 10, 2026

Choose a reason for hiding this comment

The Problem (WHY the change was needed)

Old Implementation (main branch)

The Race Condition

Timeline Visualization

New Implementation (deps-up branch)

Approach 1: Direct Lease Checking (for leader election verification)

Approach 2: Polling with TailLines (for other log messages)

Behavior Verification

Old Behavior

New Behavior

netlify bot commented Feb 5, 2026 •

edited

Loading

codecov bot commented Feb 5, 2026 •

edited

Loading

perdasilva Feb 6, 2026 •

edited

Loading