Skip to content

Conversation

@crazy-max
Copy link
Member

@crazy-max crazy-max commented Nov 6, 2025

needs #3220
carry and closes #3217

module github.com/moby/swarmkit/v2

go 1.21.0
go 1.24.0
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@crazy-max crazy-max force-pushed the etcd-update branch 3 times, most recently from e174ab2 to 84686de Compare November 6, 2025 15:33
swarmd/go.mod Outdated
k8s.io/klog/v2 v2.90.1 // indirect
)

replace github.com/moby/swarmkit/v2 => github.com/crazy-max/swarmkit/v2 v2.0.0-20251106153346-84686debe4aa // etcd-update
Copy link
Member Author

@crazy-max crazy-max Nov 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/moby/swarmkit/actions/runs/19140979041/job/54705663625?pr=3221#step:4:266

 #19 39.64 + cd swarmd
#19 39.64 + golangci-lint run --config /go/src/github.com/docker/swarmkit/.golangci.yml ./...
#19 47.56 level=error msg="Running error: context loading failed: no go files to analyze: running `go mod tidy` may solve the problem"

https://github.com/moby/swarmkit/actions/runs/19140978994/job/54705757861?pr=3221#step:5:89

# github.com/moby/swarmkit/swarmd/cmd/swarm-rafttool
/go/pkg/mod/google.golang.org/[email protected]/status/status.go:35:2: ambiguous import: found package google.golang.org/genproto/googleapis/rpc/status in multiple modules:
	google.golang.org/genproto v0.0.0-20230306155012-7f2fa6fef1f4 (/go/pkg/mod/google.golang.org/[email protected]/googleapis/rpc/status)
	google.golang.org/genproto/googleapis/rpc v0.0.0-20250303144028-a0af3efb3deb (/go/pkg/mod/google.golang.org/genproto/googleapis/[email protected]/status)
# github.com/moby/swarmkit/swarmd/cmd/swarm-rafttool
/go/pkg/mod/go.etcd.io/etcd/api/[email protected]/etcdserverpb/rpc.pb.go:19:2: ambiguous import: found package google.golang.org/genproto/googleapis/api/annotations in multiple modules:
	google.golang.org/genproto v0.0.0-20230306155012-7f2fa6fef1f4 (/go/pkg/mod/google.golang.org/[email protected]/googleapis/api/annotations)
	google.golang.org/genproto/googleapis/api v0.0.0-20250303144028-a0af3efb3deb (/go/pkg/mod/google.golang.org/genproto/googleapis/[email protected]/annotations)
# github.com/moby/swarmkit/swarmd/cmd/swarm-rafttool
cmd/swarm-rafttool/common.go:15:2: no required module provides package go.etcd.io/etcd/server/v3/wal/walpb; to add it:
	go get go.etcd.io/etcd/server/v3/wal/walpb
FAIL	github.com/moby/swarmkit/swarmd/cmd/swarm-rafttool [setup failed]
# github.com/moby/swarmkit/swarmd/cmd/swarmctl
/go/pkg/mod/google.golang.org/[email protected]/status/status.go:35:2: ambiguous import: found package google.golang.org/genproto/googleapis/rpc/status in multiple modules:
	google.golang.org/genproto v0.0.0-20230306155012-7f2fa6fef1f4 (/go/pkg/mod/google.golang.org/[email protected]/googleapis/rpc/status)
	google.golang.org/genproto/googleapis/rpc v0.0.0-20250303144028-a0af3efb3deb (/go/pkg/mod/google.golang.org/genproto/googleapis/[email protected]/status)
FAIL	github.com/moby/swarmkit/swarmd/cmd/swarmctl [setup failed]
# github.com/moby/swarmkit/swarmd/cmd/swarmctl/cluster
/go/pkg/mod/google.golang.org/[email protected]/status/status.go:35:2: ambiguous import: found package google.golang.org/genproto/googleapis/rpc/status in multiple modules:
	google.golang.org/genproto v0.0.0-20230306155012-7f2fa6fef1f4 (/go/pkg/mod/google.golang.org/[email protected]/googleapis/rpc/status)
	google.golang.org/genproto/googleapis/rpc v0.0.0-20250303144028-a0af3efb3deb (/go/pkg/mod/google.golang.org/genproto/googleapis/[email protected]/status)
FAIL	github.com/moby/swarmkit/swarmd/cmd/swarmctl/cluster [setup failed]

Didn't find another way to fix this, go.work does not make things easy when running linter and go test. I would have preferred this simple replace directive:

replace github.com/moby/swarmkit/v2 => ../

instead of using go work but seems this is a prerequisite on this repo. cc @thaJeztah @corhere

Copy link
Member Author

@crazy-max crazy-max Nov 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum it seems go test when running in a go workspace will use it instead of go.mod and same with golangci-lint

@crazy-max

This comment was marked as resolved.

@crazy-max
Copy link
Member Author

https://github.com/moby/swarmkit/actions/runs/19143203825/job/54713856103?pr=3221#step:5:76

 time="2025-11-06T16:56:47Z" level=info msg="repaired WAL error" error="unexpected EOF"
time="2025-11-06T16:56:47Z" level=info msg="repaired WAL error" error="unexpected EOF: [wal] max entry size limit exceeded when reading \"0000000000000000-0000000000000000.wal\", recBytes: 24, fileSize(200) - offset(184) - padBytes(0) = entryLimit(16)"
--- FAIL: TestReadRepairWAL (0.01s)
    walwrap_test.go:272: 
        	Error Trace:	/go/src/github.com/docker/swarmkit/manager/state/raft/storage/walwrap_test.go:272
        	Error:      	An error is expected but got nil.
        	Test:       	TestReadRepairWAL

Signed-off-by: Tonis Tiigi <[email protected]>
@tonistiigi
Copy link
Member

I updated the conditions for the failing test. The difference is in ReadRepairWAL function implementation:

if repaired || !errors.Is(err, io.ErrUnexpectedEOF) {
// TODO(thaJeztah): should ReadRepairWAL be updated to handle cases where
// some (last) of the files cannot be recovered? ("best effort" recovery?)
// Or should an informative error be produced to help the user (which could
// mean: remove the last file?). See TestReadRepairWAL for more details.
return nil, WALData{}, errors.Wrap(err, "irreparable WAL error")
}
if !wal.Repair(nil, walDir) {
return nil, WALData{}, errors.Wrap(err, "WAL error cannot be repaired")
}
log.G(ctx).WithError(err).Info("repaired WAL error")
repaired = true

both versions call Repair() that returns true,

in v3.5.6 next iteration hit irrepairable case

return nil, WALData{}, errors.Wrap(err, "irreparable WAL error")

while in v3.6 next read doesn't error and clean break out of the loop happens in


In transport pkg, the difference is that snapshot is now pointer instead of inline struct. The test can probably be made to work but iiuc the issue is that this test assumes specific memory sizes and struct layout, what probably isn't the safest way to write tests.

In etcd v3.6 implementation has changed and now
returns successful raipair for the test conditions.

Signed-off-by: Tonis Tiigi <[email protected]>
@crazy-max crazy-max force-pushed the etcd-update branch 2 times, most recently from e6b9898 to 599b35b Compare November 7, 2025 10:00
@crazy-max crazy-max marked this pull request as ready for review November 7, 2025 10:02
@crazy-max
Copy link
Member Author

crazy-max commented Nov 7, 2025

https://github.com/moby/swarmkit/actions/runs/19164853654/job/54783119616?pr=3221#step:5:161

time="2025-11-07T10:07:44Z" level=error msg="snapshot data mismatch"
panic: invalid snapshot data

goroutine 112 [running]:
github.com/moby/swarmkit/v2/manager/state/raft/transport.(*mockRaft).StreamRaftMessage(0xc000154c00, {0x14ecb20, 0xc0001ac040})
	/go/src/github.com/docker/swarmkit/manager/state/raft/transport/mock_raft_test.go:141 +0xd65
github.com/moby/swarmkit/v2/api._Raft_StreamRaftMessage_Handler({0x1326a80, 0xc000154c00}, {0x14ea2b8, 0xc0001cc620})
	/go/src/github.com/docker/swarmkit/api/raft.pb.go:1244 +0xe6
google.golang.org/grpc.(*Server).processStreamingRPC(0xc000122800, {0x14e8460, 0xc000480690}, 0xc000155500, 0xc00011d710, 0x1b94fa0, 0x0)
	/go/src/github.com/docker/swarmkit/vendor/google.golang.org/grpc/server.go:1695 +0x1f98
google.golang.org/grpc.(*Server).handleStream(0xc000122800, {0x14e87b0, 0xc00019a340}, 0xc000155500)
	/go/src/github.com/docker/swarmkit/vendor/google.golang.org/grpc/server.go:1819 +0x1325
google.golang.org/grpc.(*Server).serveStreams.func2.1()
	/go/src/github.com/docker/swarmkit/vendor/google.golang.org/grpc/server.go:1035 +0x159
created by google.golang.org/grpc.(*Server).serveStreams.func2 in goroutine 46
	/go/src/github.com/docker/swarmkit/vendor/google.golang.org/grpc/server.go:1046 +0x215
FAIL	github.com/moby/swarmkit/v2/manager/state/raft/transport	0.085s

We have invalid snapshot data, wonder if this relates to etcd-io/raft#149

Any idea @dperny?

Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dperny dperny merged commit 6f0a8d0 into moby:master Nov 10, 2025
9 checks passed
@crazy-max crazy-max deleted the etcd-update branch November 10, 2025 19:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants