-
Notifications
You must be signed in to change notification settings - Fork 18.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make sure plugin container is removed on failure#36715
Conversation
e2f95a4
to
086203b
Compare
@cpuguy83 : Janky failure |
@@ -52,10 +66,23 @@ func (e *Executor) Create(id string, spec specs.Spec, stdout, stderr io.WriteClo | |||
ctx := context.Background() | |||
err := e.client.Create(ctx, id, &spec, &opts) | |||
if err != nil { | |||
return err | |||
status, _ := e.client.Status(ctx, id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why ignore error from Status? We could be dealing with a corrupt status in L70
if err2 := e.client.Delete(ctx, id); err2 != nil && !errdefs.IsNotFound(err2) { | ||
logrus.WithError(err2).WithField("plugin", id).Error("Error cleaning up containerd container") | ||
} | ||
err = e.client.Create(ctx, id, &spec, &opts) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont understand this additional call to Create
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a recovery from left-over containerd containers. The bug leaves stuff behind on error, this cleans them up and tries again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which bug?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The bug being fixed by this PR, introduced by the containerd 1.0 integration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
haha! I expected a bug number :)
} | ||
err = e.client.Create(ctx, id, &spec, &opts) | ||
} | ||
if err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
err
will be nil, given this block is about a failed creation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
err
is set again by the 2nd create above.
} | ||
|
||
_, err = e.client.Start(ctx, id, "", false, attachStreamsFunc(stdout, stderr)) | ||
if err != nil { | ||
e.client.DeleteTask(ctx, id) | ||
e.client.Delete(ctx, id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check return similar to L71
@@ -40,7 +54,7 @@ func New(rootDir string, remote libcontainerd.Remote, exitHandler ExitHandler) ( | |||
// Executor is the containerd client implementation of a plugin executor | |||
type Executor struct { | |||
rootDir string | |||
client libcontainerd.Client | |||
client Client |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this mainly so that we can write the unit test with mockClient
? If so, I think we should continue using libcontainerd.Client
and think of other ways to rewrite the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This limits the scope of libcontainerd.Client to just the functions that are called here which makes the mock implementation simpler.
I would indeed like to use the real libcontainerd client implementation, but this is much harder and would either require actually spinning up containerd or building a custom containerd service. The latter sounds nice but is not a simple change.
} | ||
|
||
func (c *mockClient) Status(ctx context.Context, id string) (libcontainerd.Status, error) { | ||
c.mu.Lock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
c.mu.RLock
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no need to use a read lock here, these are fast reads and not high concurrency. The only real reason I added locking was to make sure if someone tried to use it concurrently in another test it would be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even so, a read lock would be sufficient, given the method only reads the map and doesnt update it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An RW mutex is slower and more complex than a normal one... I can update it if you think it should use one I think it is unnecessarily complex to do so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fair. Lets keep it AS-IS
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
086203b
to
f81172b
Compare
LGTM |
Janky and ppc failures are due to known flaky tests. Experimental failure is new though. And might be related to this PR.
|
Codecov Report
@@ Coverage Diff @@
## master #36715 +/- ##
=========================================
Coverage ? 35.14%
=========================================
Files ? 614
Lines ? 46245
Branches ? 0
=========================================
Hits ? 16252
Misses ? 27860
Partials ? 2133 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🐅
No description provided.