Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make sure plugin container is removed on failure#36715

Merged
merged 1 commit into from
Mar 30, 2018

Conversation

cpuguy83
Copy link
Member

No description provided.

@anusha-ragunathan
Copy link
Contributor

@cpuguy83 : Janky failure

@@ -52,10 +66,23 @@ func (e *Executor) Create(id string, spec specs.Spec, stdout, stderr io.WriteClo
ctx := context.Background()
err := e.client.Create(ctx, id, &spec, &opts)
if err != nil {
return err
status, _ := e.client.Status(ctx, id)
Copy link
Contributor

@anusha-ragunathan anusha-ragunathan Mar 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why ignore error from Status? We could be dealing with a corrupt status in L70

if err2 := e.client.Delete(ctx, id); err2 != nil && !errdefs.IsNotFound(err2) {
logrus.WithError(err2).WithField("plugin", id).Error("Error cleaning up containerd container")
}
err = e.client.Create(ctx, id, &spec, &opts)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont understand this additional call to Create.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a recovery from left-over containerd containers. The bug leaves stuff behind on error, this cleans them up and tries again.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which bug?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bug being fixed by this PR, introduced by the containerd 1.0 integration.

Copy link
Contributor

@anusha-ragunathan anusha-ragunathan Mar 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haha! I expected a bug number :)

}
err = e.client.Create(ctx, id, &spec, &opts)
}
if err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

err will be nil, given this block is about a failed creation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

err is set again by the 2nd create above.

}

_, err = e.client.Start(ctx, id, "", false, attachStreamsFunc(stdout, stderr))
if err != nil {
e.client.DeleteTask(ctx, id)
e.client.Delete(ctx, id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check return similar to L71

@@ -40,7 +54,7 @@ func New(rootDir string, remote libcontainerd.Remote, exitHandler ExitHandler) (
// Executor is the containerd client implementation of a plugin executor
type Executor struct {
rootDir string
client libcontainerd.Client
client Client
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this mainly so that we can write the unit test with mockClient? If so, I think we should continue using libcontainerd.Client and think of other ways to rewrite the test.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This limits the scope of libcontainerd.Client to just the functions that are called here which makes the mock implementation simpler.

I would indeed like to use the real libcontainerd client implementation, but this is much harder and would either require actually spinning up containerd or building a custom containerd service. The latter sounds nice but is not a simple change.

}

func (c *mockClient) Status(ctx context.Context, id string) (libcontainerd.Status, error) {
c.mu.Lock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

c.mu.RLock

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no need to use a read lock here, these are fast reads and not high concurrency. The only real reason I added locking was to make sure if someone tried to use it concurrently in another test it would be fine.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even so, a read lock would be sufficient, given the method only reads the map and doesnt update it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An RW mutex is slower and more complex than a normal one... I can update it if you think it should use one I think it is unnecessarily complex to do so.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair. Lets keep it AS-IS

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
@stevvooe
Copy link
Contributor

LGTM

@anusha-ragunathan
Copy link
Contributor

Janky and ppc failures are due to known flaky tests.

Experimental failure is new though. And might be related to this PR.

14:12:06 FAIL: docker_api_swarm_service_test.go:614: DockerSwarmSuite.TestAPISwarmServicesPlugin
14:12:06 
14:12:06 [d699a76c39512] waiting for daemon to start
14:12:06 [d699a76c39512] daemon started
14:12:06 
14:12:06 [d0241c324309d] waiting for daemon to start
14:12:06 [d0241c324309d] daemon started
14:12:06 
14:12:06 [dbbfcc658cf89] waiting for daemon to start
14:12:06 [dbbfcc658cf89] daemon started
14:12:06 
14:12:06 docker_api_swarm_service_test.go:683:
14:12:06     waitAndAssert(c, defaultReconciliationTimeout, d1.CheckPluginRunning(repo), checker.False)
14:12:06 docker_utils_test.go:452:
14:12:06     c.Assert(v, checker, args...)
14:12:06 ... obtained bool = true
14:12:06 ... &{Config:{Args:{Description: Name: Settable:[] Value:[]} Description: DockerVersion:dev Documentation: Entrypoint:[/basic] Env:[] Interface:{Socket:basic.sock Types:[.docker.dummy/1.0/]} IpcHost:false Linux:{AllowAllDevices:false Capabilities:[] Devices:[]} Mounts:[] Network:{Type:} PidHost:false PropagatedMount: User:{GID:0 UID:0} WorkDir: Rootfs:0xc423748c00} Enabled:true ID:67342d0b4379499e319163d3f9b98b1626de91d6be8909b56555a4321e5a9733 Name:127.0.0.1:5000/swarm/test:v1 PluginReference:127.0.0.1:5000/swarm/test:v1 Settings:{Args:[] Devices:[] Env:[] Mounts:[]}}

@codecov
Copy link

codecov bot commented Mar 30, 2018

Codecov Report

❗ No coverage uploaded for pull request base (master@b6a7d02). Click here to learn what that means.
The diff coverage is 19.23%.

@@            Coverage Diff            @@
##             master   #36715   +/-   ##
=========================================
  Coverage          ?   35.14%           
=========================================
  Files             ?      614           
  Lines             ?    46245           
  Branches          ?        0           
=========================================
  Hits              ?    16252           
  Misses            ?    27860           
  Partials          ?     2133

Copy link
Member

@vdemeester vdemeester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🐅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants