-
Notifications
You must be signed in to change notification settings - Fork 18.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
windows: docker system prune not reclaiming expected space#31253
Comments
A further discovery is that I have found over 4Gb of old docker-builder* files and directories under C:\Windows\Temp. These will be related to building the images, but I would hope these were cleaned up as well. |
I am seeing similar issues too. It seems some exited containers are skipped by Here is how I verified the issue.
On my test image (which I’ve been using only for several weeks) the directory has 11 left over folders. One of them contains Let me know if you need any other information. Docker version 17.03.1-ee-3, build 3fcee33 |
Hello there, We also see similar behavior. I found that "windowsfilter" directory contains a lot of garbage on our production server. I made a copy of this server, stopped and removed all containers, performed "docker system prune -a". And after that, this directory still contains 4204 sub-directories with size more then 800Gb. See following output
I assumed that it can be result of performing But looks like it is common issue, as Eiichi and Alex were able to reproduce it. |
ping @PatrickLang |
@darrenstahlmsft, is this a known issue? |
I've taken a look at this previously, but have not found the root cause. The container and image configs (c:\programdata\docker\containers and c:\programdata\docker\image) are still on disk even though they are not visible from the API, which is why prune thinks there is nothing available to clean up. I'm not able to find a reliable way to cause the leaks, but it seems that some failure path doesn't remove the configs from disk correctly (and they are never reloaded into the daemon, even after a restart).
This is on my backlog of things to look at when I have time. |
@darrenstahlmsft would this PR (part of 17.06) help with this #31012 ? (basically, if deleting fails, do not remove the container, so that it is still visible in |
While prevention is important, some guidance on manual clean up would be helpful too. We have customers running Windows containers in production already. For example, is there any way to take a look at the folder and determine if it's safe to delete? I could write a KB article as a temp solution. |
@ekitagawa this tool is good -> https://github.com/jhowardmsft/docker-ci-zap |
@darrenstahlmsft have you found out more about this? Also, do you think #31012 might help address this? |
@jhowardmsft or @darrenstahlmsft Any idea when you can come back to this? |
Once LCOW is done..... |
ok. i will ping you again then. |
#31012 certainly should help in a lot of cases and make tracking this down easier, but as John said, both of our available cycles are going to be totally consumed by containerd and LCOW work. I'll continue to watch out for this as I work on other things, but don't expect to be able to dedicate the necessary time in the near future. I suspect some ref counting issues in the graphdriver are at fault for some parts of this, which others could feel free to look for in the mean time. |
@pennywisdom if it's an option for you, can you try 17.06.1-ee release clients to see if #31012 helps? They're available here:
Here's approximately how to update:
|
Hi @friism just got around to testing docker-17.06.1-ce-rc4.zip and the results are a lot better. There are a couple of bits of feedback::
|
Thanks for the update @pennywisdom! |
@friism @darrenstahlmsft Hello Michael, Darren, Can we think about some separate Powershell/Go automation of cleaning this directories? I can test it inside clone test servers before applying it on production. |
I just submitted a PR that I think solves this issue. If anyone who sees this regularly could verify that would be great. I've not leaked a single layer locally since applying my fix. |
Link to PR; #36728 |
The issue still exists on 18.03.0-ce on Windows 10. |
@vasicvuk could you open a new issue with details, and steps to reproduce/verify the bug? |
@vasicvuk actually; make sure you're on 18.03.1-ce, because the fix was included in the 18.03.1 patch release; see docker-archive/docker-ce#508 |
This issue should be reopened as it was never fixed @thaJeztah and it is a very serious issue |
Continues to occur in 19.03.1 and is very easy to reproduce |
Yes I've got the issue too. I had 100Gig tied up in build cache and did docker system prune and it now shows 0Gig in build cache, but no hard drive space is freed up. My drive is 'full'. Agree with Cloudmersive, this is v serious issue |
I've resolved this issue by manually recompacting the WSL2 ext4.vhdx size: |
Description
I am running a CI process that builds images on a windows 2016 host Virtual Machine. A scheduled job runs every 4 hours to clear up space, but this does not seem to be reclaiming the space that docker is suggesting. The amount reclaimed has been decreasing.
Currently I have a 60Gb virtual machine that is running server core and when I run
docker system prune -fa
i get the following outputTYPE TOTAL ACTIVE SIZE RECLAIMABLE
Images 0 0 0 B 0 B
Containers 0 0 0 B 0 B
Local Volumes 0 0 0 B 0 B
However running a scan of C:\ProgramData\docker I can see that there are many Gb's of files here, especially under windows filter. With a scan in progress, where docker is reporting zero images I currently have:
C:\ProgramData\docker - 20.5Gb - 188398 files - 66838 directories
C:\ProgramData\docker\windowsfilter - 20.4Gb - 145451 files - 66752 directories
If i run
docker images
anddocker ps -a
these are both empty.There seems to be a gradual decline in the space that is reclaimed, like dangling images are not being detected and not picked up.
One thing to note is that I am trying to delete all images to free up space; then on my next builds I am pulling in windowsservercore or nanoserver from the build of the images, i am not explicitly pulling windowsservercore or nanoserver.
Output of
docker version
:Output of
docker info
:Running windows server 2016 on Hyper-V, fully patched with latest Windows Updates/
Expected Outcome
As I am deleting all images I would expect the amount of space reclaimed to be consistent and not reducing over time to a point where very little is reclaimed. It seems like there are dangling or orphaned images that are remaining that are not being detected.
The text was updated successfully, but these errors were encountered: