Skip to content

Volume not listed when outside on cli specified dataset #12

@girstenbrei

Description

@girstenbrei

Summary

ZFS datasets to be used by the plugin can be specified on the cli via the --dataset-name argument. These are then added to ZfsDriver.rds in main.go:41 when the driver is created. When docker asks for a volume list, theses rds's are iterated in driver.go:63, checked and then returned to docker. This works. But if a dataset is created not within the on cli specified root dataset, it is not within the scope of these iterations.

Minimum viable example

zfs-plugin version: 0.5.0
docker version: 19.03.2, build 6a30dfc
Using the standard docker-zfs-plugin.service file

zfs create tank/docker-volumes
zfs create tank/another
docker volume create -d zfs --name=tank/docker-volumes/data
docker volume create -d zfs --name=tank/another/data
docker volume ls

Expected Output

DRIVER              VOLUME NAME
zfs                 tank/another/data
zfs                 tank/docker-volumes/data

Actual Output

DRIVER              VOLUME NAME
zfs                 tank/docker-volumes/data

Why is this a Problem:

Managing these volumes via the docker volume command then makes it impossible to know about created volumes. They will also not be pruned running docker volume prune. Removing a stack also does not destroy the volumes accordingly.

Proposed Solution(s)

Somehow the list command needs to identify, which datasets are used as volumes.

Requiring all datasets used to be present on the cli would allow correct identification of datasets used as volumes. But it would break functionality of creating volumes automatically on datasets outside of these. Suddenly deploys would fail because people just used the docker-zfs-plugin.service file, and this worked till now. This would be breaking changes, requiring a major version change.

Using custom properties on the zfs datasets could make them identifiable. On service startup, the datasets specified via cli would be created if needed and this property would be set. Every dataset created via docker would also get this property. A list command could then identify these, but in order to do so would probably need to iterate through a possibly large amount of datasets. Also, as far as i know, transfering datasets (zfs send) not necessarily copies all properties. This could lead to issues when restoring backups.

Writing the information somewhere (e.g. in the required root dataset from cli) would also save the information, but violate the single-source-of-truth principle. Somehow the actual datasets could change without that change beeing reflected in the static information.

Another option may be to query docker for this information. But, i suspect that docker does not provide the information about which volumes exist, because that is exactly what the plugin should report to docker, not the other way around.

IMHO i would prefer solution 1: require all used datasets on command line. I think it is an elegant solution to define what datasets are in use and also allows to ensure that these are available and working. If this breaks configuration, it is because there are volumes outside the on cli specified dataset. This means intern, that at the moment the docker volume command is broken. I would much prefer a one time broken configuration and then some sensible error messages when i try to create datasets outside the specified root paths (which i can fix by adding these paths) over a silently failing (volume hiding) docker volume ls.

Thanks again to y'all, if I'm completely wrong: please yell at me, why I'm wrong or what to do better 😉

Greets,
Chris

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions