James's Ramblings

zfs

Created: August 31, 2020

list zpool

zfs list

Create a key

dd if=/dev/urandom of=/root/zfsPool1_key bs=32 count=1

Create zpool

zpool create -o ashift=12 -O acltype=posixacl -O compression=lz4 -O dnodesize=auto -O normalization=formD -O relatime=on -O xattr=sa -O encryption=aes-256-gcm -O keylocation=file:///root/zfsDataPool1_key -O keyformat=raw -m /mnt/dataPool1 dataPool1 mirror ata-WDC_WD1002FAEX-00Y9A0_WD-WCAW33528363 ata-WDC_WD10EZEX-00WN4A0_WD-WCC6Y4KC9ZKH
  • ashift=12 forces 4k sectors even if the drive is lying about the sectors being 4k.

  • acltype=posixacl enable POSIX style ACLs.

  • normalization=formD eliminates some corner cases relating to UTF-8 filename normalisation. It also enables utf8only=on, meaning that only files with valid UTF-8 filenames will be accepted.

  • xattr=sa vastly improves the performance of extended attributes, but is Linux-only. If you care about using this pool on other OpenZFS implementation don’t specify this option.

  • relatime was introduced in the 2.6.20 kernel and it basically only updates the atime when the previous atime was older than the mtime or ctime. Later this behaviour was changed a bit so relatime will also update the atime if the previous atime was older than 24h.

    So if you don’t modify your file, then the atime will only updated once in 24h. But every time you modify your file (and therefor update the mtime timestamp), you’ll also update the atime timestamp because to previous one is older than the new mtime timestamp.

  • dnodesize=auto specifies a compatibility mode or literal value for the size of dnodes in the file system. Consider setting dnodesize to auto if the dataset uses the xattr=sa property setting and the workload makes heavy use of extended attributes.

  • Use ls /dev/disk/by-id/ for the disk names.

zpool status

zpool status

Unlock zpool with encryption key

zfs load-key -a

Mount zpool

zfs mount dataPool1

Export for NFS

zfs set sharenfs=on dataPool1
  • No need to do exporting the old way.
  • Confirm:
showmount -e `hostname`

Remove a disk

zpool remove dataPool1 [DEVICE]

Taking devices offline

  • When a device is unreliable ZFS continues to use it. To stop this and take a device offline:
    zpool offline [DATA_POOL] [DEVICE]
    
  • This does not detach the device from the storage pool.
  • Nor does this persist between reboots.

Detaching devices from a storage pool

zpool detach [DATA_POOL] [DEVICE]

ZFS scrub

zpool scrub [POOL]
zpool scrub -s [POOL] # cancel
  • Recommended once per week for consumer hard drives.

Replace a disk - failure or larger disks

  • Record contents of zdb and zpool status -v, particularly the path and guid for each disk.

  • If we are aiming to enlarge the zpool with bigger disks: zpool set autoexpand=on [POOL].

  • Find new disk in /dev/disk/by-id.

  • Create GPT table: parted [DISK]

  • zpool replace [POOL] [GUID_OLD_DISK] [NEW_DISK_BY_ID_PATH]

  • Check status with zpool status -v

ZFS snapshots

Take a snapshot:

zfs snapshot [POOL]@[SNAPSHOT_NAME]

List snapshots:

zfs list -t snapshot [POOL]

Destroy a snapshot:

zfs destroy [POOL]@[SNAPSHOT_NAME]

Rollback:

zfs rollback [POOL]@[SNAPSHOT_NAME]
  • Only the most recent snapshot can be rolled back to.
  • To roll back to an older snapshot, snapshots in the way must be deleted with -r

To mount a snapshot:

mount -t zfs [POOL]@[SNAPSHOT_NAME] [MOUNT_PATH]

ZFS clones

  • Snapshots can be cloned.
zfs clone [POOL]@[SNAPSHOT_NAME] [POOL]/[CLONE_NAME]

Destroy a clone:

zfs destroy clone
  • Clones show in the output of zfs list .

More concepts

  • ZFS replication with snapshots send/receive.
  • ZFS checkpoints - essentially snapshot everything including config.
  • ZFS automated snapshots.
  • Scheduled scrubbing.
  • Use an SSD as a cache.
  • ZFS import/export to a new system.