Wednesday, July 23, 2008

Install and Use Unionfs ( Merging Directories )

Howto: How Install and Use Unionfs (Merging Directories)

Unionfs, developed at Stony Brook university since 2004, is a stackable unification file system, which can merge the contents of several directories (so called branches) while keeping their physical content separate. It allows any mix of read-only and read-write branches, as well as insertion and deletion of branches on the fly. Unionfs can be used in several ways, for example to unify home directories from multiple filesystems on different disk partitions, or to merge several CDs to create a unified view of a photo archive. In a similar view, Unionfs, with copy-on-write functionality, can be used to merge read-only and read-write filesystems together and to virtually allow modification of read-only filesystems saving changes to the writable ones.

SLAX is a 177 MB Linux Live distribution which aims at compacting full featured Linux operating system to a portable medium (like usb flash drive or mini-cd) and allows everyone to boot Linux on any machine without the need to install it. It works even on computers with no harddisk at all. Unionfs is the most important part of a SLAX, it allows SLAX to seem and act as a real Linux OS with full-writable root directory tree. So let's speak about unionfs first.

Getting started
To get unionfs working, you need to create a Linux kernel module by compiling its source codes. Unionfs is available as a module extension for Linux Kernel 2.4.20 / 2.6.9 and higher. Download the latest version from FTP and extract the content of the archive by using

$ tar -xzf unionfs-x-y-z.tar.gz

Then cd to its directory and read README and INSTALL files which are part of the archive. There are many instructions how to avoid problems. Before the compilation itself, you might find it useful to know that it's possible to disable compiling debug information together with the module. Debug info is useful for reporting bugs, but significantly increases the size of kernel module. Two parameters must be set to disable debug at all:

* create a file called fistdev.mk in the directory with sources
* add this text to it:

EXTRACFLAGS=-DUNIONFS_NDEBUG
UNIONFS_DEBUG_CFLAG=

The compiled kernel module will be about 90 KB big without debug info, compared to 5 MB with it (you can download fistdev.mk used to compile unionfs for SLAX)

Another important thing to make the compilation work properly is to download and extract sources for your running kernel and to modify LINUXSRC variable in unionfs' Makefile, adding path where you actually extracted it (this can be autodetected in some cases).

Finally, use the following commands to build and install unionfs module into /lib/modules/$(KernelVersion)/kernel/fs/unionfs:

$ make
$ make install
$ depmod -a

Using unionfs

In the following example, we will merge contents of two directories into a single directory /mnt/union. We assume that all directories already exist.

$ modprobe unionfs
$ mount -t unionfs -o dirs=/mnt/cdrom1=ro:/mnt/cdrom2=ro unionfs /mnt/union

From now, the directory /mnt/union will contain all files and directories from /mnt/cdrom1 and /mnt/cdrom2, merged together and both read only. If the same filename is used in both cdrom directories, the one from cdrom1 has precedence (because it was specified leftmost in the list).
Using unionctl

Unionctl is a tool which is created (together with uniondbg) during unionfs compilation and is installed to /usr/local/sbin. Unionctl is intended to manage the existing union, to list, add, modify or delete existing branches. Some simple example follows, use unionctl command without any argument to see all available options.

To list branches in existing union, use

$ unionctl /mnt/union --list

which will produce the following output

/mnt/cdrom1 (r-)
/mnt/cdrom2 (r-)


To add another directory (/mnt/cdrom3) into existing union, use

$ unionctl /mnt/union --add --after /mnt/cdrom2 --mode ro /mnt/cdrom3

and unionctl --list will now produce

/mnt/cdrom1 (r-)
/mnt/cdrom2 (r-)
/mnt/cdrom3 (r-)


In the case when you change the content of branches themselves, execute the following command to force revalidation of the union:

uniondbg -g /mnt/union


Writing to union

Merging read-only directories is useful in many cases, but the union itself remains read-only too, until a read-write branch is added to it. In that case, all changes are stored in leftmost branch (using copy-up method, see below) and file deletions are done by using one of the two methods available:

- WHITEOUT mode, inserts a .wh (whiteout) file to mask out a real file
- DELETE_ALL mode, tries to delete all instances of a file from all branches

WHITEOUT mode is used as default. Copy-up is a special method used to handle file modifications in union. A file from ro branch can't be modified, so it is copied to upper (left) read-write branch at the time when the modification should begin. Then the modification is possible and modified file remains in rw branch.

To add a rw branch at the top of union in our example, type

$ unionctl /mnt/union --add --before /mnt/cdrom1 --mode rw /mnt/changes

All the changes will be stored in /mnt/changes and the union will look like this:

/mnt/changes (rw)
/mnt/cdrom1 (r-)
/mnt/cdrom2 (r-)
/mnt/cdrom3 (r-)


Practical unionfs application - SLAX

Data stored on a read-only medium like CD-ROM can't be modified. A Live CD Linux distribution, which is offering full write support to all directories, needs to use special techniques to allow virtual modifications and to save all changes in memory. SLAX is using these techniques for very long time, starting at the end of 2003 with ovlfs and implementing unionfs at the end of 2004. SLAX 5, released in April 2005, can give you an impression of what miracles could be, thanks to unionfs, created.


Links
- Stony Brook university: http://www.fsl.cs.sunysb.edu/
- UnionFS: http://www.fsl.cs.sunysb.edu/project-unionfs.html
- SLAX: http://www.slax.org
- Linux Live scripts: http://www.linux-live.org
- Linux kernel: http://www.kernel.org

1 comment: