In computer operating systems, union mounting is a way of combining multiple directories into one that appears to contain their combined contents.[1] Union mounting is supported in Linux, BSD and several of its successors, and Plan 9, with similar but subtly different behavior.

As an example application of union mounting, consider the need to update the information contained on a CD-ROM or DVD. While a CD-ROM is not writable, one can overlay the CD's mount point with a writable directory in a union mount. Then, updating files in the union directory will cause them to end up in the writable directory, giving the illusion that the CD-ROM's contents have been updated.[1][2]

Implementations

edit

Plan 9

edit

In the Plan 9 operating system from Bell Labs (mid-1980s onward), union mounting is a central concept, replacing several older Unix conventions with union directories; for example, several directories containing executables, unioned together at a single /bin directory, replace the PATH variable for command lookup in the shell.[3]

Plan 9 union semantics are greatly simplified compared to the implementations for POSIX-style operating systems: the union of two directories is simply the concatenation of their contents, so a directory listing of the union may display duplicate names. Also, no effort is made to recursively merge subdirectories, leading to an extremely simple implementation.[4] Directories are unioned in a controllable order; u/name, where u is a union directory, denotes the file called name in the first constituent directory that contains such a file.[4]

Unix and BSD

edit

Unix/POSIX implementations of unions have requirements different from the Plan 9 implementation due to constraints in the traditional Unix file system behavior, which greatly complicates their implementation and often leads to compromises.[5] Problems that union mounting on Unix-like operating systems encounters include:

  • Duplicate file names within a directory are not acceptable, since this would break applications' expectations of how a Unix file system works. Putting a logical, stack-like precedence ordering on the union's constituents partially solves this problem, but requires memory to record which files need to be skipped over during a directory listing (which is otherwise a nearly stateless operation).[5]
  • Deletion requires special support: if files with the same name exist in several of the union directory's constituents, simply deleting it from one of the constituents causes a file from one of the others to reappear in its stead.[5]
  • Insertion of a directory into the stack can cause incoherency in the kernel's file name cache.[5]
  • Renaming a file within a single mounted file system (using the rename system call) should be an atomic operation, but renaming within a union mount can require changes to multiple of the union's constituent directories. A possible solution is to disallow rename in such situations and require implementations to copy-and-delete instead.[2]
  • Stable inode numbers for files, hard links and memory-mapped I/O (mmap) are hard to implement correctly.[2]

Early attempts to add unioning to Unix filesystems included the 3-d filesystem (Bell Labs) and the Translucent File Service in SunOS (Sun Microsystems, 1988[2]). An implementation of union mounting was added to the BSD version of Unix in version 4.4 (1994), taking inspiration from these earlier attempts, Plan 9 and the stackable file systems in Spring (Sun, 1994).[1] 4.4BSD implements the stack-of-directories approach outlined above. As in Plan 9, operations traverse this stack top-down to resolve names, but unlike Plan 9, BSD union mounts are recursive, so that the contents of subdirectories appear merged in the union directory. Also unlike the Plan 9 version, all layers except the top are read-only: modifying files in the union causes their contents to first be copied into the top layer of the stack, where the modifications are then applied. Deletion of files is implemented by writing a special type of file called a whiteout to the top directory, which has the effect of marking the file name as non-existent and hiding files with the same name in the lower layers of the stack.[1] Whiteouts require support from the underlying file system.[4]

Linux

edit

Union mounting was implemented for Linux 0.99 in 1993; this initial implementation was called the Inheriting File System, but was abandoned by its developer because of its complexity.[2] The next major implementation was UnionFS, which grew out of the FiST project at Stony Brook University.[6][5] An attempt to replace UnionFS, aufs, was released in 2006, followed in 2009 by OverlayFS.[2] In 2014 OverlayFS union mount implementation was added to the standard Linux kernel source code.[7]

Similarly, GlusterFS offers the ability to mount different filesystems distributed across a network, rather than being located on the same machine.[8]

MergerFS, originally released in 2014, is an actively developed open-source FUSE plugin, allowing pooling of arbitrary directories.[9]

References

edit
  1. ^ a b c d Pendry, Jan-Simon; Marshall Kirk McKusick (December 1995). "Union Mounts in 4.4BSD-Lite". Proceedings of the USENIX Technical Conference on UNIX and Advanced Computing Systems: 25–33. Retrieved 25 November 2007.
  2. ^ a b c d e f Aurora, Valerie; Henson (March 2009). "Unioning file systems: Architecture, features, and design choices". LWN.net. Retrieved 21 December 2009.
  3. ^ Pike, R.; Presotto, D.; Thompson, K.; Trickey, H.; Winterbottom, P. "The Use of Name Spaces in Plan 9". Random Contrarian Insurgent Organization web site cat-v.org. Bell Labs. Retrieved 27 October 2016.
  4. ^ a b c Aurora, Valerie; Henson (March 2009). "Union file systems: Implementations, part I". LWN.net. Retrieved 21 December 2009.
  5. ^ a b c d e Wright, Charles P.; Jay Dave; Puja Gupta; Harikesavan Krishnan; Erez Zadok; Mohammad Nayyer Zubair. "Versatility and Unix Semantics in a Fan-Out Unification File System". Stony Brook University Technical Report FSL-04-01b. Retrieved 25 November 2007.
  6. ^ Aurora, Valerie; Henson (April 2009). "Unioning file systems: Implementations, part 2". LWN.net. Retrieved 21 December 2009.
  7. ^ Larabel, Michael (29 September 2014). "OverlayFS Proposed for the Linux 3.18 Kernel". Phoronix.com. Retrieved 12 October 2015.
  8. ^ "About". GlusterNews. 14 November 2009. Archived from the original on 7 April 2013. Retrieved 4 March 2013. GlusterFS is an open source, distributed file system capable of scaling to several petabytes (actually, 72 brontobytes!) and handling thousands of clients. GlusterFS clusters together storage building blocks over Infiniband RDMA or TCP/IP interconnect, aggregating disk and memory resources and managing data in a single global namespace. GlusterFS is based on a stackable user space design and can deliver exceptional performance for diverse workloads.
  9. ^ "MergerFS project on GitHub". github.com. Retrieved 15 September 2021.