Talk:File system API

Latest comment: 13 years ago by DGerman in topic Rewrite in progress

Rewrite

edit

This article is in a serious need of a rewrite. I rewrote the lead section to be technically correct, but I don't really know what to make of the history section (I left a comment for each paragraph there), and I pretty much gave up when I reached the next one. The article often uses very vague terminology, and the criteria for current classifications are unintelligible.

What's the difference between "kernel-level" (formerly "kernel-based") and "driver-level" APIs? The article currently says "The API is "driver-based" when the kernel provides facilities but the filesystem code resides totally external to the kernel (not even as a module of a modular kernel).", but how can a piece of code execute in kernel mode without being included in the kernel, nor loaded as a module? Windows NT file systems certainly execute in kernel mode.

In kernel-level the filesystem driver is in the SAME OBJECT FILE AND PROCESS as the rest of the kernel components. In driver-level it is in kernel mode, but it is loaded as a library. Check OS/2 and Windows NT IFS's and you will see the differences.

From there follows the "Mixed kernel-driver-based API". Obviously I can't understand that, as it relies on the previous definition. As Windows 3.1 inherently ran in real mode, there is no actual entity called a "kernel" to speak of, just some code that happens to interface with hardware. The description seems to imply that Windows 95 file systems were wrapped for compatibility, but the file system still implements the Windows 95 file system driver API. If it's the Windows 95 API, then how is this really any different from the preceding two?

You are absolutely wrong. See "win.com /?" and you'll see it runs by default IN PROTECTED MODE. And of course, you can run it in real mode, then you can see in the control panel that you can't use the VFAT driver (it is called 32-bit disk access). If you need I can provide you with screenshots.
Of course Windows 9x API is the same as Windows 3.1, just evolved (as the Windows Vista evolved from Windows NT 3.5)

The "user space API" (formerly "userland-level API") is refreshingly understandable; however, it fails to be informative and somewhy claims that all user space filesystem APIs are inherently incompatible (!?).

User space APIs are incompatible between them. You cannot use the ADFLIB or the HFSUTILS drivers in FUSE, not reverse. You must recompile or adapt them to work at all.

If you take all that away, there is not much left at all. Thus I decided not to do it at this point. -- intgr 21:31, 12 December 2006 (UTC)Reply


Answering the article comments:
What does "originally" mean? NFS appeared in 1984, first network file systems were developed in the 1970s, see distributed file system — User:Intgr
There were operating system before 1984 and before 1970s and first networks. You can also check in UNIX source code that VFS was added to support NFS.
Huh? File system code is still part of the kernel on common operating systems these days. — User:Intgr+
Of monolithic kernels? Yes, of modular or microkernels, they aren't. E.g.: The NT kernel needs to load a filesystem driver for EVERY filesystem (FAT, NTFS, ISO9660, UDF) for being able to read them.
Apart from that, current monolithic kernels also use a filesystem API. Before that kernels when requested to open a file DIRECTLY opened it (see DOS 1.0 for example), currently they use their filesystem API to communicate with the correct filesystem code (being network or disk), and then that code is the one that directly opens it.
FIXME: Again, what's the "originally" above? I think this one contradicts with the listing of Unix under "kernel-level APIs", below. Unix predates most operating systems that are considered "old" that people still know of today — User:Intgr
Again, check the UNIX source archives. There are also hundreds of disk operating systems that predates UNIX.
If you have more doubts feel free to email me.
Regards, —Claunia 15:06, 1 February 2007 (UTC)Reply
First of all, I want to apologize in case slapping the article full of "rewrite" templates seemed rude; but the comments stayed unanswered for a long time.
"you'll see it runs by default IN PROTECTED MODE.", "Of course Windows 9x API is the same as Windows 3.1"
Ok, I admit I was wrong there (though I still find the definition of a "mixed kernel-driver-based API" confusing, but we should sort out "kernel-based" and "driver-based" first)
"You cannot use the ADFLIB or the HFSUTILS drivers in FUSE, not reverse. You must recompile or adapt them to work at all."
Yes, they are currently incompatible, but the point is that there is no inherent reason why they should be incompatible for different file systems (or even different operating systems). There just happened to be no accepted common user space FS APIs earlier. FUSE could, in theory, become one. However, the article currently states: "the great disadvantage is that the API is uniquely to each application that implements one." Unless I'm awfully misinterpreting it, the statement is misleading at best.
I'm not the very best english writer, but the idea is that ADFLIB, HFSUTILS, FUSE, and others, are directly incompatible between them. If you want to rephrase my expression more correctly, I will thank you.
"Of monolithic kernels? Yes, of modular or microkernels, they aren't."
The Windows NT kernel is not a microkernel, as most parts of the kernel, including drivers, run in privileged, and not memory-protected mode (unlike real microkernels where they are entirely isolated in their own address spaces). NT has been called a "hybrid kernel", however, many people appear to consider that just marketing speak for "monolithic kernel", since it fundamentally operates in the same way (the hybrid kernel article sums that argument up well).
I consider a monolithic kernel when everything is loaded will it be used or not, and modular when it is used when needed or when requested so by the user. Currently almost all operating systems are modular (whatever pure microkernels like Mach or hybrid like the Windows NT Kernel -HAL, Executive, Kernel)
"There were operating system before 1984 and before 1970s and first networks.", "Again, check the UNIX source archives."
I guess what confused me was the missing context. Mentioning year numbers and Unix variants would have made that much more clear. The article seemed to imply to me that network file systems were a new thing; my knowledge of the older computing and UNIX days is admittedly vague, and I was not aware of the fact that first Unices did not have a VFS layer.
Well of course network filesystem were introduced sometime, and, if I don't remember bad, the first UNIX's VFS was put by Sun for using NFS.
"The NT kernel needs to load a filesystem driver for EVERY filesystem (FAT, NTFS, ISO9660, UDF) for being able to read them." and "In driver-level it is in kernel mode, but it is loaded as a library."
I meant "part of the kernel" in the sense that it's running in kernel mode, and is addressed through direct function calls from the upper layers of the kernel — which is not separate from the kernel in my view, and definitely not a fundamental difference where one would draw lines. Just like dynamic libraries are only superficially separate from running processes. Naturally, dynamic libraries can also be linked statically.
If you define "driver" through the fact that it's loaded from a separate file then these are classically called "loadable kernel modules"; a driver is a driver regardless of whether it's statically linked to the kernel image or loaded dynamically. What matters is the functions that it provides for the operating system.
Again, I don't know about older Unixes, but recent BSD and Linux versions allow compiling most parts of the kernel as separate loadable modules, or build them into the kernel, and they still operate in the same way regardless of when they are loaded, and most importantly, they use the exact same APIs. Hence why I call the module loading difference superficial.
Can you explain why you think this difference is significant?
The most important think about the driver based is that the kernel does not need to know the module ever exists. With Linux, you cannot insert in the kernel a module that the compiled kernel doesn't knows about. I think I'm clear in this. If not I'll try to explain better.
"Check OS/2 and Windows NT IFS's and you will see the differences."
I have no experience with OS/2; however, if the question boils down to loading a module or having the code linked into the kernel, I think it's irrelevant as explained above.
NT and OS/2 IFS's operate almost equally (except for the OS/2 microIFS and miniIFS parts) so knowing one, knowing both.
"Apart from that, current monolithic kernels also use a filesystem API. Before that kernels when requested to open a file DIRECTLY opened it."
I guess this is another matter definitions. I would consider a programming interface an API whether it stands alone or is only there in a de facto form. Linux doesn't explicitly define their in-kernel APIs either, but they are APIs nevertheless. But this is irrelevant.
What really makes a difference here is the fact that DOS didn't have a file system abstraction layer (such as VFS) — not whether they interfaced with the file system using an API.
Of course but a filesystem API is an API that allows the kernel to talk with more than one filesystem handler code. Without it, when you do a readdir(), e.g. in DOS, it will directly run the directory structure, why with an API, it will first see the VFS/IFS memory table to know what filesystem handler code handles the filesystem where the request was made and then execute a generic_filesystem_api_readdir() (whatever filesystem handler it is) to obtain an abstraction.
Hope I solved more doubts, and sorry for being so late in answering last time, just had been VERY BUSY.
Have a good gay,
Claunia 23:09, 5 February 2007 (UTC)Reply
-- intgr 17:14, 5 February 2007 (UTC)Reply

Looks like it's me who has to apologize for being late, this time. :)

Anyway, perhaps we started off the wrong way, discussing statements instead of opinions/beliefs. Here's what my arguments boil down to:

  1. Drawing a line between "file systems using this API are normally loaded as modules" and "file systems using this API are normally statically linked to the kernel" is arbitrary and is unrelated to the actual APIs.
  2. Hence, as I fail to see any technical differences between the two, the mixed API classification doesn't make any sense to me.

"With Linux, you cannot insert in the kernel a module that the compiled kernel doesn't knows about."
I'm not sure about this one. Perhaps there are some cases where it is necessary to know the available file systems in advance, but I have yet to come across one. I've done this several times; first compiling a kernel, booting into it, and subsequently compiling extra file system modules that I had not anticipated. And these have loaded fine, without a kernel recompile or a reboot.

If necessary, I'm sure there aren't any problems with writing out-of-tree file systems that would load successfully without the kernel knowing about them in advance. That's how lots of Linux device drivers (still) work, despite the attempts by kernel developers to convince vendors otherwise, for reasons outlined in their Documentation/stable_api_nonsense.txt. This is, however, a purely ideological standpoint; there are no technical differences from how other kernels implement their APIs (as far as I can tell).

"Of course but a filesystem API is an API that allows the kernel to talk with more than one filesystem handler code."
Ok, this definition sounds fair. Though it describes a "virtual file system API" — e.g., how the file system interfaces with the VFS layer in the kernel, and not just any API for a file system. Note that some purely user space APIs, such as hfsutils, don't fit under this definition, since they are not coupled to the VFS in any way. FUSE, however, does fit.
-- intgr 10:47, 12 February 2007 (UTC)Reply

Primitives in a Filesystem API

edit

I think it would be interesting and useful to have a list of typical primitives in filesystem APIs in this article.

Of course there is open / close, read / write, seek, but I expect there are some others for instance dealing with file locking, directories etc.

Maybe even a comparison between for instance MS Windows and Linux file system primitives? —Preceding unsigned comment added by Erl (talkcontribs) 21:34, 31 October 2007 (UTC)Reply

Examples

edit

there are different examples that can be added:

  • linux FUSE architecture
  • full userspace architectures such as:
    • the HURD

Rewrite in progress

edit

I hope to be reworking this article in the next few days as I include a brief overview in the filesystems article. I expect to be generalizing, conceptualizing, de-windowsing and de-PCizing it and including various other environments. Feel free to jump in with thoughts and suggestions. Please be patient as I will be updating it bit by bit (pun!) DG12 (talk) 15:13, 9 August 2011 (UTC)Reply