The Möbius Operating System: Architecture


Tim Robinson


This page describes some of the high-level architectural aspects of the Möbius system, often with the help of colourful diagrams.

2.I/O Call Path

Figure 1
Fig. 1

The diagram above demonstrates the chain of calls invoked when reading from a file stored in an ext2 volume on an ATA hard disk.

  1. The call chain starts at fread (C), _read (POSIX) or directly at FsReadSync (Möbius native API). Native Möbius applications can also call FsRead to perform I/O asynchronously.
  2. FsReadSync calls FsRead, which is one of the syscall stubs in libsys.dll. This traps into the kernel, and the i386DispatchSysCall function calls FsRead in the kernel automatically. FsRead takes a pointer to a fileop_t structure as one of its parameters; this contains two out fields (the number of bytes read, and the error code) and one in field (a handle, which will be signalled once the read finished). FsReadSync uses the file handle itself here.
  3. FsRead and FsWrite both branch to the same FsReadWrite function. FsReadWrite locks the user memory buffer (by obtaining a physical page array object using MemCreatePageArray), calls FsReadWritePhysical, and unlocks the buffer again by calling MemDeletePageArray. Note that if the file system or a driver locked the buffer in between, it will stay locked, since the pages are each reference-counted by the physical memory manager.
  4. FsReadWritePhysical validates and locks the file handle and obtains a pointer to the FSD for the file. If the FSD supports the read_file or write_file function (as appropriate), it is called; otherwise, FsReadWritePhysical returns with an ENOTIMPL error.
  5. Here the ext2 FSD's read_file points to the function Ext2ReadFile. It performs some parameter validation, such as end-of file checks: if the read would go beyond the end of the file it is cut off at the end of the file; if, then, zero bytes would be read, Ext2ReadFile returns zero bytes read and an EEOF error. This is the expected behaviour of all file system drivers. Ext2ReadFile then builds an ext2_asyncio_t structure, which contains copies of the parameters supplied to Ext2ReadFile, and calls Ext2StartIo with a pointer to that structure. Note that FSDs usually don't need to queue requests from read_file or write_file, but instead queue them immediately with their underlying device, making sure that the request they send to the device is tagged with a structure containing the relevant parameters.
  6. Ext2StartIo calculates the block number of the start of the request passed to it, then checks whether that block is contained within the file's cache (using CcIsBlockValid).
  7. If the block is valid, Ext2StartIo can temporarily map both the page array originally passed to read_file and the page array corresponding to the current cache block (obtained from CcRequestBlock) and do a simple memcpy from the cache into the buffer (or vice versa for reading).
  8. If the block is not valid, Ext2StartIo obtains the page array for the block and places it in a request structure for its underlying device (a request_dev_t structure). A pointer to the page array and the other fields are stored in the dev_read member of the request structure's params union; the offset field is set to the byte offset of that block on the disk (calculated using the block numbers in the inode), and the number of bytes to read is set to the block size; the request code for reads and writes are DEV_READ and DEV_WRITE respectively. Ext2StartIo then queues the request with its device by calling IoRequest, making sure to specify that the ext2 FSD itself should be notified once the request is finished. Assuming that queueing the request succeeded, Ext2StartIo returns successfully; if not, it completes the request early with the error code from the device.
  9. Here IoRequest calls the ATA volume's request function, which, for a partition on a hard disk, is AtaPartitionRequest. This is a simple function which adds the offset of the start of the partition to the offset in the request and forwards it to the corresponding ATA drive; hence, AtaDriveRequest is called.
  10. AtaDriveRequest's job is to translate DEV_READ and DEV_WRITE requests into ATA_COMMAND requests on the drive's parent controller; all the real I/O is done on the controller device. Note that ATA_COMMAND (and its corresponding ATAPI_COMMAND request for CD-ROM drives) is internal to the ATA driver; however, it may be made available to applications later, as utilities such as CD burners and drive management software will find it useful. Having constructed the ATA_COMMAND request structure (which must be allocated on the heap since it must be available once the request has finished), AtaDriveRequest calls IoRequest to pass the request to the controller.
  11. Requests for ATA controllers are handled by the AtaCtrlRequest function; it responds to ATA_COMMAND and ATAPI_COMMAND requests by queueing them (using DevQueueRequest) and, if the controller is idle, sending the request to the hardware. The bytes sent to the drive match more-or-less exactly the fields in the request structure; some fiddling is performed to allow an unlimited number of sectors to be read or written in one request regardless of the maximum sector multiple enforced by the drive.
  12. Once the ATA driver has started I/O on the drive, and before the drive controller has triggered an interrupt, the call chain is unwound as each function returns; the FsRead syscall returns to user mode. The user-mode half of FsRead also returns to FsReadSync: FsReadSync calls ThrWaitHandle to wait for the request to complete. However, if the application had called FsRead directly, it could be doing more processing (or even queueing more I/O) while it waits.
  13. By now the drive has finished reading the sector(s) corresponding to the ext2 block and it has triggered an interrupt. This is handled by the i386-specific code in the kernel, which calls the handlers associated with that interrupt. In this case it is the AtaCtrlIsr function corresponding to the drive's controller. Assuming that all the required sectors have been read, AtaCtrlIsr calls DevFinishIo, which notifies the originator of the request (in this case the ext2 FSD) that the request has finished; hence Ext2FinishIo is called.
  14. Ext2FinishIo checks the result of the device request; if it failed then it completes the file read prematurely using FsNotifyCompletion, passing the error code obtained from the device. Otherwise it marks the current cache block as both clean (it hasn't been modified since it was last read) and valid.
  15. As before Ext2StartIo checks whether the current block is valid, which it now is; hence it can copy the data out of the cache. It checks whether the required number of bytes have been read, and, if so, it calls FsNotifyCompletion to indicate success.
  16. FsNotifyCompletion signals the event in the original fileop_t structure passed to FsRead, using EvtSignal. EvtSignal (which is a special case of HndSignal, specifically for handles which can be treated as simple boolean events) wakes any threads that were waiting on the handle. Now, when the kernel returns to user mode (having entered it for the ATA controller interrupt), it will schedule the thread which started the I/O.
  17. Back in user mode, FsReadSync updates errno (in case of failure) and returns to the caller.

Post a comment