System call

Source: Wikipedia, the free encyclopedia.

userspace

In computing, a system call (commonly abbreviated to syscall) is the programmatic way in which a

process scheduling
. System calls provide an essential interface between a process and the operating system.

In most systems, system calls can only be made from

userspace processes, while in some systems, OS/360 and successors for example, privileged system code also issues system calls.[1]

For embedded systems, system calls typically do not change the privilege mode of the CPU.

Privileges

The

frame buffer or network
devices).

However, many applications need access to these components, so system calls are made available by the operating system to provide well-defined, safe implementations for such operations. The operating system executes at the highest level of privilege, and allows applications to request services via system calls, which are often initiated via interrupts. An interrupt automatically puts the CPU into some elevated privilege level and then passes control to the kernel, which determines whether the calling program should be granted the requested service. If the service is granted, the kernel executes a specific set of instructions over which the calling program has no direct control, returns the privilege level to that of the calling program, and then returns control to the calling program.

The library as an intermediary

Generally, systems provide a

subroutine call on the assembly level) for using the system call, as well as making the system call more modular. Here, the primary function of the wrapper is to place all the arguments to be passed to the system call in the appropriate processor registers (and maybe on the call stack as well), and also setting a unique system call number for the kernel to call. In this way the library, which exists between the OS and the application, increases portability
.

The call to the library function itself does not cause a switch to

application code is more complicated and may require embedded assembly code to be used (in C and C++), as well as requiring knowledge of the low-level binary interface for the system call operation, which may be subject to change over time and thus not be part of the application binary interface
; the library functions are meant to abstract this away.

On

resource
management.

IBM's

MVS/SP
and in all later MVS versions, some system call macros generate Program Call (PC).

Examples and tools

On

kill. Many modern operating systems have hundreds of system calls. For example, Linux and OpenBSD each have over 300 different calls,[2][3] NetBSD has close to 500,[4] FreeBSD has over 500,[5] Windows has close to 2000, divided between win32k (graphical) and ntdll (core) system calls[6] while Plan 9 has 51.[7]

Tools such as strace, ftrace and truss allow a process to execute from start and report all system calls the process invokes, or can attach to an already running process and intercept any system call made by the said process if the operation does not violate the permissions of the user. This special ability of the program is usually also implemented with system calls such as ptrace or system calls on files in procfs.

Typical implementations

Implementing system calls requires a transfer of control from user space to kernel space, which involves some sort of architecture-specific feature. A typical way to implement this is to use a

trap. Interrupts transfer control to the operating system kernel
, so software simply needs to set up some register with the system call number needed, and execute the software interrupt.

This is the only technique provided for many

instruction set contains the instructions SYSCALL/SYSRET and SYSENTER/SYSEXIT (these two mechanisms were independently created by AMD and Intel, respectively, but in essence they do the same thing). These are "fast" control transfer instructions that are designed to quickly transfer control to the kernel for a system call without the overhead of an interrupt.[8] Linux 2.5 began using this on the x86, where available; formerly it used the INT instruction, where the system call number was placed in the EAX register before interrupt 0x80 was executed.[9][10]

An older mechanism is the

portability
it causes, and the existence of the faster instructions mentioned above.

For IA-64 architecture, EPC (Enter Privileged Code) instruction is used. The first eight system call arguments are passed in registers, and the rest are passed on the stack.

In the IBM System/360 mainframe family, and its successors, a Supervisor Call instruction (SVC), with the number in the instruction rather than in a register, implements a system call for legacy facilities in most of[c] IBM's own operating systems, and for all system calls in Linux. In later versions of MVS, IBM uses the Program Call (PC) instruction for many newer facilities. In particular, PC is used when the caller might be in Service Request Block (SRB) mode.

The PDP-11 minicomputer used the EMT, TRAP and IOT instructions, which, similar to the IBM System/360 SVC and x86 INT, put the code in the instruction; they generate interrupts to specific addresses, transferring control to the operating system. The VAX 32-bit successor to the PDP-11 series used the CHMK, CHME, and CHMS instructions to make system calls to privileged code at various levels; the code is an argument to the instruction.

Categories of system calls

System calls can be grouped roughly into six major categories:[12]

  1. Process control
  2. File management
    • create file, delete file
    • open, close
    • read, write, reposition
    • get/set file attributes
  3. Device management
    • request device, release device
    • read, write, reposition
    • get/set device attributes
    • logically attach or detach devices
  4. Information maintenance
    • get/set total system information (including time, date, computer name, enterprise etc.)
    • get/set process, file, or device metadata (including author, opener, creation time and date, etc.)
  5. Communication
    • create, delete communication connection
    • send, receive messages
    • transfer status information
    • attach or detach remote devices
  6. Protection
    • get/set file permissions

Processor mode and context switching

System calls in most

kernel mode, which is accomplished by changing the processor execution mode to a more privileged one, but no process context switch is necessary – although a privilege context switch does occur. The hardware sees the world in terms of the execution mode according to the processor status register, and processes are an abstraction provided by the operating system. A system call does not generally require a context switch to another process; instead, it is processed in the context of whichever process invoked it.[13][14]

In a multithreaded process, system calls can be made from multiple threads. The handling of such calls is dependent on the design of the specific operating system kernel and the application runtime environment. The following list shows typical models followed by operating systems:[15][16]

  • Many-to-one model: All system calls from any user thread in a process are handled by a single kernel-level thread. This model has a serious drawback – any blocking system call (like awaiting input from the user) can freeze all the other threads. Also, since only one thread can access the kernel at a time, this model cannot utilize multiple cores of processors.
  • One-to-one model: Every user thread gets attached to a distinct kernel-level thread during a system call. This model solves the above problem of blocking system calls. It is found in all major
    Solaris
    versions.
  • Many-to-many model: In this model, a pool of user threads is mapped to a pool of kernel threads. All system calls from a user thread pool are handled by the threads in their corresponding kernel thread pool.
  • Hybrid model: This model implements both many to many and one to one models depending upon the choice made by the kernel. This is found in old versions of
    Solaris
    .

See also

  • Linux kernel API
  • VDSO

Notes

  1. UNIX-like operating systems, system calls are used only for the kernel
  2. ^ In many but not all cases, IBM documented, e.g., the SVC number, the parameter registers.
  3. ^ The CP components of CP-67 and VM use the Diagnose (DIAG) instruction as a Hypervisor CALL (HVC) from a virtual machine to CP.

References

  1. ^ IBM (March 1967). "Writing SVC Routines". IBM System/360 Operating System System Programmer's Guide (PDF). Third Edition. pp. 32–36. C28-6550-2.
  2. ^ "syscalls(2) - Linux manual page".
  3. ^ OpenBSD (14 September 2013). "System call names (kern/syscalls.c)". BSD Cross Reference.
  4. ^ NetBSD (17 October 2013). "System call names (kern/syscalls.c)". BSD Cross Reference.
  5. ^ "FreeBSD syscalls.c, the list of syscall names and IDs".
  6. ^ Mateusz "j00ru" Jurczyk (5 November 2017). "Windows WIN32K.SYS System Call Table (NT/2000/XP/2003/Vista/2008/7/8/10)".{{cite web}}: CS1 maint: numeric names: authors list (link)
  7. ^ "sys.h". Plan 9 from Bell Labs. Archived from the original on 8 September 2023, the list of syscall names and IDs.
  8. ^ "SYSENTER". OSDev wiki. Archived from the original on 8 November 2023.
  9. ^ Anonymous (19 December 2002). "Linux 2.5 gets vsyscalls, sysenter support". KernelTrap. Retrieved 1 January 2008.
  10. ^ Manu Garg (2006). "Sysenter Based System Call Mechanism in Linux 2.6".
  11. ^ "Liberation: x86 Instruction Set Reference". renejeschke.de. Retrieved 4 July 2015.
  12. .
  13. ^ Bach, Maurice J. (1986), The Design of the UNIX Operating System, Prentice Hall, pp. 15–16.
  14. ^ Elliot, John (2011). "Discussion of system call implementation at ProgClub including quote from Bach 1986".
  15. ^ "Threads".
  16. ^ "Threading Models" (PDF).

External links