Zero-copy
This article may be too technical for most readers to understand.(June 2016) |
"Zero-copy" describes
Principle
Zero-copy
Usually when a user space process has to execute system operations like reading or writing data from/to a
If data has to be copied or moved from source to destination and both are located inside kernel space (i.e. two files, a file and a network card, etc.) then unnecessary data copies, from kernel space to user space and from user space to kernel space, can be avoided by using special (zero-copy) system calls, usually available in most recent versions of popular operating systems.
Zero-copy versions of operating system elements, such as
As an example, reading a file and then sending it over a network the traditional way requires 2 extra data copies (1 to read from kernel to user space + 1 to write from user to kernel space) and 4 context switches per read/write cycle. Those extra data copies use the CPU. Sending that file by using mmap of file data and a cycle of write calls, reduces the context switches to 2 per write call and avoids those previous 2 extra user data copies. Sending the same file via zero copy reduces the context switches to 2 per sendfile call and eliminates all CPU extra data copies (both in user and in kernel space).[1][2][3][4]
Zero-copy protocols are especially important for very high-speed networks in which the capacity of a network link approaches or exceeds the CPU's processing capacity. In such a case the CPU may spend nearly all of its time copying transferred data, and thus becomes a bottleneck which limits the communication rate to below the link's capacity. A rule of thumb used in the industry is that roughly one CPU clock cycle is needed to process one bit of incoming data.
Hardware implementations
An early implementation was
Techniques for creating zero-copy software include the use of direct memory access (DMA)-based copying and memory-mapping through a memory management unit (MMU). These features require specific hardware support and usually involve particular memory alignment requirements.
A newer approach used by the Heterogeneous System Architecture (HSA) facilitates the passing of pointers between the CPU and the GPU and also other processors. This requires a unified address space for the CPU and the GPU.[5][6]
Program interfaces
Several operating systems support zero-copying of user data and file contents through specific APIs.
Here are listed only a few well known system calls / APIs available in most popular OSs.
The
The Linux kernel supports zero-copy through various system calls, such as:
- sendfile, sendfile64;[9]
- splice;[10]
- tee;[11]
- vmsplice;[12]
- process_vm_readv;[13]
- process_vm_writev;[14]
- copy_file_range;[15]
- raw sockets with packet mmap[16] or AF_XDP.
Some of them are specified in
FreeBSD, NetBSD, OpenBSD, DragonFly BSD, etc. support zero-copy through at least these system calls:
MacOS should support zero-copy through the FreeBSD portion of the kernel because it offers the same system calls (and its manual pages are still tagged BSD) such as:
- sendfile.[21]
Oracle Solaris supports zero-copy through at least these system calls:
Microsoft Windows supports zero-copy through at least this system call:
- TransmitFile.[27]
Java input streams can support zero-copy through the java.nio.channels.FileChannel's transferTo() method if the underlying operating system also supports zero copy.[28]
RDMA (Remote Direct Memory Access) protocols deeply rely on zero-copy techniques.
See also
- AF XDP
- Call by reference
- Device driver
- Embedded system
- Infiniband
- Locality of reference
- NCOPY
- netsniff-ng
- Programmed input/output
- Socket Direct Protocol
- Scatter/gather I/O
References
- ^ a b Stancevic, Dragan (2003-01-01). "Zero Copy I: User-Mode Perspective". www.linuxjournal.com. Retrieved 2021-10-14.
- ^ )
- ^ a b Song, Jia; Alves-Foss, Jim (2012-01-01). "Performance Review of Zero Copy Techniques" (PDF). www.uidaho.edu. Retrieved 2021-10-14.
- ^ a b Baldwin, John (2020-05-01). "TLS offload in the kernel" (PDF). freebsdfoundation.org. Retrieved 2021-10-14.
- ^ "The programmer's guide to the APU galaxy" (PDF).
- ^ "AMD Outlines HSA Roadmap: Unified Memory for CPU/GPU". 2012-02-02.
- OpenDOS 7.01.)
- ^ Paul, Matthias R. (1997-07-30) [1994-05-01]. "II.4. Undokumentierte Eigenschaften externer Kommandos: MOVE.EXE". NWDOS-TIPs — Tips & Tricks rund um Novell DOS 7, mit Blick auf undokumentierte Details, Bugs und Workarounds. Release 157 (in German) (3 ed.). Archived from the original on 2017-09-10. Retrieved 2014-08-06.
{{cite book}}
:|work=
ignored (help) (NB. NWDOSTIP.TXT is a comprehensive work on Novell DOS 7 and OpenDOS 7.01, including the description of many undocumented features and internals. It is part of the author's yet largerMPDOSTIP.ZIP
collection maintained up to 2001 and distributed on many sites at the time. The provided link points to a HTML-converted older version of theNWDOSTIP.TXT
file.) [2] - ^ "sendfile(2) - Linux manual page". man7.org. 2021-03-22. Retrieved 2021-10-13.
- ^ "splice(2) - Linux manual page". man7.org. 2021-03-22. Retrieved 2021-10-13.
- ^ "tee(2) - Linux manual page". man7.org. 2021-03-22. Retrieved 2021-10-13.
- ^ "vmsplice(2) - Linux manual page". man7.org. 2021-03-22. Retrieved 2021-10-13.
- ^ "process_vm_readv(2) - Linux manual page". man7.org. 2021-03-22. Retrieved 2021-10-13.
- ^ "process_vm_writev(2) - Linux manual page". man7.org. 2021-03-22. Retrieved 2021-10-13.
- ^ "copy_file_range(2) - Linux manual page". man7.org. 2021-03-22. Retrieved 2021-10-13.
- ^ "Linux PACKET_MMAP documentation". kernel.org.
- ^ "sendfile(2) - FreeBSD manual pages". www.freebsd.org. 2020-04-30. Retrieved 2021-10-13.
- ^ "write(2) - FreeBSD manual pages". www.freebsd.org. 2020-04-30. Retrieved 2021-10-13.
- ^ "writev(2) - FreeBSD manual pages". www.freebsd.org. 2020-04-30. Retrieved 2021-10-13.
- ^ "mmap(2) - FreeBSD manual pages". www.freebsd.org. 2020-04-30. Retrieved 2021-10-13.
- ^ "sendfile(2) - Mac OS X Manual Page". developer.apple.com. 2006-03-31. Retrieved 2021-10-13.
- ^ "sendfile(3C) - Solaris manual pages". docs.oracle.com. 2021-08-13. Retrieved 2021-10-13.
- ^ "sendfilev(3C) - Solaris manual pages". docs.oracle.com. 2021-08-13. Retrieved 2021-10-13.
- ^ "write(2) - Solaris manual pages". docs.oracle.com. 2021-08-13. Retrieved 2021-10-13.
- ^ "writev(2) - Solaris manual pages". docs.oracle.com. 2021-08-13. Retrieved 2021-10-13.
- ^ "mmap(2) - Solaris manual pages". docs.oracle.com. 2021-08-13. Retrieved 2021-10-13.
- ^ "TransmitFile function (Win32)". docs.microsoft.com. 2021-05-10. Retrieved 2021-10-13.
- ^ Palaniappan, Sathish K.; Nagaraja, Pramod B. (2008-09-02). "Java zero-copy". developer.ibm.com. Retrieved 2021-10-13.