Supercomputer operating system

A supercomputer operating system is an operating system intended for supercomputers. Since the end of the 20th century, supercomputer operating systems have undergone major transformations, as fundamental changes have occurred in supercomputer architecture.^[1] While early operating systems were custom tailored to each supercomputer to gain speed, the trend has been moving away from in-house operating systems and toward some form of Linux,^[2] with it running all the supercomputers on the TOP500 list in November 2017. In 2021, top 10 computers run for instance Red Hat Enterprise Linux (RHEL), or some variant of it or other Linux distribution e.g. Ubuntu.

Given that modern massively parallel supercomputers typically separate computations from other services by using multiple types of nodes, they usually run different operating systems on different nodes, e.g., using a small and efficient lightweight kernel such as Compute Node Kernel (CNK) or Compute Node Linux (CNL) on compute nodes, but a larger system such as a Linux-derivative on server and input/output (I/O) nodes.^[3]^[4]

While in a traditional multi-user computer system

tasking problem for processing and peripheral resources, in a massively parallel system, the job management system needs to manage the allocation of both computational and communication resources, as well as gracefully dealing with inevitable hardware failures when tens of thousands of processors are present.^[5]

Although most modern supercomputers use the Linux operating system,[6] each manufacturer has made its own specific changes to the Linux-derivative they use, and no industry standard exists, partly because the differences in hardware architectures require changes to optimize the operating system to each hardware design.^[1]^[7]

Context and overview

In the early days of supercomputing, the basic architectural concepts were evolving rapidly, and system software had to follow hardware innovations that usually took rapid turns.^[1] In the early systems, operating systems were custom tailored to each supercomputer to gain speed, yet in the rush to develop them, serious software quality challenges surfaced and in many cases the cost and complexity of system software development became as much an issue as that of hardware.^[1]

NASA Ames

In the 1980s the cost for software development at Cray came to equal what they spent on hardware and that trend was partly responsible for a move away from the in-house operating systems to the adaptation of generic software.^[2] The first wave in operating system changes came in the mid-1980s, as vendor specific operating systems were abandoned in favor of Unix. Despite early skepticism, this transition proved successful.^[1]^[2]

By the early 1990s, major changes were occurring in supercomputing system software.

process schedulers.^[1] However, all the companies that adapted Unix made unique changes to it, rather than collaborating on an industry standard to create "Unix for supercomputers". This was partly because differences in their architectures required these changes to optimize Unix to each architecture.^[1]

Thus as general purpose operating systems became stable, supercomputers began to borrow and adapt the critical system code from them and relied on the rich set of secondary functions that came with them, not having to reinvent the wheel.

INRIA were examples of early microkernels.^[1]

The separation of the operating system into separate components became necessary as supercomputers developed different types of nodes, e.g., compute nodes versus I/O nodes. Thus modern supercomputers usually run different operating systems on different nodes, e.g., using a small and efficient lightweight kernel such as CNK or CNL on compute nodes, but a larger system such as a Linux-derivative on server and I/O nodes.^[3]^[4]

Early systems

The

SCOPE systems.^[9]^[10]

The first Cray-1 was delivered to the Los Alamos Lab with no operating system, or any other software.^[11] Los Alamos developed the application software for it, and the operating system.^[11] The main timesharing system for the Cray 1, the Cray Time Sharing System (CTSS), was then developed at the Livermore Labs as a direct descendant of the Livermore Time Sharing System (LTSS) for the CDC 6600 operating system from twenty years earlier.^[11]

In developing supercomputers, rising software costs soon became dominant, as evidenced by the 1980s cost for software development at Cray growing to equal their cost for hardware.^[2] That trend was partly responsible for a move away from the in-house Cray Operating System to UNICOS system based on Unix.^[2] In 1985, the Cray-2 was the first system to ship with the UNICOS operating system.^[12]

Around the same time, the

EOS operating system was developed by ETA Systems for use in their ETA10 supercomputers.^[13] Written in Cybil, a Pascal-like language from Control Data Corporation, EOS highlighted the stability problems in developing stable operating systems for supercomputers and eventually a Unix-like system was offered on the same machine.^[13]^[14] The lessons learned from developing ETA system software included the high level of risk associated with developing a new supercomputer operating system, and the advantages of using Unix with its large extant base of system software libraries.^[13]

By the middle 1990s, despite the extant investment in older operating systems, the trend was toward the use of Unix-based systems, which also facilitated the use of interactive

scientific computing across multiple platforms.^[15] The move toward a commodity OS had opponents, who cited the fast pace and focus of Linux development as a major obstacle against adoption.^[16] As one author wrote "Linux will likely catch up, but we have large-scale systems now". Nevertheless, that trend continued to gain momentum and by 2005, virtually all supercomputers used some Unix-like OS.^[17] These variants of Unix included IBM AIX, the open source Linux system, and other adaptations such as UNICOS from Cray.^[17] By the end of the 20th century, Linux was estimated to command the highest share of the supercomputing pie.^[1]^[18]

Modern approaches

Blue Gene/P supercomputer at Argonne National Lab

The IBM

Blue Gene supercomputer uses the CNK operating system on the compute nodes, but uses a modified Linux-based kernel called I/O Node Kernel (INK) on the I/O nodes.^[3]^[19] CNK is a lightweight kernel that runs on each node and supports a single application running for a single user on that node. For the sake of efficient operation, the design of CNK was kept simple and minimal, with physical memory being statically mapped and the CNK neither needing nor providing scheduling or context switching.^[3] CNK does not even implement file I/O on the compute node, but delegates that to dedicated I/O nodes.^[19] However, given that on the Blue Gene multiple compute nodes share a single I/O node, the I/O node operating system does require multi-tasking, hence the selection of the Linux-based operating system.^[3]^[19]

While in traditional multi-user computer systems and early supercomputers,

master scheduler which instructs some number of slave schedulers to launch, monitor, and control parallel jobs, and periodically receives reports from them about the status of job progress.^[5]

Some, but not all supercomputer schedulers attempt to maintain locality of job execution. The

PBS Pro scheduler used on the Cray XT3 and Cray XT4 systems does not attempt to optimize locality on its three-dimensional torus interconnect, but simply uses the first available processor.^[20] On the other hand, IBM's scheduler on the Blue Gene supercomputers aims to exploit locality and minimize network contention by assigning tasks from the same application to one or more midplanes of an 8x8x8 node group.^[20] The Slurm Workload Manager scheduler uses a best fit algorithm, and performs Hilbert curve scheduling to optimize locality of task assignments.^[20] Several modern supercomputers such as the Tianhe-2 use Slurm, which arbitrates contention for resources across the system. Slurm is open source, Linux-based, very scalable, and can manage thousands of nodes in a computer cluster with a sustained throughput of over 100,000 jobs per hour.^[21]^[22]

References

^
ISBN 0-387-09765-1
pages 426–429.

^
ISBN 0-262-63188-1
page 149–151.

^
ISBN 3-540-22924-8
page 835.

^ ^a ^b An Evaluation of the Oak Ridge National Laboratory Cray XT3 by Sadaf R. Alam, et al., International Journal of High Performance Computing Applications, February 2008 vol. 22 no. 1 52–80.
^
ISBN 978-3-540-31024-2
pages 95–101.

ZDNet
. Retrieved June 20, 2013.

^ "Top500 OS chart". Top500.org. Archived from the original on 2012-03-05. Retrieved 2010-10-31.

ISBN 0-8157-2851-4 page 82 [1]

^
ISBN 0-262-22064-4
page 258.

^ Design of a computer: the Control Data 6600 by James E. Thornton, Scott, Foresman Press 1970 page 163.

^
ISBN 0-8157-2851-4
pages 81–83.

ISBN 0-444-82163-5 page 126 [2]
.

^ ^a ^b ^c Lloyd M. Thorndyke, The Demise of the ETA Systems in "Frontiers of Supercomputing II by Karyn R. Ames, Alan Brenner 1994
ISBN 0-520-08401-2
pages 489–497.

ISBN 3-540-19664-1
page 326.

ISBN 0-520-08401-2
page 356.

^ Brightwell, Ron Riesen, Rolf Maccabe, Arthur. "On the Appropriateness of Commodity Operating Systems for Large-Scale, Balanced Computing Systems" (PDF). Retrieved January 29, 2013.{{cite web}}: CS1 maint: multiple names: authors list (link)

^
ISBN 0-309-09502-6
page 136.

^ Forbes magazine, 03.15.05: Linux Rules Supercomputers

^
ISBN 3-540-37783-2
.

^
ISBN 3-642-04632-0
pages 138–144.

^ SLURM at SchedMD

^ Jette, M. and M. Grondona, SLURM: Simple Linux Utility for Resource Management in the Proceedings of ClusterWorld Conference, San Jose, California, June 2003 [3]

v
t
e
Supercomputer operating systems
Current

Catamount

CNK

INK

SUNMOS

SUPER-UX

Historic

Chippewa Operating System

Cray Operating System

Cray Time Sharing System

EOS

Livermore Time Sharing System

Multiple Console Time Sharing System

NLTSS

UNICOS

Category

v
t
e
Parallel computing
General

Distributed computing

Parallel computing

Massively parallel

Cloud computing

High-performance computing

Multiprocessing

Manycore processor

GPGPU

Computer network

Systolic array

Levels

Bit

Instruction

Thread

Task

Data

Memory

Loop

Pipeline

Multithreading

Temporal

Simultaneous (SMT)

Simultaneous and heterogenous

Speculative (SpMT)

Preemptive

Cooperative

Clustered multi-thread (CMT)

Hardware scout

Theory

PRAM model

PEM model

Analysis of parallel algorithms

Amdahl's law

Gustafson's law

Cost efficiency

Karp–Flatt metric

Slowdown

Speedup

Elements

Process

Thread

Fiber

Instruction window

Array

Coordination

Multiprocessing

Memory coherence

Cache coherence

Cache invalidation

Barrier

Synchronization

Application checkpointing

Programming

Stream processing

Dataflow programming

Models
Implicit parallelism

Explicit parallelism

Concurrency

Non-blocking algorithm

Hardware

Flynn's taxonomy
SISD

SIMD
Array processing (SIMT)

Pipelined processing

Associative processing

MISD

MIMD

Dataflow architecture

Pipelined processor

Superscalar processor

Vector processor

Multiprocessor
symmetric

asymmetric

Memory
shared

distributed

distributed shared

UMA

NUMA

COMA

Massively parallel computer

Computer cluster
Beowulf cluster

Grid computer

Hardware acceleration

APIs

Ateji PX

Boost

Chapel

HPX

Charm++

Cilk

Coarray Fortran

CUDA

Dryad

C++ AMP

Global Arrays

GPUOpen

MPI

OpenMP

OpenCL

OpenHMPP

OpenACC

Parallel Extensions

PVM

pthreads

RaftLib

ROCm

UPC

TBB

ZPL

Problems

Automatic parallelization

Deadlock

Deterministic algorithm

Embarrassingly parallel

Parallel slowdown

Race condition

Software lockout

Scalability

Starvation

Category: Parallel computing

v
t
e
Operating systems
General

Comparison

Forensic engineering

History

List

Timeline

Usage share

User features comparison

Variants

Disk operating system

Distributed operating system

Embedded operating system

Hobbyist operating system

Just enough operating system

Mobile operating system

Network operating system

Object-oriented operating system

Real-time operating system

Supercomputer operating system

Kernel
Architectures

Exokernel

Hybrid

Microkernel

Monolithic

Multikernel

vkernel

Rump kernel

Unikernel

Components

Device driver

Loadable kernel module

User space and kernel space

Process management
Concepts

Computer multitasking (Cooperative, Preemptive)

Context switch

Interrupt

IPC

Process

Process control block

Real-time

Thread

Time-sharing

Scheduling
algorithms

Fixed-priority preemptive

Multilevel feedback queue

Round-robin

Shortest job next

Memory management,
resource protection

Bus error

General protection fault

Memory paging

Memory protection

Protection ring

Segmentation fault

Virtual memory

Storage access,
file systems

Boot loader

Defragmentation

Device file

File attribute

Inode

Journal

Partition

Virtual file system

Virtual tape library

Supporting concepts

API

Computer network

HAL

Live CD

Live USB

Shell
CLI

User interface

PXE

Retrieved from "https://en.wikipedia.org/w/index.php?title=Supercomputer_operating_system&oldid=1221669506"

[Padua426-1] 
ISBN 0-387-09765-1
pages 426–429.

[MacKenzie-2] 
ISBN 0-262-63188-1
page 149–151.

[EuroPar2004-3] 
ISBN 3-540-22924-8
page 835.

[Alam-4] An Evaluation of the Oak Ridge National Laboratory Cray XT3 by Sadaf R. Alam, et al., International Journal of High Performance Computing Applications, February 2008 vol. 22 no. 1 52–80.

[Yariv-5] 
ISBN 978-3-540-31024-2
pages 95–101.

[6] ZDNet
. Retrieved June 20, 2013.

[7] "Top500 OS chart". Top500.org. Archived from the original on 2012-03-05. Retrieved 2010-10-31.

[8] ISBN 0-8157-2851-4 page 82 [1]

[Vardalas-9] 
ISBN 0-262-22064-4
page 258.

[10] Design of a computer: the Control Data 6600 by James E. Thornton, Scott, Foresman Press 1970 page 163.

[Flamm-11] 
ISBN 0-8157-2851-4
pages 81–83.

[Power-12] ISBN 0-444-82163-5 page 126 [2]
.

[Thorndyke-13] Lloyd M. Thorndyke, The Demise of the ETA Systems in "Frontiers of Supercomputing II by Karyn R. Ames, Alan Brenner 1994
ISBN 0-520-08401-2
pages 489–497.

[14] ISBN 3-540-19664-1
page 326.

[15] ISBN 0-520-08401-2
page 356.

[16] Brightwell, Ron Riesen, Rolf Maccabe, Arthur. "On the Appropriateness of Commodity Operating Systems for Large-Scale, Balanced Computing Systems" (PDF). Retrieved January 29, 2013.{{cite web}}: CS1 maint: multiple names: authors list (link)

[National136-17] 
ISBN 0-309-09502-6
page 136.

[18] Forbes magazine, 03.15.05: Linux Rules Supercomputers

[EuroPar2006-19] 
ISBN 3-540-37783-2
.

[Eitan-20] 
ISBN 3-642-04632-0
pages 138–144.

[21] SLURM at SchedMD

[22] Jette, M. and M. Grondona, SLURM: Simple Linux Utility for Resource Management in the Proceedings of ClusterWorld Conference, San Jose, California, June 2003 [3]

[1]

[2]

[3]

[4]

[5]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

Context and overview

Early systems

Modern approaches

See also

References