Dynamic array
In
A dynamic array is not the same thing as a
Bounded-size dynamic arrays and capacity
A simple dynamic array can be constructed by allocating an array of fixed-size, typically larger than the number of elements immediately required. The elements of the dynamic array are stored contiguously at the start of the underlying array, and the remaining positions towards the end of the underlying array are reserved, or unused. Elements can be added at the end of a dynamic array in constant time by using the reserved space, until this space is completely consumed. When all space is consumed, and an additional element is to be added, then the underlying fixed-size array needs to be increased in size. Typically resizing is expensive because it involves allocating a new underlying array and copying each element from the original array. Elements can be removed from the end of a dynamic array in constant time, as no resizing is required. The number of elements used by the dynamic array contents is its logical size or size, while the size of the underlying array is called the dynamic array's capacity or physical size, which is the maximum possible size without relocating data.[2]
A fixed-size array will suffice in applications where the maximum logical size is fixed (e.g. by specification), or can be calculated before the array is allocated. A dynamic array might be preferred if:
- the maximum logical size is unknown, or difficult to calculate, before the array is allocated
- it is considered that a maximum logical size given by a specification is likely to change
- the amortized cost of resizing a dynamic array does not significantly affect performance or responsiveness
Geometric expansion and amortized cost
To avoid incurring the cost of resizing many times, dynamic arrays resize by a large amount, such as doubling in size, and use the reserved space for future expansion. The operation of adding an element to the end might work as follows:
function insertEnd(dynarray a, element e)
if (a.size == a.capacity)
// resize a to twice its current capacity:
a.capacity ← a.capacity * 2
// (copy the contents to the new memory location here)
a[a.size] ← e
a.size ← a.size + 1
As n elements are inserted, the capacities form a geometric progression. Expanding the array by any constant proportion a ensures that inserting n elements takes O(n) time overall, meaning that each insertion takes amortized constant time. Many dynamic arrays also deallocate some of the underlying storage if its size drops below a certain threshold, such as 30% of the capacity. This threshold must be strictly smaller than 1/a in order to provide hysteresis (provide a stable band to avoid repeatedly growing and shrinking) and support mixed sequences of insertions and removals with amortized constant cost.
Dynamic arrays are a common example when teaching amortized analysis.[3][4]
Growth factor
The growth factor for the dynamic array depends on several factors including a space-time trade-off and algorithms used in the memory allocator itself. For growth factor a, the average time per insertion operation is about a/(a−1), while the number of wasted cells is bounded above by (a−1)n[
Below are growth factors used by several popular implementations:
Implementation | Growth factor (a) |
---|---|
Java ArrayList[1] | 1.5 (3/2) |
Python PyListObject[7] | ~1.125 (n + (n >> 3)) |
Microsoft Visual C++ 2013[8] | 1.5 (3/2) |
G++ 5.2.0[5]
|
2 |
Clang 3.6[5] | 2 |
Facebook folly/FBVector[9] | 1.5 (3/2) |
Rust Vec[10] | 2 |
Go slices[11] | between 1.25 and 2 |
Nim sequences[12] | 2 |
SBCL (Common Lisp) vectors[13] | 2 |
C# (.NET 8) List | 2 |
Performance
Peek (index) |
Mutate (insert or delete) at … | Excess space, average | |||
---|---|---|---|---|---|
Beginning | End | Middle | |||
Linked list | Θ(n) | Θ(1) | Θ(1), known end element; Θ(n), unknown end element |
Θ(n)[14][15] | Θ(n) |
Array
|
Θ(1) | — | — | — | 0 |
Dynamic array | Θ(1) | Θ(n) | Θ(1) amortized | Θ(n) | Θ(n)[16] |
Balanced tree | Θ(log n) | Θ(log n) | Θ(log n) | Θ(log n) | Θ(n) |
Random-access list | Θ(log n)[17] | Θ(1) | —[17] | —[17] | Θ(n) |
Hashed array tree | Θ(1) | Θ(n) | Θ(1) amortized | Θ(n) | Θ(√n) |
The dynamic array has performance similar to an array, with the addition of new operations to add and remove elements:
- Getting or setting the value at a particular index (constant time)
- Iterating over the elements in order (linear time, good cache performance)
- Inserting or deleting an element in the middle of the array (linear time)
- Inserting or deleting an element at the end of the array (constant amortized time)
Dynamic arrays benefit from many of the advantages of arrays, including good
Compared to
A balanced tree can store a list while providing all operations of both dynamic arrays and linked lists reasonably efficiently, but both insertion at the end and iteration over the list are slower than for a dynamic array, in theory and in practice, due to non-contiguous storage and tree traversal/manipulation overhead.
Variants
Goodrich[18] presented a dynamic array algorithm called tiered vectors that provides O(n1/k) performance for insertions and deletions from anywhere in the array, and O(k) get and set, where k ≥ 2 is a constant parameter.
Hashed array tree (HAT) is a dynamic array algorithm published by Sitarski in 1996.[19] Hashed array tree wastes order n1/2 amount of storage space, where n is the number of elements in the array. The algorithm has O(1) amortized performance when appending a series of objects to the end of a hashed array tree.
In a 1999 paper,[20] Brodnik et al. describe a tiered dynamic array data structure, which wastes only n1/2 space for n elements at any point in time, and they prove a lower bound showing that any dynamic array must waste this much space if the operations are to remain amortized constant time. Additionally, they present a variant where growing and shrinking the buffer has not only amortized but worst-case constant time.
Bagwell (2002)[21] presented the VList algorithm, which can be adapted to implement a dynamic array.
Naïve resizable arrays -- also called "the worst implementation" of resizable arrays -- keep the allocated size of the array exactly big enough for all the data it contains, perhaps by calling
Language support
std::vector
and Rust's std::vec::Vec
are implementations of dynamic arrays, as are the ArrayList
[27] classes supplied with the Java API[28]: 236 and the .NET Framework.[29][30]The generic List<>
class supplied with version 2.0 of the .NET Framework is also implemented with dynamic arrays. Smalltalk's OrderedCollection
is a dynamic array with dynamic start and end-index, making the removal of the first element also O(1).
list
datatype implementation is a dynamic array the growth pattern of which is: 0, 4, 8, 16, 24, 32, 40, 52, 64, 76, ...[31]Ada's Ada.Containers.Vectors
generic package provides dynamic array implementation for a given subtype.
Many scripting languages such as Perl and Ruby offer dynamic arrays as a built-in primitive data type.
Several cross-platform frameworks provide dynamic array implementations for C, including CFArray
and CFMutableArray
in Core Foundation, and GArray
and GPtrArray
in GLib.
Common Lisp provides a rudimentary support for resizable vectors by allowing to configure the built-in array
type as adjustable and the location of insertion by the fill-pointer.
See also
- Stack (data structure)
- Queue (data structure)
References
- ^ a b See, for example, the source code of java.util.ArrayList class from OpenJDK 6.
- ISBN 978-1423902188
- ^ a b Goodrich, Michael T.; Tamassia, Roberto (2002), "1.5.2 Analyzing an Extendable Array Implementation", Algorithm Design: Foundations, Analysis and Internet Examples, Wiley, pp. 39–41.
- ^ ISBN 0-262-03293-7.
- ^ a b c "C++ STL vector: definition, growth factor, member functions". Archived from the original on 2015-08-06. Retrieved 2015-08-05.
- ^ "vector growth factor of 1.5". comp.lang.c++.moderated. Google Groups. Archived from the original on 2011-01-22. Retrieved 2015-08-05.
- ^ List object implementation from github.com/python/cpython/, retrieved 2020-03-23.
- ^ Brais, Hadi. "Dissecting the C++ STL Vector: Part 3 - Capacity & Size". Micromysteries. Retrieved 2015-08-05.
- ^ "facebook/folly". GitHub. Retrieved 2015-08-05.
- ^ "rust-lang/rust". GitHub. Retrieved 2020-06-09.
- ^ "golang/go". GitHub. Retrieved 2021-09-14.
- ^ "The Nim memory model". zevv.nl. Retrieved 2022-05-24.
- ^ "sbcl/sbcl". GitHub. Retrieved 2023-02-15.
- ^ Day 1 Keynote - Bjarne Stroustrup: C++11 Style at GoingNative 2012 on channel9.msdn.com from minute 45 or foil 44
- ^ Number crunching: Why you should never, ever, EVER use linked-list in your code again at kjellkod.wordpress.com
- ^ Brodnik, Andrej; Carlsson, Svante; Sedgewick, Robert; Munro, JI; Demaine, ED (1999), Resizable Arrays in Optimal Time and Space (Technical Report CS-99-09) (PDF), Department of Computer Science, University of Waterloo
- ^ .
- ISBN 978-3-540-66279-2
- ^ a b Sitarski, Edward (September 1996), "HATs: Hashed array trees", Algorithm Alley, Dr. Dobb's Journal, 21 (11)
- ^ Brodnik, Andrej; Carlsson, Svante; Sedgewick, Robert; Munro, JI; Demaine, ED (1999), Resizable Arrays in Optimal Time and Space (PDF) (Technical Report CS-99-09), Department of Computer Science, University of Waterloo
- ^ Bagwell, Phil (2002), Fast Functional Lists, Hash-Lists, Deques and Variable Length Arrays, EPFL
- ^ Mike Lam. "Dynamic Arrays".
- ^ "Amortized Time".
- ^ "Hashed Array Tree: Efficient representation of Array".
- ^ "Different notions of complexity".
- ^ Peter Kankowski. "Dynamic arrays in C".
- ^ Javadoc on
ArrayList
- ISBN 978-0134685991.
- ^ ArrayList Class
- ISBN 978-1617294532.
- ^ listobject.c (github.com)
External links
- NIST Dictionary of Algorithms and Data Structures: Dynamic array
- VPOOL - C language implementation of dynamic array.
- CollectionSpy — A Java profiler with explicit support for debugging ArrayList- and Vector-related issues.
- Open Data Structures - Chapter 2 - Array-Based Lists, Pat Morin