Cartesian tree

In
Cartesian trees were introduced by
Definition
Cartesian trees are defined using
It is also possible to characterize the Cartesian tree directly rather than recursively, using its ordering properties. In any tree, the subtree rooted at any node consists of all other nodes that can reach it by repeatedly following parent pointers. The Cartesian tree for a sequence of distinct numbers is defined by the following properties:
- The Cartesian tree for a sequence is a binary tree with one node for each number in the sequence.
- A symmetric (in-order) traversal of the tree results in the original sequence. Equivalently, for each node, the numbers in its left subtree are earlier than it in the sequence, and the numbers in the right subtree are later.
- The tree has the min-heap property: the parent of any non-root node has a smaller value than the node itself.[1]
These two definitions are equivalent: the tree defined recursively as described above is the unique tree that has the properties listed above. If a sequence of numbers contains repetitions, a Cartesian tree can be determined for it by following a consistent tie-breaking rule before applying the above construction. For instance, the first of two equal elements can be treated as the smaller of the two.[2]
History
Cartesian trees were introduced and named by
Efficient construction
A Cartesian tree can be constructed in
An alternative linear-time construction algorithm is based on the all nearest smaller values problem. In the input sequence, define the left neighbor of a value to be the value that occurs prior to , is smaller than , and is closer in position to than any other smaller value. The right neighbor is defined symmetrically. The sequence of left neighbors can be found by an algorithm that maintains a
Another linear-time algorithm for Cartesian tree construction is based on divide-and-conquer. The algorithm recursively constructs the tree on each half of the input, and then merges the two trees. The merger process involves only the nodes on the left and right spines of these trees: the left spine is the path obtained by following left child edges from the root until reaching a node with no left child, and the right spine is defined symmetrically. As with any path in a min-heap, both spines have their values in sorted order, from the smallest value at their root to their largest value at the end of the path. To merge the two trees, apply a merge algorithm to the right spine of the left tree and the left spine of the right tree, replacing these two paths in two trees by a single path that contains the same nodes. In the merged path, the successor in the sorted order of each node from the left tree is placed in its right child, and the successor of each node from the right tree is placed in its left child, the same position that was previously used for its successor in the spine. The left children of nodes from the left tree and right children of nodes from the right tree remain unchanged. The algorithm is parallelizable since on each level of recursion, each of the two sub-problems can be computed in parallel, and the merging operation can be efficiently parallelized as well.[5]
Yet another linear-time algorithm, using a linked list representation of the input sequence, is based on locally maximum linking: the algorithm repeatedly identifies a local maximum element, i.e., one that is larger than both its neighbors (or than its only neighbor, in case it is the first or last element in the list). This element is then removed from the list, and attached as the right child of its left neighbor, or the left child of its right neighbor, depending on which of the two neighbors has a larger value, breaking ties arbitrarily. This process can be implemented in a single left-to-right pass of the input, and it is easy to see that each element can gain at most one left-, and at most one right child, and that the resulting binary tree is a Cartesian tree of the input sequence. [6] [7]
It is possible to maintain the Cartesian tree of a dynamic input, subject to insertions of elements and
Applications
Range searching and lowest common ancestors

Cartesian trees form part of an efficient data structure for
The same range minimization problem may also be given an alternative interpretation in terms of two dimensional range searching. A collection of finitely many points in the
The same construction, of lowest common ancestors in a Cartesian tree, makes it possible to construct a data structure with linear space that allows the distances between pairs of points in any ultrametric space to be queried in constant time per query. The distance within an ultrametric is the same as the minimax path weight in the minimum spanning tree of the metric.[12] From the minimum spanning tree, one can construct a Cartesian tree, the root node of which represents the heaviest edge of the minimum spanning tree. Removing this edge partitions the minimum spanning tree into two subtrees, and Cartesian trees recursively constructed for these two subtrees form the children of the root node of the Cartesian tree. The leaves of the Cartesian tree represent points of the metric space, and the lowest common ancestor of two leaves in the Cartesian tree is the heaviest edge between those two points in the minimum spanning tree, which has weight equal to the distance between the two points. Once the minimum spanning tree has been found and its edge weights sorted, the Cartesian tree can be constructed in linear time.[13]
As a binary search tree
The Cartesian tree of a sorted sequence is just a
This idea was applied by Seidel & Aragon (1996), who suggested the use of random numbers as priorities. The self-balancing binary search tree resulting from this random choice is called a treap, due to its combination of binary search tree and min-heap features. An insertion into a treap can be performed by inserting the new key as a leaf of an existing tree, choosing a priority for it, and then performing tree rotation operations along a path from the node to the root of the tree to repair any violations of the heap property caused by this insertion; a deletion can similarly be performed by a constant amount of change to the tree followed by a sequence of rotations along a single path in the tree.[14] A variation on this data structure called a zip tree uses the same idea of random priorities, but simplifies the random generation of the priorities, and performs insertions and deletions in a different way, by splitting the sequence and its associated Cartesian tree into two subsequences and two trees and then recombining them.[15]
If the priorities of each key are chosen randomly and independently once whenever the key is inserted into the tree, the resulting Cartesian tree will have the same properties as a
In sorting

Levcopoulos & Petersson (1989) describe a sorting algorithm based on Cartesian trees. They describe the algorithm as based on a tree with the maximum at the root,[16] but it can be modified straightforwardly to support a Cartesian tree with the convention that the minimum value is at the root. For consistency, it is this modified version of the algorithm that is described below.
The Levcopoulos–Petersson algorithm can be viewed as a version of
- Construct a Cartesian tree for the input sequence
- Initialize a priority queue, initially containing only the tree root
- While the priority queue is non-empty:
- Find and remove the minimum value in the priority queue
- Add this value to the output sequence
- Add the Cartesian tree children of the removed value to the priority queue
As Levcopoulos and Petersson show, for input sequences that are already nearly sorted, the size of the priority queue will remain small, allowing this method to take advantage of the nearly-sorted input and run more quickly. Specifically, the worst-case running time of this algorithm is , where is the sequence length and is the average, over all values in the sequence, of the number of consecutive pairs of sequence values that bracket the given value (meaning that the given value is between the two sequence values). They also prove a lower bound stating that, for any and (non-constant) , any comparison-based sorting algorithm must use comparisons for some inputs.[16]
In pattern matching
The problem of Cartesian tree matching has been defined as a generalized form of
Notes
- ^ a b In some references, the ordering of the numbers is reversed, so the root node instead holds the maximum value and the tree has the max-heap property rather than the min-heap property.
- ^ a b c Vuillemin (1980).
- ^ a b Gabow, Bentley & Tarjan (1984).
- ^ Berkman, Schieber & Vishkin (1993).
- ^ Shun & Blelloch (2014).
- ^ Kozma & Saranurak (2020).
- ^ Hartmann et al. (2021).
- ^ Bialynicka-Birula & Grossi (2006).
- ^ Gabow, Bentley & Tarjan (1984); Bender & Farach-Colton (2000).
- ^ Harel & Tarjan (1984); Schieber & Vishkin (1988).
- ^ Bender & Farach-Colton (2000).
- ^ Hu (1961); Leclerc (1981)
- ^ Demaine, Landau & Weimann (2009).
- ^ a b c Seidel & Aragon (1996).
- ^ Tarjan, Levy & Timmel (2021).
- ^ a b c Levcopoulos & Petersson (1989).
- ^ Park et al. (2019); Park et al. (2020); Song et al. (2021); Kim & Cho (2021); Nishimoto et al. (2021); Oizumi et al. (2022)
References
- Bender, Michael A.; Farach-Colton, Martin (2000), "The LCA problem revisited", Proceedings of the 4th Latin American Symposium on Theoretical Informatics, Springer-Verlag, Lecture Notes in Computer Science 1776, pp. 88–94
- Berkman, Omer;
- Bialynicka-Birula, Iwona; Grossi, Roberto (2006), "Amortized rigidness in dynamic Cartesian trees", in Durand, Bruno; Thomas, Wolfgang (eds.), STACS 2006, 23rd Annual Symposium on Theoretical Aspects of Computer Science, Marseille, France, February 23-25, 2006, Proceedings, Lecture Notes in Computer Science, vol. 3884, Springer, pp. 80–91, ISBN 978-3-540-32301-3
- ISBN 978-3-642-02926-4
- Fischer, Johannes; Heun, Volker (2006), "Theoretical and Practical Improvements on the RMQ-Problem, with Applications to LCA and LCE", Proceedings of the 17th Annual Symposium on Combinatorial Pattern Matching, Lecture Notes in Computer Science, vol. 4009, Springer-Verlag, pp. 36–48, ISBN 978-3-540-35455-0
- Fischer, Johannes; Heun, Volker (2007), "A New Succinct Representation of RMQ-Information and Improvements in the Enhanced Suffix Array.", Proceedings of the International Symposium on Combinatorics, Algorithms, Probabilistic and Experimental Methodologies, Lecture Notes in Computer Science, vol. 4614, Springer-Verlag, pp. 459–470, ISBN 978-3-540-74449-8
- S2CID 17752833
- Harel, Dov; doi:10.1137/0213024
- Hartmann, Maria; Kozma, László; Sinnamon, Corwin;
- JSTOR 167055
- Kim, Sung-Hwan; Cho, Hwan-Gue (2021), "A compact index for Cartesian tree matching", in Gawrychowski, Pawel; Starikovskaya, Tatiana (eds.), 32nd Annual Symposium on Combinatorial Pattern Matching, CPM 2021, July 5-7, 2021, Wrocław, Poland, LIPIcs, vol. 191, Schloss Dagstuhl - Leibniz-Zentrum für Informatik, pp. 18:1–18:19, ISBN 9783959771863
- Kozma, László; Saranurak, Thatchaphol (2020), "Smooth Heaps and a Dual View of Self-Adjusting Data Structures",
- Leclerc, Bruno (1981), "Description combinatoire des ultramétriques", Centre de Mathématique Sociale. École Pratique des Hautes Études. Mathématiques et Sciences Humaines (in French) (73): 5–37, 127, MR 0623034
- Levcopoulos, Christos; Petersson, Ola (1989), "Heapsort - Adapted for Presorted Files", WADS '89: Proceedings of the Workshop on Algorithms and Data Structures, Lecture Notes in Computer Science, vol. 382, London, UK: Springer-Verlag, pp. 499–509, ISBN 978-3-540-51542-5
- Nishimoto, Akio; Fujisato, Noriki; Nakashima, Yuto; Inenaga, Shunsuke (2021), "Position heaps for Cartesian-tree matching on strings and tries", in Lecroq, Thierry; Touzet, Hélène (eds.), String Processing and Information Retrieval - 28th International Symposium, SPIRE 2021, Lille, France, October 4-6, 2021, Proceedings, Lecture Notes in Computer Science, vol. 12944, Springer, pp. 241–254, S2CID 235313506
- Oizumi, Tsubasa; Kai, Takeshi; Mieno, Takuya; Inenaga, Shunsuke; Arimura, Hiroki (2022), "Cartesian Tree Subsequence Matching", in Bannai, Hideo; Holub, Jan (eds.), 33rd Annual Symposium on Combinatorial Pattern Matching, CPM 2022, June 27-29, 2022, Prague, Czech Republic, LIPIcs, vol. 223, Schloss Dagstuhl - Leibniz-Zentrum für Informatik, pp. 14:1–14:18, S2CID 246679910
- Park, Sung Gwan; Amir, Amihood; Landau, Gad M.; Park, Kunsoo (2019), "Cartesian tree matching and indexing", in Pisanti, Nadia; Pissis, Solon P. (eds.), 30th Annual Symposium on Combinatorial Pattern Matching, CPM 2019, June 18-20, 2019, Pisa, Italy, LIPIcs, vol. 128, Schloss Dagstuhl - Leibniz-Zentrum für Informatik, pp. 16:1–16:14, S2CID 162168587
- Park, Sung Gwan; Bataa, Magsarjav; Amir, Amihood; Landau, Gad M.; Park, Kunsoo (2020), "Finding patterns and periods in Cartesian tree matching", S2CID 225227284
- doi:10.1007/s004539900061 (inactive 1 November 2024))
{{citation}}
: CS1 maint: DOI inactive as of November 2024 (link - Schieber, Baruch; doi:10.1137/0217079
- Shun, Julian; Blelloch, Guy E. (2014), "A Simple Parallel Cartesian Tree Algorithm and its Application to Parallel Suffix Tree Construction", ACM Transactions on Parallel Computing, 1: 1–20, S2CID 1912378
- Song, Siwoo; Gu, Geonmo; Ryu, Cheol; Faro, Simone; Lecroq, Thierry; Park, Kunsoo (2021), "Fast algorithms for single and multiple pattern Cartesian tree matching", S2CID 225142815
- S2CID 49298052
- S2CID 10462194