Tail call
In
Tail calls can be implemented without adding a new
Not all programming languages require tail-call elimination. However, in
Description
When a function is called, the computer must "remember" the place it was called from, the
For non-recursive function calls, this is usually an
Besides space and execution efficiency, tail-call elimination is important in the functional programming idiom known as continuation-passing style (CPS), which would otherwise quickly run out of stack space.
Syntactic form
A tail call can be located just before the syntactical end of a function:
function foo(data) {
a(data);
return b(data);
}
Here, both a(data)
and b(data)
are calls, but b
is the last thing the procedure executes before returning and is thus in tail position. However, not all tail calls are necessarily located at the syntactical end of a subroutine:
function bar(data) {
if (a(data)) {
return b(data);
}
return c(data);
}
Here, both calls to b
and c
are in tail position. This is because each of them lies in the end of if-branch respectively, even though the first one is not syntactically at the end of bar
's body.
In this code:
function foo1(data) {
return a(data) + 1;
}
function foo2(data) {
var ret = a(data);
return ret;
}
function foo3(data) {
var ret = a(data);
return (ret == 0) ? 1 : ret;
}
the call to a(data)
is in tail position in foo2
, but it is not in tail position either in foo1
or in foo3
, because control must return to the caller to allow it to inspect or modify the return value before returning it.
Example programs
The following program is an example in Scheme:[8]
;; factorial : number -> number
;; to calculate the product of all positive
;; integers less than or equal to n.
(define (factorial n)
(if (= n 0)
1
(* n (factorial (- n 1)))))
This is not written in a tail-recursive style, because the multiplication function ("*") is in the tail position. This can be compared to:
;; factorial : number -> number
;; to calculate the product of all positive
;; integers less than or equal to n.
(define (factorial n)
(fact-iter 1 n))
(define (fact-iter product n)
(if (= n 0)
product
(fact-iter (* product n)
(- n 1))))
This program assumes
call factorial (4) call fact-iter (1 4) call fact-iter (4 3) call fact-iter (12 2) call fact-iter (24 1) return 24 return 24 return 24 return 24 return 24
into the more efficient variant, in terms of both space and time:
call factorial (4) call fact-iter (1 4) replace arguments with (4 3) replace arguments with (12 2) replace arguments with (24 1) return 24 return 24
This reorganization saves space because no state except for the calling function's address needs to be saved, either on the stack or on the heap, and the call stack frame for fact-iter
is reused for the intermediate results storage. This also means that the programmer need not worry about running out of stack or heap space for extremely deep recursions. In typical implementations, the tail-recursive variant will be substantially faster than the other variant, but only by a constant factor.
Some programmers working in functional languages will rewrite recursive code to be tail recursive so they can take advantage of this feature. This often requires addition of an "accumulator" argument (product
in the above example) to the function. In some cases (such as filtering lists) and in some languages, full tail recursion may require a function that was previously purely functional to be written such that it mutates references stored in other variables.[citation needed]
Tail recursion modulo cons
Tail recursion modulo cons is a generalization of tail-recursion optimization introduced by
Example code
% Prolog, tail recursive modulo cons:
partition([], _, [], []).
partition([X|Xs], Pivot, [X|Rest], Bigs) :-
X @< Pivot, !,
partition(Xs, Pivot, Rest, Bigs).
partition([X|Xs], Pivot, Smalls, [X|Rest]) :-
partition(Xs, Pivot, Smalls, Rest).
|
-- In Haskell, guarded recursion:
partition [] _ = ([],[])
partition (x:xs) p
| x < p = (x:a,b)
| otherwise = (a,x:b)
where
(a,b) = partition xs p
|
% Prolog, with explicit unifications:
% non-tail recursive translation:
partition([], _, [], []).
partition(L, Pivot, Smalls, Bigs) :- L=[X|Xs],
( X @< Pivot
-> partition(Xs,Pivot,Rest,Bigs), Smalls=[X|Rest]
; partition(Xs,Pivot,Smalls,Rest), Bigs=[X|Rest]
).
|
% Prolog, with explicit unifications:
% tail-recursive translation:
partition([], _, [], []).
partition(L, Pivot, Smalls, Bigs) :- L=[X|Xs],
( X @< Pivot
-> Smalls=[X|Rest], partition(Xs,Pivot,Rest,Bigs)
; Bigs=[X|Rest], partition(Xs,Pivot,Smalls,Rest)
).
|
Thus in tail-recursive translation such a call is transformed into first creating a new list node and setting its first
field, and then making the tail call with the pointer to the node's rest
field as argument, to be filled recursively. The same effect is achieved when the recursion is guarded under a lazily evaluated data constructor, which is automatically achieved in lazy programming languages like Haskell.
C example
The following fragment defines a recursive function in C that duplicates a linked list (with some equivalent Scheme and Prolog code as comments, for comparison):
typedef struct list {
void *value;
struct list *next;
} list;
list *duplicate(const list *ls)
{
list *head = NULL;
if (ls != NULL) {
list *p = duplicate(ls->next);
head = malloc(sizeof *head);
head->value = ls->value;
head->next = p;
}
return head;
}
|
;; in Scheme,
(define (duplicate ls)
(if (not (null? ls))
(cons (car ls)
(duplicate (cdr ls)))
'()))
|
%% in Prolog,
duplicate([X|Xs],R):-
duplicate(Xs,Ys),
R=[X|Ys].
duplicate([],[]).
|
In this form the function is not tail recursive, because control returns to the caller after the recursive call duplicates the rest of the input list. Even if it were to allocate the head node before duplicating the rest, it would still need to plug in the result of the recursive call into the next
field after the call.[a]
So the function is almost tail recursive. Warren's method pushes the responsibility of filling the next
field into the recursive call itself, which thus becomes tail call.[b] Using sentinel head node to simplify the code,
typedef struct list {
void *value;
struct list *next;
} list;
void duplicate_aux(const list *ls, list *end);
list *duplicate(const list *ls)
{
list head;
duplicate_aux(ls, &head);
return head.next;
}
void duplicate_aux(const list *ls, list *end)
{
if (ls != NULL) {
end->next = malloc(sizeof *end);
end->next->value = ls->value;
duplicate_aux(ls->next, end->next);
} else {
end->next = NULL;
}
}
|
;; in Scheme,
(define (duplicate ls)
(let ((head (list 1)))
(let dup ((ls ls)
(end head))
(cond
((not (null? ls))
(set-cdr! end (list (car ls)))
(dup (cdr ls) (cdr end)))))
(cdr head)))
|
%% in Prolog,
duplicate([X|Xs],R):-
R=[X|Ys],
duplicate(Xs,Ys).
duplicate([],[]).
|
The callee now appends to the end of the growing list, rather than have the caller prepend to the beginning of the returned list. The work is now done on the way forward from the list's start, before the recursive call which then proceeds further, instead of backward from the list's end, after the recursive call has returned its result. It is thus similar to the accumulating parameter technique, turning a recursive computation into an iterative one.
Characteristically for this technique, a parent
The tail-recursive implementation can now be converted into an explicitly iterative implementation, as an accumulating
typedef struct list {
void *value;
struct list *next;
} list;
list *duplicate(const list *ls)
{
list head, *end;
end = &head;
while (ls != NULL)
{
end->next = malloc(sizeof *end);
end->next->value = ls->value;
ls = ls->next;
end = end->next;
}
end->next = NULL;
return head.next;
}
|
;; in Scheme,
(define (duplicate ls)
(let ((head (list 1)))
(do ((end head (cdr end))
(ls ls (cdr ls )))
((null? ls) (cdr head))
(set-cdr! end (list (car ls))))))
|
%% in Prolog,
%% N/A
|
History
In a paper delivered to the
Implementation methods
Tail recursion is important to some
goto &NAME;
[12]However, for language implementations which store function arguments and local variables on a
For example, in the Java virtual machine (JVM), tail-recursive calls can be eliminated (as this reuses the existing call stack), but general tail calls cannot be (as this changes the call stack).[13][14] As a result, functional languages such as Scala that target the JVM can efficiently implement direct tail recursion, but not mutual tail recursion.
The
-foptimize-sibling-calls
option is passed.[15][16][17] Though the given language syntax may not explicitly support it, the compiler can make this optimization whenever it can determine that the return types for the caller and callee are equivalent, and that the argument types passed to both function are either the same, or require the same amount of total storage space on the call stack.[18]Various implementation methods are available.
In assembly
Tail calls are often optimized by
For compilers generating assembly directly, tail-call elimination is easy: it suffices to replace a call opcode with a jump one, after fixing parameters on the stack. From a compiler's perspective, the first example above is initially translated into pseudo-assembly language (in fact, this is valid x86 assembly):
foo:
call B
call A
ret
Tail-call elimination replaces the last two lines with a single jump instruction:
foo:
call B
jmp A
After subroutine A
completes, it will then return directly to the return address of foo
, omitting the unnecessary ret
statement.
Typically, the subroutines being called need to be supplied with
function foo(data1, data2) B(data1) return A(data2)
(where data1
and data2
are parameters) a compiler might translate that as:[c]
foo:
mov reg,[sp+data1] ; fetch data1 from stack (sp) parameter into a scratch register.
push reg ; put data1 on stack where B expects it
call B ; B uses data1
pop ; remove data1 from stack
mov reg,[sp+data2] ; fetch data2 from stack (sp) parameter into a scratch register.
push reg ; put data2 on stack where A expects it
call A ; A uses data2
pop ; remove data2 from stack.
ret
A tail-call optimizer could then change the code to:
foo:
mov reg,[sp+data1] ; fetch data1 from stack (sp) parameter into a scratch register.
push reg ; put data1 on stack where B expects it
call B ; B uses data1
pop ; remove data1 from stack
mov reg,[sp+data2] ; fetch data2 from stack (sp) parameter into a scratch register.
mov [sp+data1],reg ; put data2 where A expects it
jmp A ; A uses data2 and returns immediately to caller.
This code is more efficient both in terms of execution speed and use of stack space.
Through trampolining
Since many
It is possible to implement trampolines using
Using a trampoline for all function calls is rather more expensive than the normal C function call, so at least one Scheme compiler,
Relation to the while statement
Tail recursion can be related to the while statement, an explicit iteration, for instance by transforming
procedure foo(x) if p(x) return bar(x) else return foo(baz(x))
into
procedure foo(x) while true if p(x) return bar(x) else x ← baz(x)
where x may be a tuple involving more than one variable: if so, care must be taken in implementing the
More generally,
procedure foo(x) if p(x) return bar(x) else if q(x) return baz(x) ... else if r(x) return foo(qux(x)) ... else return foo(quux(x))
can be transformed into
procedure foo(x) while true if p(x) return bar(x) else if q(x) return baz(x) ... else if r(x) x ← qux(x) ... else x ← quux(x)
For instance, this Julia program gives a non-tail recursive definition fact
of the factorial:
function factorial(n)
if n == 0
return 1
else
return n * factorial(n - 1)
end
end
Indeed, n * factorial(n - 1)
wraps the call to factorial
. But it can be transformed into a tail-recursive definition by adding an argument a
called an accumulator.[8]
This Julia program gives a tail-recursive definition fact_iter
of the factorial:
function factorial(n::Integer, a::Integer)
if n == 0:
return a
else
return factorial(n - 1, n * a)
end
end
function factorial(n::Integer)
return factorial(n, 1)
end
This Julia program gives an iterative definition fact_iter
of the factorial:
function fact_iter(n::Integer, a::Integer)
while n > 0
a = n * a
n = n - 1
end
return a
end
function factorial(n::Integer)
return fact_iter(n, one(n))
end
Language support
- Clojure – Clojure has
recur
special form.[22] - Common Lisp – Some implementations perform tail-call optimization during compilation if optimizing for speed
- Elixir – Elixir implements tail-call optimization,[23] as do all languages currently targeting the BEAM VM.
- Elm – Yes[24]
- Erlang – Yes
- F# – F# implements TCO by default where possible [25]
- Go – No support[26]
- Haskell – Yes[27]
- but rejected by V8 and SpiderMonkey
- Kotlin – Has
tailrec
modifier for functions[30] - Lua – Tail recursion is required by the language definition[31]
- Objective-C – Compiler optimizes tail calls when -O1 (or higher) option specified but it is easily disturbed by calls added by Automatic Reference Counting (ARC).
- OCaml – Yes
- Perl – Explicit with a variant of the "goto" statement that takes a function name:
goto &NAME;
[32] - PureScript – Yes
- For this reason, it is expected that Python will never implement tail-call optimization.
- R – Yes, tailcall() function introduced in R.4.4.0[35]
- Racket – Yes[36]
- Ruby – Yes, but disabled by default [37]
- Rust – tail-call optimization may be done in limited circumstances, but is not guaranteed[38]
- Scala – Tail-recursive functions are automatically optimized by the compiler. Such functions can also optionally be marked with a
@tailrec
annotation, which makes it a compilation error if the function is not tail recursive[39] - Scheme – Required by the language definition[40][41]
- Swift – In some cases (as of 2014).[42]
- Tcl – Since Tcl 8.6, Tcl has a tailcall command[43]
- Zig – Yes[44]
See also
![](http://upload.wikimedia.org/wikipedia/commons/thumb/9/99/Wiktionary-logo-en-v2.svg/40px-Wiktionary-logo-en-v2.svg.png)
- Course-of-values recursion
- Recursion (computer science)
- Primitive recursive function
- Inline expansion
- Leaf subroutine
- Corecursion
Notes
- ^ Like this:
if (ls != NULL) { head = malloc( sizeof *head); head->value = ls->value; head->next = duplicate( ls->next); }
- ^ Like this:
if (ls != NULL) { head = malloc( sizeof *head); head->value = ls->value; duplicate( ls->next, &(head->next)); }
- ^ The
call
instruction first pushes the current code location onto the stack and then performs an unconditional jump to the code location indicated by the label. Theret
instruction first pops a code location off the stack, then performs an unconditional jump to the retrieved code location.
References
- ISBN 978-1-55860-320-2.
- ^ S2CID 9807843.
- ^ "The LLVM Target-Independent Code Generator — LLVM 7 documentation". llvm.org.
- ^ "recursion - Stack memory usage for tail calls - Theoretical Computer Science". Cstheory.stackexchange.com. 2011-07-29. Retrieved 2013-03-21.
- ^ a b "Revised^6 Report on the Algorithmic Language Scheme". R6rs.org. Retrieved 2013-03-21.
- ^ "Revised^6 Report on the Algorithmic Language Scheme - Rationale". R6rs.org. Retrieved 2013-03-21.
- ^ "Revised^6 Report on the Algorithmic Language Scheme". R6rs.org. Retrieved 2013-03-21.
- ^ ISBN 0-262-01077-1.
- ^ D. H. D. Warren, DAI Research Report 141, University of Edinburgh, 1980.
- ^ Daniel P. Friedman and David S. Wise, Technical Report TR19: Unwinding Structured Recursions into Iterations, Indiana University, Dec. 1974. PDF available here (webarchived copy here).
- S2CID 14069423.
- ^ Contact details. "goto". perldoc.perl.org. Retrieved 2013-03-21.
- ^ "What is difference between tail calls and tail recursion?", Stack Overflow
- ^ "What limitations does the JVM impose on tail-call optimization", Programmers Stack Exchange
- ^ Lattner, Chris. "LLVM Language Reference Manual, section: The LLVM Target-Independent Code Generator, sub: Tail Call Optimization". The LLVM Compiler Infrastructure. The LLVM Project. Retrieved 24 June 2018.
- ^ "Using the GNU Compiler Collection (GCC): Optimize Options". gcc.gnu.org.
- ^ "foptimize-sibling-calls". software.intel.com.
- ^ "Tackling C++ Tail Calls".
- ^ Probst, Mark (20 July 2000). "proper tail recursion for gcc". GCC Project. Retrieved 10 March 2015.
- ^ Samuel Jack, Bouncing on your tail. Functional Fun. April 9, 2008.
- ^ a b Henry Baker, "CONS Should Not CONS Its Arguments, Part II: Cheney on the M.T.A."
- ^ "(recur expr*)". clojure.org.
- ^ "Recursion". elixir-lang.github.com.
- ^ Czaplicki, Evan. "Functional Programming in Elm: Tail-Call Elimination".
- ^ "Tail Calls in F#". msdn. Microsoft.
- ^ "proposal: Go 2: add become statement to support tail calls". github.com.
- ^ "Tail recursion - HaskellWiki". wiki.haskell.org. Retrieved 2019-06-08.
- ^ Beres-Deak, Adam. "Worth watching: Douglas Crockford speaking about the new good parts of JavaScript in 2014". bdadam.com.
- ^ "ECMAScript 6 in WebKit". 13 October 2015.
- ^ "Functions: infix, vararg, tailrec - Kotlin Programming Language". Kotlin.
- ^ "Lua 5.3 Reference Manual". www.lua.org.
- ^ "goto - perldoc.perl.org". perldoc.perl.org.
- ^ "baruchel/tco". GitHub. 29 March 2022.
- ^ Rossum, Guido Van (22 April 2009). "Neopythonic: Tail Recursion Elimination".
- ^ "What's new in R 4.4.0?". www.jumpingrivers.com. 2024-04-25. Retrieved 2024-04-28.
- ^ "The Racket Reference". docs.racket-lang.org.
- ^ "Ruby Tail Call Optimisation". ruby-doc.org.
- ^ "Rust FAQ". prev.rust-lang.org.
- ^ "Scala Standard Library 2.13.0 - scala.annotation.tailrec". www.scala-lang.org. Retrieved 2019-06-20.
- ^ "Revised^5 Report on the Algorithmic Language Scheme". www.schemers.org.
- ^ "Revised^6 Report on the Algorithmic Language Scheme". www.r6rs.org.
- ^ "Does Swift implement tail call optimization?". 2014. Retrieved 13 March 2024.
- ^ "tailcall manual page - Tcl Built-In Commands". www.tcl.tk.
- ^ "Documentation - the Zig Programming Language".