Binary GCD algorithm

The binary GCD algorithm, also known as Stein's algorithm or the binary Euclidean algorithm,^[1]^[2] is an algorithm that computes the greatest common divisor (GCD) of two nonnegative integers. Stein's algorithm uses simpler arithmetic operations than the conventional Euclidean algorithm; it replaces division with arithmetic shifts, comparisons, and subtraction.

Although the algorithm in its contemporary form was first published by the Israeli physicist and programmer Josef Stein in 1967,^[3] it may have been known by the 2nd century BCE, in ancient China.^[4]

Algorithm

The algorithm finds the GCD of two nonnegative numbers $u$ and $v$ by repeatedly applying these identities:

$gcd(u,0)=u$ : everything divides zero, and $u$ is the largest number that divides $u$ .
$gcd(2u,2v)=2\cdot gcd(u,v)$ : $2$ is a common divisor.
$gcd(u,2v)=gcd(u,v)$ if $u$ is odd: $2$ is then not a common divisor.
$gcd(u,v)=gcd(u,v-u)$ if $u,v$ odd and $u\leq v$ .

As GCD is commutative ( $gcd(u,v)=gcd(v,u)$ ), those identities still apply if the operands are swapped: $gcd(0,v)=v$ , $gcd(2u,v)=gcd(u,v)$ if $v$ is odd, etc.

Implementation

While the above description of the algorithm is mathematically correct, performant software implementations typically differ from it in a few notable ways:

eschewing trial division by $2$ in favour of a single bitshift and the
count trailing zeros
primitive; this is functionally equivalent to repeatedly applying identity 3, but much faster;
expressing the algorithm iteratively rather than recursively: the resulting implementation can be laid out to avoid repeated work, invoking identity 2 at the start and maintaining as invariant that both numbers are odd upon entering the loop, which only needs to implement identities 3 and 4;
making the loop's body branch-free except for its exit condition ( $v=0$ ): in the example below, the exchange of $u$ and $v$ (ensuring $u\leq v$ ) compiles down to
conditional moves;^[5] hard-to-predict branches can have a large, negative impact on performance.^[6]^[7]

The following is an implementation of the algorithm in Rust exemplifying those differences, adapted from uutils:

use std::cmp::min;
use std::mem::swap;

pub fn gcd(mut u: u64, mut v: u64) -> u64 {
    // Base cases: gcd(n, 0) = gcd(0, n) = n
    if u == 0 {
        return v;
    } else if v == 0 {
        return u;
    }

    // Using identities 2 and 3:
    // gcd(2ⁱ u, 2ʲ v) = 2ᵏ gcd(u, v) with u, v odd and k = min(i, j)
    // 2ᵏ is the greatest power of two that divides both 2ⁱ u and 2ʲ v
    let i = u.trailing_zeros();  u >>= i;
    let j = v.trailing_zeros();  v >>= j;
    let k = min(i, j);

    loop {
        // u and v are odd at the start of the loop
        debug_assert!(u % 2 == 1, "u = {} should be odd", u);
        debug_assert!(v % 2 == 1, "v = {} should be odd", v);

        // Swap if necessary so u ≤ v
        if u > v {
            swap(&mut u, &mut v);
        }

        // Identity 4: gcd(u, v) = gcd(u, v-u) as u ≤ v and u, v are both odd 
        v -= u;
        // v is now even

        if v == 0 {
            // Identity 1: gcd(u, 0) = u
            // The shift by k is necessary to add back the 2ᵏ factor that was removed before the loop
            return u << k;
        }

        // Identity 3: gcd(u, 2ʲ v) = gcd(u, v) as u is odd
        v >>= v.trailing_zeros();
    }
}

Note: The implementation above accepts unsigned (non-negative) integers; given that $gcd(u,v)=gcd(\pm {}u,\pm {}v)$ , the signed case can be handled as follows:

/// Computes the GCD of two signed 64-bit integers
/// The result is unsigned and not always representable as i64: gcd(i64::MIN, i64::MIN) == 1 << 63
pub fn signed_gcd(u: i64, v: i64) -> u64 {
    gcd(u.unsigned_abs(), v.unsigned_abs())
}

Complexity

Asymptotically, the algorithm requires $O(n)$ steps, where $n$ is the number of bits in the larger of the two numbers, as every two steps reduce at least one of the operands by at least a factor of $2$ . Each step involves only a few arithmetic operations ( $O(1)$ with a small constant); when working with word-sized numbers, each arithmetic operation translates to a single machine operation, so the number of machine operations is on the order of $n$ , i.e. $\log _{2}(max(u,v))$ .

For arbitrarily-large numbers, the

asymptotic complexity

of this algorithm is

O(n^{2})

,^[8] as each arithmetic operation (subtract and shift) involves a linear number of machine operations (one per word in the numbers' binary representation). If the numbers can be represented in the machine's memory, i.e. each number's size can be represented by a single machine word, this bound is reduced to:

O\left({\frac {n^{2}}{\log _{2}n}}\right)

This is the same as for the Euclidean algorithm, though a more precise analysis by Akhavi and Vallée proved that binary GCD uses about 60% fewer bit operations.^[9]

Extensions

The binary GCD algorithm can be extended in several ways, either to output additional information, deal with arbitrarily-large integers more efficiently, or to compute GCDs in domains other than the integers.

The extended binary GCD algorithm, analogous to the

Bézout coefficients

in addition to the GCD: integers

a

and

b

such that

a\cdot {}u+b\cdot {}v=gcd(u,v)

.^[10]^[11]^[12]

In the case of large integers, the best asymptotic complexity is $O(M(n)\log n)$ , with $M(n)$ the cost of $n$ -bit multiplication; this is near-linear and vastly smaller than the binary GCD algorithm's $O(n^{2})$ , though concrete implementations only outperform older algorithms for numbers larger than about 64 kilobits (i.e. greater than 8×10¹⁹²⁶⁵). This is achieved by extending the binary GCD algorithm using ideas from the Schönhage–Strassen algorithm for fast integer multiplication.^[13]

The binary GCD algorithm has also been extended to domains other than natural numbers, such as

number fields.^[18]

Historical description

An algorithm for computing the GCD of two numbers was known in ancient China, under the Han dynasty, as a method to reduce fractions:

If possible halve it; otherwise, take the denominator and the numerator, subtract the lesser from the greater, and do that alternately to make them the same. Reduce by the same number.
— Fangtian – Land surveying, The Nine Chapters on the Mathematical Art

The phrase "if possible halve it" is ambiguous,^[4]

if this applies when either of the numbers become even, the algorithm is the binary GCD algorithm;
if this only applies when both numbers are even, the algorithm is similar to the Euclidean algorithm.

References

^ Brent, Richard P. (13–15 September 1999). Twenty years' analysis of the Binary Euclidean Algorithm. 1999 Oxford-Microsoft Symposium in honour of Professor Sir Antony Hoare. Oxford.
arXiv:1303.2772
. PRG TR-7-99.

ISSN 0021-9991

^
ISBN 978-0-201-89684-8

^ Godbolt, Matt. "Compiler Explorer". Retrieved 4 February 2024.

^ Kapoor, Rajiv (21 February 2009). "Avoiding the Cost of Branch Misprediction". Intel Developer Zone.

^ Lemire, Daniel (15 October 2019). "Mispredicted branches can multiply your running times".

^ "GNU MP 6.1.2: Binary GCD".

CiteSeerX 10.1.1.42.7616

^ Knuth 1998, p. 646, answer to exercise 39 of section 4.5.2

ISBN 0-8493-8523-7
. Retrieved 9 September 2017.

ISBN 0-387-55640-0
.

S2CID 3119374
, INRIA Research Report RR-5050.

doi:10.1006/jsco.2000.0422
.

doi:10.1007/978-3-540-45077-1_11
.

doi:10.1007/978-3-540-24847-7_4
.

doi:10.1007/11682462_8
.

doi:10.1007/11523468_96
.

Further reading

ISBN 978-0-201-89684-8
.

Covers the extended binary GCD, and a probabilistic analysis of the algorithm.

ISBN 0-387-55640-0
.

Covers a variety of topics, including the extended binary GCD algorithm which outputs
Bézout coefficients, efficient handling of multi-precision integers using a variant of Lehmer's GCD algorithm, and the relationship between GCD and continued fraction
expansions of real numbers.

S2CID 27441335. Archived from the original
(PS) on 13 May 2011.

An analysis of the algorithm in the average case, through the lens of functional analysis: the algorithms' main parameters are cast as a dynamical system, and their average value is related to the invariant measure of the system's transfer operator.

External links

NIST Dictionary of Algorithms and Data Structures: binary GCD algorithm

Cut-the-Knot: Binary Euclid's Algorithm at
cut-the-knot

Analysis of the Binary Euclidean Algorithm (1976), a paper by
Richard P. Brent
, including a variant using left shifts

v
t
e
Number-theoretic algorithms
Primality tests

AKS

APR

Baillie–PSW

Elliptic curve

Pocklington

Fermat

Lucas

Lucas–Lehmer

Lucas–Lehmer–Riesel

Proth's theorem

Pépin's

Quadratic Frobenius

Solovay–Strassen

Miller–Rabin

Prime-generating

Sieve of Atkin

Sieve of Eratosthenes

Sieve of Pritchard

Sieve of Sundaram

Wheel factorization

Integer factorization

Continued fraction (CFRAC)

Dixon's

Lenstra elliptic curve (ECM)

Euler's

Pollard's rho

p − 1

p + 1

Quadratic sieve (QS)

General number field sieve (GNFS)

Special number field sieve (SNFS)

Rational sieve

Fermat's

Shanks's square forms

Trial division

Shor's

Multiplication

Ancient Egyptian

Long

Karatsuba

Toom–Cook

Schönhage–Strassen

Fürer's

Euclidean division

Binary

Chunking

Fourier

Goldschmidt

Newton-Raphson

Long

Short

SRT

Discrete logarithm

Baby-step giant-step

Pollard rho

Pollard kangaroo

Pohlig–Hellman

Index calculus

Function field sieve

Greatest common divisor

Binary

Euclidean

Extended Euclidean

Lehmer's

Modular square root

Cipolla

Pocklington's

Tonelli–Shanks

Berlekamp

Kunerth

Other algorithms

Chakravala

Cornacchia

Exponentiation by squaring

Integer square root

Integer relation (LLL; KZ)

Modular exponentiation

Montgomery reduction

Schoof

Trachtenberg system

Italics indicate that algorithm is for numbers of special forms

Retrieved from "https://en.wikipedia.org/w/index.php?title=Binary_GCD_algorithm&oldid=1217222861"

[brenta-1] Brent, Richard P. (13–15 September 1999). Twenty years' analysis of the Binary Euclidean Algorithm. 1999 Oxford-Microsoft Symposium in honour of Professor Sir Antony Hoare. Oxford.

[brentb-2] arXiv:1303.2772
. PRG TR-7-99.

[Stein-3] ISSN 0021-9991

[Knuth98-4] 
ISBN 978-0-201-89684-8

[rust_disassembly-5] Godbolt, Matt. "Compiler Explorer". Retrieved 4 February 2024.

[intel-6] Kapoor, Rajiv (21 February 2009). "Avoiding the Cost of Branch Misprediction". Intel Developer Zone.

[lemire-7] Lemire, Daniel (15 October 2019). "Mispredicted branches can multiply your running times".

[gmplib-8] "GNU MP 6.1.2: Binary GCD".

[bit-complexity-9] CiteSeerX 10.1.1.42.7616

[egcd-knuth-10] Knuth 1998, p. 646, answer to exercise 39 of section 4.5.2

[egcd-applied-crypto-11] ISBN 0-8493-8523-7
. Retrieved 9 September 2017.

[egcd-cohen-12] ISBN 0-387-55640-0
.

[stehlé-zimmermann-13] S2CID 3119374
, INRIA Research Report RR-5050.

[weilert-14] :10.1006/jsco.2000.0422
.

[eisenstein-15] :10.1007/978-3-540-45077-1_11
.

[some-quadratic-rings-16] :10.1007/978-3-540-24847-7_4
.

[UFD-quadratic-rings-17] :10.1007/11682462_8
.

[integer-rings-18] :10.1007/11523468_96
.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[18]

Algorithm

Implementation

Complexity

Extensions

Historical description

See also

References

Further reading

External links