Constrained least squares

In constrained least squares one solves a

linear least squares problem with an additional constraint on the solution.^[1]^[2]

\mathbf {X} {\boldsymbol {\beta }}=\mathbf {y}

must be fit as closely as possible (in the least squares sense) while ensuring that some other property of

{\boldsymbol {\beta }}

is maintained.

There are often special-purpose algorithms for solving such problems efficiently. Some examples of constraints are given below:

Equality constrained least squares: the elements of ${\boldsymbol {\beta }}$ must exactly satisfy $\mathbf {L} {\boldsymbol {\beta }}=\mathbf {d}$ (see Ordinary least squares).
Stochastic (linearly) constrained least squares: the elements of ${\boldsymbol {\beta }}$ must satisfy $\mathbf {L} {\boldsymbol {\beta }}=\mathbf {d} +\mathbf {\nu }$ , where $\mathbf {\nu }$ is a vector of random variables such that $\operatorname {E} (\mathbf {\nu } )=\mathbf {0}$ and $\operatorname {E} (\mathbf {\nu } \mathbf {\nu } ^{\rm {T}})=\tau ^{2}\mathbf {I}$ . This effectively imposes a
prior distribution
for ${\boldsymbol {\beta }}$ and is therefore equivalent to Bayesian linear regression.^[3]
Regularized
least squares: the elements of ${\boldsymbol {\beta }}$ must satisfy $\|\mathbf {L} {\boldsymbol {\beta }}-\mathbf {y} \|\leq \alpha$ (choosing $\alpha$ in proportion to the noise standard deviation of y prevents over-fitting).
Non-negative least squares (NNLS): The vector ${\boldsymbol {\beta }}$ must satisfy the vector inequality ${\boldsymbol {\beta }}\geq {\boldsymbol {0}}$ defined componentwise—that is, each component must be either positive or zero.
Box-constrained least squares: The vector ${\boldsymbol {\beta }}$ must satisfy the vector inequalities ${\boldsymbol {b}}_{\ell }\leq {\boldsymbol {\beta }}\leq {\boldsymbol {b}}_{u}$ , each of which is defined componentwise.
Integer-constrained least squares: all elements of ${\boldsymbol {\beta }}$ must be integers (instead of real numbers).
Phase-constrained least squares: all elements of ${\boldsymbol {\beta }}$ must be real numbers, or multiplied by the same complex number of unit modulus.

If the constraint only applies to some of the variables, the mixed problem may be solved using separable least squares by letting $\mathbf {X} =[\mathbf {X_{1}} \mathbf {X_{2}} ]$ and $\mathbf {\beta } ^{\rm {T}}=[\mathbf {\beta _{1}} ^{\rm {T}}\mathbf {\beta _{2}} ^{\rm {T}}]$ represent the unconstrained (1) and constrained (2) components. Then substituting the least-squares solution for $\mathbf {\beta _{1}}$ , i.e.

{\hat {\boldsymbol {\beta }}}_{1}=\mathbf {X} _{1}^{+}(\mathbf {y} -\mathbf {X} _{2}{\boldsymbol {\beta }}_{2})

(where ⁺ indicates the

Moore–Penrose pseudoinverse

) back into the original expression gives (following some rearrangement) an equation that can be solved as a purely constrained problem in

\mathbf {\beta } _{2}

.

\mathbf {P} \mathbf {X} _{2}{\boldsymbol {\beta }}_{2}=\mathbf {P} \mathbf {y} ,

where $\mathbf {P} :=\mathbf {I} -\mathbf {X} _{1}\mathbf {X} _{1}^{+}$ is a projection matrix. Following the constrained estimation of ${\hat {\boldsymbol {\beta }}}_{2}$ the vector ${\hat {\boldsymbol {\beta }}}_{1}$ is obtained from the expression above.