# Recursive Least Squares Deduction

In order to deduce Eqs. (3.73)-(3.75) start by considering the following lemma: Matrix inversion lemma

For A, B, C and D matrices of convenient dimensions such that the indicated inver­sions exist, it holds that

Proof Right multiply the right hand side of (B.1) by A + BCD to get

^A-1 – A-1 [DA-1 B + C-1] 1 DA-1^ (A + BCD)

= I – A-1 B^DA-1 B + C-1] 1 D

+ A-1 BCD – A-1 b[dA-1 B + C-1] 1 DA-1 BCD

= I + A-1 B [DA-1 B + C-1] 1 j[DA-1 B + C-1] CD – D – DA-1 BCdJ = I.

Now, left multiply the right hand side of (B.1) by A + BCD to get

■ DA-1 B + C-1

These quantities are related by the matrix regression model written for all the available data

z = ФLsft + V. (B.5)

With this notation the least squares functional (3.72) can be written

Jls(\$) = 2 (z – &LS\$) MLS (z – <PLsft) ■

where MLS є Жкхк is the diagonal matrix of weights

Пк 0 … 0

.. 0

0 П

Equating the gradient of Jls(&) with respect to & to zero yields the normal equation satisfied by the least squares estimates & of &

A(k)& (k) = <pLsMls~z, (B.8)

where A(k) is the information matrix given k observations, given by

A(k) = (PLsMlsc&ls – (B.9)

If the experimental conditions are such that the data verify a persistency of excitation condition, then the inverse of A(k) exists, and the least squares estimates are given by

£ (k) = A-1(k )ФTsMlsZ. (B.10)

This expression can be given the form

k

\$ (k) = A—1(k^jXk—i n(i )z(i). (B.11)

i = 1

Furthermore, the information matrix verifies the recursive equation

A(k) = XA(k — 1) + n(k)nT(k). (B.12)

Isolate the last term of the summation in (B.11) and use (B.11) with k replaced by k — 1 to get

(k) = A-1 (k )cp(k) y (k) + XA(k — Щ (k — 1)} (B.13)

and then express A(k — 1) in terms of A(k) using (B.12), yielding

Together with (B.12), Eq. (B.14) provides a way of updating recursively the estimate of &. The estimate given the data up to time k is obtained by correcting the estimate given the observations up to time k — 1 by a term given by the product of a gain (the Kalman gain, A—1(k)n(k)) by the a priori prediction error y(k) — nT(k)&(k — 1). These expressions have the drawback of requiring the computation of the inverse of the information matrix A at each iteration. This can be avoided by propagating in time directly the inverse of A. Let

Matrix P is proportional to the parameter error covariance error and, for simplicity, is referred simply as the “covariance matrix”. From (B.12) it follows that

Using the matrix inversion lemma (B.1) with A = A(k — 1), B = n(k), C = П-1 and D = nT (k) yields (3.75). Equation (3.73) follows from(B.14) and the definition of P. □

Updated: August 24, 2015 — 12:55 am