MATH 536 - Numerical Solution of Partial Differential Equations

Transcription

MATH 536 - Numerical Solution of Partial Differential Equations
Peter Minev
March 31, 2016
ii
Contents
1 Systems of Linear Equations
1.1 Direct Methods – Gaussian Elimination . . . . .
1.1.1 Forward Elimination . . . . . . . . . . . .
1.2 Gaussian Elimination with Pivoting . . . . . . .
1.3 Special Cases . . . . . . . . . . . . . . . . . . . .
1.3.1 Cholesky Decomposition . . . . . . . . . .
1.3.2 Thomas Algorithm . . . . . . . . . . . . .
1.4 Matrix Norms . . . . . . . . . . . . . . . . . . . .
1.5 Error Estimates . . . . . . . . . . . . . . . . . . .
1.6 Iterative Method Preliminaries . . . . . . . . . .
1.7 Iterative Methods . . . . . . . . . . . . . . . . . .
1.7.1 Jacobi’s Method . . . . . . . . . . . . . .
1.7.2 Gauss-Seidel (GS) Iteration . . . . . . . .
1.7.3 Successive Over-relaxation (SOR) Method
1.7.4 Conjugate Gradient Method . . . . . . . .
1.7.5 Steepest Descent Method . . . . . . . . .
1.7.6 Conjugate Gradient Method (CGM) . . .
1.7.7 Preconditioning . . . . . . . . . . . . . . .
2 Solutions to Partial Differential Equations
2.1 Classification of Partial Differential Equations .
2.1.1 First Order linear PDEs . . . . . . . . .
2.1.2 Second Order PDE . . . . . . . . . . . .
2.2 Difference Operators . . . . . . . . . . . . . . .
2.2.1 Poisson Equation . . . . . . . . . . . . .
2.2.2 Neumann Boundary Conditions . . . . .
2.3 Consistency and Convergence . . . . . . . . . .
2.4 Advection Equation . . . . . . . . . . . . . . .
2.5 Von Neumann Stability Analysis . . . . . . . .
2.6 Sufficient Conditions for Convergence . . . . .
2.7 Parabolic PDEs – Heat Equation . . . . . . . .
2.7.1 FC Scheme . . . . . . . . . . . . . . . .
2.7.2 BC scheme . . . . . . . . . . . . . . . .
2.7.3 Crank-Nicolson Scheme . . . . . . . . .
2.7.4 Leapfrog Scheme . . . . . . . . . . . . .
2.7.5 DuFort-Frankel Scheme . . . . . . . . .
2.8 Advection-Diffusion Equation . . . . . . . . . .
2.8.1 FC Scheme . . . . . . . . . . . . . . . .
2.8.2 Upwinding Scheme (FB) . . . . . . . . .
iii
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1
1
1
2
4
4
6
6
8
8
10
10
10
11
11
12
13
14
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
23
23
23
23
24
25
26
26
29
31
33
34
34
36
36
37
38
38
38
40
iv
3 Introduction to Finite Elements
3.1 Weighted Residual Methods . . . . . . .
3.1.1 Sobolev spaces . . . . . . . . . .
3.1.2 Weighted Residual Formulations
3.1.3 Collocation Methods . . . . . . .
3.2 Weak Methods . . . . . . . . . . . . . .
3.3 Finite Element Method (FEM) . . . . .
3.4 Gaussian Quadrature . . . . . . . . . . .
3.5 Error Estimates . . . . . . . . . . . . . .
3.6 Optimal Error Estimates . . . . . . . . .
3.6.1 Other Boundary Conditions . . .
3.7 Transient Problems . . . . . . . . . . . .
3.8 Finite Elements in Two Dimensions . .
3.8.1 Finite Element Basis Functions .
3.8.2 Discrete Galerkin Formulation .
CONTENTS
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
41
41
41
42
43
43
47
48
49
50
52
53
58
59
60
Chapter 1
Systems of Linear Equations
1.1
1.1.1
Direct Methods – Gaussian Elimination
Forward Elimination
(k-th step):
[k−1]
[k]
aij = aij
[k−1]
ℓi,k−1 =
[k−1]
ai,k−1
,
[k−1]
ak−1,k−1
[k−1]
[k]
[k−1] [k−1]
− ak−1,j ℓi,k−1 ,
bi = bi
In n − 1 steps:

[k−1] [k−1]
[1]
0
j = k − 1, . . . , n
k = 2, . . . , n i = k, . . . , n
− bk−1 ℓi,k−1 ,
a11

 0
 .
 .
 .
or
i = k, . . . , n,
[1]
a12
[2]
a22
..
.
0
...
...
..
.
i = k, . . . , n

[1] 
a1n
[2] 
a2n  

.. 

. 
[n]
. . . ann
x1
x2
..
.
xn


 
 
=
 

[1]
b1
[2]
b2
..
.
[n]
bn






[u] [x] = [c]
Operations Count
n−1
X
Where we have used


3
2


(n − j) +2 (n − j) (n − j + 1) = 2n + n − 7n
| {z }
|
{z
}
6
2
6
j=1
divisions
multiply and sum
m
X
j=1
j=
m (m + 1)
,
2
m
X
j2 =
j=1
m (m + 1) (2m + 1)
6
Backward Substitution


n
X
1  [i]
[i]
aij xj 
xi = [i] bi −
aii
j=i+1
1
2
CHAPTER 1. SYSTEMS OF LINEAR EQUATIONS
Operation Count
n
X
i=1






2
2 (n − i)
+ |{z}
1 
=n
| {z }
Multiplication and Subtraction division
Matrix Representations
Define:

1

0

0
Lk = 
0

.
 ..
0
..
.
A[2] = L1 A[1] ,
0
1
...
. . . −ℓk+1,k
..
.
0
Lemma 1.1.1
................
...
−ℓnk
0


. . . . . . 0

0 . . . 0

1 . . . 0


..

.
0 ... 1
A[1] = A, b[1] = b
b[2] = L1 b[1]
Therefore,
A[n] = U = Ln−1 Ln−2 · · · L1 A
or
−1
−1
A = L−1
1 L2 · · · Ln−1 U = LU
Proposition 1.1.2
−1
L = L−1
1 · · · Ln−1
is a lower triangular matrix with unit diagonal.
Definition A Permutation matrix is a matrix that has exactly one nonzero entry in each row and column, and
this entry is equal to 1. The elementary permutation matrix (EPM) is defined as a matrix produced from the
identity matrix by interchanging exactly one pair k, m of its columns. We denote it as Pkm = Pk .
|{z}
k<m
For example:
P23

1
0
=
0
0
0
0
1
0
0
1
0
0

0
0

0
1
Proposition 1.1.3 If A ∈ Rn×n and Pk,m ∈ Rn×n is EPM, then Pk,m A differs from A by an interchange of
rows k and m.
1.2
Gaussian Elimination with Pivoting
A[n] = U = Ln−1 Pn−1 . . . L1 P1 A
Where U is upper triangular, Lk is defined as before, and Pk = Pkm with m ≥ k. Equivalently,
−1
A = P1 L1−1 P2 L−1
2 · · · Pn−1 Ln−1 U
{z
}
|
L∗
1.2. GAUSSIAN ELIMINATION WITH PIVOTING
3
−1
Lemma 1.2.1 If i ≥ j > k and Pj = Pji then Pj L−1
k Pj is produced form Lk by interchanging the j-th and
i-th entry in the k-th column.
Proof

..
.













1
..
.
ℓjk
..
.
..

.
1
..
ℓik
..
.
.
1
..
.













→












..

.
1
..
.
..
.
ℓik
..
.
0
ℓjk
..
.
1
1
..
.
0
..
.

..












→













.
1
..
.
..
.
ℓik
..
.
1
..
ℓjk
..
.
.
1
..
.













Theorem 1.2.2 Consider P = Pn−1 . . . P1 . Then P is a permutation matrix and P A = P L∗ U = LU , with L
being a lower triangular matrix with unit diagonal.
Proof
−1
−1
L = P P1 L−1
1 P2 L2 · · · Pn−1 Ln−1
= Pn−1 Pn−2 · · · P1 P1 L−1
P L−1 · · · Pn−1 L−1
n−1
| {z } 1 2 2
I
−1
−1
= Pn−1 Pn−2 · · · P2 L−1
1 P2 P3 · · · Pn−1 Pn−1 · · · P3 L2 · · · Pn−1 Ln−1
{z
}
|
=
L−1
1∗ Pn−1
· · · P3 L−1
2 P3
I
−1
· · · Pn−1 Ln−1
−1
−1
−1
= · · · = L−1
1∗ L2∗ · · · Ln−2∗ Ln−1
[k]
Theorem 1.2.3 The pivot entries akk , k = 1, 2, . . . , n − 1 are nonzero if and only if Âk are non-singular for
k = 1, . . . , n − 1, where


a11 . . . a1k
 a21 . . . a2k 


Âk =  . .
. 
 .. . . .. 
ak1 . . . akk
is called the k th main principle sub-matrix.
[k]
Proof (i) Suppose first that all akk 6= 0. Then since Âk = L̂k Ûk it follows that det(Âk ) = det(Ûk ) =
[k]
[k]
a11 . . . akk 6= 0
(ii) Now suppose that det(Âk ) 6= 0. Then, using a induction argument, we have:
[1]
[1]
(a) a11 = Â1 and therefore a11 6= 0.
[1]
[k]
[k+1]
(b) Assume that a11 , . . . , ak 6= 0. Then if A11
is the k + 1-th principle sub matrix of A[k+1] we can
re-write it in the following block decomposed form
#
"
[k+1]
[k+1]
A
A
11
12
A[k+1] =
[k+1]
[k+1] .
A21 A22
[k+1]
It is easy to see that A11
upper triangular i.e.
[1]
[k+1]
= (Lk )11 . . . (L1 )11 A11 . Therefore, det(A11
[k+1]
A11

[1]
a11

 0
=
 ..
 .
0

[1]
. . . a1,k+1

[2]
. . . a2,k+1 
,
..

..
.

.
[k+1]
. . . ak+1,k+1
[1]
[k+1]
) = det(A11 ) 6= 0. But A11
is
4
[k+1]
[1]
[k+1]
[k+1]
and then det(A11 ) = a11 . . . ak+1,k+1 . This, together with the induction hypothesis yields that ak+1,k+1 6= 0
which completes the induction proof.
Definition A matrix is strictly diagonally dominant if
X
|aii | >
|aij |
j6=i
Corollary 1.2.4 If a matrix is strictly diagonally dominant then no pivoting is necessary.
Definition Matrix A is symmetric positive definite (spd) if
1. A = AT
2. v T Av ≥ 0 for all v ∈ Rn
3. v T Av = 0 if and only if v ≡ 0
Corollary 1.2.5 If a matrix is spd then no pivoting is necessary.
Definition The spectrum of a matrix A, S (A), is the set of all eigenvalues of the matrix.
Theorem 1.2.6 If A ∈ Rn×n is symmetric the following statements are equivalent
1. A is positive definite,
2. S (A) contains only positive real numbers, and
3. Every principle sub-matrix is positive definite.
Theorem 1.2.7 (Gershgorin) The spectrum S (A) is enclosed in
!
!
n
n
\ [
[
′′
′
Di
Di
i=1
i=1
where




X
Di′ = z ∈ C : |z − aii | ≤
|aij |


j6=i




X
Di′′ = z ∈ C : |z − aii | ≤
|aji |


j6=i
1.3
1.3.1
Special Cases
Cholesky Decomposition
Theorem 1.3.1 If A ∈ Rn×n is spd then there exists a lower triangular matrix C such that CC T = A.
Furthermore, the diagonal entries of C are positive.
Proof Use induction on n:
n=1
:
C = [c11 ] =
Assume the theorem holds for all matrices in Rn×n .
A
A = Tn
a
a
an+1,n+1
√
a11
1.3. SPECIAL CASES
5
where an+1,n+1 > 0. By the induction hypothesis:
An = Cn CnT
Now try to find [X] such that
Cn
CC =
XT
T
0
c
X
A
= Tn
c
a
CnT
0
a
an+1,n+1
or such that
Cn X = a
⇒
X = Cn−1 a
and
X T X + c2 = an+1,n+1
But, is c > 0?
c=
q
an+1,n1 − X T X
T
T
X T X = Cn−1 a Cn−1 a = aT Cn−1 Cn−1 a = aT A−1
n a
therefore
Then define x :=
⇒
h
A−1
n a
T
, −1
iT
T
an+1,n+1 − aT A−1
n a = an+1,n+1 − X X
6= 0 so that
xT Ax = an+1,n+1 − aT A−1
n a> 0
Cholesky Algorithm
√
1. c11 ← a11
2. For i = 2, 3, . . . , n
3. ci1 ← ai1 /c11
4. End i
5. For j = 2, 3, . . . , n − 1
6.
cjj
v
u
j−1
u
X
← tajj −
c2jk
k=1
7. For i = j + 1, . . . , n
8.
1
cij ←
cjj
aij −
j−1
X
cik cjk
k=1
9. End i
10. End j
11.
cnn
v
u
n−1
X
u
c2nk
← tann −
k=1
12. End
!
6
1.3.2
Thomas Algorithm
Definition A band matrix is of the form

a11 a1,2
...
a1,l
0
..
..
..

.
.
.
 a21

 ..
..
..
..
 .
.
.
.

ak,1 . . . ak,k−1 akk ak,k+1


..
..
..
 0
.
.
.

 ..
..
..
..
 .
.
.
.

 .
.
.
..
..
 ..

 .
..
..
 ..
.
.

 .
..
 ..
.
0
..........................
...........................
..
.
..
..
.
.
. . . ak,k+l−1
0
...
..
..
..
.
.
.
..
..
.
..
.
..
..
..
.
.
..
.
.
..
.
..
.
..
.
..
..
.
.
. . . an,n−1
.
0
an,n−k+1
0
..
.
..
.
0
..
.














0


an−l+1,n 


..

.


an−1,n 
an,n
A band matrix may be stored compactly as an l + k − 1 × n matrix of the form:


k−1
}|
{
z
 0
...
0
a11
a12 . . . a1l 


. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 


 ak1 . . . ak,k−1
akk ak,k+1 . . . ak,k+l−1 


. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 


an,n−k+1 . . . an,n−1 ann
.{z
..
0} 
|0
l−1
Thomas Algorithm
For the linear system

b1
a2

0

 ..
.

 ..
.

.
 ..
0
1.
c1
b2
a3
..
.
0
c2
b3
..
.
................
0 ...........
c3
0
...
..
..
.
.
..
..
.
..
..
.
.
............
..
.
an−1
0
.
bn−1
an
2.
cj−1
[f ]
bj−1


   
 x1
d1

  x2   d2 
   
  ..  =  .. 
 .   . 
0 
 xn
dn

cn−1 
bn
[f ]
[f ]
b j = b j − aj
[f ]
[f ]
and dj = dj − aj
[f ]
[f ]
xj−1 = dj−1 − bj xj ,
1.4
0
0
0
..
.
for
dj−1
[f ]
bj−1
,
for j = 2, . . . , n
j = n, . . . , 2
Matrix Norms
Definition If A ∈ Rn×n then the mapping Rn×n → R is called a norm of A := ||A|| if and only if:
1. ||A|| ≥ 0 and ||A|| = 0 if and only if A = 0,
1.4. MATRIX NORMS
7
2. ||λA|| = |λ| ||A|| for all λ ∈ R,
3. ||A + B|| ≤ ||A|| + ||B||.
If in adddition, ||AB|| ≤ ||A|| ||B|| the norm is called multiplicative. In the sequel, we make use of multiplicative
norms only, and therefore, the notion of a norm presumes a multiplicative norm.
Definition If A ∈ Rn×n , then for a vector norm ||·|| : Rn → R,
||Ax||
||x||6=0 ||x||
||A|| := sup
is called a subordinate matrix norm.
Theorem 1.4.1 Let A ∈ Rn×n . Then,
1.
||A||∞ :=
n
X
||Ax||∞
|aij |
= max
1≤i≤n
||x||∞ 6=0 ||x||∞
j=1
sup
2.
n
X
||Ax||1
|aij |
= max
1≤j≤n
||x||1 6=0 ||x||1
i=1
||A||1 := sup
3.
||Ax||2
=
||x||2 6=0 ||x||2
||A||2 := sup
Proof
||Ax||∞
therefore,
q
ρ (AT A)


X
n
X
n


|aij |
max |xj | = M ||x||∞
aij xj ≤ max
= max 1≤j≤n
1≤i≤n
1≤i≤n j=1
j=1
||A||∞ ≤ M
Let i be the index for which
max
1≤i≤n
n
X
j=1
|aij | = M
n
is achieved. Choose x = {xj }j=1
xj :=
aij / |aij | : aij =
6 0
0
: aij = 0
so that ||x||∞ = 1 and ||Ax||∞ = M . Therefore,
||A||∞ = M
8
1.5
Error Estimates
Consider two vectors x1 and x2 where
Ax1 = b and Ax2 = b + r
then
x1 = A−1 b and x2 = A−1 (b + r)
and it follows immediately that
Then,
Therefore,
||x2 − x1 || = A−1 r
−1 −1 A r A ||r||
||x2 − x1 ||
||x2 − x1 ||
≤
=
≤
||A|| ||x1 ||
||Ax1 ||
||b||
||b||
||r||
||x2 − x1 ||
≤ ||A|| A−1 ||x1 ||
|
{z
} ||b||
c(A)
This provides means of estimating the error introduced in the solution of a linear system due to roundoff errors
or uncertainty in the right-hand-side or the matrix of the system.
1.6
Iterative Method Preliminaries
∞
Definition If xk ∈ Rn then the sequence {xk }k=1 converges to x ∈ Rn in a norm ||·|| i.e.
lim xk = x
k→∞
if
lim ||x − xk || = 0.
k→∞
∞
Definition If Ak ∈ Rn×n then the sequence {Ak }k=1 converges to A ∈ Rn×n in a norm ||·|| i.e.
lim Ak = A
k→∞
if
lim ||A − Ak || = 0.
k→∞
Consider the matrix power series
f (A) ∼
∞
X
ak Ak
(1.1)
k=0
It is convergent if and only if
lim
K→∞
Consider also the numerical power series
K
X
ak Ak = f (A)
k=0
f (λ) ∼
∞
X
k=0
ak λk
(1.2)
1.6. ITERATIVE METHOD PRELIMINARIES
9
Theorem 1.6.1 The matrix power series (1.1) is convergent if for all λi ∈ S (A) |λi | < ρ, where ρ is the radius
of convergence of (1.2) and S (A) is the spectrum of A. It is divergent if |λi | > ρ for some i.
Proof From Jordan’s theorem, there exists nonsingular C such that:

λk
0

 ..
B = C −1 AC, B = diag (Bk ) , Bk = 
 .
 .
 ..
0
1
λk
..
.
0
1
..
.
...
...
..
.
..
..
.
.......
.
0

0
0





1
λk
where Bk ∈ Rnk ×nk and nk is the multiplicity of the eigenvalue λk . Then, f (A) is convergent if and only if
f (B) =
∞
X
k
ak B =
k=0
∞
X
ak C
−1
k
A C=C
−1
k=0
∞
X
k=0
k
ak A
!
C
is convergent as well. Now, let’s look at Bki :

 2
λk 2λk
1
... 0
0
λ2k 2λk . . . 0 

Bk2 = 
 . . . . . . . . . . . . . . . . . . . . . . . ,
0 . . . . . . . . . . . . . . λ2k
i−1
 i

i
λk
. . . . . . . . . . . . nki−1 λki−nk +1
1 λk
i−1
i
0
λik
. . . nki−2 λki−nk +2 

1 λk
Bki = 
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
0 ......................
λik
so that the i-th block of the m-th partial sum of f (B) is


(n −1)
1
1 ′
fm (λi ) . . . (ni −1)!
fm i
(λi )
fm (λi ) 1!
m


X
(n −2)
1

fm i
(λi )
fm (λi ) . . . (ni −2)!
fm (Bi ) =
ak Bik =  0

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
k=0
0
...
0
fm (λi )
where
f (B) =
m
X
ak B k
k=0
Therefore, fm (B) is convergent if and only if
fm (λi ) =
m
X
ak λki
k=0
(l)
(and therefore fm (λi )) is convergent for i = 1, . . . , n and l = 1, . . . , ni . fm (λi ) is convergent if |λi | is smaller
than ρ, the radius of convergence of (1.2).
Corollary 1.6.2
f (A) = I + A + A2 + · · · + Am + · · ·
is convergent if and only if |λk | < 1 for all k = 1, . . . , n and A ∈ Rn×n .
Corollary 1.6.3 Since |λk | ≤ ||A|| for any subordinate norm ||·|| then f (A) is convergent if ||A|| < 1.
10
1.7
Iterative Methods
Consider
Ax = b
If A = B + C then
Ax = b
x = x + C −1 (b − Ax)
⇔
and in this form we may solve the problem iteratively:
xk+1 = xk + C −1 b − Axk
(1.3)
Theorem 1.7.1 The iteration (1.3) is convergent if and only if ρ I − C −1 A < 1.
Proof
xk+1 = I − C −1 A xk + C −1 b = D2 xk−1 + (I + D) C −1 b = . . .
|
{z
}
=D
D
k+1 0
x + I + D + D2 + . . . + Dk C −1 b
If ρ I − C −1 A < 1 then I + D + . . . + Dk + . . . is convergent and therefore
Dk →k→∞ 0
Definition R (D) = − log10 ρ (D) is called the convergence rate of the iteration.
1.7.1
Jacobi’s Method
C = diag (A)


12
13
0 − aa11
− aa11
· · · − aa1n
11

.. 
..
D = I − C −1 A =  ...
.
. 
an1
an2
an3
− ann − ann − ann · · · 0
Theorem 1.7.2 If A is strictly diagonally dominant then Jacobi is convergent.
Proof
||D||∞
Then
X aij < 1.
= max
aii i
j6=i
|λi | ≤ ||D||∞ < 1
1.7.2
.
Gauss-Seidel (GS) Iteration

a11
0


C = L =  ... . . .

a1n . . . ann
B = U =A−L

X k+1 = X k + C −1 b − AX k
Theorem 1.7.3 If A is strictly diagonally dominant then Gauss-Seidel iteration is convergent.
1.7. ITERATIVE METHODS
11
Proof The formula for the i-th component of xk+1 is:
xk+1
=−
i
X aij
j<i
aii
xk+1
−
j
X aij
j>i
aii
xkj +
bi
aii
The exact solution clearly satisfies the iteration i.e.
xi = −
X aij
j<i
aii
xj −
X aij
j>i
aii
xj +
bi
aii
The error ǫk+1 = xk+1 − x for Ax = B is found as
X aij
X aij
ǫk+1
=−
ǫk+1
−
ǫkj
i
j
a
a
ii
ii
j<i
j>i
then
Using induction on i we find that
k+1 X aij k+1 X aij k ǫj +
ǫ
≤
ǫ
i
aii aii j
j>i
j<i
Therefore,


X aij k+1 ≤
ǫ
 ǫk < ǫk i
aii ∞
∞
j6=i
k+1 ǫ
i
∞
< ǫk ∞
so that the error must strictly decrease with each iteration. Then the error must go to zero because the exact
solution is the unique fixed point of the iteration (why is it unique?).
1.7.3
Successive Over-relaxation (SOR) Method
1
D
ω
A = L+D+U
D = diag (A)
C = L+
xk+1 = − ω −1 D + L
−1 −1
b.
1 − ω −1 D + U xk + ω −1 D + L
Note that in this case L has zeros on the main diagonal and is strictly lower triangular.
Theorem 1.7.4 (Ostrowski-Reich) If A ∈ Rn×n is spd then SOR is convergent iff 0 < ω < 2.
1.7.4
Conjugate Gradient Method
Theorem 1.7.5 If A ∈ Rn×n is spd then u is a solution to the linear system Au = b if and only if u minimizes
the function:
1
F (v) = v T Av − bT v
2
Proof Let u∗ satisfy Au∗ = b then
1
1 T
u Au − bT u − u∗ T Au∗ + bT u∗
2
2
1
1
1
T
= uT Au − u∗ T Au + u∗ T Au∗ = (u − u∗ ) A (u − u∗ ) ≥ 0
2
2
2
F (u) − F (u∗ ) =
12
then
1
(u − u∗ )T A (u − u∗ ) = 0
2
so that F (u) ≥ F (u∗ ) and
F (u) = F (u∗ )
⇔
⇔
u = u∗
u = u∗
Definition If A is spd then for any two vectors u, v ∈ Rn , uT Av defines the energy inner product induced by
A denoted by:
hu, viA = uT Av.
Note that in the rest of the notes at some occasions (u, v)A is also used to denote the energy inner product.
1.7.5
Steepest Descent Method
Note that any iteration for finding a solution to a linear system Au = b can be written in the form:
uk+1 = uk + αk pk
where αk ∈ R is called iteration step, pk ∈ Rn is called the search direction. For example, in the case of matrix
splitting methods the search direction is chosen to be pk = C −1 rk , with rk = b − Auk being called the residual
of the iterate uk , and αk = 1. The step does not need to be constant and the gradient-type methods choose it at
each iteration step so that this choice minimizes the function F (v). The basic gradient-type iterative algorithm
therefore can be written as:
1. Choose an initial search direction
p0 = r0 = b − Au0
2. For k = 1, 2, 3, . . . do:
(a) Find αk ∈ R such that
uk+1 = uk + αk pk
minimizes F over the line uk + αpk
(b) New iterate
uk+1 = uk + αk pk
(c) The new residual is
rk+1 = b − Auk+1
(d) Find the new search direction pk+1 .
(e) Let k = k + 1 and repeat item 2 until convergence.
To find αk we minimize F along the search direction pk :
d
d
F uk + αk pk = ∇F uk + αk pk ·
uk + αk pk
dαk
dαk
T
T
k
= A u + αk pk − b pk = Auk − b pk + αk pTk Apk
= −rkT pk + αk pTk Apk = 0
so that
αk =
rkT pk
pTk Apk
13
Note that for this choice of αk we have:

T
T


k
Auk+1 − b pk = 
A u + αk pk −b pk = −rkT pk + αk pTk Apk = 0
| {z }
{z
}
|
−rk+1
uk+1
or
T
−rk+1
pk = ∇F (uk+1 ) pk = 0
In the steepest descent method we choose
pk = rk = b − Auk ,
because this is the direction of the negative gradient and in this direction the function F (v) is clearly nonincreasing. Unfortunately, this choice of search direction is not optimal. The basic reason for this is that the
step αk is chosen so that the iterate uk+1 is a minimum of F (v) along the direction pk . However, the next
minimum, along the direction pk+1 , is not guaranteed to be smaller than or equal to the current one. This issue
is discussed in the next section.
1.7.6
Conjugate Gradient Method (CGM)
Definition uk is optimal with respect to direction p 6= 0 iff F uk ≤ F uk + λp for all λ ∈ R.
Lemma 1.7.6 uk is optimal with respect to p iff
pT rk = 0.
Proof uk is optimal with respect to p iff F uk + λp has a minimum at λ = 0, i.e.
∂F k
= pT Auk − b + λpT Apλ=0 = 0
u + λp ∂λ
| {z }
λ=0
−rk
This is obviously satisfied iff pT rk = 0.
We know that the choice of αk guarantees that uk+1 is optimal with respect to pk and this applies in particular
to the steepest descent method for which pk = −rk . However, uk+2 may not be optimal with respect to pk i.e.
the next iteration can undo some of the minimization work of the previous iteration. Indeed:
uk+2 = uk+1 + αk+1 rk+1
which gives
rk+2 = rk+1 − αk+1 Ark+1 ⇒ rkT rk+2 = rkT rk+1 −αk+1 rkT Ark+1
| {z }
⇒
pTk rk+2 = −αk+1 rkT Ark+1 6= 0
=0
unless A = cI. Note that rkT rk+1 = 0 for the steepest descent method since
rkT rk+1 = rkT (b − A(uk + αk rk )) = rkT (rk − αk Ark ) = rkT (rk −
rkT rk
Ark ) = 0.
||rk ||2A
To find a different search direction pk+1 that maintains the optimality of uk+2 w.r.t. pk , we multiply by pk the
equation for the residual:
rk+2 = rk+1 − αk+1 Apk+1
14
and require the product to be zero:
0 = pTk rk+2 = pTk rk+1 +αk+1 pTk Apk+1 ⇔ pTk Apk+1 = 0.
| {z }
=0
For the conjugate gradient method we require that uk+2 is also optimal with respect to pk i.e. we choose
the search direction pk+1 from the condition pTk Apk+1 = 0.
Definition If u, v ∈ Rn are such that hu, viA = 0 for A (which is spd) then u, v are A-conjugate.
This means that pk+1 should be A-conjugate to pk . We search for pk+1 in the form pk+1 = rk+1 + βk pk and
hrk+1 , pk iA
then from the condition pTk Apk+1 = 0 we easily obtain that βk = −
2
||pk ||A
Algorithm (Basic CGM procedure) If A ∈ Rn×n is spd and u0 is an initial guess:
1. r0 ← b − Au0
2. p0 = r0
3. For k = 1, 2, . . . , m:
2
T
4. αk−1 ← rk−1
pk−1 / ||pk−1 ||A
5. uk ← uk−1 + αk−1 pk−1
6. rk ← rk−1 − αk−1 Apk−1
7.
pk ← rk −
hrk , pk−1 iA
2
||pk−1 ||A
pk−1
8. Next k
9. End
As it will be demonstrated below, if all arithmetic operations are exact, the algorithm computes the exact
solution in a finite number of steps (finite termination property).
Later, we will show that the error decreases as:
"p
#k
k c
(A)
−
1
k
u − u = ǫ ≤ 2 ||ǫ0 ||
A p
A
A
c (A) + 1
If c (A) ≫ 1 then the convergence is slow. Therefore, the algorithm is often modified by formally multiplying
the original system by a matrix that has a spectrum close to the spectrum of A and performing the CG method
on the modified system which has the same solution as the original one. This process is called preconditioning.
1.7.7
Preconditioning
Instead of Au = b we solve Ãũ = b̃ where
Ã = B −1
T
AB −1 ,
ũ = Bu,
b̃ = B −1
T
b
Since the preconditioner B T B must be an approximation to the matrix A, a natural choice for B should be an
approximation to the square root of A. Since A is s.p.d. its square root is also s.p.d. and therefore we assume
that B T = B. The preconditioner B 2 must satisfy somewhat contradicting requirements since on one hand it
15
should be a good approximation of A and on the other, it should be much easier to solve a linear system with
a matrix B 2 than with A. The second requirement follows from the fact that, as it will become clear below, on
each iteration of the CG method with preconditioning, we will need to solve one system with B 2 .
For the system Au = b we know that
u : F (u) = minn F (v)
v∈R
Now, if ṽ = Bv then
1 T
1 −1 T −1 B ṽ A B ṽ − bT B −1 ṽ
v Av − bT v =
2
2
1 T −1
−1
= ṽ |B {z
AB } ṽ − B −1 b ṽ = F̃ (ṽ)
2
| {z }
F (v) =
Ã
b̃
The preconditioned method should compute the new iterate from: ũk = ũk−1 + α̃k−1 p̃k−1 , with α̃k−1 =
T
r̃k−1
p̃k−1
||p̃k−1 ||Ã .
The search direction for the preconditioned method should be computed as p̃k = β̃k−1 p̃k−1 +r̃k where the residual
hr̃
,p̃k iÃ
. These formulae are not convenient for practical
r̃k is given by r̃k = r̃k−1 − α̃k−1 Ãp̃k−1 , and β̃k = − k+1
||p̃ ||2
k
Ã
computations since they require to somehow compute the inverse of the preconditioner, (B 2 )−1 . Fortunately,
it appears that the preconditioned method can be implemented almost exactly as the original CGM modifying
only the computation of the search direction to:

T
zk
z }| {
 2 −1 
rk  Apk−1
 B
−1
rk −
pk = B 2
pk−1 ,
(1.4)
pTk−1 Apk−1
| {z }
zk
and defining the initial iterate as u0 = B −1 ũ0 and the initial search direction as p0 = (B 2 )−1 r0 . With this
modification, the residuals rk , r̃k , search directions pk , p̃k , and iterates uk , ũk , of the modified CG algorithm
and the preconditioned CG algorithm, verify the relations: rk = Br̃k , pk = B −1 p̃k , uk = B −1 ũk . Indeed, using
the assumption that rk−1 = Br̃k−1 , pk−1 = B −1 p̃k−1 , and uk−1 = B −1 ũk−1 (these assumptions are trivially
verifiable at k = 1), we obtain that:
α̃k−1 =
T
r̃k−1
p̃k−1
2
||p̃k−1 ||Ã
=
T
rk−1
B −1 Bpk−1
T
(Bpk−1 ) B −1 AB −1 Bpk−1
=
T
rk−1
pk−1
2
||pk−1 ||A
= αk−1 ,
and subsequently that: rk = Br̃k and uk = B −1 ũk . For example,
ũk = ũk−1 + α̃k−1 p̃k−1 = B(uk−1 + αk−1 pk−1 ) = Buk .
This immediately yields that rk = Br̃k . Then, if pk is computed from (1.4) we obtain, multiplying it by B,
that:
rT B −1 B −1 AB −1 Bpk−1
Bpk−1 ,
Bpk = B −1 rk − kT
pk−1 BB −1 AB −1 Bpk−1
or, using the induction assumptions:
Bpk = r̃k −
hr̃k , p̃k−1 iÃ
2
||p̃k−1 ||Ã
p̃k−1 = p̃k
Thus, using an induction argument we establish that rk = Br̃k , pk = B −1 p̃k , uk = B −1 ũk , and therefore,
α̃k = αk , for all positive integer k, Then, it is clear that the modified CG algorithm does not require the
knowledge of Ã to proceed. It needs only one extra step for the computation of the new search direction.
Instead of using rk for its computation, we need to use zk that is a solution of the system B 2 zk = rk . The
conjugate gradient method with preconditioning is given by:
16
Algorithm Preconditioned CGM:
1. k ← 0
2. r0 ← b − Au0
3. p0 = (B 2 )−1 r0
4. For k = 1, 2, . . . , m:
2
T
5. αk−1 ← rk−1
pk−1 / ||pk−1 ||A
6. uk ← uk−1 + αk−1 pk−1
7. rk ← rk−1 − αk−1 Apk−1
8. if ||rk ||2 ≥ τ then
9. solve B 2 zk = rk
10.
pk ← z k −
hzk , pk−1 iA
2
||pk−1 ||A
pk−1
11. k = k + 1
12. Go to (5)
13. End if
14. End
Lemma 1.7.7 If A ∈ Rn×n is spd, then for m = 0, 1, 2, . . . we have
span {p0 , p1 , . . . , pm } = span {r0 , r1 , . . . , rm } = span {r0 , Ar0 , . . . , Am r0 }
Proof
1. For m = 0, this is trivial.
2. Suppose that for m = k we have
span {p0 , p1 , . . . , pk } = span {r0 , r1 , . . . , rk } = span r0 , Ar0 , . . . , Ak r0 .
3. To complete the proof we should show that the same is true for m = k + 1 i.e.
span {p0 , p1 , . . . , pk+1 } = span {r0 , r1 , . . . , rk+1 } = span r0 , Ar0 , . . . , Ak+1 r0 .
We have shown that
this implies
so that
rk+1 = rk − αk Apk
pk ∈ span r0 , Ar0 , . . . , Ak r0
Apk ∈ span Ar0 , A2 r0 , . . . , Ak+1 r0
rk+1 ∈ span r0 , Ar0 , . . . , Ak+1 r0
span {r0 , r1 , . . . , rk+1 } ⊂ span r0 , Ar0 , . . . , Ak+1 r0
17
Now,
Ak r0 ∈ span {p0 , p1 , . . . , pk }
k+1
A
So that
Ak+1 r ∈ span
Therefore



r0 ∈ span {Ap0 , Ap1 , . . . , Apk }
rk+1 = rk − αk Apk



Ap0 , Ap1 , . . . , Apk−1 , rk+1 − rk ⊆ span {r0 , r1 , . . . rk+1 }
| {z }


 |{z} |{z}

(r1 −r0 ) (r2 −r1 )
c(rk −rk−1 )
Ak+1 r0 ∈ span {r0 , r1 , . . . , rk+1 }
and by induction
span r0 , Ar0 , . . . , Ak+1 r0 ⊂ span {r0 , . . . rk+1 }
.
span {p0 , p1 , . . . , pm } = span {r0 , . . . , rm }
is proven using
pm = rm + βm−1 pm−1
and the same arguments as above.
Definition Km := span {r0 , r1 , . . . , rm−1 } is the m-th Krylov space of A.
Theorem 1.7.8 If A is spd then:
(A) hpk , pm iA = 0 for m 6= k and
(B) rkT rm = 0 for m 6= k.
Proof
1. For k, m ≤ 1:
r1T r0 = 0
hp1 , p0 iA = 0
2. Assume:
(a) hpk , pm iA = 0 for k 6= m and k, m ≤ l
(b) rkT rm = 0 for k 6= m and k, m ≤ l
3. For m < l:
rlT pm = rlT (c0 r0 + c2 r2 + · · · + cm rm ) = 0
from (b) and therefore
T
rl+1
pm = (rl − αl Apl )T pm = 0.
For m = l we have
T
T
rl+1
pl = (rl − αl Apl ) pl = rlT pl − αl hpl , pl iA = 0
by the definition of αl . But rm ∈ span {p0 , . . . , pm } for m = 1, . . . , l and therefore
T
rl+1
rm = 0,
m = 0, 1, . . . , l,
18
which proves (B).
To prove (A), we use:
pl+1 = rl+1 + βl pl
hpl+1 , pm iA = hrl+1 , pm iA + βl hpl , pm iA ; hpl , pm iA = 0,
m = 0, . . . , l − 1.
But
Apm ∈ span {r0 , r1 , . . . , rm+1 }
and from (B),
hrl+1 , pm iA = 0
⇒
hpl+1 , pm iA = 0 for m = 0, 1, . . . , l − 1.
By the choice of pl+1 in the CGM,
hpl+1 , pl iA = 0
and so (A) holds.
From this theorem we can conclude that
dim Kn ≤ n.
Corollary 1.7.9 If A ∈ Rn×n is spd then for some m ≤ n, rm = 0.
Theorem 1.7.10 If A is spd then uk+1 minimizes the error u − uk+1 = ǫk+1 over Kk+1 in ||·||A i.e.
||ǫk+1 ||A = min ||u − v||A
(u : Au = b)
v∈Kk+1
Proof Corollary 1.7.9 concluded that there exist some m + 1 ≤ n s.t. rm+1 = 0. Without loss of generality
we can assume that u0 = 0 since any change of the initial iterate can be interpreted as a change in the right
hand side vector b: A(u − u0 ) = b − Au0 , and therefore it does not affect the estimates that follow. Then the
m
P
αi pi . Consider some
exact solution must be a linear combination of the search directions p1 , . . . , pm i.e. u =
i=0
v ∈ Kk+1 , k < m + 1. It must have the form v =
||u − v||2A = ||
k
X
i=0
k
X
i=0
k
P
γi pi . Then we have:
i=0
(αi − γi )pi +
|αi − γi |2 ||pi ||2A +
m
X
i=k+1
m
X
i=k+1
αi pi ||2A =
|αi |2 ||pi ||2A ,
since hpi , pj iA = 0 if i 6= j. Therefore, ||u − v||A has its minimum for αi = γi , i = 1, . . . , k. But uk+1 =
which concludes the proof.
Definition Πl denotes the set of all polynomials up to order l:
Πl = pl : al xl + · · · + a0
n
Corollary 1.7.11 If A ∈ Rn×n is symmetric positive definite with a spectrum {λi }i=1 then:
||ǫk+1 ||A ≤ ||ǫ0 ||A max |q (λj )|
1≤j≤n
for all polynomials q ∈ Πk+1 such that q (0) = 1
k
P
i=0
αi pi
19
Proof From theorem 1.7.10 we have that:
||ǫk+1 ||A ≤ ||u − v||A , ∀v ∈ Kk+1 .
But
v ∈ Kk+1 = span r0 , Ar0 , . . . , Ak r0
Without loss of generality, we can assume that r0 = b i.e. u0 = 0 and then
v = α0 b + α1 Ab + · · · + αk Ak b = p (A) b.
Therefore
||ǫk+1 ||A ≤ ||u − p (A) b||A , ∀p (A) ∈ Πk
but u0 = 0, b = Au, and ǫ0 = u so that
||ǫk+1 ||A ≤ ǫ0 − p (A) A u − u0 A = ||ǫ0 (I − p (A) A)||A
≤ ||ǫ0 || I − p (A) A = ||ǫ0 || ||q (A)||A , ∀q (A) ∈ Πk+1 s.t. q (0) = 1.
|
{z
}
∈Πk+1 A
Now we will prove that
||q (A)||A = max |q (λj )| .
j
Since A is symmetric, there exists an orthonormal set of eigenvectors {vj }nj=1 of A i.e.
viT vj =
Take x ∈ Rn then
x=
0
1
if
if
i 6= j
i=j
n
X
ci vi
i=1
2
and therefore ||x||A =
n
P
i=1
2
||q (A)||A
=
c2i λi . Then we have
2
sup ||q (A) x||A =
||x||A =1
n
P
i=1
n
P
i=1
sup
c2i λi =1
n
P
i=1
sup
c2i λi =1
q (A)
n
P
j=1
cj vj
!T
n
P
ci vi =
A q (A)
i=1
q 2 (λi ) c2i λi ≤ max q 2 (λi ) .
1≤i≤n
If max q 2 (λi ) is achieved for some index l then it is clear that the equality in the last relation would be achieved
1≤i≤n
√
if we take x = vl / λl , and this concludes the proof of the corollary.
From corollary 1.7.11 it is clear that the sharpest error estimate would be obtained if we pick q to be such
that max |q (λj )| is minimal with respect to q. Let us denote the minimum and maximum eigenvalues of A
1≤j≤n
by λmin and λmax correspondingly. Before we construct such a polynomial on the interval [λmin , λmax ] we first
consider polynomials on [−1, 1] whose maximum is minimal among all possible polynomials of the same degree
i.e. polynomials with a minimal infinity norm. To remind you:
20
Definition The infinity norm of a polynomial on [−1, 1] is defined as
||p||L∞ [−1,1] =
sup |p (z)|
z∈[−1,1]
If p (0) = 1 then ||p||L∞ [−1,1] ≥ 1 so the minimum for such polynomials would be 1.
The polynomials that have a unit infinity norm on [−1, 1] are the Chebyshev polynomials. They are defined
recursively as follows:
Definition
T0 (z) = 1
T1 (z) = z
Tn+1 (z) = 2zTn (z) − Tn−1 (z)
Alternatively, we may represent the Chebyschev polynomials as
k k p
p
1 2
2
z+ z −1 + z− z −1
Tk (z) =
,
2
or
Tk (z) = cos(k arccos(z)).
The polynomials with a minimum infinity norm on [λmin , λmax ], such that they equal 1 at 0, are given by the
Shifted Chebyshev polynomials:
Definition
Lemma 1.7.12
z−λmin
Tk 1 − 2 λmax
−λmin
T̃k (z) =
λmin
Tk 1 + 2 λmax −λmin
T̃k (z)
L∞ [λmin ,λmax ]
=
min
p∈Πk ,p(0)=1
||p(z)||L∞ [λmin ,λmax ] .
Proof Since Tk (z) = cos(k arccos(z)) it is clear that Tk has k + 1 extrema Tk (zi ) = (−1)i , i = 0, 1, . . . , k in the
nodes zi = cos(iπ/k) i.e it has k + 1 alternating minima and maxima on [−1, 1]. T̃k is just a scaled and shifted
version of Tk so it must have exactly the same number of alternating maxima and minima on [λmin , λmax ].
Now assume towards contradiction, that there is some pk (z) such that pk (0) = 1 and ||pk (z)||L∞ [λmin ,λmax ] <
. Then, pk − T̃k must also have at least the same number of alternating extrema as T̃k
T̃k (z)
L∞ [λmin ,λmax ]
on [λmin , λmax ] since max |pk | < max |Tk | on this interval. This means that pk − T̃k has at least k zeros on
[λmin , λmax ]. But pk (0) − T̃k (0) = 0 and 0 < λmin . So we reach the contradiction that the polynomial of k-th
degree pk − T̃k has at least k + 1 zeros which concludes the proof.
From the corollary we have that
||ǫk ||A ≤ ||ǫ0 ||A max |p (λi )|
λi
for any p (x) ∈ Πk such that p (0) = 1. Therefore,
||ǫk ||A ≤ ||ǫ0 ||A
sup
λmin <z<λmax
|pk (z)| = ||ǫ0 ||A kpkL∞ [λmin ,λmax ] .
Since pk can be any polynomial such that p (0) = 1, the sharpest estimate would be obtained for polynomials
with a minimum infinity norm i.e. shifted Chebishev polynomials:
||ǫk ||A ≤ ||ǫ0 ||A
max
T̃k (z)
λmin <z<λmax
21
Then,
T 1 − 2 z−λmin
k
λ
−λ
max
min
max
T̃k (z) = max λ
λmin <z<λmax
Tk 1 + 2 λ min
−λ
max
min
1
≤ 2
≤ λmin
Tk 1 + 2 λmax
−λmin √
k
√
λ
− λ
√ max √ min
λmax + λmin
where we have used
Tk λmax + λmin = 1
λmax − λmin 2
√
√
√
k √
k !
λmax + λmin
λmax − λmin
√
√
√
+ √
λmax − λmin
λmax + λmin
√
√
k
1
λmax + λmin
√
√
≥
.
2
λmax − λmin
Then
||ǫk ||A ≤ 2 ||ǫ0 ||A
If A is spd then
√
k
√
λ
− λ
√ max √ min
λmax + λmin
λmax
c2 (A) = ||A||2 A−1 2 =
λmin
and
||ǫk ||A ≤ 2 ||ǫ0 ||A
!k
p
c2 (A) − 1
p
.
c2 (A) + 1
Let p(γ) be the smallest integer k > 0 such that ||ǫk ||A ≤ γ ||ǫ0 ||A . Then, from the estimate above (which is
sharp) it follows that
k
p
1−z
2
≤ γ, ∀z ∈ [0, 1), where z = 1/ c2 (A),
1+z
or
k
1+z
2
.
≤
γ
1−z
Note that z = 0 corresponds to the limit c2 (A) → ∞. Then we get
ln
or
2
1+z
1
1
≤ k ln
= 2k(z + z 3 + z 5 + . . . ),
γ
1−z
3
5
z ∈ [0, 1)
1
2
1
1
arctanh(z)
ln ≤ k(1 + z 2 + z 4 + . . . ) = k
.
2z γ
3
5
z
But arctanh(z)/z is a monotonically increasing function on [0, 1) and its minimum is 1. Therefore, we have;
2
1p
c2 (A) ln ≤ k,
2
γ
p
p
which means that p(γ) ≤ 1/2 c2 (A) ln(2/γ) + 1, i.e. for all practical purposes, we need of order of c2 (A)
iterations for satisfying the convergence condition ||ǫk ||A ≤ γ ||ǫ0 ||A .
22
Chapter 2
Solutions to Partial Differential
Equations
2.1
Classification of Partial Differential Equations
Suppose we have a pde of the form
∂u ∂u ∂ 2 u ∂ 2 u ∂ 2 u
φ x, y, u,
,
, 2,
, 2 =0
∂x ∂y ∂x ∂x∂y ∂y
then we classify the equation in terms of the second order derivatives, if they are not present we then classify
in terms of the first order derivatives.
2.1.1
First Order linear PDEs
∂u
∂u
+ β (t, x)
= γ (t, x)
∂t
∂x
Let us try to reduce it to an ODE over some path (t (s) , x (s)) in the domain of the equation. We have that:
α (t, x)
du
∂u dt
∂u dx
=
+
ds
∂t ds ∂x ds
If we can choose a path α =
dt
ds
and β =
∂x
∂s
then
du
=γ
ds
These paths are called characteristics.
2.1.2
Second Order PDE
∂ 2u
∂u ∂u
∂2u
∂ 2u
α 2 +β
+ γ 2 = ψ x, y, u,
,
∂x
∂x∂y
∂y
∂x ∂y
We use a similar idea as in the first order case but now we try to reduce the equation to a system of ODEs for
the first derivatives of the solution i.e.
∂ 2 u dx
∂ 2 u dy
d ∂u
=
+
2
ds ∂x
∂x ds
∂x∂y ds
2
∂ u dx ∂ 2 u dy
d ∂u
=
+ 2
ds ∂y
∂x∂y ds
∂y ds
23
24
CHAPTER 2. SOLUTIONS TO PARTIAL DIFFERENTIAL EQUATIONS
Then we may write this system as

α
β
 dx
ds
0
dy
ds
dx
ds
  ∂2u   ψ 
γ
∂x2
 ∂ 2 u   d ∂u 
0   ∂x∂y
 =  ds ∂x 
dy
ds
∂2u
∂y 2
d
ds
∂u
∂y
If we require, similarly to the first-order case, that the original equation is a linear combination of the two
equations for the first partial derivative, i. e. this system is linearly dependent, then
2
2
2
dy
dx dy
dy
dy
dx
α
−β
= 0 and α
−β
+γ
+ γ = 0.
ds
ds ds
ds
dx
dx
We classify the equations based on the discriminant
β 2 − 4αγ > 0
2
β − 4αγ = 0
β 2 − 4αγ < 0
2.2
⇒
⇒
⇒
hyperbolic
(2 real characteristics)
parabolic (1 real characteristic)
elliptic (0 real characteristics)
Difference Operators
Suppose that we have a grid of nodes in an interval [a, b], ∆ = {a = x0 < x1 < · · · < xN −1 < xN = b}. The
following mapping: uh : ∆ → R is called a grid function: uh = (uh,0 , . . . , uh,N )T with uh,j = uh (xj ). A classical
function u : [a, b] → R gives rise to an associated grid function and, abusing notation somewhat, we denote
this grid function by u and its value at a given node xj by uj . In addition we will make use of the following
operators on such functions, that are used to approximate the corresponding derivatives:
1
(uj+1 − uj ) ,
hj+1 = xj+1 − xj
δ + uj =
hj+1
1
(uj − uj−1 )
δ − uj =
hj
1
δuj+ 21 =
(uj+1 − uj ) ,
hj+1
1
uj − uj−1
uj+1 − uj
1
2
,
hj+ 21 = (hj + hj+1 ) .
−
δ uj =
hj+ 21
hj+1
hj
2
We will also make use of the averaging operator:
1
(uj + uj+1 ) .
2
Assuming that hj = h, ∀j, and enough regularity of the function u, and using Taylor expansions, we can derive
the following estimates for the error of these approximations:
∂uj
1
∂ 2 u j h2
1
+
uj +
h+
+ . . . − uj
δ uj = (uj+1 − uj ) =
h
h
∂x
∂x2 2
∂uj
∂uj
∂ 2 uj
=
+ h 2 + ... =
+ O (h)
∂x
∂x
∂x
∂uj
+ O (h)
δ − uj =
∂x
uj+1 − 2uj + uj−1
∂ 2 uj
δ 2 uj =
=
+ O h2
h2
∂x2
1
uj+ 21 = (uj + uj+1 ) = u xj+ 21 + O h2 .
2
Similar estimates can be derived for non-constant grid size hj , however, in the rest of the notes we will consider
only equidistant grids. The non-equidistant case requires more careful examination.
uj+ 12 =
2.2. DIFFERENCE OPERATORS
2.2.1
25
Poisson Equation
−∇2 u (u, y) = f (x, y) ,
(x, y) ∈ (0, 1) × (0, 1) = Ω
u (x, y) = g (x, y) ,
(x, y) ∈ ∂Ω
Consider the grid:
Ω̄h ≡ {(xj , yl ) ∈ [0, 1] × [0, 1] , xj = jh, yl = lh,
i, l ∈ 0, . . . , N }
Ωh ≡ {(xj , yl ) ∈ ∆, 1 ≤ j, l ≤ N − 1}
∂Ωh ≡ {(x0 , yl ) , (xN , yl ) , (xj , y0 ) , (xj , yN ) , 1 ≤ j, l ≤ N − 1}
A second order scheme for an approximation to the solution uh is given by:
− δx2 uh,j,l + δy2 uh,j,l = fh,j,l ,
(xj , yl ) ∈ Ωh
uj,l = gh,j,l ,
(xj , yl ) ∈ ∂Ωh
where fh,j,l = fj,l , (xj , yl ) ∈ Ωh ; gh,j,l = gj,l , (xj , yl ) ∈ ∂Ωh So that
− (uh,j−1,l + uh,j,l−1 − 4uh,j,l + uh,j,l+1 + uh,j+1,l ) = h2 fh,j,l
The left hand side is a discretization of ∇2 u on the following stencil
yl+1
yl
xj−1

where
T I
I T I


..

.


I T I
I T
yl−1
xj+1
xj

uh,1
uh,2
..
.


r1
r2
..
.














 = −h2 





 uh,N −2 
rN −2 
uh,N −1
rN −1






−4 1
fj,1
uh,j,1

 1 −4 1
 fj,2 


 uh,j,2 
bj






..
T ≡
 , fj ≡  ..  , rj ≡ fj + 2
 , uh,j ≡ 
..
.


h




.
.

1 −4 1 
fj,N −1
uh,j,N −1
1 −4






gj,0
gN,1 + gN −1,0
g0,1 + g1,0
 0 




gN,2
g0,2






 .. 




.
.
..
..
bj ≡  .  for j = 2, . . . , N − 2, b1 ≡ 

 , bN −1 ≡ 






 0 




gN,N −2
g0,N −2
gN,N −1 + gN −1,N
g0,N −1 + g1,N
gj,N
26
2.2.2
Neumann Boundary Conditions
y4
y3
y2
y1
x−1 x0
Suppose that we need to impose a Neumann condition
∂u = g|ij
∂n ij
on the left boundary of the domain sketched above. Then, we write the scheme for all nodes on this boundary
and discretize the Neumann condition using a central difference scheme, introducing an additional layer of points
(corresponding to level -1 in x direction)
uh,1,l − uh,−1,l
= g0,l ,
2h
uh,−1,l + uh,0,l−1 − 4uh,0,l + uh,0,l+1 + uh,1,l = −h2 f0,l .
In order to eliminate the additional layer of points, we combine these results to get
uh,0,l−1 − 4uh,0,l + uh,0,l+1 + 2uh,1,l = −h2 f0,l + 2hg0,l
2.3
Consistency and Convergence
Consider the continuous problem
Lu = f
u=g
in Ω
on ∂Ω
Now, consider the corresponding discrete problem
L h uh = fh
uh = gh
in Ωh
on ∂Ωh .
fh , gh are some sort of approximations of f, g and these notes we assume that fh = f, gh = g in the nodes of
the discretization grid.
Definition For a given φ ∈ C ∞ (Ω)
τh (x) ≡ (L − Lh ) φ (x)
is called a truncation error of Lh . The scheme Lh is consistent with L if
lim τh (x) = 0
h→0
for all x ∈ Ω and φ ∈ C ∞ (Ω). If τh (x) = O (hp ), then Lh is consistent to order p.
2.3. CONSISTENCY AND CONVERGENCE
27
Proposition 2.3.1
−∇2h = −δx2 − δy2
is consistent to order 2 with
−∇2 = −
∂2
∂2
− 2
2
∂x
∂y
Definition uh converges to u in a given norm ||.|| if ||ǫh || = ||u − uh || satisfies
lim ||ǫh || = 0.
h→0
If ||ǫh || = O (hp ), then p is the order of convergence.
In the section on finite difference methods we use exclusively the infinity (or maximum) norm: ||ǫh || = max |ǫh,j |.
j
Note that consistency does not guarantee that the difference between the solutions of the continuous and
discrete problems also tends to zero. It only guarantees that the action of the difference between the differential
and discrete operators on a smooth enough function tends to zero.
Let us go back to the boundary value problem for the Poisson equation
−∇2 u = f
u=g
in Ω
in ∂Ω
(2.1)
in Ωh
in ∂Ωh .
(2.2)
with the corresponding discrete problem
−∇2h uh = fh
uh = gh
From the elliptic PDEs theory we know that the solution to (2.1) satisfies a maximum principle i.e. the solution
achieves its maximum on the boundary of the domain. Similarly (2.2) satisfies a Discrete Maximum Principle:
Theorem 2.3.2 (Discrete Maximum Principle) If
∇2h vj,l ≥ 0
for all (xj , yl ) ∈ Ωh , then
max
(xj ,yl )∈Ωh
vj,l ≤
max
(xj ,yl )∈∂Ωh
vj,l
Proof
∇2h vj,l =
vj+1,l + vj−1,l + vj,l+1 + vj,l−1 − 4vj,l
≥0
h2
⇒
vj,l ≤
1
(vj+1,l + vj−1,l + vj,l+1 + vj,l−1 )
4
∀(xj , yl ) ∈ Ωh
Suppose that the maximum of v is achieved in an internal point (xJ , yL ) ∈ Ωh , then:
vJ ,L ≥ vJ −1,L ,
vJ ,L ≥ vJ +1,L ,
vJ ,L ≥ vJ ,L−1 ,
vJ ,L ≥ vJ ,L+1
so that
4vJ ,L ≥ vJ −1,L + vJ +1,L + vJ ,L−1 + vJ ,L+1
⇒
4vJ ,L = vJ −1,L + vJ +1,L + vJ ,L−1 + vJ ,L+1
Therefore, v must attain its maximum in all neighbouring points too. Applying this argument repeatedly to
the neighbours we eventually will reach a boundary point.
Corollary 2.3.3 The following two results follow from the discrete maximum principle:
28
1.
∇2h uh = 0
uh = 0
in Ωh
on ∂Ωh
has a unique solution uh = 0 in Ωh ∪ ∂Ωh .
2. For given fh and gh ,
∇2h uh = fh
uh = gh
in Ωh
on ∂Ωh
has a unique solution.
Definition If
v : Ωh ∪ ∂Ωh → R
then
||v||Ω ≡
||v||∂Ω ≡
max
(xj ,yl )∈Ωh
|vj,l |
max
(xj ,yl )∈∂Ωh
|vj,l |
Lemma 2.3.4 If vj,l = 0 for all (xj , yl ) ∈ ∂Ωh then
||v||Ω ≤ M∆ ∇2h v Ω
for some M∆ > 0, that depends on Ω but not on h.
Proof Let
and consider the function
2 ∇h v = ν
Ω
w : wj,l =
Denote its norm on the boundary by M∆ i.e.
⇒
−ν ≤ ∇2h v ≤ ν,
1 2
xj + yl2 ≥ 0.
4
||w||∂Ω = M∆ > 0.
Note that ∇2h w = 1. Then
∇2h (v + νw) ≥ 0
∇2h (v − νw) ≤ 0
From the maximum principle and since vh |∂Ωh = 0 we have:
in Ω
vj,l ≤ vj,l + νwj,l ≤ ν ||wj,l ||∂Ω = M∆ ∇2h vj,l Ω
vj,l ≥ vj,l − νwj,l ≥ −ν ||wj,l || = −M∆ ∇2h vj,l ∂Ω
∀(xj , yl ) ∈ Ωh
Ω
Theorem 2.3.5 (error estimate) Let u be the solution of the boundary value problem (2.1) and uh is a
solution to the discrete analog (2.2). Assuming that u ∈ C 4 Ω̄ , then there exists a constant k > 0 such that:
||u − uh ||Ω ≤ kM h2
where
)
( 4 ∂ u ∂ 4 u M ≡ max 4 ,
∂x L∞ (Ω) ∂y 4 L∞ (Ω)
2.4. ADVECTION EQUATION
29
Proof From the proposition:
h2 ∂ 4 u
∂ 4u
∇2h − ∇2 uj,l =
(ξ
,
y
)
+
(x
,
η
)
j
l
j
l
12 ∂x4
∂y 4
for some ξj , ηl s.t. xj−1 ≤ ξj ≤ xj+1 , yl−1 ≤ ηl ≤ yl+1 . Taking into account that u is the exact solution we
have:
h2 ∂ 4 u
∂4u
−∇2h uj,l = fj,l −
(ξ
,
y
)
+
(x
,
η
)
j l
j l
12 ∂x4
∂y 4
Subtract −∇2h uh,j,l = fh,j,l = fj,l and note that u − uh = 0 on ∂Ωh . Then
∇2h
∂4u
h2 ∂ 4 u
(ξj , yl ) + 4 (xj , ηl )
(uj,l − uh,j,l ) =
12 ∂x4
∂y
From the Lemma we get
2M∆
||u − uh ||Ω ≤ M∆ ∇2h (u − uh )Ω ≤
M h2 = kM h2
12
| {z }
=k
2.4
Advection Equation
Consider the initial value problem
∂u
∂u
+v
= 0,
∂t
∂x
−∞ < x < ∞,
t>0
with v > 0, u (0, x) = f (x) for −∞ < x < ∞. The exact solution is given by
u (t, x) = f (x − vt)
Consider a grid ∆ ≡ ∆t × ∆x :
∆t ≡ {nk : n = 0, 1, 2, . . .}
∆x ≡ {jh : j = 0, ±1, ±2, . . .}
One possible discretization is
or equivalently
n
un+1
unh,j − unh,j−1
h,j − uh,j
δt+ + vδx− unh,j =
+v
=0
k
h
n
n
n
un+1
h,j = uh,j − C uh,j − uh,j−1
This is called the FB (Forwards Backwards) scheme where the Courant number is defined as
C≡
vk
h
30
x
..
.
5h
C >1
4h
C=1
3h
2h
h
k
2k
3k
t
Define S (shift operator) such that
Sunh,j = unh,j+1
n
n
n−1
2 n−2
= 1 − C + CS −1 uh,j
= · · · = 1 − C + CS −1 u0h,j = 1 − C + CS −1 fj0
unh,j = 1 − C + CS −1 uh,j
n n X
X
n
n
m
−1 n−m
(1 − C)m C n−m fj−n+m
fj =
(1 − C) CS
=
m
m
m=0
m=0
Now, suppose that the initial condition is perturbed with an error ǫj i.e. instead of fj the initial condition is
given byfˆj = fj + ǫj . Then, the perturbed solution ûh is given by:
ûh,j
n X
n
m
(1 − C) C n−m (fj−n+m + ǫj−n+m )
=
m
m=0
and we have
n n
X
n
m
n
n−m
uh,j − ûh,j = (1 − C) C
ǫj−n+m m
m=0
n n
X
X
n
n
m
n
m n−m
|1 − C| C n−m = ||ǫ||∞ (|1 − C| + C)
|1 − C| C
|ǫj−n+m | ≤ ||ǫ||∞
≤
m
m
m=0
m=0
So that
Then, if C ≤ 1:
However, if C > 1:
n
u − ûn ≤ ||ǫ|| (|1 − C| + C)n
h,j
h,j
∞
n
u − ûn ≤ ||ǫ||
h,j
h,j
∞
n
u − ûn ≤ ||ǫ|| (|1 − C| + C)n = ||ǫ|| (2C − 1)n n→∞
→ ∞
h,j
h,j
∞
∞
Theorem 2.4.1 If C ≤ 1 and u ∈ C 2 ((0, t] × R) then the FB scheme is convergent. That is
E n = ||un − unh ||∞ ≤ tn O (k + h)
where tn = kn.
2.5. VON NEUMANN STABILITY ANALYSIS
31
Proof
k ∂2u
∂u
(t, ξ)
(t, x) +
∂t
2 ∂t2
h ∂2u
∂u
(η, x)
(t, x) −
δx− u (t, x) =
∂x
2 ∂x2
+
−
δt u + vδx u = −τk,h (t, x) = O (k + h)
δt+ u (t, x) =
δt+ uh + vδx− uh = 0
and letting ǫh = u − uh we get the following estimate:
n+1 ǫh,j ≤ (1 − C) ǫnh,j + C ǫnh,j−1 + k |τk,h (tn , xj )|
Let
E n = ||ǫnh ||∞ ,
T n = max |τk,h (tn , xj )|
j
Then we can rewrite the inequality in the form
E n+1 ≤ (1 − C) E n + CE n + k |τk,h (tn , xj )| ≤ E n + kT n ≤ E n−1 + kT n + kT n−1 ≤ E 0 + k
n
X
Tm
m=1
0
If E = 0 then
E n+1 ≤ k
2.5
n
X
m=1
T m ≤ nk max T m = tn max T m = tn O (k + h)
m
1≤m≤n
Von Neumann Stability Analysis
Definition Discrete L2 norm:
∞
X
2
||v||2 ≡ h
j=−∞
|vj |
2
Definition A FD (Finite Difference) scheme for a time dependent PDE is stable if for any time T > 0, there
exists K > 0 such that for any initial data u0h , the sequence {unh } satisfies:
||unh ||2 ≤ K u0h 2
for 0 ≤ nk ≤ T . The constant K is independent of the time and space mesh size!
Consider the following grid on the real axes
∆x = {. . . , −2h, −h, 0, h, 2h, . . .} .
Then the Fourier Transform of a given grid function v is defined as
∞
h X −ijhτ
e
vj
(F v) (τ ) ≡ √
2π j=−∞
Inverse transform:
1
vj = √
2π
Z
π
h
eijhξ (F v) (ξ) dξ
−π
h
Theorem 2.5.1 (Parseval Identity)
||F v||2L2 [− π , π ] :=
h
h
Z
π
h
−π
h
(F v)2 dξ = ||v||22
32
Corollary 2.5.2 A finite difference scheme is stable if and only if ∃K > 0, independent of k, h, s.t.:
||F unh ||L2 [− π , π ] ≤ K F u0h L2 [− π , π ]
h h
h h
Consider the forward backwards scheme. Since we have:
n−1
n−1
unh,j = (1 − C) uh,j
+ Cuh,j−1
Z πh
h
n
√
eijhξ (F unh ) (ξ) dξ
uh,j =
2π − πh
Z πh
h
n−1
uh,j−1
= √
ei(j−1)hξ F uhn−1 (ξ) dξ,
2π − πh
we obtain:
F unh =
F uhn−1 (ξ)
1 − C + Ce−ihξ
{z
}
|
Amplification Factor
A (hξ) = 1 − C + Ce−ihξ
Then, since we have
2
||F unh ||L2 [− π , π ]
h h
if |A (hξ)| ≤ 1, we obtain that:
=
A (hξ)
2
F uhn−1
2
dξ,
||F unh ||L2 [− π , π ] ≤ F uhn−1 L2 [− π , π ]
h
h h
||F unh ||L2 [− π , π ] ≤ F u0h L2 [− π , π ] .
h
Since
π
h
−π
h
h
and recursive application gives:
Z
h
h h
A (θ) = 1 − C + Ce−iθ
2
2
|A (θ)| = (1 − C + C cos θ) + (−C sin θ)
= 1 − 2C + C 2 + 2C cos θ − 2C 2 cos θ + C 2 cos2 θ + C 2 sin2 θ
= 1 − 2C (1 − cos θ) + 2C 2 (1 − cos θ) = 1 + 2C (1 − cos θ) (C − 1)
≤ 1 + 4C (C − 1)
it follows that if C > 1 then |A (θ)| > 1, and if C ≤ 1 then |A (θ)| ≤ 1. Also,
1 − C + Ce−iθ ≤ |1 − C| + C
If a numerical scheme requires a solution of a linear system with a non-diagonal matrix then we call it an implicit
scheme, otherwise it is called an explicit scheme.
Consider the following “Backwards Backwards” scheme
n−1
unh,j − uh,j
unh,j − unh,j−1
=0
k
h
n−1
n
n
(1 + C) uh,j − Cuh,j−1 = uh,j
+v
This is an implicit scheme.
In order to study its stability (and the stability of any other FD scheme), it is enough, instead of considering
the full Fourier transform of the solution, to consider the ”action” of the scheme on a single mode of the form:
wn eijhξ
2.6. SUFFICIENT CONDITIONS FOR CONVERGENCE
33
then plugging into the BB scheme:
wn 1 + C 1 − e−ihξ = wn−1
|
{z
}
A−1 (hξ)
wn = A (hξ) wn−1
−1
A (θ) = 1 + C 1 − e−iθ
Then,
1 + C 1 − e−ihξ ≥ |1 + C| − Ce−ihξ = 1
so that
|A (θ)| ≤ 1
∀θ
therefore the BB scheme is unconditionally stable.
We may outline a procedure for von Neumann stability analysis:
1. Consider a single mode wn eijhξ .
2. Derive conditions for A to satisfy |A (hξ)| ≤ 1
2.6
Sufficient Conditions for Convergence
Consider
∂u
= Lu, (t, x) ∈ (0, T ) × R
∂t
where L is a linear spatial operator, with an initial condition
u (0, x) = f (x) .
We assume that the boundary conditions, in case of a finite spatial domain, are incorporated in the operator L.
Then after discretization using a scheme that involves two time levels, in general we obtain a discrete system
that can be written as:
un+1 − unh
C h
= Dun+1
+ F unh
h
k
where C, D, F are linear finite dimensional operators. The consistency error is obtained when we substitute the
exact solution u in the discrete system i.e.
C
The last equation can be rewritten as:
un+1 − un
n+1
= Dun+1 + F un + τk,h
.
k
n+1
Bun+1 = Aun + kτk,h
where
B = C − kD,
A = kF − C
If det (B) 6= 0 (otherwise the scheme would be inconsistent if the original problem is well posed) then
n+1
n+1
un+1 = B −1 Aun + kB −1 τk,l
= Eun + kB −1 τk,l
n+1
where τk,h
is the truncation error.
Theorem 2.6.1 If the scheme is consistent to order (p, q) in time and space and is also stable then it is
convergent of order (p, q), that is
n+1
u
= O (k p + hq )
− un+1
h
2
provided that the initial error is zero:
0
u − u0h = 0
2
34
Proof
n+1
un+1 = Eun + kB −1 τk,h
,
un+1
= Eunk
h
so that
n+1
X
0
0
n+1
n
n
−1 n+1
n+1
m
u
−
u
un+1
−
u
=
E
(u
−
u
)
−
kB
τ
=
.
.
.
=
E
−k
E n−m+1 B −1 τk,h
h
h
h
k,h
| {z }
m=1
=0
Then,
n+1
n+1
X X m n+1
m n+1 .
u
−
u
≤
k
≤
k
h
2
2
2
2
| {z }2
m=1
m=1
≤M
If the scheme is stable then
integer
l, ulh 2 ≤ K u0h 2 for some positive K. Since from
l for
any
positive
the scheme we have that uh 2 ≤ E l 2 u0h 2 and this is a sharp inequality we can conclude that ∃k2 > 0
n−m+1 E
≤ k2 and we have
2
n+1
X
n+1
m m n+1 u
≤ RT max τk,h
= RT O (k p + hq )
−
u
≤
k
k2 M τk,h
h
2
2
2
m
m=1
where T = k(n + 1) is the final time and R = k2 M .
Theorem 2.6.2 (Lax Theorem) Given a consistent scheme for a well-posed initial boundary value problem,
then stability is a necessary and sufficient condition for convergence.
2.7
Parabolic PDEs – Heat Equation
∂u
∂t
Which has the solution
2.7.1
2
= D ∂∂xu2 , (t, x) ∈ (0, T ] × R
u (0, x) = f (x) , x ∈ R
1
u (t, x) = √
4πDt
Z
∞
f (ξ) e−
(x−ξ)2
4Dt
dξ
−∞
FC Scheme
This is an explicit scheme.
δt+ unh,j − Dδx2 unh,j = 0
n
n
n
un+1
h,j = (1 − 2ΓD) uh,j + ΓD uh,j+1 + uh,j−1 ,
where Γ is called a grid ratio.
tn+1
tn
tn−l
xj−1
xj
xj+1
Γ=
k
h2
2.7. PARABOLIC PDES – HEAT EQUATION
35
The consistency error is found using
∂t u − δt+ u = O (k)
∂xx u − δx2 u = O h2
so that
τk,h = O k + h2 .
Stability is a central concern of parabolic and hyperbolic PDEs. Using the Neumann analysis idea we substitute
the Fourier mode:
vjn = wn eijθ
into the scheme to get:
1
D
wn+1 − wn eijθ = 2 wn eiθ − 2wn + wn e−iθ eijθ
k
h
This gives:
w
n+1
2 θ
= 1 − 4DΓ sin
wn
2
|
{z
}
A(θ)≤1
So that
0 < DΓ ≤
1
2
⇔
|A (θ)| ≤ 1
⇒
stability
and we require:
1
Dk
≤
2
h
2
⇒
k≤
1 h2
2D
and we note that k and h are not of the same order. So, this is a conditionally stable scheme, but it is explicit
and this is to be expected.
In this problem the domain Ω is the entire real axes and therefore it is convenient to use for measuring the
error the following ∞-norm: ||vh ||Ω = max |vh,j |. Then the following theorem provides the convergence
−∞<j<∞
estimate for the FC scheme.
Theorem 2.7.1 If DΓ ≤
1
2
and if uh satisfies exactly the initial condition then:
||u − uh ||Ω = ||ǫh ||Ω = O k + h2
Proof From the scheme we have:
n
n
n
2
2
un+1
h,j = DΓuh,j−1 + (1 − 2DΓ) uh,j + DΓuh,j+1 + O k + kh
so that:
DΓ ǫnh,j+1 + O k 2 + kh2
ǫn+1
DΓ ǫnh,j−1 + (1 − 2DΓ) ǫnh,j + |{z}
h,j = |{z}
| {z }
≥0
≥0
≥0
by the triangle inequality,
n+1 ǫ
≤ ||ǫn || + O k 2 + kh2 ≤ ǫ0 + (n + 1) O k 2 + kh2 = (n + 1) k O k + h2 .
h Ω
h Ω
h
Ω
| {z }
tn+1
Note that the the right hand side of this estimate will blow up as t → ∞ for fixed k, h.
36
2.7.2
BC scheme
tn+1
tn
tn−l
xj−1
xj
xj+1
2 n+1
δt− un+1
h,j = Dδx uh,j
This is a O k + h2 consistent scheme. It can be rewritten as:
n+1
n+1
n
−DΓun+1
h,j−1 + (1 + 2ΓD) uh,j − DΓuh,j+1 = uh,j
So we must invert a tridiagonal matrix. Now we will study its stability. Substituting
vjn = wn eijθ
into the scheme we have:
wn+1 − wn = DΓ eiθ − 2 + e−iθ wn+1
So that:
wn+1 =
1
wn .
1 + 4DΓ sin2 θ2
|
{z
}
A(θ)≤1
So, it is clear that the BC scheme is unconditionally stable.
2.7.3
Crank-Nicolson Scheme
It is given by
δt− un+1
h,j =
or
n
un+1
h,j − uh,j
k
Consistency:
=
1 2 n+1
D δx uh,j + δx2 unh,j ,
2
D n+1
n
n
n
uh,j−1 − 2un+1
h,j + uh,j+1 + uh,j−1 − 2uh,j + uh,j+1
2
2h
!
n+1
∂ 2 u n
∂ 2 u 2
2
+O h +
+O h
∂x2 j
∂x2 j
n+ 1
∂ 2 u 2
+ O k 2 + h2
=D
2
∂x j
D
D 2 n+1
δ u
+ δx2 unj =
2 x j
2
n+ 1
un+1
− unj
∂u 2
j
=
+ O k2
k
∂t j
2.7. PARABOLIC PDES – HEAT EQUATION
37
tn+1
tn
tn−l
xj−1
xj
xj+1
Neumann stability analysis: First, rewrite the scheme in the following two-stage form
n+ 1
uh,j 2 − unh,j
k
2
=D
n+ 1
2
un+1
h,j − uh,j
k
2
Then,
=D
unh,j−1 − 2unh,j + unh,j+1
Γ
θ
→ A1 (θ) = 1 − 4D sin2
2
h
2
2
n+1
n+1
un+1
h,j−1 − 2uh,j + uh,j+1
h2
1
wn+ 2 = A1 (θ) wn
→ A2 (θ) =
1
.
1 + 4D Γ2 sin2 2θ
1
wn+1 = A2 (θ) wn+ 2 = A1 A2 wn
| {z }
A(θ)
So that A (θ) ≤ 1, so the system is unconditionally stable, and O h + k .
2
2.7.4
2
Leapfrog Scheme
tn+1
tn
tn−l
xj−1
xj
xj+1
As we use central difference in both directions, the scheme is O k 2 + h2 . We may write the scheme as:
n−1
un+1
h,j − uh,j
2k
=D
unh,j−1 − 2unh,j + unh,j+1
h2
Now we preform stability analysis: vjn = wn eijθ
Then, assume that:
wn+1 − wn−1 = 2DΓ e−iθ − 2 + eiθ wn
wn = A (θ) wn−1
so that
and wn+1 = A (θ) wn
A2 − 1 wn−1 = 4DΓ (cos θ − 1) Awn−1
38
this gives
r
θ
θ
A1,2 = −4DΓ sin
± 1 + 16D2 Γ2 sin2
2
2
clearly |A2 | ≥ 1, and |A2 | > 1 for some θ, so the scheme is unconditionally unstable.
2
2.7.5
DuFort-Frankel Scheme
δt unh =
Since
i
D h n
n+1
n−1
n
u
−
u
+
u
+
u
h,j+1
h,j−1
h,j
h,j
h2
1 n+1
n−1
uh,j + uh,j
= unh,j + O k 2
2
this scheme can be considered as a stabilized version of the Leapfrog scheme. The truncation error is then:
2 2
k
∂ u
k3 ∂ 2 u
h2 ∂ 2 u
τk,h (t, x) = 2
(t
,
x
)
+
(t
,
x
)
−
(tn , xj ) + O k 3 + h3
n
j
n
j
2
2
2
h
∂t
3 ∂t
12 ∂x
k
h
and the system is conditionally consistant if
→ 0. If k, h go to zero but
k
h
→ c2 , then DF is consistent with:
∂u
∂ 2u
∂2u
+ c2 2 = D 2 .
∂t
∂t
∂x
The stability analysis reveals that this scheme is unconditionally stable.
2.8
Advection-Diffusion Equation
∂u
∂2u
∂u
(t, x) ∈ (0, T ] × (0, L)
+v
=D 2
∂t
∂x
∂x
x
vL
and v, D > 0. Then define τ = vt
L , ξ = L , and Peclet number P e = D . So the PDE becomes:
vT
1 ∂2u
∂u ∂u
× (0, 1)
,
(τ,
ξ)
∈
0,
+
=
∂τ
∂ξ
P e ∂ξ 2
L
2.8.1
FC Scheme
n
un+1
h,j − uh,j
unh,j+1 − unh,j−1
1 unh,j−1 − 2unh,j + unh,j+1
=0
−
k
2h
Pe
h2
This scheme is consistant to O k + h2 . Now preform stability analysis: vjn = wn eijθ
+
1 e−iθ − 2 + eiθ
eiθ − e−iθ n
=0
w −
k
2h
Pe
h2
2Γ
n
= w −Ci sin θ +
(cos θ − 1) + 1
Pe
|
{z
}
wn+1 − wn + k
wn+1
A(θ)
So we may write,
2
|A (θ)| =
2
θ
Γ
+ C 2 sin2 θ
sin2
1−4
Pe
2
If Γ/P e ≤ 1/2 the first term ≤ 1 and since C 2 = kΓ then C 2 sin2 θ ≤ P ek/2 = M k, M > 0, i.e.
2
|A| ≤ 1 + M k
2.8. ADVECTION-DIFFUSION EQUATION
39
2
Theorem 2.8.1 The FC scheme is stable if Γ ≤ P e/2 i.e. if |A (θ)| ≤ 1 + M k for some M > 0.
Proof
2
(F unh ) (ξ) = A2 (hξ) F unn−1
Therefore,
2
(ξ) = A2n (hξ) F u0h
2
n
(ξ) ≤ (1 + M k)
F u0h
2
(ξ)
Mtn 2 2
1
nkM F u0h 2 ≤ eMtn F u0h 2 ,
||F unh ||22 ≤ (1 + M k)n F u0h ≤ (1 + M k) kM F u0h 2 = (1 + M k) kM
2
2
where tn = nk.
Theorem 2.8.2 Assume:
1
Γ
≤
Pe
2
Then the solution of the FC scheme satisfies the maximum principle:
n max un+1
h,j ≤ max uh,j
j
j
for all n ≥ 1, if and only if h ≤
Proof Let h ≤
2
Pe,
2
Pe.
then:
n
Γ
1
1
Γ
n+1 −1 n 1 + hP e uh,j−1 + 1 − 2ΓP e
1 − hP e unh,j+1 ≤ max unh,j uh,j +
uh,j ≤
j
Pe
2
Pe
2
because h ≤ P2e , PΓe ≤ 12 .
Now, assume that
n max un+1
h,j ≤ max uh,j .
j
j
Aiming towards a contradiction, assume that h >
u0h,j
so that
u1h,1 =
Γ
Pe
2
Pe.
=
Then, choose the initial data as follows
1, j = 0, 1
0, j > 1
1
1 2
Γ
Γ
1 + hP e + 1 − 2
1+
>
Pe − 2 + 1 = 1
2
Pe
Pe
2 Pe
and this gives:
max u1h,j > 1 = max u0h,j j
so, by contradiction, the theorem holds.
j
Since violation of the maximum principle leads to unphysical solution then we must choose h ≤ 2/P e and so,
if the Peclet number P e is very large then we need to use very small h, of the order of P e−1 . This corresponds
to problems with boundary layers that need to be resolved by h. However, it is known from the properties of
the corresponding boundary value problem that the thickness of such boundary layers is of the order of P e−1/2
i.e. it can be resolved by h being of order of P e−1/2 . So, this scheme requires the use of a spatial step h much
less than what is needed to resolve the actual solution. The next scheme cures this problem at the expense of
loss of accuracy.
40
2.8.2
Upwinding Scheme (FB)
n
un+1
h,j − uh,j
k
Theorem 2.8.3 Assume that
Γ
Pe
+
unh,j − unh,j−1
1 unh,j−1 − 2unh,j + unh,j+1
=
h
Pe
h2
≤ 12 . Then the FB scheme satisfies a maximum principle provided that:
k
Γ
+ C ≤ 1,
C=
2
Pe
h
The proof is left as an exercise. This theorem implies that the maximum principle is satisfied if k ≤ (P eh2 )/(2 +
P eh). Note that in the limit P e → ∞, this restriction on k tends to h. On the other hand, the stability condition
k ≤ P eh2 /2 implies that if P e → ∞ but P eh2 → const (i.e. h = O(P e−1/2 )), k is restricted by a constant. In
conclusion, if the Peclet number P e is very large, the scheme guarantees that the solution satisfies a maximum
principle, similar to the exact solution, and is stable if k ≤ h, but k can be of the order of h. And this is true
even if h = O(P e−1/2 ) that is sufficient to resolve boundary layers.
The scheme has a consistency error
τk,h (t, x) = O (k + h)
as
δt+ unj =
and
δx− unj =
and
h ∂2u
∂u
(tn , xj ) + O h2
(tn , xj ) −
2
∂x
2 ∂x
δx2 unj =
Combining these results we have:
δt+ unj
+
δx− unj
∂u
(tn , xj ) + O (k)
∂t
∂2u
(tn , xj ) + O h2 .
2
∂x
∂u ∂u
1 2 n
δ u =
+
−
−
Pe x j
∂t
∂x
h
1
+
Pe 2
∂2u
+ O k + h2
∂x2
i.e. up to terms that are O(k + h2 ), the scheme is consistent with the equation
∂u ∂u
h ∂2u
1
= 0.
+
−
+
∂t
∂x
P e 2 ∂x2
This means that the scheme adds an extra diffusion term that is of order of h. If the leading order term in
the spatial error is proportional to a derivative of an even order, we call the scheme dissipative since such
schemes usually dissipate sharp changes (large gradients) of the solution. If the leading order term of the error
is proportional to a derivative of an odd order, the scheme is called dispersive. This is because odd derivatives
in PDEs lead to the so called dispersion in the solution. In terms of a Fourier decomposition of the solution,
this effect is manifested in the fact that due to the presence of odd derivatives different Fourier modes travel
with a different speed.
Chapter 3
Introduction to Finite Elements
3.1
3.1.1
Weighted Residual Methods
Sobolev spaces
Sobolev space:
Hm (Ω) ≡
v : Ω→R :
then
Z 2 2
<∞
v 2 + v (1) + . . . + v (m)
Ω
H0 (Ω) = L2 (Ω)
and Hm (Ω) is a Hilbert space:
(u, v)m =
Z Ω
and
||u||m =
Proposition 3.1.1
uv + u(1) v (1) + . . . + u(m) v (m) dx
sZ
u2 + . . . + u(m)
Ω
1. Hm+1 (Ω) ⊂ Hm (Ω)
2. If v ∈ Hm (Ω) then:
2 2
2
2
2
||v||m = ||v||0 + v (1) + . . . + v (m) 0
3. ||v||m+1 ≥ ||v||m and ||v||m+1
dx
0
≥ v (1) m .
4. If v, w ∈ Hm (Ω) then |(v, w)m | ≤ ||v||m ||v||m , this is the Cauchy-Schwartz inequality.
5. If v, w ∈ Hm (Ω) then
||v + w||m ≤ ||v||m + ||w||m
Proof We will prove the Cauchy-Schwartz inequality
|(u, v)m | ≤ ||u||m ||v||m
for u, v ∈ Hm (Ω). For any s ∈ R we have
2
2
0 ≤ (u − sv, u − sv)m = ||u||m + s2 ||v||m − 2s (u, v)m .
41
42
CHAPTER 3. INTRODUCTION TO FINITE ELEMENTS
Choose
(u, v)m
s=
2
||v||m
so that:
0 ≤ (u − sv, u − sv)m = ||u||2m +
2
2
(u, v)m
and, upon rearranging, we have:
−2
||v||2m
(u, v)m
= ||u||2m −
||v||2m
2
(u, v)m
||v||2m
(u, v)m ≤ ||u||m ||v||m
3.1.2
Weighted Residual Formulations
Consider
Lu = f,
in Ω
with
u = 0,
on ∂Ω
where
L=−
so that:
−
d2 u
=f
dx2
d2
dx2
in Ω = (0, 1)
with
u = 0,
on x = 0, 1
Define the “trial function” space, for example it can be chosen to be
U = u : u ∈ H2 (Ω) , u = 0 on ∂Ω
and we introduce the “weight functions” (also called test) space, for this choice of U it can be chosen to be
W = w : w ∈ L2 (Ω) , w = 0 on ∂Ω
Now, find u ∈ U such that
Z d2 u
+
f
w dΩ = 0, f ∈ L2 (Ω)
2
dx
Ω
for all w ∈ W . This is one “weighted residual formulation” of the original problem. This particular formulation
is also usually called a strong formulation. It still defines a continuous problem and in order to discretize it
we need to discretize the corresponding functional spaces i.e. to define appropriate approximations for each
function in them. We select a subset Uh of U :
and Wh , a subset of W :
Uh ⊂ U,
Uh = span {φ0 , . . . , φn−1 }
Wh ⊂ W,
Wh = span {ψ0 , . . . , ψn−1 }
and then, search for the discrete approximation uh in the form:
uh =
n−1
X
ci φi
i=0
so that:
Z
Ω
n−1
d2 X
ci φi − f
dx2 i=0
!
ψj dΩ = 0
⇒
n−1
X
ci
i=0
This is a linear algebraic system Lc = F where:
Z 2
d φi
Lij =
ψ dΩ,
2 j
Ω dx
Z
Ω
d2 φi
ψj =
dx2
Fi =
Z
Ω
Z
f ψj dΩ,
Ω
f ψi dΩ
j = 0, 1, . . . , n − 1
3.2. WEAK METHODS
3.1.3
43
Collocation Methods
Let
ψj (x) = δ (x − xj )
2
so that: Lij = ddxφ2i (xj ) and Fj = f (xj ) . Note that ψj are not in L2 however the formulation still makes sense
d2
2
if φi ∈ C 0 because the integrals are well defined. Next, we may approximate dx
2 ∼ δx , and this gives us a
finite difference method or we may select Uh as the space spanned by certain sets of orthogonal polynomials
(Legendre, Tchebishev etc.), and this gives us spectral collocation.
3.2
Weak Methods
By far, the most popular weighted residual methods are based on the so called weak formulation of the classical
problem. To obtain it we start from
Z 2
d u
+
f
w dΩ = 0 ∀ w ∈ W
dx2
Ω
and choose the discrete test functions space to be
W = w : w ∈ H10 (Ω)
. Here we define
. Then,
H10 (Ω) = u ∈ H1 (Ω) , u = 0 on ∂Ω
−
Z
Ω
du dw
+
dx dx
Z
| ∂Ω
du
w ds = −
dx
{z
}
Z
f w dΩ
Ω
=0
Then the problem is reformulated as: Find u ∈ U such that:
Z
Z
du dw
f w dΩ
=
Ω
Ω dx dx
A natural choice for U which makes all integrals well defined and incorporates the Dirichlet boundary conditions
in the solution is U = W . Such formulation is called a Galerkin formulation. The space U can be discretized by
means of piecewise polynomial functions, orthogonal polynomials, trigonometric polynomials, spline functions
etc. These choices yield various weak methods. In the remainder of the notes we will focus on the piecewise
polynomial approximations which yield the so called finite element methods.
Before we proceed with the discretization we shall prove some important results for the continuous Galerkin
formulation: Find u ∈ U = H10 (Ω) such that
Z
Z
u′ v ′ dΩ =
f v dΩ, ∀ v ∈ U.
Ω
Ω
Theorem 3.2.1 If u is a solution of the classical formulation
−u′′ = f
u=0
in Ω
on ∂Ω
then u is a solution of the Galerkin formulation. If u is a solution to the Galerkin Formulation then it is a
solution to the classical formulation if u ∈ C 2 ([0, 1]) and f ∈ C 0 ([0, 1]).
44
Proof Suppose that u solves the differential formulation:
−u′′ = f
then take v ∈ U so that:
−
Z
′′
u v dΩ =
ZΩ
u′ v ′ dΩ =
Z
f v dΩ
ZΩ
f v dΩ
Ω
Ω
(u′ , v ′ )0 = (f, v)0
Suppose that u solves the Galerkin formulation so that:
(u′ , v ′ ) = (f, v)
then
Z
Ω
(u′ v ′ − f v) dΩ = 0
⇒
Z
(u′′ + f ) v dΩ = 0
Ω
Assume towards contradiction that u′′ + f 6= 0 in some point x ∈ Ω. However, u′′ ∈ C 0 (Ω) and f ∈ C 0 (Ω).
Therefore,
u′′ + f ∈ C 0 (Ω) =⇒ u′′ + f 6= 0
in an entire open interval contained in Ω. Therefore, there exists (x0 , x1 ) ⊂ Ω such that u′′ + f > 0 for all
x ∈ (x0 , x1 ) or u′′ + f < 0 for all x ∈ (x0 , x1 ). We consider the first possibility and the second one can be
considered in exactly the same way. Let us choose
− (x − x0 ) (x − x1 ) in (x0 , x1 )
v=
0
otherwise.
For this choice it is clear that v ∈ H10 (Ω) and v ≥ 0 in Ω. Now since u′′ + f > 0 in (x0 , x1 ) then
(u′′ + f, v) > 0
but this contradicts the fact that u is a solution to (u′′ + f, v) = 0 for all v ∈ U . Therefore, u′′ + f = 0.
Example Consider the deformation of a rod from it’s equilibrium position under a given load f :
f (x)
u
Then
E (u) =
1 ′ ′
(u , u )0 − (f, u)0
2
is the total energy of the system.
Therefore, E is called the energy functional of any system described by a second order elliptic differential
equation.
Theorem 3.2.2 u ∈ H10 (Ω) is a solution to the Galerkin formulation if and only if u minimizes E (v) over
H10 (Ω). That is, E (u) ≤ E (v) for all v ∈ H10 (Ω).
3.2. WEAK METHODS
45
Proof
1. Suppose that u is a solution to the Galerkin formulation so that: (u′ , v ′ )0 = (f, v) for all v ∈ H10 (Ω)
so that:
1
1
1
′
′
(u + w) , (u + w) 0 −(f, u + w)0 = (u′ , u′ )0 −(f, u)0 + (w′ , w′ )0 −(f, w)0 +(u′ , w′ )0
E (v) = E (u + w) =
2
2
2
Since
(u′ , w′ )0 = (f, w)0
we have
E (v) = E (u) +
1
2
||w′ ||0 ≥ E (u)
2
2. Suppose that E (u) ≤ E (v) for all v ∈ H10 (Ω). Then if we take s ∈ R, E (u + sw) has a min at s = 0, and
therefore:
d
E (u + sw)
=0
ds
s=0
which we expand as:
1
(u + sw)′ , (u + sw)′ 0 − (f, u + sw)0 =0
2
s=0
d
1
′
′
2
′
′
′
′
′
′ =0
E (u) + s (u , w )0 − s (f, w)0 + s (w , w ) = (u , w )0 − (f, w)0 + s (w , w )
{z
}
|
ds
2
s=0
d
ds
=0
s=0
So that:
(u′ , w′ )0 = (f, w)0
for all w ∈ H10 (Ω). Therefore, u solves the Galerkin Formulation.
Consider a Hilbert space V equipped with the product (·, ·)V i..e. V is complete with respect to the norm
||·||V induced by this product. Now, consider the mapping a : V × V → R (further called a bilinear form) such
that:
1. a (αu + βv, w) = αa (u, w) + βa (v, w),
2. a (w, αu + βv) = αa (w, u) + βa (w, v),
3. there exists β > 0 such that |a (u, v)| ≤ β ||u||V ||v||V (such bilinear form is called bounded w.r.t. ||.||V ),
and
4. there exists ρ > 0 such that
2
a (u, u) ≥ ρ ||u||V
(such bilinear form is called coercive w.r.t. ||.||V )
Now, suppose that G (v) is a functional such that:
1. G (αu + βv) = αG (u) + βG (v) (such a functional is called linear) and
2. there exists δ > 0 such that |G (u)| ≤ δ ||u||V (such functional is called bounded).
The following theorem is one version of a very well known result from functional analysis.
Theorem 3.2.3 (Riesz Theorem) If V is a Hilbert space with a given inner product (., .)∗ and if G (v) is a
bounded linear functional on V (w.r.t. the norm induced by its inner product), then there is a unique element
û ∈ V such that (û, v)∗ = G (v) for all v ∈ V .
This theorem almost directly yields the proof of a somewhat restricted version of the following fundamental
result in PDE theory and their numerical analysis.
46
Lemma 3.2.4 (Lax-Milgram) If a (u, v) and G (v) satisfy the six conditions stated above then there exists a
unique solution û ∈ V for the following problem: Find u ∈ V s.t.
a (u, v) = G (v)
∀v ∈ V
If a (u, v) = a (v, u) (symmetric form) then û is the minimizer of
E (v) =
1
a (v, v) − G (v)
2
Proof We shall consider only the case where a (u, v) = a (v, u).
• The proof of the first claim is a direct consequence of the Riesz theorem if we prove that a (u, v) defines
an inner product on V . We know that:
2
a (u, u) ≥ ρ ||u||V
which guarantees that
p
a (u, u) = ||u||a
is a norm on V . It is not difficult then to verify that a(u, v) defines an inner product on V . Also, it is
straightforward to show that the norm ||.||a is equivalent to ||.||V i.e.:
p
√
ρ ||u||V ≤ ||u||a ≤ β ||u||V .
Therefore, since V is complete with respect to ||·||V then it is complete with respect to the norm ||.||a i.e.
V is a Hilbert space with respect to the product a(., .). As G (v) is bounded with respect to ||·||V , G (v)
is bounded with respect to ||·||a . Since a(., .) is an inner product on V and G (v) is bounded w.r.t. the
norm induced by this inner product, the Riesz theorem implies the first claim of the lemma.
• Next we show that û minimizes E (v) = 12 a (v, v) − G (v).
E (v) = E (û + w) = E (û) + 1/2 a (w, w) ≥ E (û)
| {z }
≥0
Example
−∇2 u = f
u=0
in Ω
.
on ∂Ω
Its Galerkin formulation reads: Find u ∈ H10 (Ω) s.t.:
(∇u, ∇v)0 = (f, v)0
∀v ∈ H10 (Ω) .
(3.1)
Below we prove the so-called Poincaré inequality which will guarantee that a(u, v) = (∇u, ∇v)0 defines a bilinear
form satisfying the conditions of the Lax-Milgram lemma.
Lemma 3.2.5 (Poincaré Inequality) If Ω is a bounded domain and v ∈ H10 (Ω) then ∃C > 0, depending only
on the domain Ω, such that
2
2
||v||0 ≤ C ||∇v||0 .
Proof Consider first v ∈ C10 (Ω) i.e. v is a continuously differentiable function defined on Ω that vanishes on
its boundary. We can always find a ∈ R large enough so that the cube Q = {x ∈ Rn : |xj | < a, 1 ≤ j ≤ n}
contains Ω. Integrating by parts in the x1 direction and taking into account that the surface integral vanishes
since v = 0 on ∂Ω we obtain:
R
R
R
2
∂v 2
||v||0 = v 2 dx = 1 · v 2 dx = − x1 ∂x
dx
1
Ω
Ω
Ω
R
R
∂v ∂v
= −2 x1 v ∂x
dx ≤ 2a |v| ∂x
dx.
1
1
Ω
Ω
3.3. FINITE ELEMENT METHOD (FEM)
47
Using the Cauchy-Schwarz inequality for the L2 inner product on Ω, we obtain:
R
∂v 2
∂v
||v||0 ≤ 2a |v| ∂x
k0 ≤ 2akvk0 k∇vk0 .
dx ≤ 2akvk0 k ∂x
1
1
Ω
Dividing by kvk0 gives the desired result with C = (2a)2 and v ∈ C10 (Ω). Due to some classical results in
functional analysis (see for example Ern, A. and Guermond, J.-L., Theory and Practice of Finite Elements,
Applied Mathematical Sciences, v. 159, Springer, 2004, p. 485), we can claim that for each v ∈ H10 (Ω) we can
1
1
choose a sequence of functions in {vk }∞
k=1 ⊂ C0 (Ω) such that the sequence converges to v in the H Ω) norm
i.e. kv − vk k0 ≤ kv − vk k1 → 0 as k → ∞ and similarly k∇v − ∇vk k0 ≤ kv − vk k1 → 0. Using the triangular
inequality in the form ku − wk0 ≥ kuk0 − kwk0 yields that kvk k0 → kvk0 and k∇vk k0 → k∇vk0 . This allows us
to take the limit in the Poincaré inequality for vk ∈ C10 (Ω) to derive the inequality for any function in H10 (Ω).
Using this inequality we easily prove the following proposition.
Proposition 3.2.6 a (u, v) = (∇u, ∇v)0 is a bilinear, bounded, and coercive form in H01 (Ω); G (v) = (f, v)0 is
bounded in H01 (Ω) if ||f ||0 < ∞.
Proof
Z
|a (u, v)| = ∇u∇vdx ≤ ||∇u||0 ||∇v||0 ≤ ||u||1 ||v||1
Ω
as ||u||1 = ||∇u||0 + ||u||0 ≥ ||∇u||0 . Then,
1
1
1
2
2
2
2
2
||∇u||0 + ||∇u||0 ≥
||u||0 + ||∇u||0 = ||u||1
2
2
2
R
The last proposition guarantee us that the bilinear form a (u, v) = ∇u∇vdx and the functional G (v) =
2
a (u, u) = (∇u, ∇u) = ||∇u||0 =
Ω
(f, v)0 satisfy the conditions of the Lax-Milgram lemma if ||f ||0 < ∞. This automatically guarantees that the
corresponding weak formulation (3.1) has a unique solution in H01 (Ω).
3.3
Finite Element Method (FEM)
Let us define a grid:
∆ = {x0 , . . . , xM }
and the following linear space of continuous piecewise linear functions on ∆:
M01 (∆) = span {l0 , l1 , . . . , lM } ∈ H1 (Ω)
where
Define the discrete space:
which is equivalent to
 x−x
i−1

 xi −xi−1 , xi−1 ≤ x ≤ xi
i+1
li = xx−x
, xi ≤ x ≤ xi+1 .
−x

 i 0i+1 otherwise.
Vh = v ∈ M01 (∆) , v (0) = v (1) = 0 ⊂ H10 (Ω)
Vh = span {l1 , . . . , lM−1 }
Let us now consider the following discrete problem: Find uh ∈ Vh such that:
(u′h , vh′ )0 = (f, vh )0
48
for all vh ∈ Vh . It is clear that:
uh =
M−1
X
u j lj
j=1
and
(u′h , li′ )0 = (f, li )0 ,
so that:


M−1
X
j=1

uj lj′ , li′ 
= (f, li )0
M−1
X
⇒
1≤i≤M −1
uj lj′ , li′
j=1
0
0
= (f, li )0 ,
1 ≤ i ≤ M − 1.
The last set of equations clearly constitute a linear algebraic system:
Auh = F
where Aij = li′ , lj′
Now, as
0
, Fi = (f, li )0 , and uh,i = ui .
0 ≤ (vh′ , vh′ )0 =
M−1
X
vi li′ ,
i=1
M−1
X
vj lj′ =
M−1
X M−1
X
vi li′ , lj′
i=1 j=1
j=1
v
0 j
= v T Av
the Lax-Milgram lemma guarantees that this system has a unique solution if f (x) ∈ L2 (Ω).
3.4
Gaussian Quadrature
We may approximate the integral of function φ (x) over the range x ∈ (−1, 1) using Gaussian integration:
Z
1
−1
φ (x) dx ≈
n
X
i=1
(n)
(n)
Ai φ xi
(n)
(n)
where Ai are the weights and xi are the nodes of the quadrature. n is a positive integer that controls
the accuracy. The weights and nodes are determined from the condition that the quadrature is exact for all
polynomials of the highest possible degree. We have 2n undetermined coefficients (n weights and n nodes) and
so we can make the quadrature exact for polynomials of 2n − 1 degree (that have 2n coefficients).
The following table contains the weights and Gauss points for the first fourGauss quadratures:
(n)
n
1
2
Ai
2
1, 1
3
4
5 8 5
9, 9, 9
(n)
xi
0
− √13 , √13
q
q
− 35 , 0, 35
−0.861136, −0.339981, 0.339981, 0.861136
0.347854, 0.652145, 0.347854, 0.652145
.
2n − 1
1
3
5
7
Remark We may approximate the integral over arbitrary bounds a and b using Gaussian Quadrature by
applying the substitution:
y (x)
a
y
x
b
−1
1
3.5. ERROR ESTIMATES
49
y (x) =
So that y (a) = −1, y (b) = 1, and dx =
Z
b
a
3.5
b−a
2
a+b
2
x+
b−a
a−b
dy. Therefore:
b−a
φ (x) dx =
2
Z
1
φ
−1
(b − a) y + a + b
2
dy
Error Estimates
Theorem 3.5.1 If uh is the finite element approximation to the exact solution u of the boundary value problem
−u′′ = f,
u (0) = u (1) = 0
then there exists k > 0 such that:
||u − uh ||1 ≤ k ||u − vh ||1
for all vh ∈ Vh .
Proof u is the exact solution, so that
a (u, wh ) = (f, wh )
for all wh ∈ Vh , and
a (uh , wh ) = (f, wh )
So that


a u − uh , wh  = 0
| {z }
Then,
ǫh
⇒
a (ǫh , wh ) = 0
1
||ǫh ||21 ≤ a (ǫh , ǫh ) + a (ǫh , wh ) ≤ a (ǫh , ǫh + wh ) = a (ǫh , u − uh + wh ) = a (ǫh , u − vh ) ≤ ||ǫh ||1 ||u − vh ||1
2
So that
||ǫh ||1 ≤ 2 ||u − vh ||1
Corollary 3.5.2 If u and uh are as in the theorem, then we have
||u − uh ||1 ≤ ch
where c > 0 is independent of h.
Proof For the time being we will assume that u ∈ C 2 (Ω) i.e. u is a solution to the classical problem for
PM−1
f ∈ C 0 (Ω). Let û be the piecewise linear interpolant of u in Vh i.e. û = i=1 ui li , where ui are the values
of u in the nodes of the grid xi . Using Taylor expansion it is possible to prove the following estimate for the
interpolation error:
||u′ − û′ ||L∞ ≤ ||u′′ ||L∞ h
Recall that:
||u − uh ||21 ≤ 4 ||u − v||21
for all v ∈ Vh and as û ∈ Vh we have:
2
2
2
2
2
||u − uh ||1 ≤ 4 ||u − û||1 = 4 ||u − û||0 + ||u′ + û′ ||0 ≤ 8 ||u′ − û′ ||0
50
where, in the last step, we applied the Poincaré inequality. Continuing, we have:
Z 1
2
2
2
2
(u′ − û′ ) dx ≤ 8 ||u′ − û′ ||L∞ ≤ 8 ||u′′ ||L∞ h2
||u − uh ||1 ≤ 8
0
2
2
where we have used the interpolation estimate ||u′ − û′ ||L∞ ≤ ||u′′ ||L∞ h2 . This yields
√
||u − uh ||1 ≤ 8 ||u′′ ||L∞ h = O (h)
And, applying Poincaré’s inequality, we have:
2
||u − uh ||21 = ||u − uh ||20 + ||u′ − u′h ||0 ≥ 2 ||u − uh ||20
so that:
||u − uh ||0 ≤ ch ||u′′ ||L∞ .
This is a suboptimal estimate because from interpolation theory we know that u can be approximated with a
piecewise linear function with a second order accuracy in the L2 norm.
3.6
Optimal Error Estimates
Lemma 3.6.1 If û is the finite element interpolant of u ∈ H2 (Ω), i.e. û =
1. ||u − û||0 ≤
2. ||u′ − û′ ||0 ≤
h 2
π
h
π
||u′′ ||0
PM−1
j=1
u(xj )lj , then:
||u′′ ||0
Proof The domain is subdivided into elements at points:
x0 = 0, x1 , x2 , . . . xi−1 , xi , . . . xM−1 , xM = 1
Consider the difference
u − û
1
in [0, h] = [x0 , x1 ]
and define η (x) = u − û ∈ H ([0, h]). Clearly,
η (0) = η (h) = 0.
Since η ∈ C 0 [0, h] it can be expanded in a uniformly convergent Fourier sine series:
η (x) =
∞
X
ηn sin
n=1
nπx h
Then, using the Parceval identity,
Z
0
h
2
η 2 dx = ||η||L2 [0,h] =
Differentiating term wise we get:
η ′ (x) =
∞
X
ηn
nπx nπ
cos
h
h
ηn
nπx n2 π 2
sin
h2
h
n=1
⇒
∞
hX 2
η .
2 n=1 n
2
||η ′ ||L2 [0,h] =
∞
h X nπ 2
ηn
2 n=1
h
and differentiating again we have:
η ′′ (x) = −
∞
X
n=1
⇒
2
||η ′′ ||L2 [0,h] =
∞
h X 2 nπ 4
η
2 n=1 n h
3.6. OPTIMAL ERROR ESTIMATES
51
So that
2
||η ′ ||L2 [0,h] =
∞
∞
h X 2 nπ 2 nπ 2 h2
h2 ′′ 2
h2 h X 2 nπ 4
h2 ′′ 2
=
≤
||η
||
=
||u ||L2 [0,h]
ηn
η
2
L [0,h]
2 n=1
h
h
n2 π 2
π 2 2 n=1 n h
π2
π2
and
2
||η||L2 [0,h]
4
∞
∞
h2 ′ 2
h2 h X 2 nπ 2
h
hX 2
2
= 2 ||η ||L2 [0,h] =
||u′′ ||L2 [0,h]
η ≤ 2
η
=
2 n=1 n
π 2 n=1 n h
π
π
Applying the same for any subsequent subinterval and summing the resulting inequalities we complete the
proof.
Corollary 3.6.2
2. ||u − uh ||1 ≤
Proof
√
1. ||u − û||1 ≤
8h
π
√
h 2
π
||u′′ ||0
||u′′ ||0
1.
2
2
||u − û||21 = ||u − û||20 + ||u′ − û′ ||0 ≤ 2 ||u′ − û′ ||0 ≤ 2
2
h
2
||u′′ ||0
π
2. for all vh ∈ Vh in the finite element space we have:
||u − uh ||1 ≤ 2 ||u − vh ||1
Take vh = û, then:
√
h 8 ′′
||u ||0
||u − uh ||1 ≤ 2 ||u − û||1 ≤
π
So that
||u − uh ||1 ≤ Ch ||u′′ ||0
where C is independent of h.
Theorem 3.6.3 (L2 lifting theorem) If uh ∈ Vh is the Galerkin finite element approxitamion of:
−u′′ = f (x) , x ∈ (0, 1)
u :
u (0) = u (1) = 0
then there exists Γ > 0 such that
||u − uh ||0 ≤ Γh2 ||u′′ ||0
Proof Define:
ǫh = u − u h
then,
(u′ , vh′ )0 = (f, vh )0
for all vh ∈ Vh or,
so that
Now, consider the auxiliary problem
a (u, vh ) = (u′ , vh′ )0
a (u, vh ) = (f, vh )0
a (uh , vh ) = (f, vh )0
a (ǫh , vh ) = 0
−φ′′ = ǫh , 0 < x < 1
φ (0) = φ (1) = 0
(3.2)
52
Then,
2
||ǫh ||0 = (ǫh , ǫh )0 = − (ǫh , φ′′ )0 = (ǫ′h , φ′ )0 = a (ǫh , φ)
Now, taking vh = φ̂ in (3.2), we get:
a ǫh , φ̂ = 0,
and subtracting it from the above equation we obtain:
2
||ǫh ||0
Z 1 ′
′
′
=
ǫ′ φ′ − φ̂′ dx
= a (ǫh , φ) = a (ǫh , φ) − a ǫh , φ̂ = a ǫh , φ − φ̂ = ǫh , φ − φ̂
0
0
√
√
h
2h
2
≤ ||ǫ′h ||0 φ′ − φ̂′ ≤ ||ǫ′h ||0 φ − φ̂ ≤ ||ǫ′h ||0
||φ′′ ||0 = ||ǫ′h ||0
||ǫh ||0
π
π
0
1
√
h 2
||ǫh ||0 ≤ Γh2 ||u′′ ||0 ||ǫh ||0
≤ ||ǫh ||1
π
So that
||ǫh ||0 ≤ Γh2 ||u′′ ||0
for some Γ > 0, independent of h.
3.6.1
Other Boundary Conditions
Let us consider an elliptic problem with more general boundary conditions:
−u′′ (x) = f, 0 < x < 1
u (0) = β1 , u′ (1) = β2 .
Since we need to satisfy non-homogeneous boundary conditions of Dirichlet and Neumann type, this time we
discretize the solution with the expansion:
uh = β1 l0 (x) +
M
X
uj lj (x)
j=1
so that
uh (0) = β1 l0 (0) = β1 .
Now, consider the original equation:
−u′′ = f
multiply it by v ∈ V = H1 (Ω), and integrate over Ω to obtain:
(−u′′ , v)0 = (f, v)0
(u′ , v ′ )0 + u′ (0) v (0) − u′ (1) v (1) = (f, v)0 .
⇒
If we approximate u with (3.3) and take the test functions to be vh ∈ Vh = span {l1 , . . . , lM }, we have:
(u′h , lj )0 − u′h (1) v (1) = (f, lj )0 ,
or
β1 l0′ +
M
X
un ln′ , lj′
n=1
This yields:
β1 l0′
+
M
X
n=1
!
j = 1, . . . , M,
− β2 lj (1) = (f, lj )0 .
un ln′ , l1′
!
= (f, l1 )0 ,
(3.3)
3.7. TRANSIENT PROBLEMS
53
M
X
un ln′ , lj′
n=1
for j = 2, . . . , M − 1 and finally,
M
X
′
un ln′ , lM
n=1
3.7
!
0
!
= (f, lj )
− β2 = (f, lM )0 .
Transient Problems
Consider the Initial Boundary Value Problem given by:
 ∂u
∂
a (x) ∂u
(t, x) ∈ (0, T ] × (0, 1)
 ∂t − ∂x
∂x = f (x) for
u (t, 0) = u (t, 1) = 0
for t > 0

u (0, x) = g (x)
for x ∈ (0, 1)
For a (x) ∈ C 1 Ω̄ such that 0 < α ≤ a (x) ≤ A < ∞ and ||f (x) ||0 ≤ L < ∞.
The Galerkin formulation of the problem is given by: Find u ∈ H10 (Ω) such that
∂u
, v + b (u, v) = (f (x) , v)0
∂t
0
∂v
for all v ∈ H10 (Ω), where b (u, v) = a (x) ∂u
∂x , ∂x 0 . Note that b(u, v) is a coercive and bounded bilinear form on
1
H0 (Ω) (why?). In order to discretize it we define
Vh = span {l1 , l2 , . . . lM−1 }
and search for uh ∈ Vh such that:
∂uh
, vh
∂t
0
+ b (uh , vh ) = (f, vh )0
for all vh ∈ Vh . Then, taking into account that
uh (t, x) =
M−1
X
uj (t) lj (x)
j=1
we obtain the linear system
where
for each i, l ∈ 1, . . . , M − 1, and
∂
(Mu) + Su = F
∂t
Z
Z
∂li ∂lj
a (x)
(li , lj ) dx +
Aij =
dx
∂x ∂x
}
{z
}
| Ω {z
|Ω
mass matrix M
stiffness S
Fj =
Z
f lj dx
Ω
Then, as M is time-independent,
∂
(u) + Su = F
∂t
We may discretize this ODE system using for example a backward difference scheme (a good choice for such
problems, as we know from the previous section)
M
M
un − un−1
+ Sun = Fn + τk
k
54
∂u n un − un−1
is the truncation error of the approximation of the time derivative of u by a
where τk = −
+
∂t
k
backward difference. Define
M
A=
+S ,
k
where :
Mij =
Z
li lj
Ω
and
Z
∂li ∂lj
dx ≥ α
Sij =
a (x)
∂x ∂x
Ω
Z
Ω
∂li ∂lj
dx.
∂x ∂x
M, A are are spd (why?). If u is the exact solution, then
∂u
∂u
∂
a (x)
=f
−
∂t
∂x
∂x
Now we multiply by vh ∈ Vh and integrate:
∂u
, vh + b (u, vh ) = (f, vh )0
∂t
0
Applying the discretization, we get:
n
n
n
u − un−1
∂u
u − un−1
n
n
, vh + b (u , vh )0 = (f , vh )0 −
, vh
, vh +
k
∂t
k
0
0
0
which we may rewrite as:
un − un−1
, vh
k
0
+ b (un , vh ) = (f n , vh )0 + (τk , vh )0 .
On the other hand we have for the finite element approximation to the solution that:
n
uh − uhn−1
, vh + b (unh , vh ) = (f n , vh )0 .
k
0
Letting
un − unh = ǫnh
and subtracting the two preceding equations equations, we get:
n
ǫh − ǫhn−1
, vh + b (ǫnh , vh ) = (τk , vh )0
k
Now, define wh such that
b (wh , vh ) = b (u, vh )
(3.4)
for all vh ∈ Vh . wh is called the elliptic projection of u onto Vh . Then we split the error ǫnh into ǫnh = η n + ξ n
where η n = un − whn is the error of the elliptic projection of u. Therefore, as we have already established for
the solution of (3.4)
2 n
n
2 ∂ u ||u − wh ||0 ≤ Γh 2 ∂x
0
But then, as
∂wh
∂t
∂u
∂t
is the elliptic projection of
onto vh ∈ Vh ,
n 3 n ∂u
∂wh 2 ∂ u ≤
Γh
−
∂t
2
∂t
∂t∂ x 0
0
55
Now,
ξ n − ξ n−1 , vh
But, as ξ n ∈ Vh ,
ξ n − ξ n−1 , ξ n
0
+ kb (ξ n , vh ) = k (τkn , vh )0 − η n − η n−1 , vh
0
− kb (η n , vh )
0
+ kb (ξ n , ξ n ) = k (τkn , ξ n )0 − η n − η n−1 , ξ n
0
− kb (η n , ξ n )
and, since since b (ξ n , ξ n ) ≥ 0 and b (η n , ξ n ) = 0, this leads to the inequality:
ξ n − ξ n−1 , ξ n 0 ≤ k (τkn , ξ n )0 − η n − η n−1 , ξ n 0
which we may rewrite as:
2
+ k (τkn , ξ n )0 − η n − η n−1 , ξ n
≤
||ξ n ||0 ≤ ξ n−1 , ξ n
Lemma 3.7.1
ξ n−1 , ξ n
Proof
0
0
1 n−1 2 1 n a
ξ
+ ||ξ ||0
0
2
2
0
2
2
n
0 ≤ ξ n − ξ n−1 0 = ||ξ n ||0 − 2 ξ n , ξ n−1 0 + ξ n−1 0
Upon rearranging, this yields the desired result.
Theorem 3.7.2 (Young Inequality)
1
||v||20
4α2
(u, v)0 ≤ α2 ||u||20 +
Lemma 3.7.3
k2
k (τ , ξ ) ≤
2
n
Proof
n
2 2
∂ u k n 2
∂t2 dt + 2 ||ξ ||0
(n−1)k
0
Z
nk
k n 2 k n 2
||τ ||0 + ||ξ ||0 .
2
2
Z
n
un − un−1
1 nk
∂u
∂2u
Now using the integral form of the truncation error: τ n =
−
=
[t − (n − 1) k] 2 dt,
∂t
k
k (n−1)k
∂t
and the Cauchy-Schwartz inequality for the integral in time, we obtain:
Z
2
n
1 nk
n
n−1 2
2
∂u
u
−
u
∂
u
n 2
= ||τ ||0 = dt
−
[t
−
(n
−
1)
k]
2
∂t
k
k
∂t
(n−1)k
0
0
2
s
s
Z nk
Z nk 2 2
1
∂ u
2
dt
≤ [t − (n − 1)] dt
2
∂t
(n−1)k
(n−1)k
k
0
√ s
2
Z
Z
Z
2
2
nk
nk
k 3
∂2u
∂2u
= k
≤ dt
dt dx
∂t2
∂t2
Ω (n−1)k
(n−1)k
k
k (τ n , ξ n ) ≤
0
2 2
∂ u =k
∂t2 dt
(n−1)k
0
Z
So then,
nk
k
k
k2
2
2
k τ k , ξ n ≤ ||τ n ||0 + ||ξ n ||0 ≤
2
2
2
2 2
∂ u k n 2
∂t2 dt + 2 ||ξ ||0
(n−1)k
Z
nk
56
Lemma 3.7.4
2 4
n
η − η n−1 , ξ n ≤ Γ h
0
2
3 2
∂ u k n 2
∂t∂x2 dt + 2 ||ξ ||0
(n−1)k
0
Z
nk
Proof Using the Cauchy-Schwartz inequality for the integral in time and subsequently,
we obtain

Z
! sZ
2
nk
nk
√
n
∂η
∂η
η − η n−1 , ξ n = dt, kξ n 
dt, ξ n ≤ 
0
(n−1)k ∂t
∂t
(n−1)k
0
nk
2
k
∂η
1
2
dt + ||ξ n ||0 =
∂t
2
2
Ω (n−1)k
Z nk 2
∂η 1
dt + k ||ξ n ||2
=
0
2 (n−1)k ∂t 0
2
≤
Then, applying the result
1
2
Z Z
Z
nk
(n−1)k
Z Ω
∂η
∂t
2
the Young’s inequality
0
dt +
k n 2
||ξ ||0
2
3 2
2
2 2
∂wh
∂u ∂u 2 ∂ u 2 ∂
=
Γh
−
≤
Γh
,
∂t
2
2
∂t 0
∂x
∂t
∂x ∂t 0
0
we obtain the final claim of the lemma.
Lemma 3.7.5 Let k ≤ 14 , then:
3 2
2 n−1
X
∂ u 2
∂ u + 2h4 Γ2 2 +
||ξ m ||20
||ξ n ||20 ≤ 2 ξ 0 0 + 2k 2 2 ∂t L2 (0,T,L2 (Ω))
∂x ∂t L2 (0,T,L2 (Ω)) m=0
Where we define
||f (t, x)||L2 (0,T,L2 (Ω)) =
Z
0
T
Z
f 2,
Ω
f ∈ L2 ((0, T ) × Ω) .
Proof Using the results of the last three lemmas we easily obtain:
2
2
Z
Z
1 m 2
Γ2 h4 mk ∂ 3 u 1 m−1 2 k 2 mk ∂ 2 u m m
m 2
dt +
||ξ ||0 + kb (ξ , ξ ) ≤
ξ
+
∂x2 ∂t dt + k ||ξ ||0
0
2
2
2 (m−1)k ∂t2 0
2
(m−1)k
0
and, as b (ξ m , ξ m ) ≥ 0,
2
||ξ m ||0
2
≤ ξ m−1 0 + k 2
2 2
Z mk 3 2
∂ u ∂ u m 2
dt + Γ2 h4
2 dt + 2k ||ξ ||0
∂t2 (m−1)k ∂x ∂t 0
(m−1)k
0
Z
mk
Now we sum for m = 1, . . . , n to get:
Z nk 2 2
Z nk 3 2
n−1
X
2
∂ u ∂ u 2
2
2
dt + Γ2 h4
dt + 2k
||ξ n ||0 ≤ ξ 0 0 + k 2
||ξ m ||0 + 2k ||ξ n ||0
∂t2 ∂t∂x2 0
0
0
0
m=1
Z nk 3 2
Z nk 2 2
n−1
X
∂ u 2
∂ u 2
2
2 4
||ξ m ||0 .
(1 − 2k) ||ξ n ||0 ≤ ξ 0 0 + k 2
∂x2 ∂t dt + 2k
∂t2 + Γ h
0
0
0
0
m=1
2
Since k ≤ 1/4 then 1 − 2k ≥ 1/2. Also ξ 0 0 ≥ 0 so we can multiply it by 4k and add to the right hand side
of the last inequality to obtain:
Z nk 2 2
Z nk 3 2
n−1
X
0 2
∂ u ∂ u n 2
2
2 4
dt + 4k
||ξ ||0 ≤ 2 ξ 0 + 2k
+
2Γ
h
||ξ m ||20 .
∂t2 ∂x2 ∂t 0
0
0
0
m=0
But k ≤ 1/4 and this competes the proof.
57
N
Lemma 3.7.6 (Gronwall Lemma) If the sequence {ξ n }n=1 satisfies
|ξ n | ≤ ν + µ
n−1
X
m=1
|ξ m | ,
ν, µ ≥ 0
then,
Proof Consider the sequence zn defined by
zn = ν + µ
|ξ n | ≤ eµN ν + µ ξ 0 n
X
m=1
|ξ m |
⇒
z n − z n−1 = µ |ξ n |
so that,
z n ≤ (1 + µ) z n−1
z n−1 ≤ (1 + µ) z n−2 ≤ . . . ≤ (1 + µ)
(n−1)µ
µ
z 0 ≤ eµ(n−1) z 0 ≤ eµN z 0
because:
1
(1 + µ) µ ≤ eµ .
Then,
|ξ n | ≤ z n−1 ≤ eµN z 0 = eµN ν + µ ξ 0 for n = 1, . . . , N .
Now let µ = 4k and
2 2
2 2
0 2
2 ∂ u 2 4 ∂ u ν = 2 ξ 0 + 2k 2 + 2Γ h .
2
∂t L2 (0,T ;L2 (Ω))
∂t∂x L2 (0,T ;L2 (Ω))
Denoting the final time instant by T = nk and using the results of lemma 3.7.5 we find that
2
||ξ n ||0
≤e
2 2
0 2
2 ∂ u 3 ξ 0 + 2k 2 ∂t
2
4T
L (0,T ;L2 (Ω))
Let’s choose u0h = wh0 i..e.
≤e
4T
L (0,T ;L2 (Ω))
!
2 2
3 2
∂ u 2 4 ∂ u 2k 2 = O k 2 + h4
+ 2Γ h 2
∂t L2 (0,T ;L2 (Ω))
∂t∂x L2 (0,T ;L2 (Ω))
2
Theorem 3.7.7 Consider the scheme:
M
If k <
1
4
!
b u0h , vh = a (g, vh )
for all vh ∈ Vh . Then ξ 0 = 0. Therefore,
2
||ξ n ||0
3 2
∂ u + 2h Γ ∂t∂x2 2
4 2
un − un−1
+ Aun = f
k
and u0h = wh0 , then
||ǫnh ||0 ≤ O k + h2
58
Proof
h 2
ξ ≤ a2 k 2 + b2 h4 ≤ ak + lh2 2
0
where we define:
2 2
∂ u a2 = 2e4T 2 ∂t L2 (0,T ;L2 (Ω))
3 2
2
4T 2 ∂ u b = 2e Γ ∂t∂x2 L2 (0,T ;L2 (Ω))
Then,
n
n
||ǫn
h ||0 ≤ ||η ||0 + ||ξ ||0 ≤
3.8
√
2 ∂ u 2e2T 2 ∂t
k + h2 Γ
L2 (0,T ;L2 (Ω))
"
2 n √
∂ u + 2e2T
max 2
0≤n≤N
∂x
0
Finite Elements in Two Dimensions
Consider the model problem:
− (pux )x − (puy )y + qu = g
u=γ
1
3 ∂ u ∂t∂x2 L2 (0,T ;L2 (Ω))
in Ω
on ∂Ω
2
e1
e2
7
4
3
e3
e5
e4
5
9
e7
e6
6
e8
8
10
This is the global numbering, the local numbering (for i = 1, 2, 3) is:
3 → ik3
ik1 ← 1
2 → ik2
where k is the element number and ikj is the global number of the j-th local point in the k-th element.
#
3.8. FINITE ELEMENTS IN TWO DIMENSIONS
3.8.1
59
Finite Element Basis Functions
With each global mode i we associate one basis funcion φi (x, y). If Nj is the j-th point in the grid, we require:
φi (Nj ) = δij
where φj |ek ∈ P 1 , the space of all first degree polynomials. Then,
Vh = span {φ1 , . . . , φM }
where M is the total number of points in the grid.
uh (x, y) =
M
X
ui φi (x, y)
i=1
1
1
φi (x, y)
(k)
φ
r (x, y)
Ni
ek
r
ek
where ikr = Ni and r can be 1, 2, or 3 (local numbers).
ξ
3
1 ek
Tk
y
2
1
x
1
Now, we define:
Tk ≡
Then,
So that:
η
x = x (ξ, η) = a1 ξ + a2 η + a3
y = y (ξ, η) = b1 ξ + b2 η + b3
φk
1 (ξ, η) = 1 − ξ − η,
Z
φk
2 (ξ, η) = ξ,
φi (x, y) φj (x, y) dΩ =
Ω
Ke Z
X
k=1
φk
3 (ξ, η) = η
φi (ξ, η) φj (ξ, η) dek
ek
Where Ke is the total number of finite elements, dek = |JTk | dξdη, and JTk is the Jacobian of Tk .
60
3.8.2
Discrete Galerkin Formulation
Find uh ∈ Vh such that:
Z
puh,x vh,x +
Ω
Z
puh,y vh,y +
Ω
Z
quh vh =
Ω
Z
gvh
Ω
and, for simplicity, we choose γ ≡ 0, p = constant and q = constant. Then, substitutiting
uh =
M
X
ui φi
i=1
and vh = φj for j = 1, . . . , M yields the linear system:
(pS + qM) uh = G
where:
Sij = p
Mij = q
Z
ZΩ
Ω
∇φi ∇φj dΩ
φi φj dΩ
(the stiffness matrix)
(the mass matrix)

MATH 536 - Numerical Solution of Partial Differential Equations

Transcription

Documents pareils

OSTROWSKI`S THEOREM FOR Q(i) We will extend Ostrowki`s

BEAUTIFUL THEOREMS OF GEOMETRY AS VAN AUBEL`S

cohomology of groups - International Mathematical Union

Open Shear flows - La Diagonale Paris

A short survey of automated reasoning

BI-84 DEPARTMENT: HOME AFFAIRS REPUBLIC OF SOUTH

Rafting with AIJ

Conduction with Internal Energy Generation

Health insurance exemption form - The American University of Paris