def doolittle(A):
"""Performs Doolittle LU decomposition: A = LU, with L unit lower triangular and U upper triangular."""
n = len(A)
L = [[0.0]*n for _ in range(n)]
U = [[0.0]*n for _ in range(n)]
for i in range(n):
# Upper Triangular
for k in range(i, n):
U[i][k] = A[i][k] - sum(L[i][j]*U[j][k] for j in range(i))
# Lower Triangular
L[i][i] = 1.0
for k in range(i+1, n):
if U[i][i] == 0:
raise ZeroDivisionError("Zero pivot encountered.")
L[k][i] = (A[k][i] - sum(L[k][j]*U[j][i] for j in range(i))) / U[i][i]
return L, U
def print_matrix(M, name="Matrix"):
print(f"{name} =")
for row in M:
print(" [" + " ".join(f"{val:8.3f}" for val in row) + "]")
print()O.1 Cholesky Decomposition
The Cholesky decomposition is a method for factorizing a positive definite matrix into the product of a lower triangular matrix and its transpose. It is particularly useful for solving systems of linear equations and for generating samples from multivariate normal distributions.
Definition O.1 (Definition of the Cholesky Decomposition) André-Louis Cholesky (1875–1918) was a cartographer in the French army, who introduced a method for decomposing a symmetric positive definite matrix into the product of a lower triangular matrix and its transpose. This decomposition is known as the Cholesky decomposition.
Let A be a symmetric positive definite matrix. The Cholesky decomposition of A is a factorization of the form: A = LL^\top \tag{O.1} where L is a lower triangular matrix with positive diagonal entries.
The Cholesky decomposition is unique for a given positive definite matrix, and it can be computed efficiently using algorithms such as the Doolittle algorithm or the Crout algorithm.
O.1.1 Doolittle Algorithm
Algorithm O.1
A = [
[2, 3, 1],
[4, 7, 7],
[6, 18, 22]
]
print_matrix(A, name="A")
L, U = doolittle(A)
print_matrix(L, name="L")
print_matrix(U, name="U")A =
[ 2.000 3.000 1.000]
[ 4.000 7.000 7.000]
[ 6.000 18.000 22.000]
L =
[ 1.000 0.000 0.000]
[ 2.000 1.000 0.000]
[ 3.000 9.000 1.000]
U =
[ 2.000 3.000 1.000]
[ 0.000 1.000 5.000]
[ 0.000 0.000 -26.000]
O.1.2 Doolittle’s Algorithm with Partial Pivoting
When performing LU decomposition, it is often necessary to use partial pivoting to ensure numerical stability and to handle cases where the matrix may be singular or nearly singular. Partial pivoting involves swapping rows of the matrix to place the largest absolute value in the pivot position.
Adding partial pivoting is algebraically equivalent to multiplying the original matrix by a permutation matrix P, such that PA = LU, where P is a permutation matrix, L is a lower triangular matrix, and U is an upper triangular matrix.
def doolittle_partial_pivoting(A):
"""Performs LU decomposition with partial pivoting: PA = LU."""
n = len(A)
# Deep copy of A
A = [row[:] for row in A]
P = list(range(n))
L = [[0.0]*n for _ in range(n)]
U = [[0.0]*n for _ in range(n)]
for k in range(n):
# Partial pivoting: find row with max abs value in column k
pivot_row = max(range(k, n), key=lambda i: abs(A[i][k]))
if A[pivot_row][k] == 0:
raise ZeroDivisionError("Matrix is singular.")
# Swap rows in A and record permutation
A[k], A[pivot_row] = A[pivot_row], A[k]
P[k], P[pivot_row] = P[pivot_row], P[k]
for i in range(k):
L[k][i], L[pivot_row][i] = L[pivot_row][i], L[k][i]
# Compute U[k][k:] and L[k+1:][k]
L[k][k] = 1.0
for j in range(k, n):
U[k][j] = A[k][j] - sum(L[k][s]*U[s][j] for s in range(k))
for i in range(k+1, n):
L[i][k] = (A[i][k] - sum(L[i][s]*U[s][k] for s in range(k))) / U[k][k]
# Permutation matrix P as a 2D matrix
P_matrix = [[1 if j == P[i] else 0 for j in range(n)] for i in range(n)]
return P_matrix, L, UDemo for Doolittle’s algorithm with partial pivoting:
A = [
[0, 3, 1],
[4, 7, 7],
[6, 18, 22]
]
print_matrix(A, name="A")
P, L, U = doolittle_partial_pivoting(A)
print_matrix(P, name="P")
print_matrix(L, name="L")
print_matrix(U, name="U")A =
[ 0.000 3.000 1.000]
[ 4.000 7.000 7.000]
[ 6.000 18.000 22.000]
P =
[ 0.000 0.000 1.000]
[ 0.000 1.000 0.000]
[ 1.000 0.000 0.000]
L =
[ 1.000 0.000 0.000]
[ 0.667 1.000 0.000]
[ 0.000 -0.600 1.000]
U =
[ 6.000 18.000 22.000]
[ 0.000 -5.000 -7.667]
[ 0.000 0.000 -3.600]
A = [
[2, 1, 1, 3, 2],
[1, 2, 2, 1, 1],
[3, 2, 3, 2, 1],
[2, 1, 2, 2, 1],
[1, 1, 1, 1, 1]
]
P, L, U = doolittle_partial_pivoting(A)
print_matrix(P, "P")
print_matrix(L, "L")
print_matrix(U, "U")
def matmul(A, B):
return [[sum(A[i][k] * B[k][j] for k in range(len(B)))
for j in range(len(B[0]))]
for i in range(len(A))]
def permute(A, P):
"""P is a permutation matrix; return PA."""
return matmul(P, A)
PA = permute(A, P)
LU = matmul(L, U)
# Print comparison
print_matrix(PA, "PA")
print_matrix(LU, "LU")P =
[ 0.000 0.000 1.000 0.000 0.000]
[ 0.000 1.000 0.000 0.000 0.000]
[ 0.000 0.000 0.000 1.000 0.000]
[ 1.000 0.000 0.000 0.000 0.000]
[ 0.000 0.000 0.000 0.000 1.000]
L =
[ 1.000 0.000 0.000 0.000 0.000]
[ 0.333 1.000 0.000 0.000 0.000]
[ 0.667 -0.250 1.000 0.000 0.000]
[ 0.667 -0.250 -3.000 1.000 0.000]
[ 0.333 0.250 -1.000 0.250 1.000]
U =
[ 3.000 2.000 3.000 2.000 1.000]
[ 0.000 1.333 1.000 0.333 0.667]
[ 0.000 0.000 0.250 0.750 0.500]
[ 0.000 0.000 0.000 4.000 3.000]
[ 0.000 0.000 0.000 0.000 0.250]
PA =
[ 3.000 2.000 3.000 2.000 1.000]
[ 1.000 2.000 2.000 1.000 1.000]
[ 2.000 1.000 2.000 2.000 1.000]
[ 2.000 1.000 1.000 3.000 2.000]
[ 1.000 1.000 1.000 1.000 1.000]
LU =
[ 3.000 2.000 3.000 2.000 1.000]
[ 1.000 2.000 2.000 1.000 1.000]
[ 2.000 1.000 2.000 2.000 1.000]
[ 2.000 1.000 1.000 3.000 2.000]
[ 1.000 1.000 1.000 1.000 1.000]
O.1.3 Vectorization of the doolittle algorithm
import numpy as np
def doolittle_numpy(A):
"""LU decomposition with partial pivoting using NumPy. Returns P, L, U such that PA = LU."""
A = np.array(A, dtype=float)
n = A.shape[0]
P = np.eye(n)
L = np.zeros((n, n))
U = A.copy()
for k in range(n):
# Partial pivoting
pivot = np.argmax(abs(U[k:, k])) + k
if U[pivot, k] == 0:
raise ZeroDivisionError("Matrix is singular.")
if pivot != k:
U[[k, pivot]] = U[[pivot, k]]
P[[k, pivot]] = P[[pivot, k]]
L[[k, pivot], :k] = L[[pivot, k], :k]
L[k, k] = 1.0
L[k+1:, k] = U[k+1:, k] / U[k, k]
U[k+1:] -= np.outer(L[k+1:, k], U[k])
return P, L, Udef random_sign_matrix(n, seed=None):
"""Generate an n×n matrix with random entries in {-1, 0, 1}."""
rng = np.random.default_rng(seed)
return rng.choice([-1, 0, 1], size=(n, n))
A = [
[2, 3, 1],
[4, 7, 7],
[6, 18, 22]
]
A = random_sign_matrix(16, seed=42)
print_matrix(A, name="A")
P, L, U = doolittle_numpy(A)
print_matrix(P, name="P")
print_matrix(L, name="L")
print_matrix(U, name="U")A =
[ -1.000 1.000 0.000 0.000 0.000 1.000 -1.000 1.000 -1.000 -1.000 0.000 1.000 1.000 1.000 1.000 1.000]
[ 0.000 -1.000 1.000 0.000 0.000 0.000 -1.000 1.000 1.000 0.000 0.000 1.000 0.000 0.000 0.000 -1.000]
[ -1.000 0.000 1.000 -1.000 1.000 1.000 -1.000 0.000 -1.000 1.000 1.000 0.000 -1.000 1.000 0.000 1.000]
[ 1.000 1.000 1.000 -1.000 0.000 0.000 0.000 -1.000 0.000 -1.000 1.000 1.000 1.000 1.000 0.000 1.000]
[ 0.000 -1.000 1.000 0.000 -1.000 0.000 1.000 -1.000 0.000 -1.000 1.000 0.000 -1.000 -1.000 0.000 1.000]
[ 1.000 0.000 -1.000 1.000 0.000 1.000 -1.000 -1.000 1.000 1.000 0.000 1.000 1.000 0.000 1.000 -1.000]
[ -1.000 1.000 0.000 -1.000 1.000 -1.000 1.000 -1.000 1.000 1.000 1.000 0.000 0.000 1.000 -1.000 1.000]
[ 0.000 0.000 0.000 0.000 -1.000 -1.000 -1.000 -1.000 0.000 1.000 0.000 0.000 1.000 0.000 -1.000 1.000]
[ 0.000 0.000 0.000 0.000 -1.000 0.000 1.000 -1.000 0.000 -1.000 0.000 0.000 1.000 -1.000 -1.000 0.000]
[ 1.000 1.000 -1.000 -1.000 1.000 -1.000 1.000 -1.000 1.000 -1.000 0.000 0.000 -1.000 0.000 0.000 1.000]
[ 1.000 0.000 0.000 0.000 0.000 1.000 -1.000 -1.000 0.000 -1.000 -1.000 -1.000 1.000 1.000 1.000 0.000]
[ 1.000 -1.000 1.000 0.000 1.000 -1.000 0.000 1.000 0.000 0.000 -1.000 0.000 -1.000 -1.000 1.000 0.000]
[ 0.000 0.000 1.000 -1.000 0.000 -1.000 0.000 1.000 0.000 1.000 0.000 1.000 0.000 -1.000 1.000 1.000]
[ -1.000 1.000 -1.000 1.000 1.000 0.000 1.000 -1.000 -1.000 -1.000 0.000 1.000 -1.000 0.000 1.000 -1.000]
[ 1.000 -1.000 1.000 0.000 0.000 -1.000 0.000 1.000 -1.000 1.000 0.000 1.000 0.000 0.000 -1.000 0.000]
[ -1.000 0.000 -1.000 0.000 0.000 -1.000 1.000 0.000 1.000 -1.000 -1.000 0.000 -1.000 -1.000 0.000 -1.000]
P =
[ 1.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000]
[ 0.000 0.000 0.000 1.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000]
[ 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.000 0.000 0.000 0.000 0.000 0.000 0.000]
[ 0.000 0.000 0.000 0.000 0.000 1.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000]
[ 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.000 0.000 0.000 0.000 0.000]
[ 0.000 0.000 1.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000]
[ 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000]
[ 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.000 0.000 0.000]
[ 0.000 0.000 0.000 0.000 0.000 0.000 1.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000]
[ 0.000 1.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000]
[ 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.000 0.000 0.000 0.000 0.000 0.000]
[ 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.000 0.000 0.000 0.000]
[ 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.000 0.000]
[ 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000]
[ 0.000 0.000 0.000 0.000 1.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000]
[ 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.000]
L =
[ 1.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000]
[ -1.000 1.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000]
[ -1.000 1.000 1.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000]
[ -1.000 0.500 0.750 1.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000]
[ -1.000 0.000 -0.500 0.000 1.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000]
[ 1.000 -0.500 -0.750 -1.000 0.667 1.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000]
[ -0.000 0.000 -0.000 0.000 -0.667 -0.571 1.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000]
[ 1.000 0.000 0.500 0.667 0.667 -0.714 -0.938 1.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000]
[ 1.000 0.000 -0.000 -0.667 0.333 -0.143 -0.187 0.537 1.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000]
[ -0.000 -0.500 -0.750 -0.333 0.333 0.286 0.375 -0.232 0.670 1.000 0.000 0.000 0.000 0.000 0.000 0.000]
[ -1.000 0.500 0.250 0.333 0.000 0.429 0.125 -0.189 -0.393 0.619 1.000 0.000 0.000 0.000 0.000 0.000]
[ -0.000 0.000 -0.500 -0.667 0.000 0.000 0.438 -0.242 0.214 -0.636 -0.177 1.000 0.000 0.000 0.000 0.000]
[ -1.000 0.000 -0.500 0.000 0.333 -0.143 0.250 -0.211 -0.625 -0.996 -0.786 0.408 1.000 0.000 0.000 0.000]
[ -0.000 0.000 -0.000 0.000 -0.667 -0.143 -0.188 0.032 -0.161 0.360 0.480 0.771 0.733 1.000 0.000 0.000]
[ -0.000 -0.500 -0.750 -0.333 -0.333 0.143 -0.250 0.042 -0.045 0.933 0.070 0.108 0.143 0.431 1.000 0.000]
[ 1.000 -0.500 0.250 -0.333 -0.333 -0.286 0.062 0.158 0.554 0.895 0.568 0.802 0.025 0.140 -0.221 1.000]
U =
[ -1.000 1.000 0.000 0.000 0.000 1.000 -1.000 1.000 -1.000 -1.000 0.000 1.000 1.000 1.000 1.000 1.000]
[ 0.000 2.000 1.000 -1.000 0.000 1.000 -1.000 0.000 -1.000 -2.000 1.000 2.000 2.000 2.000 1.000 2.000]
[ 0.000 0.000 -2.000 0.000 1.000 -1.000 1.000 0.000 1.000 0.000 -1.000 -1.000 -2.000 -1.000 0.000 0.000]
[ 0.000 0.000 0.000 1.500 -0.750 2.250 -2.250 0.000 -0.250 1.000 0.250 1.750 2.500 0.750 1.500 -1.000]
[ 0.000 0.000 0.000 0.000 1.500 -0.500 -0.500 2.000 -0.500 -1.000 -1.500 0.500 -1.000 -0.500 2.000 1.000]
[ 0.000 0.000 0.000 0.000 0.000 2.333 -1.667 -2.333 0.333 2.667 2.000 0.667 0.667 1.333 -0.333 -0.667]
[ 0.000 0.000 0.000 0.000 0.000 0.000 -2.286 -1.000 -0.143 1.857 0.143 0.714 0.714 0.429 0.143 1.286]
[ 0.000 0.000 0.000 0.000 0.000 0.000 0.000 -5.938 0.104 3.646 2.896 0.146 -0.854 0.688 -2.437 -1.271]
[ 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.965 1.772 0.425 0.151 1.688 0.568 -0.379 -0.172]
[ 0.000 0.000 0.000 0.000 0.000 -0.000 0.000 0.000 0.000 -2.134 0.095 1.141 -1.120 -0.096 0.064 -1.137]
[ 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 -1.551 -2.328 1.486 0.788 0.474 0.854]
[ 0.000 0.000 0.000 0.000 0.000 -0.000 0.000 0.000 0.000 0.000 0.000 1.671 -0.663 -1.065 1.553 -1.072]
[ 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.449 2.208 -1.697 -0.149]
[ 0.000 0.000 0.000 0.000 0.000 -0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 -2.134 0.123 1.760]
[ 0.000 0.000 0.000 0.000 0.000 -0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 1.765 2.843]
[ 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.802]
O.1.4 Crout’s Algorithm
Algorithm O.2
Crout’s algorithm is another method for performing LU decomposition, similar to Doolittle’s algorithm. It constructs the lower triangular matrix L directly, while the upper triangular matrix U is obtained from the original matrix A.
the key parts are the formula for computing the entries of L:
L[i][j] = A[i][j] - \sum_{k=0}^{j-1} L[i][k] U[k][j]
and using this to computing the entries of U:
U[j][i] = \frac{A[j][i] - \sum_{k=0}^{j-1} L[j][k] U[k][i]}{L[j][j]}
import numpy as np
def crout_lu(A, *, pivot=True, rtol=1e-9, atol=None):
"""
Robust Crout LU with partial pivoting.
Improvements
------------
1. Pivot is selected from the *updated* column, eliminating the
“false zero-pivot” issue.
2. Test `abs(pivot) < max(atol, rtol*‖A‖∞)` so tolerance scales with data.
Returns L, U, P such that P @ A = L @ U.
"""
A = np.asarray_chkfinite(A, dtype=float)
n = A.shape[0]
if A.ndim != 2 or n != A.shape[1]:
raise ValueError("square matrix required")
L = np.zeros_like(A)
U = np.eye(n, dtype=A.dtype)
P = np.eye(n, dtype=A.dtype)
rows = np.arange(n)
if atol is None:
atol = np.finfo(A.dtype).eps * np.linalg.norm(A, np.inf) * 10
for k in range(n):
# current residual column
col = A[rows[k:], k] - L[k:, :k] @ U[:k, k]
if pivot:
j = k + np.argmax(np.abs(col)) # best row
if np.abs(col[j - k]) < max(atol, rtol*np.abs(col).max()):
raise np.linalg.LinAlgError("matrix is numerically singular")
if j != k: # swap logical rows
rows[[k, j]] = rows[[j, k]]
L[[k, j], :k] = L[[j, k], :k]
P[[k, j]] = P[[j, k]]
col[[0, j - k]] = col[[j - k, 0]]
L[k:, k] = col
if np.abs(L[k, k]) < max(atol, rtol*np.abs(col).max()):
raise np.linalg.LinAlgError("zero pivot encountered")
# row k of U (unit diagonal)
U[k, k+1:] = (A[rows[k], k+1:] - L[k, :k] @ U[:k, k+1:]) / L[k, k]
return L, U, PTest the Crout LU decomposition with a random full rank sign matrix:
def random_full_rank_sign_matrix(n, seed=42):
rng = np.random.default_rng(seed)
while True:
A = rng.choice([-1,0, 1], size=(n, n))
if np.linalg.matrix_rank(A) == n:
return A
A = random_full_rank_sign_matrix(12, seed=42)
print_matrix(A, name="A")
A_orig = A.copy() # Keep original for verification
L, U, P = crout_lu(A)
# Check correctness
assert np.allclose(P @ A_orig, L @ U, atol=1e-8)A =
[ -1.000 1.000 0.000 0.000 0.000 1.000 -1.000 1.000 -1.000 -1.000 0.000 1.000]
[ 1.000 1.000 1.000 1.000 0.000 -1.000 1.000 0.000 0.000 0.000 -1.000 1.000]
[ 1.000 0.000 0.000 1.000 0.000 0.000 0.000 -1.000 -1.000 0.000 1.000 -1.000]
[ 1.000 1.000 -1.000 0.000 -1.000 1.000 1.000 0.000 -1.000 1.000 0.000 1.000]
[ 1.000 1.000 1.000 -1.000 0.000 0.000 0.000 -1.000 0.000 -1.000 1.000 1.000]
[ 1.000 1.000 0.000 1.000 0.000 -1.000 1.000 0.000 -1.000 0.000 1.000 -1.000]
[ 0.000 -1.000 1.000 0.000 -1.000 -1.000 0.000 1.000 1.000 0.000 -1.000 1.000]
[ 0.000 1.000 -1.000 -1.000 1.000 1.000 0.000 1.000 1.000 0.000 1.000 -1.000]
[ -1.000 1.000 0.000 -1.000 1.000 -1.000 1.000 -1.000 1.000 1.000 1.000 0.000]
[ 0.000 1.000 -1.000 1.000 0.000 0.000 0.000 0.000 -1.000 -1.000 -1.000 -1.000]
[ 0.000 1.000 0.000 0.000 1.000 0.000 -1.000 1.000 0.000 0.000 0.000 0.000]
[ -1.000 0.000 1.000 -1.000 0.000 -1.000 0.000 0.000 1.000 -1.000 -1.000 0.000]
The Cholesky decomposition has several important properties:
- If A is positive definite, then L is unique and has positive diagonal entries.
- The Cholesky decomposition can be used to solve linear systems of equations of the form Ax = b by first solving Ly = b for y, and then solving L^\top x = y.
- The Cholesky decomposition can be used to generate samples from a multivariate normal distribution by transforming samples from a standard normal distribution using the Cholesky factor L.
The Cholesky decomposition is widely used in various applications, including numerical optimization, Bayesian inference, and machine learning. It is particularly useful for solving linear systems efficiently and for generating samples from multivariate normal distributions.
O.2 Kronecker Product and Hadamard Product
The Kronecker product and Hadamard product are two important operations in linear algebra that are used to manipulate matrices in various ways.
However when it comes to programming, these operations can be used to formalize many matrix operations in a more compact and efficient way which otherwise couldn’t be understood as an algebraic operation.
These operations are particularly useful in applications such as signal processing, image processing, and machine learning, where they can be used as short hand notation for certain matrix operations used in many algorithms.
O.2.1 Kronecker Product
The Kronecker product is a matrix operation that takes two matrices and produces a block matrix by multiplying each element of the first matrix by the entire second matrix. A programmer might describe it as a “block-wise multiplication” of two matrices.
The Hadamard product, on the other hand, is an element-wise multiplication of two matrices of the same dimensions.
Definition O.2 (Definition of the Kronecker Product) The Kronecker product of two matrices A and B, denoted by A \otimes B, is defined as the block matrix formed by multiplying each element of A by the entire matrix B. If A is an m \times n matrix and B is a p \times q matrix, then the Kronecker product A \otimes B is an (mp) \times (nq) matrix given by:
A \otimes B = \begin{bmatrix} a_{11}B & a_{12}B & \cdots & a_{1n}B \\ a_{21}B & a_{22}B & \cdots & a_{2n}B \\ \vdots & \vdots & \ddots & \vdots \\ a_{m1}B & a_{m2}B & \cdots & a_{mn}B \end{bmatrix} \tag{O.2}
where a_{ij} are the elements of matrix A.
some properties of the Kronecker product include:
Theorem O.1 (Properties of the Kronecker Product) Let A be an m \times n matrix and B be a p \times q matrix. Then:
- a \otimes A = A\otimes a for any scalar a (scalar multiplication property)
- (aA) \otimes (bB) = ab(A \otimes B) for any scalars a and b (scalar multiplication property)
- (A \otimes B) \otimes C = A \otimes (B\otimes C) (associative property)
- (A+B) \otimes C = (A \otimes C) + (B \otimes C) when A,B,C are matrices of the same size (distributive property)
- A \otimes (B+C) = (A \otimes B) + (A \otimes C) when A,B,C are matrices of the same size (distributive property)
- (A \otimes B)^{\prime} = A^{\prime} \otimes B^{\prime} (transpose property)
- (\mathbf{a} \otimes \mathbf{b})^{\prime} = \mathbf{a}^{\prime} \otimes \mathbf{b}^{\prime} (commutativity property for vectors)
- \text{det}(A \otimes B) = \text{det}(A)^{p} \cdot \text{det}(B)^{m} (determinant property)
- \text{rank}(A \otimes B) = \text{rank}(A) \cdot \text{rank}(B) (rank property)
- (A \otimes B)^{-1} = A^{-1} \otimes B^{-1}, if both A and B are invertible (inverse property)
- (A \otimes B)(C \otimes D) = (AC) \otimes (BD) (multiplication property) ;. \text{tr}(A \otimes B) = \text{tr}(A) \cdot \text{tr}(B) (trace property)