Mathematics for Machine Learning

Mathematics for Machine Learning

Collection of Quizzes on the topic "Mathematics for machine learning". Starts from the very basics of what is a vector and slowly builds complexity as we progress

by Shiju P John
Oct 29, 2025
14 Quizzes
Public
FREE

Quiz Book Content

14 quizzes included in this collection

Math for Machine Learning Chapter 1: Vectors

Math for Machine Learning Chapter 1: Vectors

This quiz covers fundamental concepts of vectors, distinguishing between their geometric (arrows in space) and algebraic (lists of numbers) representations. It delves into core vector operations such as vector addition and scalar multiplication, emphasizing their intuitive meaning and practical applications in Machine Learning. Understanding these concepts is paramount as vectors form the bedrock for representing data points, features, parameters, and directions of change in ML algorithms. Key formulas and concepts include: **1. Algebraic Vector Representation:** A vector $$ \mathbf{v} $$ in $$ n $$-dimensional space (e.g., $$ \mathbb{R}^n $$) is typically represented as an ordered list of $$ n $$ numbers (components): $$ \mathbf{v} = \begin{pmatrix} v_1 \\ v_2 \\ \vdots \\ v_n \end{pmatrix} $$ or $$ \mathbf{v} = (v_1, v_2, \dots, v_n) $$. Each $$ v_i $$ is a real number. **2. Geometric Vector Representation:** An arrow in space, with its length representing **magnitude** and its orientation representing **direction**. A position vector starts at the origin (0,0) and points to a specific coordinate. A free vector represents displacement and can be drawn starting from any point. **3. Vector Addition:** - **Algebraic:** Performed component-wise. If $$ \mathbf{u} = (u_1, u_2, \dots, u_n) $$ and $$ \mathbf{v} = (v_1, v_2, \dots, v_n) $$, then: $$ \mathbf{u} + \mathbf{v} = (u_1+v_1, u_2+v_2, \dots, u_n+v_n) $$ - **Geometric:** - **Triangle Rule:** Place the tail of the second vector at the head of the first. The sum is the vector from the tail of the first to the head of the second. - **Parallelogram Rule:** Place both vectors tail-to-tail. Complete the parallelogram. The sum is the diagonal starting from the common tail. - **Properties:** Commutative ($$ \mathbf{u} + \mathbf{v} = \mathbf{v} + \mathbf{u} $$) and Associative ($$ (\mathbf{u} + \mathbf{v}) + \mathbf{w} = \mathbf{u} + (\mathbf{v} + \mathbf{w}) $$). **4. Scalar Multiplication:** - **Algebraic:** Multiply each component of the vector by the scalar. If $$ c $$ is a scalar and $$ \mathbf{v} = (v_1, v_2, \dots, v_n) $$, then: $$ c\mathbf{v} = (cv_1, cv_2, \dots, cv_n) $$ - **Geometric:** - Scales the magnitude of the vector by $$ |c| $$. - If $$ c > 0 $$, the direction remains the same. - If $$ c < 0 $$, the direction reverses (by 180 degrees). - **Properties:** Associative ($$ (cd)\mathbf{v} = c(d\mathbf{v}) $$), Distributive over vector addition ($$ c(\mathbf{u} + \mathbf{v}) = c\mathbf{u} + c\mathbf{v} $$), and Distributive over scalar addition ($$ (c+d)\mathbf{v} = c\mathbf{v} + d\mathbf{v} $$). **5. Vectors in Machine Learning:** - **Data Points:** Features of a data instance (e.g., pixel values of an image, attributes of a customer) are represented as components of a vector. - **Parameters:** Weights and biases of models (e.g., neural networks, linear regression) are often vectors. - **Directions of Change:** Gradients in optimization algorithms are vectors that indicate the direction of steepest ascent (or descent). This quiz evaluates your understanding of these core concepts, their interrelations, and their practical significance in the field of Machine Learning.

Math for Machine Learning
FREE
Start Quiz
Math for Machine Learning - Chapter 11: Eigenvalues and Eigenvectors

Math for Machine Learning - Chapter 11: Eigenvalues and Eigenvectors

This quiz is designed to test a deep understanding of eigenvalues and eigenvectors, focusing on their theoretical foundations, geometric intuition, and properties of special matrix types relevant to machine learning. It covers advanced concepts such as algebraic and geometric multiplicity, diagonalizability, Cayley-Hamilton Theorem, Perron-Frobenius Theorem, and applications to covariance matrices, aiming to solidify expertise in 'eigen-everything'. The questions are crafted to be challenging, requiring analytical reasoning beyond simple formula recall. **Key Formulas and Concepts:** * **Eigenvalue Equation**: $$ Av = \lambda v $$ * **Characteristic Equation**: $$ \det(A - \lambda I) = 0 $$ * **Eigenspace**: $$ E_{\lambda} = \text{Null}(A - \lambda I) $$ * **Algebraic Multiplicity (AM)**: The multiplicity of $$ \lambda $$ as a root of the characteristic polynomial. * **Geometric Multiplicity (GM)**: $$ \dim(E_{\lambda}) $$. Always $$ 1 \le GM \le AM $$. * **Diagonalization**: A matrix $$ A $$ is diagonalizable if and only if $$ A = PDP^{-1} $$, where $$ D $$ is a diagonal matrix of eigenvalues and $$ P $$ is an invertible matrix of eigenvectors. This occurs if and only if $$ GM = AM $$ for all eigenvalues. * **Trace and Determinant**: $$ \text{tr}(A) = \sum \lambda_i $$, $$ \det(A) = \prod \lambda_i $$. * **Symmetric Matrices**: Real eigenvalues, orthogonal eigenvectors (for distinct eigenvalues), diagonalizable by orthogonal matrix ($$ A = QDQ^T $$). * **Positive Definite Matrices**: Symmetric, with all eigenvalues strictly positive; $$ x^T A x > 0 $$ for all non-zero $$ x $$. * **Projection Matrices**: Eigenvalues are only 0 or 1. * **Orthogonal Matrices**: Eigenvalues have magnitude 1. * **Cayley-Hamilton Theorem**: Every square matrix satisfies its own characteristic equation, i.e., if $$ p(\lambda) $$ is the characteristic polynomial of $$ A $$, then $$ p(A) = 0 $$. * **Perron-Frobenius Theorem (for positive matrices)**: Guarantees a unique largest positive eigenvalue (Perron root) with a strictly positive eigenvector.

Math for Machine Learning
FREE
Start Quiz
Math for Machine Learning - Chapter 12: Eigen decomposition and Diagonalization

Math for Machine Learning - Chapter 12: Eigen decomposition and Diagonalization

This quiz rigorously tests your advanced understanding of Eigendecomposition and Diagonalization, crucial concepts in linear algebra for machine learning. The questions cover theoretical foundations, computational aspects, and practical implications, designed to challenge even expert learners. Key formulas and concepts: * **Eigenvalue Equation**: For a square matrix $$A$$, a non-zero vector $$v$$ is an eigenvector if $$Av = \lambda v$$, where $$ \lambda $$ is the corresponding eigenvalue. * **Characteristic Equation**: Eigenvalues are found by solving $$ \det(A - \lambda I) = 0 $$, where $$I$$ is the identity matrix. * **Diagonalization**: A matrix $$A$$ is diagonalizable if there exists an invertible matrix $$P$$ and a diagonal matrix $$D$$ such that $$A = PDP^{-1}$$. The columns of $$P$$ are the linearly independent eigenvectors of $$A$$, and the diagonal entries of $$D$$ are the corresponding eigenvalues. * **Conditions for Diagonalization**: $$A$$ is diagonalizable if and only if for every eigenvalue $$ \lambda $$, its algebraic multiplicity equals its geometric multiplicity. A sufficient condition is that $$A$$ has $$n$$ distinct eigenvalues. Real symmetric matrices are always orthogonally diagonalizable ($$A = QDQ^T$$). * **Properties of Eigenvalues**: * Trace: $$ \text{tr}(A) = \sum_{i=1}^n \lambda_i $$ * Determinant: $$ \det(A) = \prod_{i=1}^n \lambda_i $$ * Eigenvalues of $$A^k$$ are $$ \lambda^k $$. * Eigenvalues of $$A^{-1}$$ are $$ \lambda^{-1} $$ (if $$A$$ is invertible). * **Matrix Functions**: For an analytic function $$f(x)$$, if $$A = PDP^{-1}$$, then $$f(A) = P f(D) P^{-1}$$. * **Geometric Interpretation**: Eigendecomposition reveals the invariant directions (eigenvectors) along which a linear transformation acts merely as a scaling (eigenvalues). * **Applications**: Fundamental to PCA (Principal Component Analysis) for dimensionality reduction, spectral clustering, solving systems of linear ODEs and recurrence relations, and understanding the stability of dynamical systems. This quiz demands a deep understanding of these principles and their interconnections. Good luck!

Math for Machine Learning
FREE
Start Quiz
Math for Machine Learning - Chapter 3: Linear Independence and Basis

Math for Machine Learning - Chapter 3: Linear Independence and Basis

This quiz rigorously tests your understanding of linear independence, span, basis, and dimension, crucial concepts in Linear Algebra for Machine Learning. It covers identifying redundant vectors, understanding coordinate systems, and implications for feature engineering and dimensionality reduction. Expect challenging questions that require a deep conceptual grasp and analytical skills. **Key Formulae and Concepts:** * **Linear Combination**: A vector $$ \mathbf{v} $$ is a linear combination of vectors $$ \mathbf{v}_1, \mathbf{v}_2, \ldots, \mathbf{v}_k $$ if there exist scalars $$ c_1, \ldots, c_k $$ such that $$ \mathbf{v} = c_1\mathbf{v}_1 + c_2\mathbf{v}_2 + \ldots + c_k\mathbf{v}_k $$. * **Linear Independence**: A set of vectors $$ \{\mathbf{v}_1, \mathbf{v}_2, \ldots, \mathbf{v}_k\} $$ is linearly independent if the only solution to the vector equation $$ c_1\mathbf{v}_1 + c_2\mathbf{v}_2 + \ldots + c_k\mathbf{v}_k = \mathbf{0} $$ is $$ c_1 = c_2 = \ldots = c_k = 0 $$. If non-zero scalars exist, the vectors are linearly dependent (redundant). * **Span**: The span of a set of vectors $$ S = \{\mathbf{v}_1, \mathbf{v}_2, \ldots, \mathbf{v}_k\} $$, denoted by $$ \text{span}(S) $$, is the set of all possible linear combinations of the vectors in $$ S $$. * **Basis**: A set of vectors $$ \mathcal{B} = \{\mathbf{b}_1, \mathbf{b}_2, \ldots, \mathbf{b}_d\} $$ is a basis for a vector space $$ V $$ if: 1. $$ \mathcal{B} $$ is linearly independent. 2. $$ \text{span}(\mathcal{B}) = V $$. * **Dimension**: The dimension of a vector space $$ V $$, denoted $$ \text{dim}(V) $$, is the number of vectors in any basis for $$ V $$. All bases for a given finite-dimensional vector space have the same number of vectors. * **Coordinate Vector**: If $$ \mathcal{B} = \{\mathbf{b}_1, \ldots, \mathbf{b}_n\} $$ is a basis for $$ V $$, then for any $$ \mathbf{x} \in V $$, there exist unique scalars $$ c_1, \ldots, c_n $$ such that $$ \mathbf{x} = c_1\mathbf{b}_1 + \ldots + c_n\mathbf{b}_n $$. The coordinate vector of $$ \mathbf{x} $$ relative to $$ \mathcal{B} $$ is $$ [\mathbf{x}]_{\mathcal{B}} = \begin{bmatrix} c_1 \\ \vdots \\ c_n \end{bmatrix} $$. * **Determinant and Linear Independence**: For $$ n $$ vectors in $$ \mathbb{R}^n $$, they are linearly independent if and only if the determinant of the matrix formed by these vectors as columns (or rows) is non-zero. * **Rank-Nullity Theorem**: For an $$ m \times n $$ matrix $$ A $$, $$ \text{rank}(A) + \text{nullity}(A) = n $$, where $$ \text{rank}(A) $$ is the dimension of the column space (or row space) and $$ \text{nullity}(A) $$ is the dimension of the null space. * **Properties**: * Any set of $$ k $$ vectors in an $$ n $$-dimensional space is linearly dependent if $$ k > n $$. * Any set of $$ k $$ vectors in an $$ n $$-dimensional space cannot span the space if $$ k < n $$. * A set of vectors containing the zero vector is always linearly dependent.

Math for Machine Learning
FREE
Start Quiz
Math for Machine Learning - Chapter 2: Advanced Vector Spaces & Subspaces

Math for Machine Learning - Chapter 2: Advanced Vector Spaces & Subspaces

This quiz is designed for an expert-level assessment of Vector Spaces and Subspaces, fundamental concepts in linear algebra that form the bedrock of many machine learning algorithms. It goes beyond basic definitions to test deep conceptual understanding, the ability to verify axioms in non-standard spaces, and the skill to identify subtle properties of span, linear independence, and subspaces. These concepts are the 'playground' for data vectors, model parameters, and error vectors in ML. A strong grasp is crucial for understanding topics like dimensionality reduction (PCA), regularization, and the geometry of model solution spaces. Key Concepts Tested: - **Vector Space Axioms**: A set V is a vector space over a field F (e.g., $$ℝ$$) if for any $$u, v, w \in V$$ and scalars $$c, d \in F$$, it satisfies ten axioms including closure, associativity, commutativity, identity and inverse elements, and distributivity. - **Subspace Criteria**: A non-empty subset $$W$$ of a vector space $$V$$ is a subspace if and only if it is closed under vector addition ($$u, v \in W \implies u+v \in W$$) and scalar multiplication ($$v \in W \implies c v \in W$$). - **Linear Combination & Span**: A vector $$v$$ is a linear combination of $$v_1, ..., v_k$$ if $$v = \sum_{i=1}^{k} c_i v_i$$. The span of a set of vectors is the set of all their possible linear combinations, which always forms a subspace. - **Linear Independence**: A set of vectors $${v_1, ..., v_k}$$ is linearly independent if the only solution to $$\sum_{i=1}^{k} c_i v_i = 0$$ is $$c_1 = c_2 = ... = c_k = 0$$.

Math for Machine Learning
FREE
Start Quiz
Math for Machine Learning - Chapter 5: Norms and Distance Metrics

Math for Machine Learning - Chapter 5: Norms and Distance Metrics

This quiz provides an in-depth assessment of your understanding of norms and distance metrics, fundamental concepts in linear algebra that are critical for machine learning. You will be challenged on the theoretical underpinnings and practical implications of various vector norms, including their definitions, properties, and geometric interpretations. The quiz covers the L1 (Manhattan), L2 (Euclidean), L-infinity (Chebyshev), and the general L-p norms. Key formulas you should be familiar with: - **L-p Norm:** For a vector `$$\mathbf{x} \in \mathbb{R}^n$$`, the L-p norm is defined as: `$$||\mathbf{x}||_p = \left(\sum_{i=1}^{n} |x_i|^p\right)^{1/p}$$` - **L1 Norm (p=1):** `$$||\mathbf{x}||_1 = \sum_{i=1}^{n} |x_i|$$` - **L2 Norm (p=2):** `$$||\mathbf{x}||_2 = \sqrt{\sum_{i=1}^{n} x_i^2}$$` - **L-infinity Norm (p -> infinity):** `$$||\mathbf{x}||_\infty = \max_{i} |x_i|$$` - **Distance Metric:** The distance between two vectors `$$\mathbf{x}$$` and `$$\mathbf{y}$$` induced by a norm is: `$$d_p(\mathbf{x}, \mathbf{y}) = ||\mathbf{x} - \mathbf{y}||_p$$` Questions will require you to go beyond simple calculations and apply these concepts to solve complex problems related to optimization (regularization), geometry in high-dimensional spaces, and the behavior of machine learning algorithms. Prepare to analyze the properties of norms, compare their effects, and understand their role in shaping model characteristics like sparsity and robustness to outliers.

Math for Machine Learning
FREE
Start Quiz
Math for Machine Learning - Chapter 8: Linear Systems of Equations

Math for Machine Learning - Chapter 8: Linear Systems of Equations

This quiz delves into advanced aspects of systems of linear equations, a cornerstone of mathematics for machine learning. Moving beyond basic solution methods, these questions rigorously test your understanding of the underlying theory, geometric interpretations, and the conditions for solution existence and uniqueness. We explore the profound differences between the 'row picture' (viewing equations as intersecting hyperplanes) and the 'column picture' (expressing the right-hand side vector as a linear combination of the matrix's column vectors). Key concepts include the rank of a matrix ($rank(A)$), the column space ($C(A)$), the null space ($N(A)$), and their critical roles in determining system consistency ($b \in C(A)$ implies consistency) and uniqueness ($N(A)$ containing only the zero vector implies uniqueness). You will apply these concepts to various matrix dimensions ($m \times n$) and scenarios, including overdetermined and underdetermined systems, homogeneous systems ($A\mathbf{x}=\mathbf{0}$), and the general solution structure $\mathbf{x} = \mathbf{x}_p + \mathbf{x}_h$. Prepare to analyze the implications of matrix properties like invertibility, linear dependence of rows/columns, and the powerful rank-nullity theorem. This quiz is designed to solidify your expertise in deciphering the intricate behavior of linear systems, essential for fields like regression, optimization, and dimensionality reduction in machine learning. Key Formulas: $$ A\mathbf{x} = \mathbf{b} $$ $$ \text{Consistency condition: } rank(A) = rank([A|\mathbf{b}]) $$ $$ \text{Rank-Nullity Theorem: } rank(A) + dim(N(A)) = n \quad (\text{where } n \text{ is the number of columns of } A) $$ $$ \text{General Solution: } \mathbf{x} = \mathbf{x}_p + \mathbf{x}_h, \quad \text{ where } A\mathbf{x}_p = \mathbf{b} \text{ (particular solution) and } A\mathbf{x}_h = \mathbf{0} \text{ (homogeneous solution)} $$

Math for Machine Learning
FREE
Start Quiz
Math for Machine Learning - Chapter 9: Determinants

Math for Machine Learning - Chapter 9: Determinants

This quiz rigorously tests your understanding of determinants, focusing on their role in linear transformations and implications for machine learning. It delves into the geometric interpretation of determinants as measures of space scaling and orientation, the critical concept of singularity, and its direct link to matrix invertibility. Questions will challenge your knowledge of advanced determinant properties, their application to various matrix types (orthogonal, similar, block matrices), and their connection to eigenvalues and system solvability. Mastery of these concepts is crucial for understanding topics like PCA, linear regression, and optimization in ML. Key formulas and concepts covered: - Determinant of a $2 \times 2$ matrix $A = \begin{pmatrix} a & b \\ c & d \end{pmatrix}$: $$\text{det}(A) = ad - bc$$ - Cofactor expansion for $n \times n$ matrices: $$\text{det}(A) = \sum_{j=1}^{n} (-1)^{i+j} a_{ij} M_{ij}$$ (for expansion along row $i$, where $M_{ij}$ is the determinant of the submatrix formed by removing row $i$ and column $j$). - Product rule: $$\text{det}(AB) = \text{det}(A) \text{det}(B)$$ - Inverse rule: $$\text{det}(A^{-1}) = \frac{1}{\text{det}(A)}$$ (if $A$ is invertible) - Transpose rule: $$\text{det}(A^T) = \text{det}(A)$$ - Scalar multiplication: $$\text{det}(kA) = k^n \text{det}(A)$$ for an $n \times n$ matrix $A$. - Singularity: A matrix $A$ is singular if and only if $$\text{det}(A) = 0$$. This implies columns (or rows) are linearly dependent, and the transformation collapses space. - Invertibility: A matrix $A$ is invertible if and only if $$\text{det}(A) \neq 0$$. An invertible matrix represents a transformation that can be undone. - Geometric Interpretation: $|\text{det}(A)|$ represents the scaling factor of volume (or area in 2D) under the transformation represented by $A$. The sign of $\text{det}(A)$ indicates whether the transformation preserves ($+$) or reverses ($-$) orientation. - Eigenvalues: The product of eigenvalues of a matrix $A$ equals its determinant: $$\text{det}(A) = \prod_{i=1}^{n} \lambda_i$$.

Math for Machine Learning
FREE
Start Quiz
Math for Machine Learning - Chapter 14: Gram-Schmidt Process and QR Decomposition

Math for Machine Learning - Chapter 14: Gram-Schmidt Process and QR Decomposition

This expert-level quiz delves into the intricacies of the Gram-Schmidt process and QR decomposition, two fundamental tools in linear algebra for machine learning. Moving beyond basic application, these questions will challenge your understanding of numerical stability, algorithmic variants, geometric interpretations, and practical applications in solving least squares problems. You will be tested on the subtle but critical differences between Classical and Modified Gram-Schmidt, the reasons for preferring Householder or Givens methods, and the deep connections between the factorization $$A=QR$$ and properties like determinants, rank, and condition numbers. To succeed, you must be familiar with the following concepts and formulae: **Gram-Schmidt Orthogonalization:** Given a basis $${\mathbf{v}_1, \dots, \mathbf{v}_n}$$, an orthogonal basis $${\mathbf{u}_1, \dots, \mathbf{u}_n}$$ is constructed as: - $$ \mathbf{u}_1 = \mathbf{v}_1 $$ - $$ \mathbf{u}_k = \mathbf{v}_k - \sum_{j=1}^{k-1} \text{proj}_{\mathbf{u}_j}(\mathbf{v}_k) = \mathbf{v}_k - \sum_{j=1}^{k-1} \frac{\langle \mathbf{v}_k, \mathbf{u}_j \rangle}{\|\mathbf{u}_j\|^2} \mathbf{u}_j $$ An orthonormal basis $${\mathbf{q}_1, \dots, \mathbf{q}_n}$$ is then $$ \mathbf{q}_k = \frac{\mathbf{u}_k}{\|\mathbf{u}_k\|} $$. **QR Decomposition:** Any real matrix $$A$$ with linearly independent columns can be factored as $$A = QR$$, where: - $$Q$$ is a matrix with orthonormal columns ($$Q^T Q = I$$). - $$R$$ is an upper triangular, invertible matrix. The entries of R are given by $$r_{ij} = \langle \mathbf{a}_j, \mathbf{q}_i \rangle$$ for $$i<j$$ and $$r_{ii} = \| \mathbf{a}_i - \sum_{k=1}^{i-1} r_{ki} \mathbf{q}_k \| _2$$. This quiz is designed to solidify your expertise, preparing you for advanced topics in numerical optimization and data analysis.

Math for Machine Learning, Linear Algebra
FREE
Start Quiz
Math for Machine Learning - Chapter 10: Fundamental Subspaces (Column space, Row space and Null space)

Math for Machine Learning - Chapter 10: Fundamental Subspaces (Column space, Row space and Null space)

This advanced quiz delves into the foundational concepts of linear algebra crucial for machine learning: the four fundamental subspaces of a matrix. Understanding the Column Space, Row Space, Null Space, and Left Null Space is paramount for grasping how linear transformations behave, the solvability of linear systems, and the underlying geometry of data. This quiz challenges your comprehension of their definitions, properties, bases, dimensions, and intricate orthogonal relationships. Key concepts include: - The **Column Space** $C(A)$: The span of the columns of $A$, representing the set of all possible outputs $Ax$. - The **Row Space** $R(A)$: The span of the rows of $A$, which is equivalent to $C(A^T)$. - The **Null Space** $N(A)$: The set of all vectors $x$ such that $Ax = 0$. - The **Left Null Space** $N(A^T)$: The set of all vectors $y$ such that $A^Ty = 0$, which is the null space of $A^T$. - **Rank-Nullity Theorem**: For an $m \times n$ matrix $A$, $rank(A) + nullity(A) = n$. - **Fundamental Theorem of Linear Algebra**: The Row Space and Null Space are orthogonal complements ($R(A) = N(A)^\perp$), and the Column Space and Left Null Space are orthogonal complements ($C(A) = N(A^T)^\perp$). - **Orthogonal Projection**: The projection of a vector $b$ onto a subspace $S$ with basis columns $U$ is given by $P_S b$, where $P_S = U(U^T U)^{-1} U^T$. If $U$ contains an orthonormal basis for $S$, $P_S = U U^T$. This quiz will push your analytical skills to master these theoretical underpinnings, essential for tasks like dimensionality reduction, understanding model capacities, and optimizing algorithms in machine learning.

Math for Machine Learning
FREE
Start Quiz