quatica.solver
quatica.solver
CGNEQSolver
Conjugate Gradient on the Normal Equations (CGNE–Q) to solve XA = I_n in matrix form.
Minimizes f(X) = 1/2 || X A - I_n ||_F^2 with quaternion-native operations.
- Column case (full column rank): A in H^{m x n} with m >= n
- Iterates stay in span{A^H} when initialized with X0 = alpha * A^H
- Converges to A^† as ||I_n - X_k A||_F -> 0
compute(A)
Compute pseudoinverse using Conjugate Gradient on Normal Equations (CGNE–Q).
Minimizes f(X) = 1/2 || X A - I_n ||_F^2 with quaternion-native operations. This method provides global convergence to the Moore-Penrose pseudoinverse by solving the normal equations via conjugate gradient.
Parameters:
A : np.ndarray Quaternion matrix of shape (m, n) with m >= n (full column rank)
Returns:
tuple[np.ndarray, dict] X : np.ndarray Pseudoinverse of A, shape (n, m) info : dict Dictionary containing convergence information: - 'iterations': Number of CG iterations performed - 'residual_norms': List of relative residual norms per iteration - 'iteration_times': List of CPU times per iteration - 'total_time': Total computation time - 'converged': Boolean indicating if convergence was achieved - 'preconditioner_rank': Rank of preconditioner used (0 if none)
Notes:
- Requires m >= n (full column rank matrices)
- Initializes with X_0 = (1/||A||_F^2) * A^H to stay in span{A^H}
- Uses Fletcher-Reeves CG with exact line search
- Supports optional thin Nyström-type right preconditioning
- All operations are quaternion-native with no embeddings
- Converges to A^† as ||I_n - X_k A||_F -> 0
- Per iteration: one A multiply and one A^H multiply
DeepLinearNewtonSchulz
Deep Linear Newton-Schulz solver for quaternion matrices.
Implements the alternating gradient method for deep linear networks using proper Newton-Schulz updates for each layer individually.
The method optimizes ||X * W_1 * ... * W_d - I||_F -> min where each W_i is updated using the damped Newton-Schulz formula: W_i^(k+1) = W_i^(k) - γ * W_i^(k) * (hat_W * hat_X * W_i^(k) - I)
compute(X, layers)
Compute deep linear decomposition using alternating Newton-Schulz updates.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
X
|
ndarray
|
Input quaternion matrix (n_samples x input_dim) |
required |
layers
|
list[int]
|
List of layer dimensions [d0, d1, ..., dk] where d0 = input_dim, dk = output_dim |
required |
Returns:
Name | Type | Description |
---|---|---|
weights |
list[ndarray]
|
List of optimized quaternion weight matrices |
residuals |
dict[str, list[float]]
|
Dictionary of residual norms over iterations |
deviations |
list[float]
|
List of total deviation norms over iterations |
HigherOrderNewtonSchulzPseudoinverse
Third-order Newton–Schulz pseudoinverse solver (no damping).
Iteration (T := X_k): X_{k+1} = 3 T - 3 T A T + T (A T)^2
Initialization
T_0 = A^H / ||A||_F^2
Residuals tracked per iteration
E1 = ||A X A - A||_F E2 = ||X A X - X||_F E3 = ||(A X)^H - A X||_F E4 = ||(X A)^H - X A||_F
HybridRSPNewtonSchulz
Hybrid RSP-Q + NS (column variant): alternate T randomized sketch-and-project steps with one exact hyperpower (order p) step on the right residual.
Parameters
r : int Sketch block size for RSP-Q phase p : int Hyperpower order for NS step (e.g., 2, 4, 8) T : int Number of RSP steps per cycle before one NS step tol : float Stopping tolerance for proxy residual using test sketch Pi max_iter : int Maximum total RSP steps (cycles*T bounded by this) verbose : bool Verbose logging seed : int | None RNG seed
compute(A)
Compute pseudoinverse using hybrid RSP-Q + NS algorithm.
Alternates T randomized sketch-and-project steps with one exact hyperpower (order p) step on the right residual. This combines the efficiency of randomized methods with the accuracy of exact Newton-Schulz iterations.
Parameters:
A : np.ndarray Quaternion matrix of shape (m, n) with m >= n (full column rank)
Returns:
tuple[np.ndarray, dict] X : np.ndarray Pseudoinverse of A, shape (n, m) info : dict Dictionary containing convergence information: - 'iterations_rsp': Number of RSP iterations performed - 'residual_norms': List of residual norms after each cycle - 'total_time': Total computation time - 'converged': Boolean indicating if convergence was achieved - 'r': Block size used for RSP steps - 'p': Hyperpower order used for NS steps - 'T': Number of RSP steps per cycle
Notes:
- Currently supports only column variant (m >= n)
- Initializes with X_0 = (1/||A||_F^2) * A^H
- Performs T RSP steps followed by one NS hyperpower step per cycle
- Monitors convergence using test sketch to avoid full residual computation
- Combines benefits of randomized efficiency and exact convergence
NewtonSchulzPseudoinverse
Compute the Moore–Penrose pseudoinverse of quaternion matrices via damped Newton–Schulz.
QGMRESSolver
Quaternion Generalized Minimal Residual (Q-GMRES) solver.
Implements the Q-GMRES algorithm for solving quaternion linear systems A * x = b, where A is a quaternion matrix and b is a quaternion vector.
Based on the implementation by Zhigang Jia and Michael K. Ng: "Structure Preserving Quaternion Generalized Minimal Residual Method", SIMAX, 2021
__init__(tol=1e-06, max_iter=None, verbose=False, preconditioner=None)
Initialize Q-GMRES solver.
Parameters:
tol : float, optional Tolerance for convergence (default: 1e-6) max_iter : int, optional Maximum number of iterations (default: None, uses matrix dimension) verbose : bool, optional Whether to print convergence information (default: False)
solve(A, b)
Solve the quaternion linear system A * x = b using Q-GMRES.
Parameters:
A : np.ndarray Quaternion matrix (m x n) b : np.ndarray Quaternion right-hand side vector (m x 1)
Returns:
x : np.ndarray Solution vector (n x 1) info : dict Information about the solution process including: - 'iterations': Number of iterations performed - 'residual': Final residual norm - 'residual_history': List of residual norms - 'converged': Whether the method converged
RandomizedSketchProjectPseudoinverse
Randomized Sketch-and-Project (RSP-Q) for quaternion matrix pseudoinverse.
Implements the randomized sketch-and-project method for computing the Moore-Penrose pseudoinverse of quaternion matrices. This method provides global linear convergence in expectation through cheap randomized projections onto sketched identity constraints.
The method works by drawing random sketches and projecting onto the constraint set {X: XY_k = Omega_k} where Y_k = AOmega_k and Omega_k is a random sketch. This provides a block Kaczmarz-style approach that converges globally.
__init__(block_size=16, max_iter=1000, tol=1e-06, test_sketch_size=8, verbose=False, seed=None, column_solver='qr')
Initialize the RSP-Q solver.
Parameters:
block_size : int, optional Size of the random sketch block r (default: 16) max_iter : int, optional Maximum number of iterations (default: 1000) tol : float, optional Convergence tolerance (default: 1e-6) test_sketch_size : int, optional Size of the test sketch for convergence check (default: 8) verbose : bool, optional Whether to print convergence information (default: False) seed : int, optional Random seed for reproducibility (default: None)
compute(A)
Compute the Moore-Penrose pseudoinverse using RSP-Q.
Automatically selects column or row variant based on matrix dimensions. This is the main entry point for RSP-Q pseudoinverse computation.
Parameters:
A : np.ndarray Quaternion matrix of shape (m, n)
Returns:
tuple[np.ndarray, dict] X : np.ndarray Moore-Penrose pseudoinverse of A, shape (n, m) convergence_info : dict Dictionary containing convergence information from the selected variant
Notes:
- For m >= n (tall/square matrices): uses column variant (compute_column_variant)
- For m < n (wide matrices): uses row variant (compute_row_variant)
- Automatically clamps block_size to safe range [1, min(m, n)]
- Both variants provide global linear convergence in expectation
- The choice between variants is optimal for computational efficiency
compute_column_variant(A)
Compute pseudoinverse using column variant RSP-Q for full column rank A.
Solves XA = I_n by iterative sketch-and-project updates. This variant is designed for tall matrices (m >= n) and uses randomized sketching to reduce computational complexity.
Parameters:
A : np.ndarray Full column rank quaternion matrix of shape (m, n) with m >= n
Returns:
tuple[np.ndarray, dict] X : np.ndarray Pseudoinverse of A, shape (n, m) convergence_info : dict Dictionary containing convergence information: - 'iterations': Number of iterations performed - 'residual_norms': List of residual norms per iteration - 'iteration_times': List of CPU times per iteration - 'total_time': Total computation time - 'converged': Boolean indicating if convergence was achieved
Notes:
- Initializes with X_0 = (1/||A||_F^2) * A^H
- Uses random sketches of size block_size for each iteration
- Supports both QR and SPD update strategies via column_solver parameter
- Monitors convergence using a test sketch to avoid computing full residuals
- Global linear convergence in expectation for well-conditioned matrices
compute_row_variant(A)
RSP-Q row variant: solve A X = I_m for full row rank A (m <= n).
This variant is designed for wide matrices (m <= n) and uses left sketching to project onto sketched identity constraints. The update formula is: X <- X + Z^H (Z ZH){-1} (S^H - Z X), where Z = S^H A.
Parameters:
A : np.ndarray Full row rank quaternion matrix of shape (m, n) with m <= n
Returns:
tuple[np.ndarray, dict] X : np.ndarray Pseudoinverse of A, shape (n, m) info : dict Dictionary containing convergence information: - 'iterations': Number of iterations performed - 'residual_norms': List of residual norms per iteration - 'iteration_times': List of CPU times per iteration - 'total_time': Total computation time - 'converged': Boolean indicating if convergence was achieved
Notes:
- Initializes with X_0 = 0 (zero initialization is sufficient for RSP)
- Uses left sketches S of size block_size for each iteration
- Solves (Z Z^H) W = R using CG micro-solver with fallback to Newton-Schulz
- Monitors convergence using proxy residual with test sketch
- Applies tiny ridge regularization (1e-10) to Gram matrix for stability
- Global linear convergence in expectation for well-conditioned matrices
maxvol_submatrix_quat(A, k, tol=1e-06, max_sweeps=50, track_history=True)
Two-sided MaxVol for quaternion matrices: select k×k submatrix with large volume.
Parameters: - A: quaternion ndarray of shape (m, n) - k: target core size (k ≤ min(m, n)) - tol: stopping tolerance; phases stop when max coeff ≤ 1 + tol - max_sweeps: maximum row/column alternating sweeps - track_history: if True, records accepted core snapshots for external checks
Returns: - I: list[int] of selected row indices (len k) - J: list[int] of selected column indices (len k) - B: quaternion ndarray (k×k) final core A[I, J] - info: dict with keys {"row_swaps","col_swaps","iterations","B_history"}
Notes: - Row phase uses C = A[:, J] @ B^{-1} (m×k) - Column phase uses C' = B^{-1} @ A[I, :] (k×n) - Rank-1 updates are done quaternion-natively; scalar divisions use right-multiplication by q^{-1}.