cvx-linalg 0.2.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,80 @@
1
+ # Configure Git Auth for Private Packages
2
+
3
+ This composite action configures git to use token authentication for private GitHub packages.
4
+
5
+ ## Usage
6
+
7
+ Add this step before installing dependencies that include private GitHub packages:
8
+
9
+ ```yaml
10
+ - name: Configure git auth for private packages
11
+ uses: ./.github/actions/configure-git-auth
12
+ with:
13
+ token: ${{ secrets.GH_PAT }}
14
+ ```
15
+
16
+ The `GH_PAT` secret should be a Personal Access Token with `repo` scope.
17
+
18
+ ## What It Does
19
+
20
+ This action runs:
21
+
22
+ ```bash
23
+ git config --global url."https://<token>@github.com/".insteadOf "https://github.com/"
24
+ ```
25
+
26
+ This tells git to automatically inject the token into all HTTPS GitHub URLs, enabling access to private repositories.
27
+
28
+ ## When to Use
29
+
30
+ Use this action when your project has dependencies defined in `pyproject.toml` like:
31
+
32
+ ```toml
33
+ [tool.uv.sources]
34
+ private-package = { git = "https://github.com/your-org/private-package.git", rev = "v1.0.0" }
35
+ ```
36
+
37
+ ## Token Requirements
38
+
39
+ By default, this action will use the workflow’s built-in `GITHUB_TOKEN` (`github.token`) if no `token` input is provided or if the provided value is empty (it uses `inputs.token || github.token` internally).
40
+
41
+ The `GITHUB_TOKEN` is usually sufficient when:
42
+
43
+ - installing dependencies hosted in the **same repository** as the workflow, or
44
+ - accessing **public** repositories.
45
+
46
+ The default `GITHUB_TOKEN` typically does **not** have permission to read other private repositories, even within the same organization. For that scenario, you should create a Personal Access Token (PAT) with `repo` scope and store it as `secrets.GH_PAT`, then pass it to the action via the `token` input.
47
+
48
+ If you configure the step as in the example (`token: ${{ secrets.GH_PAT }}`) and `secrets.GH_PAT` is not defined, GitHub Actions passes an empty string to the action. The composite action then falls back to `github.token`, so the configuration step itself still succeeds. However, any subsequent step that tries to access private repositories that are not covered by the permissions of `GITHUB_TOKEN` will fail with an authentication error.
49
+ ## Example Workflow
50
+
51
+ ```yaml
52
+ name: CI
53
+
54
+ on: [push, pull_request]
55
+
56
+ jobs:
57
+ test:
58
+ runs-on: ubuntu-latest
59
+ steps:
60
+ - uses: actions/checkout@v6
61
+
62
+ - name: Install uv
63
+ uses: astral-sh/setup-uv@v7
64
+
65
+ - name: Configure git auth for private packages
66
+ uses: ./.github/actions/configure-git-auth
67
+ with:
68
+ token: ${{ secrets.GH_PAT }}
69
+
70
+ - name: Install dependencies
71
+ run: uv sync --frozen
72
+
73
+ - name: Run tests
74
+ run: uv run pytest
75
+ ```
76
+
77
+ ## See Also
78
+
79
+ - [PRIVATE_PACKAGES.md](../../../.rhiza/docs/PRIVATE_PACKAGES.md) - Complete guide to using private packages
80
+ - [TOKEN_SETUP.md](../../../.rhiza/docs/TOKEN_SETUP.md) - Setting up Personal Access Tokens
@@ -0,0 +1,122 @@
1
+ ### Python template
2
+ .idea
3
+ .venv
4
+ .ruff_cache
5
+ .ty_cache
6
+
7
+ # HTML outputs from docstring examples (e.g. report.save("output/..."))
8
+ output/
9
+
10
+ ### Don't expose API keys, etc.
11
+ .env
12
+
13
+ __marimo__
14
+
15
+ _tests
16
+ _book
17
+ _pdoc
18
+ docs/notebooks/*.html
19
+ docs/reports
20
+ docs/reports.md
21
+ _marimushka
22
+ _mkdocs
23
+ _benchmarks
24
+ _jupyter
25
+ _site
26
+
27
+ # LaTeX build artifacts
28
+ docs/paper/*.aux
29
+ docs/paper/*.fdb_latexmk
30
+ docs/paper/*.fls
31
+ docs/paper/*.log
32
+ docs/paper/*.out
33
+ docs/paper/*.toc
34
+
35
+ # temp file used by Junie
36
+ .output.txt
37
+
38
+ # folder used for programs, e.g. uv, uvx, task, etc.
39
+ bin
40
+
41
+ # Byte-compiled / optimized / DLL files
42
+ __pycache__/
43
+ *.py[cod]
44
+ *$py.class
45
+
46
+ # Generated presentation files
47
+ presentation.html
48
+ presentation.pdf
49
+ *.pptx
50
+
51
+ # C extensions
52
+ *.so
53
+
54
+ # .DS_Store
55
+ .DS_Store
56
+
57
+ # Distribution / packaging
58
+ .Python
59
+ build/
60
+ develop-eggs/
61
+ dist/
62
+ downloads/
63
+ eggs/
64
+ .eggs/
65
+ lib/
66
+ lib64/
67
+ parts/
68
+ sdist/
69
+ var/
70
+ wheels/
71
+ share/python-wheels/
72
+ *.egg-info/
73
+ .installed.cfg
74
+ *.egg
75
+ MANIFEST
76
+
77
+ # Installer logs
78
+ pip-log.txt
79
+ pip-delete-this-directory.txt
80
+
81
+ # Unit test / coverage reports
82
+ htmlcov/
83
+ .tox/
84
+ .nox/
85
+ .coverage
86
+ .coverage.*
87
+ .cache
88
+ nosetests.xml
89
+ coverage.xml
90
+ coverage.json
91
+ *.cover
92
+ *.py,cover
93
+ .hypothesis/
94
+ .benchmarks/
95
+ .pytest_cache/
96
+ cover/
97
+
98
+ # Security scanning baselines (regenerate as needed)
99
+ .bandit-baseline.json
100
+
101
+ # Translations
102
+ *.mo
103
+ *.pot
104
+
105
+ # Django stuff:
106
+ *.log
107
+ local_settings.py
108
+ db.sqlite3
109
+ db.sqlite3-journal
110
+
111
+ # Flask stuff:
112
+ instance/
113
+ .webassets-cache
114
+
115
+ # Cython debug symbols
116
+ cython_debug/
117
+
118
+ # Makefile
119
+ local.mk
120
+
121
+ .bandit-baseline.json
122
+
@@ -0,0 +1,27 @@
1
+ # Requirements Folder
2
+
3
+ This folder contains the development dependencies for the Rhiza project, organized by purpose.
4
+
5
+ ## Files
6
+
7
+ - **tests.txt** - Testing dependencies (pytest, pytest-cov, pytest-html, pytest-mock, PyYAML, defusedxml, hypothesis, pytest-benchmark, pygal)
8
+ - **marimo.txt** - Marimo notebook dependencies
9
+ - **docs.txt** - Documentation generation dependencies (interrogate, mkdocs, mkdocs-material, mkdocstrings)
10
+ - **tools.txt** - Development tools (pre-commit, python-dotenv, typer, ty)
11
+
12
+ ## Usage
13
+
14
+ These requirements files are automatically installed by the `make install` command.
15
+
16
+ To install specific requirement files manually:
17
+
18
+ ```bash
19
+ uv pip install -r .rhiza/requirements/tests.txt
20
+ uv pip install -r .rhiza/requirements/marimo.txt
21
+ uv pip install -r .rhiza/requirements/docs.txt
22
+ uv pip install -r .rhiza/requirements/tools.txt
23
+ ```
24
+
25
+ ## CI/CD
26
+
27
+ GitHub Actions workflows automatically install these requirements as needed.
@@ -0,0 +1,169 @@
1
+ # Rhiza Test Suite
2
+
3
+ This directory contains the comprehensive test suite for the Rhiza project.
4
+
5
+ ## Test Organization
6
+
7
+ Tests are organized into purpose-driven subdirectories:
8
+
9
+ ### `structure/`
10
+ Static assertions about file and directory presence. These tests verify that the repository contains the expected files, directories, and configuration structure without executing any subprocesses.
11
+
12
+ - `test_project_layout.py` — Validates root-level files and directories
13
+ - `test_requirements.py` — Validates `.rhiza/requirements/` structure
14
+
15
+ ### `api/`
16
+ Makefile target validation via dry-runs. These tests verify that Makefile targets are properly defined and would execute the expected commands.
17
+
18
+ - `test_makefile_targets.py` — Core Makefile targets (install, test, fmt, etc.)
19
+ - `test_makefile_api.py` — Makefile API (delegation, extension, hooks, overrides)
20
+ - `test_github_targets.py` — GitHub-specific Makefile targets
21
+
22
+ ### `integration/`
23
+ Tests requiring sandboxed git repositories or subprocess execution. These tests verify end-to-end workflows.
24
+
25
+ - `test_release.py` — Release script functionality
26
+ - `test_book_targets.py` — Documentation book build targets
27
+
28
+ ### `sync/`
29
+ Template sync, workflows, versioning, and content validation tests. These tests ensure that template synchronization and content validation work correctly.
30
+
31
+ - `test_rhiza_version.py` — Version reading and workflow validation
32
+ - `test_readme_validation.py` — README code block execution and validation
33
+ - `test_docstrings.py` — Doctest validation across source modules
34
+
35
+ #### Skipping README code blocks with `+RHIZA_SKIP`
36
+
37
+ By default, every `python` and `bash` code block in `README.md` is executed or
38
+ syntax-checked by `test_readme_validation.py`. To mark a block as intentionally
39
+ non-runnable (e.g. illustrative snippets or environment-specific commands), add
40
+ `+RHIZA_SKIP` to the opening fence line:
41
+
42
+ ~~~markdown
43
+ ```python +RHIZA_SKIP
44
+ # This block will NOT be executed or syntax-checked
45
+ from my_env import some_function
46
+ some_function()
47
+ ```
48
+
49
+ ```bash +RHIZA_SKIP
50
+ # This bash block will NOT be syntax-checked
51
+ run-something --only-on-ci
52
+ ```
53
+ ~~~
54
+
55
+ Markdown renderers (including GitHub) ignore everything after the first word on
56
+ a fence line, so the block still renders as a normal highlighted code block.
57
+ Blocks without `+RHIZA_SKIP` continue to be validated as before.
58
+
59
+ ### `utils/`
60
+ Tests for utility code and test infrastructure. These tests validate the testing framework itself and utility scripts.
61
+
62
+ - `test_git_repo_fixture.py` — Validates the `git_repo` fixture
63
+
64
+ ### `deps/`
65
+ Dependency validation tests. These tests ensure that project dependencies are correctly specified and healthy.
66
+
67
+ - `test_dependency_health.py` — Validates pyproject.toml and requirements files
68
+
69
+ ### `stress/`
70
+ Stress tests that verify Rhiza's stability under heavy load. These tests execute Rhiza-specific operations under concurrent load and repeated execution to detect race conditions, resource leaks, and performance degradation.
71
+
72
+ - `test_makefile_stress.py` — Makefile operations under concurrent/repeated load
73
+ - `test_git_stress.py` — Git operations under concurrent load
74
+
75
+ See [stress/README.md](stress/README.md) for detailed documentation.
76
+
77
+ ## Running Tests
78
+
79
+ ### Run all tests
80
+ ```bash
81
+ uv run pytest .rhiza/tests/
82
+ # or
83
+ make test
84
+ ```
85
+
86
+ ### Run tests from a specific category
87
+ ```bash
88
+ uv run pytest .rhiza/tests/structure/
89
+ uv run pytest .rhiza/tests/api/
90
+ uv run pytest .rhiza/tests/integration/
91
+ uv run pytest .rhiza/tests/sync/
92
+ uv run pytest .rhiza/tests/utils/
93
+ uv run pytest .rhiza/tests/deps/
94
+ uv run pytest .rhiza/tests/stress/
95
+ ```
96
+
97
+ ### Run stress tests with custom parameters
98
+ ```bash
99
+ # Run all stress tests (default: 100 iterations, 10 workers)
100
+ uv run pytest .rhiza/tests/stress/ -v
101
+
102
+ # Run with fewer iterations (faster)
103
+ uv run pytest .rhiza/tests/stress/ -v --iterations=10
104
+
105
+ # Skip stress tests when running full test suite
106
+ uv run pytest .rhiza/tests/ -v -m "not stress"
107
+ ```
108
+
109
+ ### Run a specific test file
110
+ ```bash
111
+ uv run pytest .rhiza/tests/structure/test_project_layout.py
112
+ ```
113
+
114
+ ### Run with verbose output
115
+ ```bash
116
+ uv run pytest .rhiza/tests/ -v
117
+ ```
118
+
119
+ ### Run with coverage
120
+ ```bash
121
+ uv run pytest .rhiza/tests/ --cov
122
+ ```
123
+
124
+ ## Fixtures
125
+
126
+ ### Root-level fixtures (`conftest.py`)
127
+ - `root` — Repository root path (session-scoped)
128
+ - `logger` — Configured logger instance (session-scoped)
129
+ - `git_repo` — Sandboxed git repository (function-scoped)
130
+
131
+ ### Category-specific fixtures
132
+ - `api/conftest.py` — `setup_tmp_makefile`, `run_make`, `setup_rhiza_git_repo`
133
+ - `sync/conftest.py` — `setup_sync_env`
134
+
135
+ ## Writing Tests
136
+
137
+ ### Conventions
138
+ - Use descriptive test names that explain what is being tested
139
+ - Group related tests in classes when appropriate
140
+ - Use appropriate fixtures for setup/teardown
141
+ - Add docstrings to test modules and complex test functions
142
+ - Use `pytest.mark.skip` for tests that depend on optional features
143
+
144
+ ### Import Patterns
145
+ ```python
146
+ # Import shared helpers from test_utils
147
+ from test_utils import strip_ansi, run_make, setup_rhiza_git_repo
148
+
149
+ # Import from local category conftest (for fixtures and category-specific helpers)
150
+ from api.conftest import SPLIT_MAKEFILES, setup_tmp_makefile
151
+
152
+ # Note: Fixtures defined in conftest.py are automatically available in tests
153
+ # and don't need to be explicitly imported
154
+ ```
155
+
156
+ ## Test Coverage
157
+
158
+ The test suite aims for high coverage across:
159
+ - Configuration validation (structure, dependencies)
160
+ - Makefile target correctness (api)
161
+ - End-to-end workflows (integration)
162
+ - Template synchronization (sync)
163
+ - Utility code (utils)
164
+
165
+ ## Notes
166
+
167
+ - Benchmarks are located in `tests/benchmarks/` and run via `make benchmark`
168
+ - Integration tests use sandboxed git repositories to avoid affecting the working tree
169
+ - All Makefile tests use dry-run mode (`make -n`) to avoid side effects
@@ -0,0 +1,143 @@
1
+ # Stress Tests for Rhiza Framework
2
+
3
+ This directory contains stress tests that verify the stability and performance of the Rhiza framework under heavy load conditions.
4
+
5
+ ## Overview
6
+
7
+ Stress tests differ from regular integration tests and benchmarks:
8
+ - **Integration tests** verify that workflows execute correctly
9
+ - **Benchmarks** measure performance of individual operations
10
+ - **Stress tests** verify system stability under concurrent load and repeated operations
11
+
12
+ These tests focus specifically on Rhiza's core operations: Makefile execution and Git operations used by release and sync workflows.
13
+
14
+ ## Test Categories
15
+
16
+ ### 1. Makefile Stress Tests (`test_makefile_stress.py`)
17
+
18
+ Tests Rhiza's Makefile operations under stress:
19
+ - Concurrent invocations of targets (help, dry-run)
20
+ - Repeated executions to detect resource leaks
21
+ - Parallel variable printing and Makefile parsing
22
+
23
+ ### 2. Git Operations Stress Tests (`test_git_stress.py`)
24
+
25
+ Tests Git operations used by Rhiza (release scripts, sync) under concurrent load:
26
+ - Concurrent git status/log/diff/show commands
27
+ - Repeated git operations (status, log, branch, rev-parse)
28
+ - Rapid git rev-parse (used in release script)
29
+
30
+ ## Running Stress Tests
31
+
32
+ ### Run all stress tests
33
+ ```bash
34
+ uv run pytest .rhiza/tests/stress/ -v
35
+ ```
36
+
37
+ ### Run specific stress test category
38
+ ```bash
39
+ uv run pytest .rhiza/tests/stress/test_makefile_stress.py -v
40
+ uv run pytest .rhiza/tests/stress/test_git_stress.py -v
41
+ ```
42
+
43
+ ### Run with custom iteration count
44
+ ```bash
45
+ # Reduce iterations for faster testing
46
+ uv run pytest .rhiza/tests/stress/ -v --iterations=10
47
+
48
+ # Increase iterations for more thorough testing
49
+ uv run pytest .rhiza/tests/stress/ -v --iterations=500
50
+ ```
51
+
52
+ ### Run with custom worker count
53
+ ```bash
54
+ # Test with more concurrent workers
55
+ uv run pytest .rhiza/tests/stress/ -v --workers=20
56
+ ```
57
+
58
+ ### Skip stress tests (when running full test suite)
59
+ ```bash
60
+ uv run pytest .rhiza/tests/ -v -m "not stress"
61
+ ```
62
+
63
+ ## Test Markers
64
+
65
+ All tests in this directory are marked with `@pytest.mark.stress` to allow selective execution:
66
+
67
+ ```python
68
+ @pytest.mark.stress
69
+ def test_concurrent_operations():
70
+ # Test concurrent operations
71
+ pass
72
+ ```
73
+
74
+ ## Expected Behavior
75
+
76
+ Stress tests should:
77
+ 1. **Pass consistently** - No flakiness or race conditions
78
+ 2. **Complete in reasonable time** - Generally < 60 seconds per test
79
+ 3. **Clean up resources** - No leaked file handles, processes, or temporary files
80
+ 4. **Report clear failures** - When failures occur, provide actionable error messages
81
+
82
+ ## Acceptance Criteria
83
+
84
+ For Rhiza framework stress tests, we aim for:
85
+ - **100% success rate** - All operations should complete successfully
86
+ - **No resource leaks** - Memory and file handles should be cleaned up
87
+ - **Deterministic behavior** - Tests should produce consistent results
88
+ - **Reasonable performance** - Operations should complete within expected time bounds
89
+
90
+ ## Troubleshooting
91
+
92
+ ### Tests timeout
93
+ - Reduce iteration count: `pytest --iterations=10`
94
+ - Reduce worker count: `pytest --workers=5`
95
+ - Check system resources (CPU, memory, disk)
96
+
97
+ ### Intermittent failures
98
+ - Run with verbose output: `pytest -vv`
99
+ - Check for resource contention with other processes
100
+ - Verify git configuration (may affect git operations)
101
+
102
+ ### Out of memory errors
103
+ - Reduce concurrent workers
104
+ - Check for memory leaks in test code
105
+ - Ensure proper cleanup in fixtures
106
+
107
+ ## Contributing
108
+
109
+ When adding new stress tests:
110
+ 1. Use the `@pytest.mark.stress` decorator
111
+ 2. Use provided fixtures (`stress_iterations`, `concurrent_workers`)
112
+ 3. Ensure proper cleanup (use context managers or fixtures)
113
+ 4. Document expected behavior and acceptance criteria
114
+ 5. Keep tests focused on one stress scenario
115
+ 6. Provide clear assertion messages
116
+
117
+ Example:
118
+ ```python
119
+ import pytest
120
+ import concurrent.futures
121
+
122
+ @pytest.mark.stress
123
+ def test_concurrent_operation(stress_iterations, concurrent_workers):
124
+ """Test concurrent execution of operation X."""
125
+
126
+ def perform_operation():
127
+ # Operation to stress test
128
+ return True
129
+
130
+ with concurrent.futures.ThreadPoolExecutor(max_workers=concurrent_workers) as executor:
131
+ futures = [executor.submit(perform_operation) for _ in range(stress_iterations)]
132
+ results = [f.result() for f in concurrent.futures.as_completed(futures)]
133
+
134
+ success_rate = sum(results) / len(results)
135
+ assert success_rate == 1.0, f"Expected 100% success rate, got {success_rate * 100:.1f}%"
136
+ ```
137
+
138
+ ## See Also
139
+
140
+ - [Main Test README](../README.md) - Overview of all test categories
141
+ - [Integration Tests](../integration/) - End-to-end workflow tests
142
+ - [Benchmarks](../../../tests/benchmarks/) - Performance benchmarks
143
+ - [Property Tests](../../../tests/property/) - Property-based tests
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2025 Jebel Quant Research
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
@@ -0,0 +1,44 @@
1
+ Metadata-Version: 2.4
2
+ Name: cvx-linalg
3
+ Version: 0.2.0
4
+ Summary: Linear algebra utilities for portfolio optimization
5
+ Project-URL: repository, https://github.com/Jebel-Quant/linalg
6
+ Author-email: Thomas Schmelzer <thomas.schmelzer@gmail.com>
7
+ License-Expression: MIT
8
+ License-File: LICENSE
9
+ Classifier: Development Status :: 4 - Beta
10
+ Classifier: Intended Audience :: Financial and Insurance Industry
11
+ Classifier: Intended Audience :: Science/Research
12
+ Classifier: Programming Language :: Python :: 3 :: Only
13
+ Classifier: Programming Language :: Python :: 3.11
14
+ Classifier: Programming Language :: Python :: 3.12
15
+ Classifier: Programming Language :: Python :: 3.13
16
+ Classifier: Programming Language :: Python :: 3.14
17
+ Classifier: Topic :: Scientific/Engineering :: Mathematics
18
+ Requires-Python: >=3.11
19
+ Requires-Dist: numpy>=2.0.0
20
+ Requires-Dist: polars>=1.0.0
21
+ Description-Content-Type: text/markdown
22
+
23
+ # cvx-linalg
24
+
25
+ Linear algebra utilities for portfolio optimization, part of the [jebel-quant](https://github.com/jebel-quant) ecosystem.
26
+
27
+ ## Installation
28
+
29
+ ```bash
30
+ pip install cvx-linalg
31
+ ```
32
+
33
+ ## Usage
34
+
35
+ ```python
36
+ from cvx.linalg import cholesky, pca, rand_cov, valid
37
+ ```
38
+
39
+ ## Functions
40
+
41
+ - **`cholesky(cov)`** — Upper triangular Cholesky factor R such that R.T @ R = cov
42
+ - **`pca(returns, n_components)`** — Principal Component Analysis via SVD
43
+ - **`rand_cov(n, seed)`** — Random positive semi-definite covariance matrix
44
+ - **`valid(matrix)`** — Extract valid submatrix by removing rows/columns with non-finite diagonal entries
@@ -0,0 +1,22 @@
1
+ # cvx-linalg
2
+
3
+ Linear algebra utilities for portfolio optimization, part of the [jebel-quant](https://github.com/jebel-quant) ecosystem.
4
+
5
+ ## Installation
6
+
7
+ ```bash
8
+ pip install cvx-linalg
9
+ ```
10
+
11
+ ## Usage
12
+
13
+ ```python
14
+ from cvx.linalg import cholesky, pca, rand_cov, valid
15
+ ```
16
+
17
+ ## Functions
18
+
19
+ - **`cholesky(cov)`** — Upper triangular Cholesky factor R such that R.T @ R = cov
20
+ - **`pca(returns, n_components)`** — Principal Component Analysis via SVD
21
+ - **`rand_cov(n, seed)`** — Random positive semi-definite covariance matrix
22
+ - **`valid(matrix)`** — Extract valid submatrix by removing rows/columns with non-finite diagonal entries
@@ -0,0 +1,51 @@
1
+ [project]
2
+ name = "cvx-linalg"
3
+ version = "0.2.0"
4
+ description = "Linear algebra utilities for portfolio optimization"
5
+ readme = "README.md"
6
+ requires-python = ">=3.11"
7
+ dependencies = [
8
+ "numpy>=2.0.0",
9
+ "polars>=1.0.0",
10
+ ]
11
+ authors = [{name = "Thomas Schmelzer", email = "thomas.schmelzer@gmail.com"}]
12
+ license = "MIT"
13
+ license-files = ["LICENSE"]
14
+ classifiers = [
15
+ "Development Status :: 4 - Beta",
16
+ "Intended Audience :: Financial and Insurance Industry",
17
+ "Intended Audience :: Science/Research",
18
+ "Topic :: Scientific/Engineering :: Mathematics",
19
+ "Programming Language :: Python :: 3 :: Only",
20
+ "Programming Language :: Python :: 3.11",
21
+ "Programming Language :: Python :: 3.12",
22
+ "Programming Language :: Python :: 3.13",
23
+ "Programming Language :: Python :: 3.14",
24
+ ]
25
+
26
+ [project.urls]
27
+ repository = "https://github.com/Jebel-Quant/linalg"
28
+
29
+
30
+ [build-system]
31
+ requires = ["hatchling"]
32
+ build-backend = "hatchling.build"
33
+
34
+ [tool.hatch.build.targets.wheel]
35
+ packages = ["src/cvx"]
36
+
37
+ [tool.hatch.build]
38
+ include = [
39
+ "LICENSE",
40
+ "README.md",
41
+ "src/cvx/linalg",
42
+ ]
43
+
44
+ [tool.interrogate]
45
+ fail-under = 100
46
+ ignore-init-method = true
47
+ ignore-magic = true
48
+
49
+ [tool.deptry.package_module_name_map]
50
+ numpy = "numpy"
51
+ polars = "polars"
@@ -0,0 +1,40 @@
1
+ """Linear algebra utilities for risk models.
2
+
3
+ This subpackage provides linear algebra utilities commonly used in risk modeling,
4
+ including Cholesky decomposition, Principal Component Analysis, and matrix
5
+ validation.
6
+
7
+ Example:
8
+ >>> import numpy as np
9
+ >>> from cvx.linalg import cholesky, pca, rand_cov, valid
10
+ >>> # Cholesky decomposition
11
+ >>> cov = np.array([[4.0, 2.0], [2.0, 5.0]])
12
+ >>> R = cholesky(cov)
13
+ >>> np.allclose(R.T @ R, cov)
14
+ True
15
+
16
+ Functions:
17
+ cholesky: Compute upper triangular Cholesky decomposition
18
+ pca: Compute principal components of return data
19
+ rand_cov: Generate a random positive semi-definite covariance matrix
20
+ valid: Extract valid submatrix from a matrix with NaN values
21
+
22
+ """
23
+
24
+ # Copyright 2023 Stanford University Convex Optimization Group
25
+ #
26
+ # Licensed under the Apache License, Version 2.0 (the "License");
27
+ # you may not use this file except in compliance with the License.
28
+ # You may obtain a copy of the License at
29
+ #
30
+ # http://www.apache.org/licenses/LICENSE-2.0
31
+ #
32
+ # Unless required by applicable law or agreed to in writing, software
33
+ # distributed under the License is distributed on an "AS IS" BASIS,
34
+ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
35
+ # See the License for the specific language governing permissions and
36
+ # limitations under the License.
37
+ from .cholesky import cholesky as cholesky
38
+ from .pca import pca as pca
39
+ from .rand_cov import rand_cov as rand_cov
40
+ from .valid import valid as valid
@@ -0,0 +1,117 @@
1
+ """Cholesky decomposition utilities for covariance matrices.
2
+
3
+ This module provides a function to compute the upper triangular Cholesky
4
+ decomposition of a positive definite covariance matrix.
5
+
6
+ Example:
7
+ Compute the Cholesky decomposition of a covariance matrix:
8
+
9
+ >>> import numpy as np
10
+ >>> from cvx.linalg import cholesky
11
+ >>> # Create a positive definite matrix
12
+ >>> cov = np.array([[4.0, 2.0], [2.0, 5.0]])
13
+ >>> # Compute upper triangular Cholesky factor
14
+ >>> R = cholesky(cov)
15
+ >>> # Verify: R.T @ R = cov
16
+ >>> np.allclose(R.T @ R, cov)
17
+ True
18
+
19
+ """
20
+
21
+ # Copyright 2023 Stanford University Convex Optimization Group
22
+ #
23
+ # Licensed under the Apache License, Version 2.0 (the "License");
24
+ # you may not use this file except in compliance with the License.
25
+ # You may obtain a copy of the License at
26
+ #
27
+ # http://www.apache.org/licenses/LICENSE-2.0
28
+ #
29
+ # Unless required by applicable law or agreed to in writing, software
30
+ # distributed under the License is distributed on an "AS IS" BASIS,
31
+ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
32
+ # See the License for the specific language governing permissions and
33
+ # limitations under the License.
34
+ from __future__ import annotations
35
+
36
+ import numpy as np
37
+ from numpy.linalg import cholesky as _cholesky
38
+
39
+
40
+ def cholesky(cov: np.ndarray) -> np.ndarray:
41
+ """Compute the upper triangular part of the Cholesky decomposition.
42
+
43
+ This function computes the Cholesky decomposition of a positive definite matrix
44
+ and returns the upper triangular matrix R such that R^T @ R = cov.
45
+
46
+ The Cholesky decomposition is useful in portfolio optimization because it
47
+ provides an efficient way to compute portfolio risk as ||R @ w||_2, where
48
+ w is the portfolio weights vector.
49
+
50
+ Args:
51
+ cov: A positive definite covariance matrix of shape (n, n).
52
+
53
+ Returns:
54
+ The upper triangular Cholesky factor R of shape (n, n).
55
+
56
+ Example:
57
+ Basic usage with a simple covariance matrix:
58
+
59
+ >>> import numpy as np
60
+ >>> from cvx.linalg import cholesky
61
+ >>> # Identity matrix
62
+ >>> cov = np.eye(3)
63
+ >>> R = cholesky(cov)
64
+ >>> np.allclose(R, np.eye(3))
65
+ True
66
+
67
+ With a more complex covariance matrix:
68
+
69
+ >>> cov = np.array([[1.0, 0.5, 0.0],
70
+ ... [0.5, 1.0, 0.5],
71
+ ... [0.0, 0.5, 1.0]])
72
+ >>> R = cholesky(cov)
73
+ >>> np.allclose(R.T @ R, cov)
74
+ True
75
+
76
+ Verify upper triangular structure:
77
+
78
+ >>> R = cholesky(np.array([[4.0, 2.0], [2.0, 5.0]]))
79
+ >>> # R is upper triangular (zeros below diagonal)
80
+ >>> bool(np.allclose(R[1, 0], 0.0))
81
+ True
82
+ >>> bool(R[0, 0] > 0 and R[1, 1] > 0) # Positive diagonal
83
+ True
84
+
85
+ Portfolio risk computation via Cholesky factor:
86
+
87
+ >>> cov = np.array([[0.04, 0.01], [0.01, 0.09]])
88
+ >>> R = cholesky(cov)
89
+ >>> w = np.array([0.6, 0.4])
90
+ >>> # Risk via Cholesky: ||R @ w||_2
91
+ >>> risk_chol = np.linalg.norm(R @ w)
92
+ >>> # Risk via covariance: sqrt(w^T @ cov @ w)
93
+ >>> risk_cov = np.sqrt(w @ cov @ w)
94
+ >>> bool(np.isclose(risk_chol, risk_cov))
95
+ True
96
+
97
+ Relationship between upper (R) and lower (L) triangular factors:
98
+
99
+ >>> cov = np.array([[9.0, 3.0], [3.0, 5.0]])
100
+ >>> R = cholesky(cov)
101
+ >>> L = np.linalg.cholesky(cov) # numpy returns lower triangular
102
+ >>> # R = L^T
103
+ >>> np.allclose(R, L.T)
104
+ True
105
+ >>> # Both reconstruct the covariance
106
+ >>> np.allclose(L @ L.T, cov)
107
+ True
108
+ >>> np.allclose(R.T @ R, cov)
109
+ True
110
+
111
+ Note:
112
+ This function returns the upper triangular factor (R), whereas
113
+ numpy.linalg.cholesky returns the lower triangular factor (L).
114
+ The relationship is: L @ L^T = cov and R^T @ R = cov, where R = L^T.
115
+
116
+ """
117
+ return _cholesky(cov).transpose()
@@ -0,0 +1,220 @@
1
+ # Copyright 2023 Stanford University Convex Optimization Group
2
+ #
3
+ # Licensed under the Apache License, Version 2.0 (the "License");
4
+ # you may not use this file except in compliance with the License.
5
+ # You may obtain a copy of the License at
6
+ #
7
+ # http://www.apache.org/licenses/LICENSE-2.0
8
+ #
9
+ # Unless required by applicable law or agreed to in writing, software
10
+ # distributed under the License is distributed on an "AS IS" BASIS,
11
+ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12
+ # See the License for the specific language governing permissions and
13
+ # limitations under the License.
14
+ """PCA analysis (pure NumPy implementation).
15
+
16
+ This module provides Principal Component Analysis (PCA) for dimensionality
17
+ reduction of return data. PCA is commonly used to construct factor models
18
+ for portfolio optimization.
19
+
20
+ Example:
21
+ Perform PCA on stock returns:
22
+
23
+ >>> import numpy as np
24
+ >>> import polars as pl
25
+ >>> from cvx.linalg import pca
26
+ >>> # Create sample returns data
27
+ >>> np.random.seed(42)
28
+ >>> returns = pl.DataFrame(
29
+ ... np.random.randn(100, 5),
30
+ ... schema=['A', 'B', 'C', 'D', 'E']
31
+ ... )
32
+ >>> # Compute PCA with 3 components
33
+ >>> result = pca(returns, n_components=3)
34
+ >>> # Access explained variance
35
+ >>> len(result.explained_variance)
36
+ 3
37
+ >>> # Access factors (principal components)
38
+ >>> result.factors.shape
39
+ (100, 3)
40
+ >>> # Access factor exposures (loadings)
41
+ >>> result.exposure.shape
42
+ (3, 5)
43
+
44
+ """
45
+
46
+ from __future__ import annotations
47
+
48
+ from collections import namedtuple
49
+
50
+ import numpy as np
51
+ import polars as pl
52
+
53
+ PCA = namedtuple(
54
+ "PCA",
55
+ ["explained_variance", "factors", "exposure", "cov", "systematic", "idiosyncratic"],
56
+ )
57
+ """Named tuple containing the results of PCA analysis.
58
+
59
+ Attributes:
60
+ explained_variance: Explained variance ratio for each component.
61
+ An array of shape (n_components,) where each element represents
62
+ the proportion of total variance explained by that component.
63
+ factors: Factor returns (principal components) as a DataFrame.
64
+ Shape is (n_samples, n_components). Each column is a factor.
65
+ exposure: Factor exposures (loadings) for each asset as a DataFrame.
66
+ Shape is (n_components, n_assets). Each row contains the loadings
67
+ of one component on all assets.
68
+ cov: Covariance matrix of the factors as a DataFrame.
69
+ Shape is (n_components, n_components).
70
+ systematic: Systematic returns explained by the factors as a DataFrame.
71
+ Shape is (n_samples, n_assets). This is the part of returns
72
+ explained by the factor model.
73
+ idiosyncratic: Idiosyncratic returns not explained by factors as a DataFrame.
74
+ Shape is (n_samples, n_assets). This is the residual part of returns.
75
+
76
+ Example:
77
+ >>> import numpy as np
78
+ >>> import polars as pl
79
+ >>> from cvx.linalg import pca
80
+ >>> np.random.seed(42)
81
+ >>> returns = pl.DataFrame(np.random.randn(50, 4))
82
+ >>> result = pca(returns, n_components=2)
83
+ >>> # Check explained variance sums to less than 1
84
+ >>> result.explained_variance.sum() < 1
85
+ True
86
+ >>> # Systematic + idiosyncratic approximately equals original
87
+ >>> np.allclose(
88
+ ... result.systematic.to_numpy() + result.idiosyncratic.to_numpy(),
89
+ ... returns.to_numpy(),
90
+ ... atol=1e-10
91
+ ... )
92
+ True
93
+
94
+ """
95
+
96
+
97
+ def pca(returns: pl.DataFrame, n_components: int = 10) -> PCA:
98
+ """Compute the first n principal components for a return matrix using SVD.
99
+
100
+ This function performs Principal Component Analysis on asset returns to
101
+ extract the main sources of variance. The results can be used to construct
102
+ a factor model for portfolio optimization.
103
+
104
+ Args:
105
+ returns: DataFrame of asset returns with shape (n_samples, n_assets).
106
+ Rows represent time periods, columns represent assets.
107
+ n_components: Number of principal components to extract. Defaults to 10.
108
+
109
+ Returns:
110
+ PCA named tuple containing:
111
+ - explained_variance: Ratio of variance explained by each component
112
+ - factors: Factor returns (scores)
113
+ - exposure: Factor exposures (loadings)
114
+ - cov: Factor covariance matrix
115
+ - systematic: Returns explained by factors
116
+ - idiosyncratic: Residual returns
117
+
118
+ Example:
119
+ Basic PCA on synthetic returns:
120
+
121
+ >>> import numpy as np
122
+ >>> import polars as pl
123
+ >>> from cvx.linalg import pca
124
+ >>> np.random.seed(42)
125
+ >>> # Create returns with 100 periods and 10 assets
126
+ >>> returns = pl.DataFrame(np.random.randn(100, 10))
127
+ >>> result = pca(returns, n_components=3)
128
+ >>> # First component explains most variance
129
+ >>> bool(result.explained_variance[0] > result.explained_variance[1])
130
+ True
131
+ >>> # Factors are orthogonal
132
+ >>> factor_corr = np.corrcoef(result.factors.to_numpy().T)
133
+ >>> bool(np.allclose(factor_corr, np.eye(3), atol=0.1))
134
+ True
135
+
136
+ Verifying variance decomposition (systematic + idiosyncratic = total):
137
+
138
+ >>> np.random.seed(123)
139
+ >>> returns = pl.DataFrame(np.random.randn(50, 5))
140
+ >>> result = pca(returns, n_components=3)
141
+ >>> # Systematic variance + idiosyncratic variance ≈ total variance
142
+ >>> total_var = np.var(returns.to_numpy(), axis=0, ddof=1).sum()
143
+ >>> systematic_var = np.var(result.systematic.to_numpy(), axis=0, ddof=1).sum()
144
+ >>> idio_var = np.var(result.idiosyncratic.to_numpy(), axis=0, ddof=1).sum()
145
+ >>> # Note: small differences due to demeaning
146
+ >>> bool(np.isclose(systematic_var + idio_var, total_var, rtol=0.1))
147
+ True
148
+
149
+ Exposure matrix has orthonormal rows (loadings are orthogonal):
150
+
151
+ >>> np.random.seed(42)
152
+ >>> returns = pl.DataFrame(np.random.randn(100, 6))
153
+ >>> result = pca(returns, n_components=3)
154
+ >>> # V^T @ V should be identity (orthonormal loadings)
155
+ >>> VtV = result.exposure.to_numpy() @ result.exposure.to_numpy().T
156
+ >>> bool(np.allclose(VtV, np.eye(3), atol=1e-10))
157
+ True
158
+
159
+ Explained variance is ordered (first component explains most):
160
+
161
+ >>> all(result.explained_variance[i] >= result.explained_variance[i+1]
162
+ ... for i in range(len(result.explained_variance)-1))
163
+ True
164
+
165
+ Reconstructing returns from factors and exposures:
166
+
167
+ >>> # systematic = factors @ exposure (plus mean)
168
+ >>> reconstructed = result.factors.to_numpy() @ result.exposure.to_numpy()
169
+ >>> # Should match systematic (centered part)
170
+ >>> centered_systematic = result.systematic.to_numpy() - returns.to_numpy().mean(axis=0)
171
+ >>> bool(np.allclose(reconstructed, centered_systematic, atol=1e-10))
172
+ True
173
+
174
+ """
175
+ # Demean the returns
176
+ x = returns.to_numpy()
177
+ x_mean = x.mean(axis=0)
178
+ x_centered = x - x_mean
179
+
180
+ # Singular Value Decomposition
181
+ # x = u s V^T, where columns of V are principal axes
182
+ u, s_full, vt = np.linalg.svd(x_centered, full_matrices=False)
183
+
184
+ # Take only the first n components
185
+ u = u[:, :n_components]
186
+ s = s_full[:n_components]
187
+ vt = vt[:n_components, :]
188
+
189
+ pc_names = [f"PC{i + 1}" for i in range(n_components)]
190
+
191
+ # Factor exposures (loadings): each component's weight per asset
192
+ exposure = pl.DataFrame(vt, schema=returns.columns)
193
+
194
+ # Factor returns (scores): projection of data onto components
195
+ factors = pl.DataFrame(u * s, schema=pc_names)
196
+
197
+ # Explained variance ratio (normalize by total variance across ALL components)
198
+ explained_variance = (s**2) / np.sum(s_full**2)
199
+
200
+ # Covariance of factor returns
201
+ cov = pl.DataFrame(np.cov((u * s).T), schema=pc_names)
202
+
203
+ # Systematic + Idiosyncratic returns
204
+ systematic = pl.DataFrame(
205
+ (u * s) @ vt + x_mean,
206
+ schema=returns.columns,
207
+ )
208
+ idiosyncratic = pl.DataFrame(
209
+ x_centered - (u * s) @ vt,
210
+ schema=returns.columns,
211
+ )
212
+
213
+ return PCA(
214
+ explained_variance=explained_variance,
215
+ factors=factors,
216
+ exposure=exposure,
217
+ cov=cov,
218
+ systematic=systematic,
219
+ idiosyncratic=idiosyncratic,
220
+ )
@@ -0,0 +1,106 @@
1
+ """Random covariance matrix generation utilities.
2
+
3
+ This module provides functions for generating random positive semi-definite
4
+ covariance matrices. These are useful for testing and simulation purposes.
5
+
6
+ Example:
7
+ Generate a random covariance matrix:
8
+
9
+ >>> import numpy as np
10
+ >>> from cvx.linalg import rand_cov
11
+ >>> # Generate a 5x5 random covariance matrix
12
+ >>> cov = rand_cov(5, seed=42)
13
+ >>> cov.shape
14
+ (5, 5)
15
+ >>> # Verify it's symmetric
16
+ >>> bool(np.allclose(cov, cov.T))
17
+ True
18
+ >>> # Verify it's positive semi-definite
19
+ >>> bool(np.all(np.linalg.eigvals(cov) >= -1e-10))
20
+ True
21
+
22
+ """
23
+
24
+ # Copyright 2023 Stanford University Convex Optimization Group
25
+ #
26
+ # Licensed under the Apache License, Version 2.0 (the "License");
27
+ # you may not use this file except in compliance with the License.
28
+ # You may obtain a copy of the License at
29
+ #
30
+ # http://www.apache.org/licenses/LICENSE-2.0
31
+ #
32
+ # Unless required by applicable law or agreed to in writing, software
33
+ # distributed under the License is distributed on an "AS IS" BASIS,
34
+ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
35
+ # See the License for the specific language governing permissions and
36
+ # limitations under the License.
37
+ from __future__ import annotations
38
+
39
+ import numpy as np
40
+
41
+
42
+ def rand_cov(n: int, seed: int | None = None) -> np.ndarray:
43
+ """Construct a random positive semi-definite covariance matrix of size n x n.
44
+
45
+ The matrix is constructed as A^T @ A where A is a random n x n matrix with
46
+ elements drawn from a standard normal distribution. This ensures the result
47
+ is symmetric and positive semi-definite.
48
+
49
+ Args:
50
+ n: Size of the covariance matrix (n x n).
51
+ seed: Random seed for reproducibility. If None, uses the current
52
+ random state.
53
+
54
+ Returns:
55
+ A random positive semi-definite n x n covariance matrix.
56
+
57
+ Example:
58
+ Generate a reproducible random covariance matrix:
59
+
60
+ >>> import numpy as np
61
+ >>> from cvx.linalg import rand_cov
62
+ >>> cov1 = rand_cov(3, seed=42)
63
+ >>> cov2 = rand_cov(3, seed=42)
64
+ >>> np.allclose(cov1, cov2)
65
+ True
66
+
67
+ Verify positive definiteness via Cholesky decomposition:
68
+
69
+ >>> cov = rand_cov(5, seed=123)
70
+ >>> # If Cholesky succeeds without error, matrix is positive definite
71
+ >>> L = np.linalg.cholesky(cov)
72
+ >>> bool(np.allclose(L @ L.T, cov))
73
+ True
74
+
75
+ Eigenvalue verification:
76
+
77
+ >>> cov = rand_cov(3, seed=99)
78
+ >>> eigenvalues = np.linalg.eigvalsh(cov)
79
+ >>> # All eigenvalues should be positive for PD matrix
80
+ >>> bool(np.all(eigenvalues > 0))
81
+ True
82
+
83
+ Different seeds produce different matrices:
84
+
85
+ >>> cov1 = rand_cov(3, seed=1)
86
+ >>> cov2 = rand_cov(3, seed=2)
87
+ >>> bool(not np.allclose(cov1, cov2))
88
+ True
89
+
90
+ Without seed, consecutive calls may differ (random state):
91
+
92
+ >>> # These may or may not be equal depending on random state
93
+ >>> cov_a = rand_cov(2, seed=None)
94
+ >>> cov_b = rand_cov(2, seed=None)
95
+ >>> cov_a.shape == cov_b.shape == (2, 2)
96
+ True
97
+
98
+ Note:
99
+ The generated matrix is guaranteed to be positive semi-definite because
100
+ it is constructed as A^T @ A. In practice, it will typically be positive
101
+ definite (all eigenvalues strictly positive) unless n is very large.
102
+
103
+ """
104
+ rng = np.random.default_rng(seed)
105
+ a = rng.standard_normal((n, n))
106
+ return np.transpose(a) @ a
@@ -0,0 +1,138 @@
1
+ """Matrix validation utilities for handling non-finite values.
2
+
3
+ This module provides functions for validating and cleaning matrices that may
4
+ contain non-finite values (NaN or infinity). This is particularly useful when
5
+ working with financial data where missing values are common.
6
+
7
+ Example:
8
+ Extract the valid submatrix from a covariance matrix with missing data:
9
+
10
+ >>> import numpy as np
11
+ >>> from cvx.linalg import valid
12
+ >>> # Create a covariance matrix with some NaN values on diagonal
13
+ >>> cov = np.array([[np.nan, 0.5, 0.2],
14
+ ... [0.5, 2.0, 0.3],
15
+ ... [0.2, 0.3, np.nan]])
16
+ >>> # Get valid indicator and submatrix
17
+ >>> v, submatrix = valid(cov)
18
+ >>> v # Second row/column is valid
19
+ array([False, True, False])
20
+ >>> submatrix
21
+ array([[2.]])
22
+
23
+ """
24
+
25
+ # Copyright 2023 Stanford University Convex Optimization Group
26
+ #
27
+ # Licensed under the Apache License, Version 2.0 (the "License");
28
+ # you may not use this file except in compliance with the License.
29
+ # You may obtain a copy of the License at
30
+ #
31
+ # http://www.apache.org/licenses/LICENSE-2.0
32
+ #
33
+ # Unless required by applicable law or agreed to in writing, software
34
+ # distributed under the License is distributed on an "AS IS" BASIS,
35
+ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
36
+ # See the License for the specific language governing permissions and
37
+ # limitations under the License.
38
+ from __future__ import annotations
39
+
40
+ import numpy as np
41
+
42
+
43
+ def valid(matrix: np.ndarray) -> tuple[np.ndarray, np.ndarray]:
44
+ """Extract the valid subset of a matrix by removing rows/columns with non-finite values.
45
+
46
+ This function identifies rows and columns in a square matrix that contain
47
+ non-finite values (NaN or infinity) on the diagonal and removes them,
48
+ returning both the indicator vector and the resulting valid submatrix.
49
+
50
+ This is useful when working with covariance matrices where some assets
51
+ may have missing or invalid data.
52
+
53
+ Args:
54
+ matrix: A square n x n matrix to be validated. Typically a covariance
55
+ or correlation matrix.
56
+
57
+ Returns:
58
+ A tuple containing:
59
+ - v: Boolean vector of shape (n,) indicating which rows/columns are
60
+ valid (True for valid, False for invalid).
61
+ - submatrix: The valid submatrix with invalid rows/columns removed.
62
+ Shape is (k, k) where k is the number of True values in v.
63
+
64
+ Raises:
65
+ AssertionError: If the input matrix is not square (n x n).
66
+
67
+ Example:
68
+ Basic usage with a covariance matrix:
69
+
70
+ >>> import numpy as np
71
+ >>> from cvx.linalg import valid
72
+ >>> # Create a 3x3 matrix with one invalid entry
73
+ >>> cov = np.array([[1.0, 0.5, 0.2],
74
+ ... [0.5, np.nan, 0.3],
75
+ ... [0.2, 0.3, 1.0]])
76
+ >>> v, submatrix = valid(cov)
77
+ >>> v
78
+ array([ True, False, True])
79
+ >>> submatrix
80
+ array([[1. , 0.2],
81
+ [0.2, 1. ]])
82
+
83
+ Handling a fully valid matrix:
84
+
85
+ >>> cov = np.array([[1.0, 0.5], [0.5, 1.0]])
86
+ >>> v, submatrix = valid(cov)
87
+ >>> v
88
+ array([ True, True])
89
+ >>> np.allclose(submatrix, cov)
90
+ True
91
+
92
+ Handling infinity values:
93
+
94
+ >>> cov = np.array([[1.0, 0.5, 0.2],
95
+ ... [0.5, np.inf, 0.3],
96
+ ... [0.2, 0.3, 1.0]])
97
+ >>> v, submatrix = valid(cov)
98
+ >>> v
99
+ array([ True, False, True])
100
+ >>> submatrix
101
+ array([[1. , 0.2],
102
+ [0.2, 1. ]])
103
+
104
+ Multiple invalid entries:
105
+
106
+ >>> cov = np.array([[np.nan, 0.1, 0.2, 0.3],
107
+ ... [0.1, 2.0, 0.4, 0.5],
108
+ ... [0.2, 0.4, np.nan, 0.6],
109
+ ... [0.3, 0.5, 0.6, 3.0]])
110
+ >>> v, submatrix = valid(cov)
111
+ >>> v
112
+ array([False, True, False, True])
113
+ >>> submatrix.shape
114
+ (2, 2)
115
+ >>> submatrix
116
+ array([[2. , 0.5],
117
+ [0.5, 3. ]])
118
+
119
+ Non-square matrix raises assertion:
120
+
121
+ >>> try:
122
+ ... valid(np.array([[1, 2, 3], [4, 5, 6]]))
123
+ ... except AssertionError:
124
+ ... print("Caught assertion error for non-square matrix")
125
+ Caught assertion error for non-square matrix
126
+
127
+ Note:
128
+ The function checks only the diagonal elements for validity. It assumes
129
+ that if the diagonal is finite, the entire row/column is valid. This is
130
+ a common assumption for covariance matrices.
131
+
132
+ """
133
+ # make sure matrix is quadratic
134
+ if matrix.shape[0] != matrix.shape[1]:
135
+ raise AssertionError
136
+
137
+ v = np.isfinite(np.diag(matrix))
138
+ return v, matrix[:, v][v]