PyPI - pyhrp - Versions diffs - 0.0.0__tar.gz - Mend

pyhrp 0.0.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (10) hide show

pyhrp-0.0.0/LICENSE.txt ADDED Viewed

@@ -0,0 +1,21 @@
+The MIT License (MIT)
+Copyright (c) 2020 Thomas Schmelzer
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

pyhrp-0.0.0/PKG-INFO ADDED Viewed

@@ -0,0 +1,65 @@
+Metadata-Version: 2.1
+Name: pyhrp
+Version: 0.0.0
+Summary: ...
+Home-page: https://github.com/tschm/pyhrp
+Author: Thomas Schmelzer
+Requires-Python: >=3.9.0
+Classifier: Programming Language :: Python :: 3
+Classifier: Programming Language :: Python :: 3.9
+Classifier: Programming Language :: Python :: 3.10
+Classifier: Programming Language :: Python :: 3.11
+Requires-Dist: matplotlib (>=3.3.3)
+Requires-Dist: pandas (>=1.2.0)
+Requires-Dist: scikit-learn (>=0.24.1)
+Requires-Dist: scipy (>=1.6.0)
+Project-URL: Repository, https://github.com/tschm/pyhrp
+Description-Content-Type: text/markdown
+# pyhrp
+[![DeepSource](https://deepsource.io/gh/tschm/hrp.svg/?label=active+issues&show_trend=true&token=qjT_aLQgo_1Xbe2Z9ZNdH3Cx)](https://deepsource.io/gh/tschm/hrp/?ref=repository-badge)
+A recursive implementation of the Hierarchical Risk Parity (hrp) approach by Marcos Lopez de Prado.
+We take heavily advantage of the scipy.cluster.hierarchy package.
+Here's a simple example
+```python
+import pandas as pd
+from pyhrp.hrp import dist, linkage, tree, _hrp
+prices = pd.read_csv("test/resources/stock_prices.csv", index_col=0, parse_dates=True)
+returns = prices.pct_change().dropna(axis=0, how="all")
+cov, cor = returns.cov(), returns.corr()
+links = linkage(dist(cor.values), method='ward')
+node = tree(links)
+rootcluster = _hrp(node, cov)
+ax = dendrogram(links, orientation="left")
+ax.get_figure().savefig("dendrogram.png")
+```
+For your convenience you can bypass the construction of the covariance and correlation matrix, the links and the node, e.g. the root of the tree (dendrogram).
+```python
+import pandas as pd
+from pyhrp.hrp import hrp
+prices = pd.read_csv("test/resources/stock_prices.csv", index_col=0, parse_dates=True)
+root = hrp(prices=prices)
+```
+You may expect a weight series here but instead the `hrp` function returns a `Cluster` object. The `Cluster` simplifies all further post-analysis.
+```python
+print(cluster.weights)
+print(cluster.variance)
+# You can drill into the graph by going downstream
+print(cluster.left)
+print(cluster.right)
+```
+## Installation:
+```
+pip install pyhpr
+```

pyhrp-0.0.0/README.md ADDED Viewed

@@ -0,0 +1,46 @@
+# pyhrp
+[![DeepSource](https://deepsource.io/gh/tschm/hrp.svg/?label=active+issues&show_trend=true&token=qjT_aLQgo_1Xbe2Z9ZNdH3Cx)](https://deepsource.io/gh/tschm/hrp/?ref=repository-badge)
+A recursive implementation of the Hierarchical Risk Parity (hrp) approach by Marcos Lopez de Prado.
+We take heavily advantage of the scipy.cluster.hierarchy package.
+Here's a simple example
+```python
+import pandas as pd
+from pyhrp.hrp import dist, linkage, tree, _hrp
+prices = pd.read_csv("test/resources/stock_prices.csv", index_col=0, parse_dates=True)
+returns = prices.pct_change().dropna(axis=0, how="all")
+cov, cor = returns.cov(), returns.corr()
+links = linkage(dist(cor.values), method='ward')
+node = tree(links)
+rootcluster = _hrp(node, cov)
+ax = dendrogram(links, orientation="left")
+ax.get_figure().savefig("dendrogram.png")
+```
+For your convenience you can bypass the construction of the covariance and correlation matrix, the links and the node, e.g. the root of the tree (dendrogram).
+```python
+import pandas as pd
+from pyhrp.hrp import hrp
+prices = pd.read_csv("test/resources/stock_prices.csv", index_col=0, parse_dates=True)
+root = hrp(prices=prices)
+```
+You may expect a weight series here but instead the `hrp` function returns a `Cluster` object. The `Cluster` simplifies all further post-analysis.
+```python
+print(cluster.weights)
+print(cluster.variance)
+# You can drill into the graph by going downstream
+print(cluster.left)
+print(cluster.right)
+```
+## Installation:
+```
+pip install pyhpr
+```

pyhrp-0.0.0/pyhrp/__init__.py ADDED Viewed

File without changes

pyhrp-0.0.0/pyhrp/cluster.py ADDED Viewed

@@ -0,0 +1,97 @@
+"""risk parity for clusters"""
+from dataclasses import dataclass
+from typing import Dict
+import numpy as np
+import pandas as pd
+def risk_parity(cluster_left, cluster_right, cov):
+    """
+    Given two clusters compute in a bottom-up approach their parent.
+    :param cluster_left: left cluster
+    :param cluster_right: right cluster
+    :param cov: (global) covariance matrix. Will pick the correct sub-matrix
+    """
+    # combine two clusters
+    def parity(v_left, v_right):
+        """
+        Compute the weights for a risk parity portfolio of two assets
+        :param v_left: Variance of the "left" portfolio
+        :param v_right: Variance of the "right" portfolio
+        :return: w, 1-w the weights for the left and the right portfolio.
+                 It is w*v_left == (1-w)*v_right hence w = v_right / (v_right + v_left)
+        """
+        return v_right / (v_left + v_right), v_left / (v_left + v_right)
+    if not set(cluster_left.assets).isdisjoint(set(cluster_right.assets)):
+        raise AssertionError
+    # split is s.t. v_left * alpha_left == v_right * alpha_right and alpha + beta = 1
+    alpha_left, alpha_right = parity(cluster_left.variance, cluster_right.variance)
+    # assets in the cluster are the assets of the left and right cluster
+    # further downstream
+    assets = {
+        **(alpha_left * cluster_left.weights).to_dict(),
+        **(alpha_right * cluster_right.weights).to_dict(),
+    }
+    weights = np.array(list(assets.values()))
+    covariance = cov[assets.keys()].loc[assets.keys()]
+    var = np.linalg.multi_dot((weights, covariance, weights))
+    return Cluster(
+        assets=assets,
+        variance=var,
+        left=cluster_left,
+        right=cluster_right,  # , node=node
+    )
+@dataclass(frozen=True)
+class Cluster:
+    """
+    Clusters are the nodes of the graphs we build.
+    Each cluster is aware of the left and the right cluster
+    it is connecting to.
+    """
+    assets: Dict[str, float]
+    variance: float
+    left: object = None
+    right: object = None
+    def __post_init__(self):
+        """check input"""
+        if self.variance <= 0:
+            raise AssertionError
+        if self.left is None:
+            # if there is no left, there can't be a right
+            if self.right is not None:
+                raise AssertionError
+        else:
+            # left is not None, hence both left and right have to be clusters
+            if not isinstance(self.left, Cluster):
+                raise AssertionError
+            if not isinstance(self.right, Cluster):
+                raise AssertionError
+            if not set(self.left.assets.keys()).isdisjoint(
+                set(self.right.assets.keys())
+            ):
+                raise AssertionError
+    def is_leaf(self):
+        """true if this cluster is a leaf, e.g. no clusters follow downstream"""
+        return self.left is None and self.right is None
+    @property
+    def weights(self):
+        """weight series"""
+        return pd.Series(self.assets, name="Weights").sort_index()

pyhrp-0.0.0/pyhrp/graph.py ADDED Viewed

@@ -0,0 +1,11 @@
+"""display a dendrogram"""
+import matplotlib.pyplot as plt
+import scipy.cluster.hierarchy as sch
+def dendrogram(links, ax=None, **kwargs):
+    """Plot a dendrogram using matplotlib"""
+    if ax is None:
+        _, ax = plt.subplots(figsize=(25, 20))
+    sch.dendrogram(links, ax=ax, **kwargs)
+    return ax

pyhrp-0.0.0/pyhrp/hrp.py ADDED Viewed

@@ -0,0 +1,72 @@
+"""the hrp algorithm"""
+import numpy as np
+import scipy.cluster.hierarchy as sch
+import scipy.spatial.distance as ssd
+from pyhrp.cluster import Cluster, risk_parity
+def dist(cor):
+    """
+    Compute the correlation based distance matrix d,
+    compare with page 239 of the first book by Marcos
+    :param cor: the n x n correlation matrix
+    :return: The matrix d indicating the distance between column i and i.
+             Note that all the diagonal entries are zero.
+    """
+    # https://stackoverflow.com/questions/18952587/
+    matrix = np.sqrt(np.clip((1.0 - cor) / 2.0, a_min=0.0, a_max=1.0))
+    np.fill_diagonal(matrix, val=0.0)
+    return ssd.squareform(matrix)
+def linkage(dist_vec, method="ward", **kwargs):
+    """
+    Based on distance matrix compute the underlying links
+    :param dist_vec: The distance vector based on the correlation matrix
+    :param method: "single", "ward", etc.
+    :return: links  The links describing the graph (useful to draw the dendrogram)
+                    and basis for constructing the tree object
+    """
+    # compute the root node of the dendrogram
+    return sch.linkage(dist_vec, method=method, **kwargs)
+def tree(links):
+    """
+    Compute the root ClusterNode.
+    :param links: The Linkage matrix compiled by the linkage function above
+    :return: The root node. From there it's possible to reach the entire graph
+    """
+    return sch.to_tree(links, rd=False)
+def build_cluster(node, cov):
+    """compute a cluster"""
+    if node.is_leaf():
+        # a node is a leaf if has no further relatives downstream.
+        # no leaves, no branches, ...
+        asset = cov.keys().to_list()[node.id]
+        return Cluster(assets={asset: 1.0}, variance=cov[asset][asset])
+    # drill down on the left
+    cluster_left = build_cluster(node.left, cov)
+    # drill down on the right
+    cluster_right = build_cluster(node.right, cov)
+    # combine left and right into a new cluster
+    return risk_parity(cluster_left, cluster_right, cov=cov)
+def hrp(prices, node=None, method="single"):
+    """
+    Computes the root node for the hierarchical risk parity portfolio
+    :param cov: This is the covariance matrix that shall be used
+    :param node: Optional. This is the rootnode of the graph describing the dendrogram
+    :return: the root cluster of the risk parity portfolio
+    """
+    returns = prices.pct_change().dropna(axis=0, how="all")
+    cov, cor = returns.cov(), returns.corr()
+    node = node or tree(linkage(dist(cor.values), method=method))
+    return build_cluster(node, cov)

pyhrp-0.0.0/pyhrp/marcos.py ADDED Viewed

@@ -0,0 +1,67 @@
+"""Replicate the implementation of HRP by Marcos Lopez de Prado using this package
+The original implementation by Marcos Lopez de Prado is using recursive bisection
+on a ranked list of columns of the covariance matrix
+To get to this list Lopez de Prado is using a matrix quasi-diagonalization
+induced by the order (from left to right) of the dendrogram.
+Based on that we build a tree reflecting the recursive bisection.
+With that tree and the covariance matrix we go back to the hrp algorithm"""
+import pandas as pd
+import scipy.cluster.hierarchy as sch
+from pyhrp.hrp import build_cluster, dist, linkage, tree
+def bisection(ids):
+    """
+    Compute the graph underlying the recursive bisection of Marcos Lopez de Prado
+    :param ids: A (ranked) set of indixes
+    :return: The root ClusterNode of this tree
+    """
+    def split(ids):
+        """split the vector ids in two parts, split in the middle"""
+        if len(ids) < 2:
+            raise AssertionError
+        num = len(ids)
+        return ids[: num // 2], ids[num // 2 :]
+    if len(ids) < 1:
+        raise AssertionError
+    if len(ids) != len(set(ids)):
+        raise AssertionError
+    if len(ids) == 1:
+        return sch.ClusterNode(id=ids[0])
+    left, right = split(ids)
+    return sch.ClusterNode(id=0, left=bisection(ids=left), right=bisection(ids=right))
+def marcos(prices, node=None, method=None):
+    """The algorithm as implemented in the book by Marcos Lopez de Prado"""
+    # make sure the prices are a DataFrame
+    if not isinstance(prices, pd.DataFrame):
+        raise AssertionError
+    # convert into returns
+    returns = prices.pct_change().dropna(axis=0, how="all")
+    # compute covariance matrix and correlation matrices (both as DataFrames)
+    cov, cor = returns.cov(), returns.corr()
+    # Compute the root node of the tree
+    method = method or "ward"
+    node = node or tree(linkage(dist(cor.values), method=method))
+    # this is an interesting step
+    ids = node.pre_order()
+    # apply bisection, root is now a ClusterNode of the graph
+    root = bisection(ids=ids)
+    # It's not clear to me why Marcos is going down this route.
+    # Rather than sticking with the graph computed above.
+    return build_cluster(node=root, cov=cov)

pyhrp-0.0.0/pyproject.toml ADDED Viewed

@@ -0,0 +1,23 @@
+[tool.poetry]
+name = "pyhrp"
+version = "0.0.0"
+description = "..."
+authors = ["Thomas Schmelzer"]
+readme = "README.md"
+repository = "https://github.com/tschm/pyhrp"
+packages = [{include = "pyhrp"}]
+[tool.poetry.dependencies]
+python = ">=3.9.0"
+pandas = ">=1.2.0"
+scipy = ">=1.6.0"
+matplotlib = ">=3.3.3"
+scikit-learn = ">=0.24.1"
+[tool.poetry.dev-dependencies]
+pytest = "7.2.0"
+pytest-cov = "*"
+[build-system]
+requires = ["poetry>=1.0.2"]
+build-backend = "poetry.masonry.api"

pyhrp-0.0.0/setup.py ADDED Viewed

@@ -0,0 +1,30 @@
+# -*- coding: utf-8 -*-
+from setuptools import setup
+packages = \
+['pyhrp']
+package_data = \
+{'': ['*']}
+install_requires = \
+['matplotlib>=3.3.3', 'pandas>=1.2.0', 'scikit-learn>=0.24.1', 'scipy>=1.6.0']
+setup_kwargs = {
+    'name': 'pyhrp',
+    'version': '0.0.0',
+    'description': '...',
+    'long_description': '# pyhrp\n\n[![DeepSource](https://deepsource.io/gh/tschm/hrp.svg/?label=active+issues&show_trend=true&token=qjT_aLQgo_1Xbe2Z9ZNdH3Cx)](https://deepsource.io/gh/tschm/hrp/?ref=repository-badge)\n\nA recursive implementation of the Hierarchical Risk Parity (hrp) approach by Marcos Lopez de Prado.\nWe take heavily advantage of the scipy.cluster.hierarchy package. \n\nHere\'s a simple example\n\n```python\nimport pandas as pd\nfrom pyhrp.hrp import dist, linkage, tree, _hrp\n\nprices = pd.read_csv("test/resources/stock_prices.csv", index_col=0, parse_dates=True)\n\nreturns = prices.pct_change().dropna(axis=0, how="all")\ncov, cor = returns.cov(), returns.corr()\nlinks = linkage(dist(cor.values), method=\'ward\')\nnode = tree(links)\n\nrootcluster = _hrp(node, cov)\n\nax = dendrogram(links, orientation="left")\nax.get_figure().savefig("dendrogram.png")\n```\nFor your convenience you can bypass the construction of the covariance and correlation matrix, the links and the node, e.g. the root of the tree (dendrogram).\n```python\nimport pandas as pd\nfrom pyhrp.hrp import hrp\n\nprices = pd.read_csv("test/resources/stock_prices.csv", index_col=0, parse_dates=True)\nroot = hrp(prices=prices)\n```\nYou may expect a weight series here but instead the `hrp` function returns a `Cluster` object. The `Cluster` simplifies all further post-analysis.\n```python\nprint(cluster.weights)\nprint(cluster.variance)\n# You can drill into the graph by going downstream\nprint(cluster.left)\nprint(cluster.right)\n```\n\n## Installation:\n```\npip install pyhpr\n```\n',
+    'author': 'Thomas Schmelzer',
+    'author_email': 'None',
+    'maintainer': 'None',
+    'maintainer_email': 'None',
+    'url': 'https://github.com/tschm/pyhrp',
+    'packages': packages,
+    'package_data': package_data,
+    'install_requires': install_requires,
+    'python_requires': '>=3.9.0',
+}
+setup(**setup_kwargs)