PyPI - jax-hpc-profiler - Versions diffs - 0.2.0__tar.gz → 0.2.1__tar.gz - Mend

jax-hpc-profiler 0.2.0tar.gz → 0.2.1tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (19) hide show

{jax_hpc_profiler-0.2.0 → jax_hpc_profiler-0.2.1}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.1
 Name: jax_hpc_profiler
-Version: 0.2.0
+Version: 0.2.1
 Summary: HPC Plotter and profiler for benchmarking data made for JAX
 Author: Wassim Kabalan
 License:                     GNU GENERAL PUBLIC LICENSE
@@ -678,6 +678,19 @@ License:                     GNU GENERAL PUBLIC LICENSE
         Public License instead of this License.  But first, please read
         <https://www.gnu.org/licenses/why-not-lgpl.html>.
+Project-URL: Homepage, https://github.com/ASKabalan/jax-hpc-profiler
+Keywords: jax,hpc,profiler,plotter,benchmarking
+Classifier: Development Status :: 4 - Beta
+Classifier: Intended Audience :: Developers
+Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3)
+Classifier: Programming Language :: Python :: 3
+Classifier: Programming Language :: Python :: 3.8
+Classifier: Programming Language :: Python :: 3.9
+Classifier: Programming Language :: Python :: 3.10
+Classifier: Programming Language :: Python :: 3.11
+Classifier: Programming Language :: Python :: 3.12
+Classifier: Programming Language :: Python :: 3 :: Only
+Requires-Python: >=3.8
 Description-Content-Type: text/markdown
 License-File: LICENSE
 Requires-Dist: numpy
@@ -686,9 +699,11 @@ Requires-Dist: matplotlib
 Requires-Dist: seaborn
 Requires-Dist: tabulate
-# HPC Plotter
+Here's the updated README with the additional information about the timer.report and the multi-GPU setup:
-HPC Plotter is a tool designed for benchmarking and visualizing performance data in high-performance computing (HPC) environments. It provides functionalities to generate, concatenate, and plot CSV data from various runs.
+# JAX HPC Profiler
+JAX HPC Profiler is a tool designed for benchmarking and visualizing performance data in high-performance computing (HPC) environments. It provides functionalities to generate, concatenate, and plot CSV data from various runs.
 ## Table of Contents
 - [Introduction](#introduction)
@@ -697,9 +712,10 @@ HPC Plotter is a tool designed for benchmarking and visualizing performance data
 - [CSV Structure](#csv-structure)
 - [Concatenating Files from Different Runs](#concatenating-files-from-different-runs)
 - [Plotting CSV Data](#plotting-csv-data)
+- [Examples](#examples)
 ## Introduction
-HPC Plotter allows users to:
+JAX HPC Profiler allows users to:
 1. Generate CSV files containing performance data.
 2. Concatenate multiple CSV files from different runs.
 3. Plot the performance data for analysis.
@@ -709,53 +725,80 @@ HPC Plotter allows users to:
 To install the package, run the following command:
 ```bash
-pip install hpc-plotter
+pip install jax-hpc-profiler
 ```
 ## Generating CSV Files Using the Timer Class
-To generate CSV files, you can use the `Timer` class provided in the `hpc_plotter.timer` module. This class helps in timing functions and saving the timing results to CSV files.
+To generate CSV files, you can use the `Timer` class provided in the `jax_hpc_profiler.timer` module. This class helps in timing functions and saving the timing results to CSV files.
 ### Example Usage
 ```python
-import time
-from hpc_plotter.timer import Timer
 import jax
-# Define the functions you want to time
-def example_function():
-    time.sleep(1)  # Simulating a task
-# Create a Timer instance
-timer = Timer()
-# Time the function
-timer.chrono_jit(example_function)
-for _ in range(5):
-    timer.chrono_fun(example_function)
-# Metadata for the CSV file
-metadata = {
-    'rank': jax.process_index(),
-    'function_name': 'example_function',
-    'precision': 'float32',
-    'x': '1024',
-    'y': '1024',
-    'z': '1024',
-    'px': '4',
-    'py': '4',
-    'backend': 'NCCL',
-    'nodes': '2'
+from jax_hpc_profiler import Timer
+def fcn(m, n, k):
+    return jax.numpy.dot(m, n) + k
+timer = Timer(save_jaxpr=True)
+m = jax.numpy.ones((1000, 1000))
+n = jax.numpy.ones((1000, 1000))
+k = jax.numpy.ones((1000, 1000))
+timer.chrono_jit(fcn, m, n, k)
+for i in range(10):
+    timer.chrono_fun(fcn, m, n, k)
+meta_data = {
+  "function": "fcn",
+  "precision": "float32",
+  "x": 1000,
+  "y": 1000,
+  "z": 1000,
+  "px": 1,
+  "py": 1,
+  "backend": "NCCL",
+  "nodes": 1
+}
+extra_info = {
+    "done": "yes"
 }
-# Print the results to a CSV file
-timer.print_to_csv('output.csv', **metadata)
+timer.report("examples/profiling/test.csv", **meta_data,  extra_info=extra_info)
 ```
+`timer.report` has sensible defaults and this is the API for the `Timer` class:
+- `csv_filename`: The path to the CSV file to save the timing data **(required)**.
+- `function`: The name of the function being timed **(required)**.
+- `x`: The size of the input data in the x dimension **(required)**.
+- `y`: The size of the input data in the y dimension (by default same as x).
+- `z`: The size of the input data in the z dimension (by default same as x).
+- `precision`: The precision of the data (default: "float32").
+- `px`: The number of partitions in the x dimension (default: 1).
+- `py`: The number of partitions in the y dimension (default: 1).
+- `backend`: The backend used for computation (default: "NCCL").
+- `nodes`: The number of nodes used for computation (default: 1).
+- `md_filename`: The path to the markdown file containing the compiled code and other information (default: {csv_folder}/{x}_{px}_{py}_{backend}_{precision}_{function}.md).
+- `extra_info`: Additional information to include in the report (default: {}
+`px` and `py` are used to specify the data decomposition. For example, if you have a 2D array of size 1000x1000 and you partition it into 4 parts (2x2), you would set `px=2` and `py=2`.\
+they can also be used in a single device run to specify batch size.
+Some decomposition parameters are generated and that are specific to 3D data decomposition.\
+`slab_yz` if the distributed axis is the y-axis.\
+`slab_xy` if the distributed axis is the x-axis.\
+`pencils` if the distributed axis are the x and y axes.
+### Multi-GPU Setup
+In a multi-GPU setup, the times are automatically averaged across ranks, providing a single performance metric for the entire setup.
 ## CSV Structure
 The CSV files should follow a specific structure to ensure proper processing and concatenation. The directory structure should be organized by GPU type, with subdirectories for the number of GPUs and the respective CSV files.
 ### Example Directory Structure
 ```
@@ -790,12 +833,12 @@ root_directory/
 ## Concatenating Files from Different Runs
-The `plot` function expects the directory to be organized as described above, but with the different number of GPUs toghether in the same directory. The `concatenate` function can be used to concatenate the CSV files from different runs into a single file.
+The `plot` function expects the directory to be organized as described above, but with the different number of GPUs together in the same directory. The `concatenate` function can be used to concatenate the CSV files from different runs into a single file.
 ### Example Usage
 ```bash
-hpc-plotter concat /path/to/root_directory /path/to/output
+jax-hpc-profiler concat /path/to/root_directory /path/to/output
 ```
 And the output will be:
@@ -812,8 +855,6 @@ out_directory/
     └── method_3.csv
 ```
 ## Plotting CSV Data
 You can plot the performance data using the `plot` command. The plotting command provides various options to customize the plots.
@@ -821,27 +862,41 @@ You can plot the performance data using the `plot` command. The plotting command
 ### Usage
 ```bash
-hpc-plotter plot -f <csv_files> [options]
+jax-hpc-profiler plot -f <csv_files> [options]
 ```
-with options :
+### Options
 - `-f, --csv_files`: List of CSV files to plot (required).
-- `-g, --gpus`: Filter GPUs. List of number of GPUs to plot.
-- `-d, --data_size`: Filter data sizes. List of data sizes to plot.
+- `-g, --gpus`: List of number of GPUs to plot.
+- `-d, --data_size`: List of data sizes to plot.
 - `-fd, --filter_pdims`: List of pdims to filter (e.g., 1x4 2x2 4x8).
-- `-ps, --pdims_strategy`: Strategy for plotting pdims (`plot_all` or `plot_fastest`).
-  - `plot_all`: Plot every decomposition. 1xX and Xx1 as slabs, XxX as pencils.
+- `-ps, --pdim_strategy`: Strategy for plotting pdims. This argument can be multiple ones (`plot_all`, `plot_fastest`, `slab_yz`, `slab_xy`, `pencils`).
+  - `plot_all`: Plot every decomposition.
   - `plot_fastest`: Plot the fastest decomposition.
-- `-p, --precision`: Precision to filter by (`float32` or `float64`).
-- `-fn, --function_name`: Function name to filter.
-- `-ta, --time_aggregation`: Time aggregation method (`mean`, `min`, `max`).
-- `-tc, --time_column`: Time column to plot (`jit_time`, `min_time`, `max_time`, `mean_time`, `std_div`, `last_time`).
+- `-pr, --precision`: Precision to filter by. This argument can be multiple ones (`float32`, `float64`).
+- `-fn, --function_name`: Function names to filter. This argument can be multiple ones.
+- `-pt, --plot_times`: Time columns to plot (`jit_time`, `min_time`, `max_time`, `mean_time`, `std_time`, `last_time`). Note: You cannot plot memory and time together.
+- `-pm, --plot_memory`: Memory columns to plot (`generated_code`, `argument_size`, `output_size`, `temp_size`). Note: You cannot plot memory and time together.
+- `-mu, --memory_units`: Memory units to plot (`KB`, `MB`, `GB`, `TB`).
 - `-fs, --figure_size`: Figure size.
-- `-nl, --nodes_in_label`: Use node names in labels.
 - `-o, --output`: Output file (if none then only show plot).
 - `-db, --dark_bg`: Use dark background for plotting.
-- `-pd, --print_decompositions`: Print decompositions on plot (only for `plot_fastest`).
-- `-b, --backends`: List of backends to include.
-- `-sc, --scaling`: Scaling type (`Weak` or `Strong`).
+- `-pd, --print_decompositions`: Print decompositions on plot (experimental).
+- `-b, --backends`: List of backends to include. This argument can be multiple ones.
+- `-sc, --scaling`: Scaling type (`Weak`, `Strong`).
+- `-l, --label_text`: Custom label for the plot. You can use placeholders: `%decomposition%` (or `%p%`), `%precision%` (or `%pr%`), `%plot_name%` (or `%pn%`), `%backend%` (or `%b%`), `%node%` (or `%n%`), `%methodname%` (or `%m%`).
+## Examples
+The repository includes examples for both profiling and plotting.
+### Profiling Example
+See the `examples/profiling` directory for profiling examples, including `function.py`, `test.csv`, and the generated markdown report.
+### Plotting Example
+See the `examples/plotting` directory for plotting examples, including `generator.py`, `sample_data1.csv`, `sample_data2.csv`, and `sample_data3.csv`.
+a multi GPU example comparing distributed FFT can be found here [jaxdecomp-bechmarks](https://github.com/ASKabalan/jaxdecomp-benchmarks)

jax_hpc_profiler-0.2.1/README.md ADDED Viewed

@@ -0,0 +1,201 @@
+Here's the updated README with the additional information about the timer.report and the multi-GPU setup:
+# JAX HPC Profiler
+JAX HPC Profiler is a tool designed for benchmarking and visualizing performance data in high-performance computing (HPC) environments. It provides functionalities to generate, concatenate, and plot CSV data from various runs.
+## Table of Contents
+- [Introduction](#introduction)
+- [Installation](#installation)
+- [Generating CSV Files Using the Timer Class](#generating-csv-files-using-the-timer-class)
+- [CSV Structure](#csv-structure)
+- [Concatenating Files from Different Runs](#concatenating-files-from-different-runs)
+- [Plotting CSV Data](#plotting-csv-data)
+- [Examples](#examples)
+## Introduction
+JAX HPC Profiler allows users to:
+1. Generate CSV files containing performance data.
+2. Concatenate multiple CSV files from different runs.
+3. Plot the performance data for analysis.
+## Installation
+To install the package, run the following command:
+```bash
+pip install jax-hpc-profiler
+```
+## Generating CSV Files Using the Timer Class
+To generate CSV files, you can use the `Timer` class provided in the `jax_hpc_profiler.timer` module. This class helps in timing functions and saving the timing results to CSV files.
+### Example Usage
+```python
+import jax
+from jax_hpc_profiler import Timer
+def fcn(m, n, k):
+    return jax.numpy.dot(m, n) + k
+timer = Timer(save_jaxpr=True)
+m = jax.numpy.ones((1000, 1000))
+n = jax.numpy.ones((1000, 1000))
+k = jax.numpy.ones((1000, 1000))
+timer.chrono_jit(fcn, m, n, k)
+for i in range(10):
+    timer.chrono_fun(fcn, m, n, k)
+meta_data = {
+  "function": "fcn",
+  "precision": "float32",
+  "x": 1000,
+  "y": 1000,
+  "z": 1000,
+  "px": 1,
+  "py": 1,
+  "backend": "NCCL",
+  "nodes": 1
+}
+extra_info = {
+    "done": "yes"
+}
+timer.report("examples/profiling/test.csv", **meta_data,  extra_info=extra_info)
+```
+`timer.report` has sensible defaults and this is the API for the `Timer` class:
+- `csv_filename`: The path to the CSV file to save the timing data **(required)**.
+- `function`: The name of the function being timed **(required)**.
+- `x`: The size of the input data in the x dimension **(required)**.
+- `y`: The size of the input data in the y dimension (by default same as x).
+- `z`: The size of the input data in the z dimension (by default same as x).
+- `precision`: The precision of the data (default: "float32").
+- `px`: The number of partitions in the x dimension (default: 1).
+- `py`: The number of partitions in the y dimension (default: 1).
+- `backend`: The backend used for computation (default: "NCCL").
+- `nodes`: The number of nodes used for computation (default: 1).
+- `md_filename`: The path to the markdown file containing the compiled code and other information (default: {csv_folder}/{x}_{px}_{py}_{backend}_{precision}_{function}.md).
+- `extra_info`: Additional information to include in the report (default: {}
+`px` and `py` are used to specify the data decomposition. For example, if you have a 2D array of size 1000x1000 and you partition it into 4 parts (2x2), you would set `px=2` and `py=2`.\
+they can also be used in a single device run to specify batch size.
+Some decomposition parameters are generated and that are specific to 3D data decomposition.\
+`slab_yz` if the distributed axis is the y-axis.\
+`slab_xy` if the distributed axis is the x-axis.\
+`pencils` if the distributed axis are the x and y axes.
+### Multi-GPU Setup
+In a multi-GPU setup, the times are automatically averaged across ranks, providing a single performance metric for the entire setup.
+## CSV Structure
+The CSV files should follow a specific structure to ensure proper processing and concatenation. The directory structure should be organized by GPU type, with subdirectories for the number of GPUs and the respective CSV files.
+### Example Directory Structure
+```
+root_directory/
+├── gpu_1/
+│   ├── 2/
+│   │   ├── method_1.csv
+│   │   ├── method_2.csv
+│   │   └── method_3.csv
+│   ├── 4/
+│   │   ├── method_1.csv
+│   │   ├── method_2.csv
+│   │   └── method_3.csv
+│   └── 8/
+│       ├── method_1.csv
+│       ├── method_2.csv
+│       └── method_3.csv
+└── gpu_2/
+    ├── 2/
+    │   ├── method_1.csv
+    │   ├── method_2.csv
+    │   └── method_3.csv
+    ├── 4/
+    │   ├── method_1.csv
+    │   ├── method_2.csv
+    │   └── method_3.csv
+    └── 8/
+        ├── method_1.csv
+        ├── method_2.csv
+        └── method_3.csv
+```
+## Concatenating Files from Different Runs
+The `plot` function expects the directory to be organized as described above, but with the different number of GPUs together in the same directory. The `concatenate` function can be used to concatenate the CSV files from different runs into a single file.
+### Example Usage
+```bash
+jax-hpc-profiler concat /path/to/root_directory /path/to/output
+```
+And the output will be:
+```
+out_directory/
+├── gpu_1/
+│   ├── method_1.csv
+│   ├── method_2.csv
+│   └── method_3.csv
+└── gpu_2/
+    ├── method_1.csv
+    ├── method_2.csv
+    └── method_3.csv
+```
+## Plotting CSV Data
+You can plot the performance data using the `plot` command. The plotting command provides various options to customize the plots.
+### Usage
+```bash
+jax-hpc-profiler plot -f <csv_files> [options]
+```
+### Options
+- `-f, --csv_files`: List of CSV files to plot (required).
+- `-g, --gpus`: List of number of GPUs to plot.
+- `-d, --data_size`: List of data sizes to plot.
+- `-fd, --filter_pdims`: List of pdims to filter (e.g., 1x4 2x2 4x8).
+- `-ps, --pdim_strategy`: Strategy for plotting pdims. This argument can be multiple ones (`plot_all`, `plot_fastest`, `slab_yz`, `slab_xy`, `pencils`).
+  - `plot_all`: Plot every decomposition.
+  - `plot_fastest`: Plot the fastest decomposition.
+- `-pr, --precision`: Precision to filter by. This argument can be multiple ones (`float32`, `float64`).
+- `-fn, --function_name`: Function names to filter. This argument can be multiple ones.
+- `-pt, --plot_times`: Time columns to plot (`jit_time`, `min_time`, `max_time`, `mean_time`, `std_time`, `last_time`). Note: You cannot plot memory and time together.
+- `-pm, --plot_memory`: Memory columns to plot (`generated_code`, `argument_size`, `output_size`, `temp_size`). Note: You cannot plot memory and time together.
+- `-mu, --memory_units`: Memory units to plot (`KB`, `MB`, `GB`, `TB`).
+- `-fs, --figure_size`: Figure size.
+- `-o, --output`: Output file (if none then only show plot).
+- `-db, --dark_bg`: Use dark background for plotting.
+- `-pd, --print_decompositions`: Print decompositions on plot (experimental).
+- `-b, --backends`: List of backends to include. This argument can be multiple ones.
+- `-sc, --scaling`: Scaling type (`Weak`, `Strong`).
+- `-l, --label_text`: Custom label for the plot. You can use placeholders: `%decomposition%` (or `%p%`), `%precision%` (or `%pr%`), `%plot_name%` (or `%pn%`), `%backend%` (or `%b%`), `%node%` (or `%n%`), `%methodname%` (or `%m%`).
+## Examples
+The repository includes examples for both profiling and plotting.
+### Profiling Example
+See the `examples/profiling` directory for profiling examples, including `function.py`, `test.csv`, and the generated markdown report.
+### Plotting Example
+See the `examples/plotting` directory for plotting examples, including `generator.py`, `sample_data1.csv`, `sample_data2.csv`, and `sample_data3.csv`.
+a multi GPU example comparing distributed FFT can be found here [jaxdecomp-bechmarks](https://github.com/ASKabalan/jaxdecomp-benchmarks)

jax_hpc_profiler-0.2.1/pyproject.toml ADDED Viewed

@@ -0,0 +1,49 @@
+[build-system]
+requires = ["setuptools", "wheel"]
+build-backend = "setuptools.build_meta"
+[project]
+name = "jax_hpc_profiler"
+version = "0.2.1"
+description = "HPC Plotter and profiler for benchmarking data made for JAX"
+authors = [
+    { name="Wassim Kabalan" }
+]
+dependencies = [
+    "numpy",
+    "pandas",
+    "matplotlib",
+    "seaborn",
+    "tabulate"
+]
+readme = "README.md"
+license = { file = "LICENSE" }
+requires-python = ">=3.8"
+keywords = ["jax", "hpc", "profiler", "plotter", "benchmarking"]
+# For a list of valid classifiers, see https://pypi.org/classifiers/
+classifiers = [
+  "Development Status :: 4 - Beta",
+  # Indicate who your project is intended for
+  "Intended Audience :: Developers",
+  # Pick your license as you wish
+  "License :: OSI Approved :: GNU General Public License v3 (GPLv3)",
+  # Specify the Python versions you support here. In particular, ensure
+  # that you indicate you support Python 3. These classifiers are *not*
+  # checked by "pip install". See instead "requires-python" key in this file.
+  "Programming Language :: Python :: 3",
+  "Programming Language :: Python :: 3.8",
+  "Programming Language :: Python :: 3.9",
+  "Programming Language :: Python :: 3.10",
+  "Programming Language :: Python :: 3.11",
+  "Programming Language :: Python :: 3.12",
+  "Programming Language :: Python :: 3 :: Only",
+]
+urls = { "Homepage" = "https://github.com/ASKabalan/jax-hpc-profiler" }
+[project.scripts]
+jhp = "jax_hpc_profiler.main:main"

{jax_hpc_profiler-0.2.0 → jax_hpc_profiler-0.2.1}/src/jax_hpc_profiler/create_argparse.py RENAMED Viewed

@@ -89,9 +89,9 @@ def create_argparser():
                                 ],
                                 help='Memory columns to plot')
     plot_parser.add_argument('-mu',
-                        '--memory_units',
-                        default='GB',
-                        help='Memory units to plot (KB, MB, GB, TB)')
+                             '--memory_units',
+                             default='GB',
+                             help='Memory units to plot (KB, MB, GB, TB)')
     # Plot customization arguments
     plot_parser.add_argument('-fs',

{jax_hpc_profiler-0.2.0 → jax_hpc_profiler-0.2.1}/src/jax_hpc_profiler/main.py RENAMED Viewed

@@ -15,7 +15,7 @@ def main():
         dataframes, available_gpu_counts, available_data_sizes = clean_up_csv(
             args.csv_files, args.precision, args.function_name, args.gpus,
             args.data_size, args.filter_pdims, args.pdim_strategy,
-            args.backends,args.memory_units)
+            args.backends, args.memory_units)
         if len(dataframes) == 0:
             print(f"No dataframes found for the given arguments. Exiting...")
             sys.exit(1)
@@ -29,12 +29,10 @@ def main():
             if data_size in available_data_sizes
         ]
         if len(args.gpus) == 0:
-            print(
-                f"No dataframes found for the given GPUs. Exiting...")
+            print(f"No dataframes found for the given GPUs. Exiting...")
             sys.exit(1)
         if len(args.data_size) == 0:
-            print(
-                f"No dataframes found for the given data sizes. Exiting...")
+            print(f"No dataframes found for the given data sizes. Exiting...")
             sys.exit(1)
         if args.scaling == 'Weak':

{jax_hpc_profiler-0.2.0 → jax_hpc_profiler-0.2.1}/src/jax_hpc_profiler/timer.py RENAMED Viewed

@@ -1,12 +1,13 @@
 import os
 import time
 from functools import partial
-from typing import Any, Callable, List
+from typing import Any, Callable, List, Tuple
 import jax
 import jax.numpy as jnp
 import numpy as np
 from jax import make_jaxpr
+from jax.experimental import mesh_utils
 from jax.experimental.shard_map import shard_map
 from jax.sharding import Mesh, NamedSharding
 from jax.sharding import PartitionSpec as P
@@ -22,6 +23,19 @@ class Timer:
         self.compiled_code = {}
         self.save_jaxpr = save_jaxpr
+    def _read_cost_analysis(self, cost_analysis: Any) -> str | None:
+        if cost_analysis is None:
+            return None
+        return cost_analysis[0]['flops']
+    def _read_memory_analysis(self, memory_analysis: Any) -> Tuple:
+        if memory_analysis is None:
+            return None, None, None, None
+        return (memory_analysis.generated_code_size_in_bytes,
+                memory_analysis.argument_size_in_bytes,
+                memory_analysis.output_size_in_bytes,
+                memory_analysis.temp_size_in_bytes)
     def chrono_jit(self, fun: Callable, *args, ndarray_arg=None) -> np.ndarray:
         start = time.perf_counter()
         out = jax.jit(fun)(*args)
@@ -38,18 +52,17 @@ class Timer:
         lowered = jax.jit(fun).lower(*args)
         compiled = lowered.compile()
-        memory_analysis = compiled.memory_analysis()
+        memory_analysis = self._read_memory_analysis(
+            compiled.memory_analysis())
+        cost_analysis = self._read_cost_analysis(compiled.cost_analysis())
         self.compiled_code["LOWERED"] = lowered.as_text()
         self.compiled_code["COMPILED"] = compiled.as_text()
-        self.profiling_data["FLOPS"] = compiled.cost_analysis()[0]['flops']
-        self.profiling_data[
-            "generated_code"] = memory_analysis.generated_code_size_in_bytes
-        self.profiling_data[
-            "argument_size"] = memory_analysis.argument_size_in_bytes
-        self.profiling_data[
-            "output_size"] = memory_analysis.output_size_in_bytes
-        self.profiling_data["temp_size"] = memory_analysis.temp_size_in_bytes
+        self.profiling_data["FLOPS"] = cost_analysis
+        self.profiling_data["generated_code"] = memory_analysis[0]
+        self.profiling_data["argument_size"] = memory_analysis[0]
+        self.profiling_data["output_size"] = memory_analysis[0]
+        self.profiling_data["temp_size"] = memory_analysis[0]
         return out
     def chrono_fun(self, fun: Callable, *args, ndarray_arg=None) -> np.ndarray:
@@ -63,56 +76,62 @@ class Timer:
         self.times.append((end - start) * 1e3)
         return out
-    def _get_mean_times(self, times_array: jnp.ndarray,
-                        sharding: NamedSharding):
-        mesh = sharding.mesh
-        specs = sharding.spec
-        valid_letters = [letter for letter in specs if letter is not None]
-        assert len(valid_letters
-                   ) > 0, "Sharding was provided but with no partition specs"
+    def _get_mean_times(self) -> np.ndarray:
+        if jax.device_count() == 1:
+            return np.array(self.times)
+        devices = mesh_utils.create_device_mesh((jax.device_count(), ))
+        mesh = Mesh(devices, ('x', ))
+        sharding = NamedSharding(mesh, P('x'))
+        times_array = jnp.array(self.times)
+        global_shape = (jax.device_count(), times_array.shape[0])
+        global_times = jax.make_array_from_callback(
+            shape=global_shape,
+            sharding=sharding,
+            data_callback=lambda x: times_array)
         @partial(shard_map,
                  mesh=mesh,
-                 in_specs=specs,
+                 in_specs=P('x'),
                  out_specs=P(),
                  check_rep=False)
         def get_mean_times(times):
-            mean = jax.lax.pmean(times, axis_name=valid_letters[0])
-            for axis_name in valid_letters[1:]:
-                mean = jax.lax.pmean(mean, axis_name=axis_name)
-            return mean
+            return jax.lax.pmean(times, axis_name='x')
-        times_array = get_mean_times(times_array)
+        times_array = get_mean_times(global_times)
         times_array.block_until_ready()
-        return times_array
+        return np.array(times_array.addressable_data(0))
     def report(self,
                csv_filename: str,
                function: str,
-               precision: str,
                x: int,
-               y: int,
-               z: int,
-               px: int,
-               py: int,
-               backend: str,
-               nodes: int,
-               sharding: NamedSharding | None = None,
+               y: int | None = None,
+               z: int | None = None,
+               precision: str = "float32",
+               px: int = 1,
+               py: int = 1,
+               backend: str = "NCCL",
+               nodes: int = 1,
                md_filename: str | None = None,
                extra_info: dict = {}):
-        times_array = jnp.array(self.times)
         if md_filename is None:
-            dirname, filename = os.path.dirname(csv_filename), os.path.splitext(os.path.basename(csv_filename))[0]
+            dirname, filename = os.path.dirname(
+                csv_filename), os.path.splitext(
+                    os.path.basename(csv_filename))[0]
             report_folder = filename if dirname == "" else f"{dirname}/{filename}"
-            print(f"report_folder: {report_folder} csv_filename: {csv_filename}")
+            print(
+                f"report_folder: {report_folder} csv_filename: {csv_filename}")
             os.makedirs(report_folder, exist_ok=True)
             md_filename = f"{report_folder}/{x}_{px}_{py}_{backend}_{precision}_{function}.md"
-        if sharding is not None:
-            times_array = self._get_mean_times(times_array, sharding)
+        y = x if y is None else y
+        z = x if z is None else z
+        times_array = self._get_mean_times()
-        times_array = np.array(times_array)
         min_time = np.min(times_array)
         max_time = np.max(times_array)
         mean_time = np.mean(times_array)
@@ -163,10 +182,18 @@ class Timer:
         with open(md_filename, 'w') as f:
             f.write(f"# Reporting for {function}\n")
             f.write(f"## Parameters\n")
-            f.write(tabulate(param_dict.items() , headers=["Parameter" , "Value"] , tablefmt='github'))
+            keys = list(param_dict.keys())
+            values = list(param_dict.values())
+            f.write(
+                tabulate(param_dict.items(),
+                         headers=["Parameter", "Value"],
+                         tablefmt='github'))
             f.write("\n---\n")
             f.write(f"## Profiling Data\n")
-            f.write(tabulate(profiling_result.items() , headers=["Parameter" , "Value"] , tablefmt='github'))
+            f.write(
+                tabulate(profiling_result.items(),
+                         headers=["Parameter", "Value"],
+                         tablefmt='github'))
             f.write("\n---\n")
             f.write(f"## Compiled Code\n")
             f.write(f"```hlo\n")

{jax_hpc_profiler-0.2.0 → jax_hpc_profiler-0.2.1}/src/jax_hpc_profiler/utils.py RENAMED Viewed

@@ -6,6 +6,7 @@ import numpy as np
 import pandas as pd
 from matplotlib.axes import Axes
 def inspect_data(dataframes: Dict[str, pd.DataFrame]):
     """
     Inspect the dataframes.
@@ -203,17 +204,17 @@ def concatenate_csvs(root_dir: str, output_dir: str):
                 if file.endswith('.csv'):
                     csv_file_path = os.path.join(root, file)
                     print(f'Concatenating {csv_file_path}...')
-                    df = pd.read_csv(
-                        csv_file_path,
-                        header=None,
-                        names=[
-                            "function", "precision", "x", "y", "z", "px",
-                            "py", "backend", "nodes", "jit_time", "min_time",
-                            "max_time", "mean_time", "std_div", "last_time",
-                            "generated_code", "argument_size", "output_size",
-                            "temp_size", "flops"
-                        ],
-                        index_col=False)
+                    df = pd.read_csv(csv_file_path,
+                                     header=None,
+                                     names=[
+                                         "function", "precision", "x", "y",
+                                         "z", "px", "py", "backend", "nodes",
+                                         "jit_time", "min_time", "max_time",
+                                         "mean_time", "std_div", "last_time",
+                                         "generated_code", "argument_size",
+                                         "output_size", "temp_size", "flops"
+                                     ],
+                                     index_col=False)
                     if file not in combined_dfs:
                         combined_dfs[file] = df
                     else:
@@ -340,23 +341,23 @@ def clean_up_csv(
         if pdims:
             px_list, py_list = zip(*[map(int, p.split('x')) for p in pdims])
             df = df[(df['px'].isin(px_list)) & (df['py'].isin(py_list))]
-        # convert memory units columns to remquested memory_units
+        # convert memory units columns to remquested memory_units
         match memory_units:
-          case 'KB':
-            factor = 1024
-          case 'MB':
-            factor = 1024**2
-          case 'GB':
-            factor = 1024**3
-          case 'TB':
-            factor = 1024**4
-          case _:
-            factor = 1
+            case 'KB':
+                factor = 1024
+            case 'MB':
+                factor = 1024**2
+            case 'GB':
+                factor = 1024**3
+            case 'TB':
+                factor = 1024**4
+            case _:
+                factor = 1
         df['generated_code'] = df['generated_code'] / factor
-        df['argument_size'] = df['argument_size'] / factor
-        df['output_size'] = df['output_size'] / factor
+        df['argument_size'] = df['argument_size'] / factor
+        df['output_size'] = df['output_size'] / factor
         df['temp_size'] = df['temp_size'] / factor
         # in case of the same test is run multiple times, keep the last one
         df = df.drop_duplicates(subset=[
@@ -383,7 +384,7 @@ def clean_up_csv(
             df['decomp'] = df.apply(get_decomp_from_px_py, axis=1)
             df.drop(columns=['px', 'py'], inplace=True)
             if not 'plot_all' in pdims_strategy:
-              df = df[df['decomp'].isin(pdims_strategy)]
+                df = df[df['decomp'].isin(pdims_strategy)]
         # check available gpus in dataset
         available_gpu_counts.update(df['gpus'].unique())
         available_data_sizes.update(df['x'].unique())

{jax_hpc_profiler-0.2.0 → jax_hpc_profiler-0.2.1}/src/jax_hpc_profiler.egg-info/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.1
 Name: jax_hpc_profiler
-Version: 0.2.0
+Version: 0.2.1
 Summary: HPC Plotter and profiler for benchmarking data made for JAX
 Author: Wassim Kabalan
 License:                     GNU GENERAL PUBLIC LICENSE
@@ -678,6 +678,19 @@ License:                     GNU GENERAL PUBLIC LICENSE
         Public License instead of this License.  But first, please read
         <https://www.gnu.org/licenses/why-not-lgpl.html>.
+Project-URL: Homepage, https://github.com/ASKabalan/jax-hpc-profiler
+Keywords: jax,hpc,profiler,plotter,benchmarking
+Classifier: Development Status :: 4 - Beta
+Classifier: Intended Audience :: Developers
+Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3)
+Classifier: Programming Language :: Python :: 3
+Classifier: Programming Language :: Python :: 3.8
+Classifier: Programming Language :: Python :: 3.9
+Classifier: Programming Language :: Python :: 3.10
+Classifier: Programming Language :: Python :: 3.11
+Classifier: Programming Language :: Python :: 3.12
+Classifier: Programming Language :: Python :: 3 :: Only
+Requires-Python: >=3.8
 Description-Content-Type: text/markdown
 License-File: LICENSE
 Requires-Dist: numpy
@@ -686,9 +699,11 @@ Requires-Dist: matplotlib
 Requires-Dist: seaborn
 Requires-Dist: tabulate
-# HPC Plotter
+Here's the updated README with the additional information about the timer.report and the multi-GPU setup:
-HPC Plotter is a tool designed for benchmarking and visualizing performance data in high-performance computing (HPC) environments. It provides functionalities to generate, concatenate, and plot CSV data from various runs.
+# JAX HPC Profiler
+JAX HPC Profiler is a tool designed for benchmarking and visualizing performance data in high-performance computing (HPC) environments. It provides functionalities to generate, concatenate, and plot CSV data from various runs.
 ## Table of Contents
 - [Introduction](#introduction)
@@ -697,9 +712,10 @@ HPC Plotter is a tool designed for benchmarking and visualizing performance data
 - [CSV Structure](#csv-structure)
 - [Concatenating Files from Different Runs](#concatenating-files-from-different-runs)
 - [Plotting CSV Data](#plotting-csv-data)
+- [Examples](#examples)
 ## Introduction
-HPC Plotter allows users to:
+JAX HPC Profiler allows users to:
 1. Generate CSV files containing performance data.
 2. Concatenate multiple CSV files from different runs.
 3. Plot the performance data for analysis.
@@ -709,53 +725,80 @@ HPC Plotter allows users to:
 To install the package, run the following command:
 ```bash
-pip install hpc-plotter
+pip install jax-hpc-profiler
 ```
 ## Generating CSV Files Using the Timer Class
-To generate CSV files, you can use the `Timer` class provided in the `hpc_plotter.timer` module. This class helps in timing functions and saving the timing results to CSV files.
+To generate CSV files, you can use the `Timer` class provided in the `jax_hpc_profiler.timer` module. This class helps in timing functions and saving the timing results to CSV files.
 ### Example Usage
 ```python
-import time
-from hpc_plotter.timer import Timer
 import jax
-# Define the functions you want to time
-def example_function():
-    time.sleep(1)  # Simulating a task
-# Create a Timer instance
-timer = Timer()
-# Time the function
-timer.chrono_jit(example_function)
-for _ in range(5):
-    timer.chrono_fun(example_function)
-# Metadata for the CSV file
-metadata = {
-    'rank': jax.process_index(),
-    'function_name': 'example_function',
-    'precision': 'float32',
-    'x': '1024',
-    'y': '1024',
-    'z': '1024',
-    'px': '4',
-    'py': '4',
-    'backend': 'NCCL',
-    'nodes': '2'
+from jax_hpc_profiler import Timer
+def fcn(m, n, k):
+    return jax.numpy.dot(m, n) + k
+timer = Timer(save_jaxpr=True)
+m = jax.numpy.ones((1000, 1000))
+n = jax.numpy.ones((1000, 1000))
+k = jax.numpy.ones((1000, 1000))
+timer.chrono_jit(fcn, m, n, k)
+for i in range(10):
+    timer.chrono_fun(fcn, m, n, k)
+meta_data = {
+  "function": "fcn",
+  "precision": "float32",
+  "x": 1000,
+  "y": 1000,
+  "z": 1000,
+  "px": 1,
+  "py": 1,
+  "backend": "NCCL",
+  "nodes": 1
+}
+extra_info = {
+    "done": "yes"
 }
-# Print the results to a CSV file
-timer.print_to_csv('output.csv', **metadata)
+timer.report("examples/profiling/test.csv", **meta_data,  extra_info=extra_info)
 ```
+`timer.report` has sensible defaults and this is the API for the `Timer` class:
+- `csv_filename`: The path to the CSV file to save the timing data **(required)**.
+- `function`: The name of the function being timed **(required)**.
+- `x`: The size of the input data in the x dimension **(required)**.
+- `y`: The size of the input data in the y dimension (by default same as x).
+- `z`: The size of the input data in the z dimension (by default same as x).
+- `precision`: The precision of the data (default: "float32").
+- `px`: The number of partitions in the x dimension (default: 1).
+- `py`: The number of partitions in the y dimension (default: 1).
+- `backend`: The backend used for computation (default: "NCCL").
+- `nodes`: The number of nodes used for computation (default: 1).
+- `md_filename`: The path to the markdown file containing the compiled code and other information (default: {csv_folder}/{x}_{px}_{py}_{backend}_{precision}_{function}.md).
+- `extra_info`: Additional information to include in the report (default: {}
+`px` and `py` are used to specify the data decomposition. For example, if you have a 2D array of size 1000x1000 and you partition it into 4 parts (2x2), you would set `px=2` and `py=2`.\
+they can also be used in a single device run to specify batch size.
+Some decomposition parameters are generated and that are specific to 3D data decomposition.\
+`slab_yz` if the distributed axis is the y-axis.\
+`slab_xy` if the distributed axis is the x-axis.\
+`pencils` if the distributed axis are the x and y axes.
+### Multi-GPU Setup
+In a multi-GPU setup, the times are automatically averaged across ranks, providing a single performance metric for the entire setup.
 ## CSV Structure
 The CSV files should follow a specific structure to ensure proper processing and concatenation. The directory structure should be organized by GPU type, with subdirectories for the number of GPUs and the respective CSV files.
 ### Example Directory Structure
 ```
@@ -790,12 +833,12 @@ root_directory/
 ## Concatenating Files from Different Runs
-The `plot` function expects the directory to be organized as described above, but with the different number of GPUs toghether in the same directory. The `concatenate` function can be used to concatenate the CSV files from different runs into a single file.
+The `plot` function expects the directory to be organized as described above, but with the different number of GPUs together in the same directory. The `concatenate` function can be used to concatenate the CSV files from different runs into a single file.
 ### Example Usage
 ```bash
-hpc-plotter concat /path/to/root_directory /path/to/output
+jax-hpc-profiler concat /path/to/root_directory /path/to/output
 ```
 And the output will be:
@@ -812,8 +855,6 @@ out_directory/
     └── method_3.csv
 ```
 ## Plotting CSV Data
 You can plot the performance data using the `plot` command. The plotting command provides various options to customize the plots.
@@ -821,27 +862,41 @@ You can plot the performance data using the `plot` command. The plotting command
 ### Usage
 ```bash
-hpc-plotter plot -f <csv_files> [options]
+jax-hpc-profiler plot -f <csv_files> [options]
 ```
-with options :
+### Options
 - `-f, --csv_files`: List of CSV files to plot (required).
-- `-g, --gpus`: Filter GPUs. List of number of GPUs to plot.
-- `-d, --data_size`: Filter data sizes. List of data sizes to plot.
+- `-g, --gpus`: List of number of GPUs to plot.
+- `-d, --data_size`: List of data sizes to plot.
 - `-fd, --filter_pdims`: List of pdims to filter (e.g., 1x4 2x2 4x8).
-- `-ps, --pdims_strategy`: Strategy for plotting pdims (`plot_all` or `plot_fastest`).
-  - `plot_all`: Plot every decomposition. 1xX and Xx1 as slabs, XxX as pencils.
+- `-ps, --pdim_strategy`: Strategy for plotting pdims. This argument can be multiple ones (`plot_all`, `plot_fastest`, `slab_yz`, `slab_xy`, `pencils`).
+  - `plot_all`: Plot every decomposition.
   - `plot_fastest`: Plot the fastest decomposition.
-- `-p, --precision`: Precision to filter by (`float32` or `float64`).
-- `-fn, --function_name`: Function name to filter.
-- `-ta, --time_aggregation`: Time aggregation method (`mean`, `min`, `max`).
-- `-tc, --time_column`: Time column to plot (`jit_time`, `min_time`, `max_time`, `mean_time`, `std_div`, `last_time`).
+- `-pr, --precision`: Precision to filter by. This argument can be multiple ones (`float32`, `float64`).
+- `-fn, --function_name`: Function names to filter. This argument can be multiple ones.
+- `-pt, --plot_times`: Time columns to plot (`jit_time`, `min_time`, `max_time`, `mean_time`, `std_time`, `last_time`). Note: You cannot plot memory and time together.
+- `-pm, --plot_memory`: Memory columns to plot (`generated_code`, `argument_size`, `output_size`, `temp_size`). Note: You cannot plot memory and time together.
+- `-mu, --memory_units`: Memory units to plot (`KB`, `MB`, `GB`, `TB`).
 - `-fs, --figure_size`: Figure size.
-- `-nl, --nodes_in_label`: Use node names in labels.
 - `-o, --output`: Output file (if none then only show plot).
 - `-db, --dark_bg`: Use dark background for plotting.
-- `-pd, --print_decompositions`: Print decompositions on plot (only for `plot_fastest`).
-- `-b, --backends`: List of backends to include.
-- `-sc, --scaling`: Scaling type (`Weak` or `Strong`).
+- `-pd, --print_decompositions`: Print decompositions on plot (experimental).
+- `-b, --backends`: List of backends to include. This argument can be multiple ones.
+- `-sc, --scaling`: Scaling type (`Weak`, `Strong`).
+- `-l, --label_text`: Custom label for the plot. You can use placeholders: `%decomposition%` (or `%p%`), `%precision%` (or `%pr%`), `%plot_name%` (or `%pn%`), `%backend%` (or `%b%`), `%node%` (or `%n%`), `%methodname%` (or `%m%`).
+## Examples
+The repository includes examples for both profiling and plotting.
+### Profiling Example
+See the `examples/profiling` directory for profiling examples, including `function.py`, `test.csv`, and the generated markdown report.
+### Plotting Example
+See the `examples/plotting` directory for plotting examples, including `generator.py`, `sample_data1.csv`, `sample_data2.csv`, and `sample_data3.csv`.
+a multi GPU example comparing distributed FFT can be found here [jaxdecomp-bechmarks](https://github.com/ASKabalan/jaxdecomp-benchmarks)

jax_hpc_profiler-0.2.0/README.md DELETED Viewed

@@ -1,159 +0,0 @@
-# HPC Plotter
-HPC Plotter is a tool designed for benchmarking and visualizing performance data in high-performance computing (HPC) environments. It provides functionalities to generate, concatenate, and plot CSV data from various runs.
-## Table of Contents
-- [Introduction](#introduction)
-- [Installation](#installation)
-- [Generating CSV Files Using the Timer Class](#generating-csv-files-using-the-timer-class)
-- [CSV Structure](#csv-structure)
-- [Concatenating Files from Different Runs](#concatenating-files-from-different-runs)
-- [Plotting CSV Data](#plotting-csv-data)
-## Introduction
-HPC Plotter allows users to:
-1. Generate CSV files containing performance data.
-2. Concatenate multiple CSV files from different runs.
-3. Plot the performance data for analysis.
-## Installation
-To install the package, run the following command:
-```bash
-pip install hpc-plotter
-```
-## Generating CSV Files Using the Timer Class
-To generate CSV files, you can use the `Timer` class provided in the `hpc_plotter.timer` module. This class helps in timing functions and saving the timing results to CSV files.
-### Example Usage
-```python
-import time
-from hpc_plotter.timer import Timer
-import jax
-# Define the functions you want to time
-def example_function():
-    time.sleep(1)  # Simulating a task
-# Create a Timer instance
-timer = Timer()
-# Time the function
-timer.chrono_jit(example_function)
-for _ in range(5):
-    timer.chrono_fun(example_function)
-# Metadata for the CSV file
-metadata = {
-    'rank': jax.process_index(),
-    'function_name': 'example_function',
-    'precision': 'float32',
-    'x': '1024',
-    'y': '1024',
-    'z': '1024',
-    'px': '4',
-    'py': '4',
-    'backend': 'NCCL',
-    'nodes': '2'
-}
-# Print the results to a CSV file
-timer.print_to_csv('output.csv', **metadata)
-```
-## CSV Structure
-The CSV files should follow a specific structure to ensure proper processing and concatenation. The directory structure should be organized by GPU type, with subdirectories for the number of GPUs and the respective CSV files.
-### Example Directory Structure
-```
-root_directory/
-├── gpu_1/
-│   ├── 2/
-│   │   ├── method_1.csv
-│   │   ├── method_2.csv
-│   │   └── method_3.csv
-│   ├── 4/
-│   │   ├── method_1.csv
-│   │   ├── method_2.csv
-│   │   └── method_3.csv
-│   └── 8/
-│       ├── method_1.csv
-│       ├── method_2.csv
-│       └── method_3.csv
-└── gpu_2/
-    ├── 2/
-    │   ├── method_1.csv
-    │   ├── method_2.csv
-    │   └── method_3.csv
-    ├── 4/
-    │   ├── method_1.csv
-    │   ├── method_2.csv
-    │   └── method_3.csv
-    └── 8/
-        ├── method_1.csv
-        ├── method_2.csv
-        └── method_3.csv
-```
-## Concatenating Files from Different Runs
-The `plot` function expects the directory to be organized as described above, but with the different number of GPUs toghether in the same directory. The `concatenate` function can be used to concatenate the CSV files from different runs into a single file.
-### Example Usage
-```bash
-hpc-plotter concat /path/to/root_directory /path/to/output
-```
-And the output will be:
-```
-out_directory/
-├── gpu_1/
-│   ├── method_1.csv
-│   ├── method_2.csv
-│   └── method_3.csv
-└── gpu_2/
-    ├── method_1.csv
-    ├── method_2.csv
-    └── method_3.csv
-```
-## Plotting CSV Data
-You can plot the performance data using the `plot` command. The plotting command provides various options to customize the plots.
-### Usage
-```bash
-hpc-plotter plot -f <csv_files> [options]
-```
-with options :
-- `-f, --csv_files`: List of CSV files to plot (required).
-- `-g, --gpus`: Filter GPUs. List of number of GPUs to plot.
-- `-d, --data_size`: Filter data sizes. List of data sizes to plot.
-- `-fd, --filter_pdims`: List of pdims to filter (e.g., 1x4 2x2 4x8).
-- `-ps, --pdims_strategy`: Strategy for plotting pdims (`plot_all` or `plot_fastest`).
-  - `plot_all`: Plot every decomposition. 1xX and Xx1 as slabs, XxX as pencils.
-  - `plot_fastest`: Plot the fastest decomposition.
-- `-p, --precision`: Precision to filter by (`float32` or `float64`).
-- `-fn, --function_name`: Function name to filter.
-- `-ta, --time_aggregation`: Time aggregation method (`mean`, `min`, `max`).
-- `-tc, --time_column`: Time column to plot (`jit_time`, `min_time`, `max_time`, `mean_time`, `std_div`, `last_time`).
-- `-fs, --figure_size`: Figure size.
-- `-nl, --nodes_in_label`: Use node names in labels.
-- `-o, --output`: Output file (if none then only show plot).
-- `-db, --dark_bg`: Use dark background for plotting.
-- `-pd, --print_decompositions`: Print decompositions on plot (only for `plot_fastest`).
-- `-b, --backends`: List of backends to include.
-- `-sc, --scaling`: Scaling type (`Weak` or `Strong`).

jax_hpc_profiler-0.2.0/pyproject.toml DELETED Viewed

@@ -1,23 +0,0 @@
-[build-system]
-requires = ["setuptools", "wheel"]
-build-backend = "setuptools.build_meta"
-[project]
-name = "jax_hpc_profiler"
-version = "0.2.0"
-description = "HPC Plotter and profiler for benchmarking data made for JAX"
-authors = [
-    { name="Wassim Kabalan" }
-]
-dependencies = [
-    "numpy",
-    "pandas",
-    "matplotlib",
-    "seaborn",
-    "tabulate"
-]
-readme = "README.md"
-license = { file = "LICENSE" }
-[project.scripts]
-jhp = "jax_hpc_profiler.main:main"