PyPI - torchax - Versions diffs - 0.0.4__py3-none-any.whl → 0.0.6__py3-none-any.whl - Mend

torchax 0.0.4py3-none-any.whl → 0.0.6py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.

This version of torchax might be problematic. Click here for more details.

Files changed (32) hide show

torchax/CONTRIBUTING.md +2 -2
torchax/__init__.py +26 -24
torchax/amp.py +332 -0
torchax/config.py +25 -14
torchax/configuration.py +30 -0
torchax/decompositions.py +663 -195
torchax/device_module.py +14 -1
torchax/environment.py +0 -1
torchax/export.py +26 -17
torchax/flax.py +39 -0
torchax/interop.py +288 -141
torchax/mesh_util.py +220 -0
torchax/ops/jaten.py +1723 -1297
torchax/ops/jax_reimplement.py +23 -21
torchax/ops/jc10d.py +5 -4
torchax/ops/jimage.py +113 -0
torchax/ops/jlibrary.py +9 -2
torchax/ops/jtorch.py +237 -88
torchax/ops/jtorchvision_nms.py +32 -43
torchax/ops/mappings.py +77 -35
torchax/ops/op_base.py +59 -32
torchax/ops/ops_registry.py +40 -35
torchax/tensor.py +442 -288
torchax/train.py +38 -41
torchax/util.py +88 -0
torchax/view.py +377 -0
{torchax-0.0.4.dist-info → torchax-0.0.6.dist-info}/METADATA +111 -145
torchax-0.0.6.dist-info/RECORD +33 -0
torchax/distributed.py +0 -246
torchax-0.0.4.dist-info/RECORD +0 -27
{torchax-0.0.4.dist-info → torchax-0.0.6.dist-info}/WHEEL +0 -0
{torchax-0.0.4.dist-info → torchax-0.0.6.dist-info}/licenses/LICENSE +0 -0

{torchax-0.0.4.dist-info → torchax-0.0.6.dist-info}/METADATA RENAMED Viewed

@@ -1,7 +1,7 @@
 Metadata-Version: 2.4
 Name: torchax
-Version: 0.0.4
-Summary: torchax is a library for running PyTorch on TPU
+Version: 0.0.6
+Summary: torchax is a library for running Jax and PyTorch together
 Project-URL: Homepage, https://github.com/pytorch/xla/tree/master/torchax
 Author-email: Han Qi <qihan.dev@gmail.com>, Pytorch/XLA team <pytorchxla-dev@google.com>
 License: BSD 3-Clause License
@@ -51,111 +51,75 @@ Classifier: Topic :: Software Development :: Libraries :: Python Modules
 Requires-Python: >=3.10
 Provides-Extra: cpu
 Requires-Dist: jax[cpu]; extra == 'cpu'
-Requires-Dist: jax[cpu]>=0.4.30; extra == 'cpu'
+Requires-Dist: jax[cpu]>=0.6.2; extra == 'cpu'
 Provides-Extra: cuda
-Requires-Dist: jax[cpu]>=0.4.30; extra == 'cuda'
+Requires-Dist: jax[cpu]>=0.6.2; extra == 'cuda'
 Requires-Dist: jax[cuda12]; extra == 'cuda'
 Provides-Extra: odml
 Requires-Dist: jax[cpu]; extra == 'odml'
-Requires-Dist: jax[cpu]>=0.4.30; extra == 'odml'
+Requires-Dist: jax[cpu]>=0.6.2; extra == 'odml'
 Provides-Extra: tpu
-Requires-Dist: jax[cpu]>=0.4.30; extra == 'tpu'
+Requires-Dist: jax[cpu]>=0.6.2; extra == 'tpu'
 Requires-Dist: jax[tpu]; extra == 'tpu'
 Description-Content-Type: text/markdown
-# torchax: Running PyTorch on TPU
+# torchax: Running PyTorch on TPU via JAX
-**torchax!** is a backend for PyTorch, allowing users to run
-PyTorch on Google CloudTPUs. **torchax!** is also a library for providing
-graph-level interoperability between PyTorch and Jax.
+**torchax** is a backend for PyTorch, allowing users to run
+PyTorch on Google Cloud TPUs. **torchax** is also a library for providing
+graph-level interoperability between PyTorch and JAX.
 This means, with **torchax** you can:
-* Run PyTorch code on TPU with as little as 2 lines of code change.
-* Call a jax function from a pytorch function, passing in `jax.Array`s
-* Call a pytorch function from a jax function, passing in a `torch.Tensor` subclass.
-* Use jax features such as `jax.grad`, `optax` and `GSMPD` to train a Pytorch model.
-* Use a Pytorch model as feature extractor and use it with a Jax model.
+* Run PyTorch code on TPUs with as little as 2 lines of code change.
+* Call a JAX function from a PyTorch function, passing in `jax.Array`s.
+* Call a PyTorch function from a JAX function, passing in a `torch.Tensor`s.
+* Use JAX features such as `jax.grad`, `optax`, and `GSPMD` to train a PyTorch
+  model.
+* Use a PyTorch model as feature extractor and use it with a JAX model.
 etc etc.
 ## Install
-### On Google Cloud TPU:
 First install torch CPU:
 ```bash
+# On Linux.
 pip install torch --index-url https://download.pytorch.org/whl/cpu
-```
-Then install jax TPU:
-```bash
-pip install -U jax[tpu]
-```
-Finally install torchax
-```bash
-pip install torchax
+# Or on Mac.
+pip install torch
 ```
-### On GPU machines:
-First install torch CPU:
+Then install JAX for the accelerator you want to use:
 ```bash
-pip install torch --index-url https://download.pytorch.org/whl/cpu
-```
-Then install jax CUDA:
+# On Google Cloud TPU.
+pip install -U jax[tpu]
-```bash
+# Or, on GPU machines.
 pip install -U jax[cuda12]
-```
-Finally install torchax
-```bash
-pip install torchax
-```
-### On CPU machines (mac included)
-First install torch CPU:
-```bash
-# Linux
-pip install torch --index-url https://download.pytorch.org/whl/cpu
-# OR Mac:
-pip install torch
-```
-Then install jax CPU:
-```bash
+# Or, on Linux CPU machines or Macs (see the note below).
 pip install -U jax
 ```
-Finally install torchax
-```bash
-pip install torchax
-```
 NOTE: if you like metal support for Apple devices then install the
-metal version of jax: https://developer.apple.com/metal/jax/
+metal version of JAX: https://developer.apple.com/metal/jax/
-### Installing `torchax` from source
-Still need to install `torch` CPU and `Jax` of your accelerator (GPU, TPU or None).
+Finally install torchax:
 ```bash
+# Install pre-built torchax.
+pip install torchax
+# Or, install torchax from source.
 pip install git+https://github.com/pytorch/xla.git#subdirectory=torchax
 ```
 ## Run a model
-Now let's execute a model under torchax. We'll start with a simple 2-layer model
-it can be in theory any instance of `torch.nn.Module`.
+Now let's execute a model under torchax. We'll start with a simple 2-layer model.
+In theory, we can use any instance of `torch.nn.Module`.
 ```python
 import torch
@@ -179,75 +143,73 @@ class MyModel(nn.Module):
 m = MyModel()
-# Execute this model using torch
+# Execute this model using torch.
 inputs = torch.randn(3, 3, 28, 28)
 print(m(inputs))
 ```
-This model `m` contains 2 parts: the weights that is stored inside of the model
-and it's submodules (`nn.Linear`).
-To execute this model with `torchax`; we need to enable torchax to capture pytorch ops.
-To enable this, use:
+To execute this model with `torchax`, we need to enable torchax to capture PyTorch ops:
 ```python
 import torchax
 torchax.enable_globally()
 ```
-Then, a `jax` device will be available to use
+Then, we can use a `jax` device:
 ```python
 inputs = torch.randn(3, 3, 28, 28, device='jax')
-m = MyModel()
+m = MyModel().to('jax')
 res = m(inputs)
 print(type(res))  # outputs torchax.tensor.Tensor
 ```
 `torchax.tensor.Tensor` is a `torch.Tensor` subclass that holds
-a `jax.Array`. You can inspect that jax array with `res.jax()`
+a `jax.Array`. You can inspect that JAX array with `res.jax()`.
-## What is happening behind the scene:
+## What is happening behind the scene
-We took the approach detailed in [new device](https://github.com/albanD/subclass_zoo/blob/main/new_device.py) recipe by Alban (@albanD); using `jax.Array` for the `raw_data`.
+We took the approach detailed in the
+[new device](https://github.com/albanD/subclass_zoo/blob/main/new_device.py)
+recipe by Alban (@albanD), using `jax.Array` for `raw_data`.
-In other words, When a torch op is executed inside of `env` context manager (which is enabled with `torchax.enable_globally()`), we can swap out the
-implementation of that op written in Jax.
+In other words, when a torch op is executed inside an `env` context manager,
+which is enabled by `torchax.enable_globally()`, we will swap out the
+implementation of that op with JAX.
 When a model's constructor runs, it will call some tensor constructor, such as
-`torch.rand`, `torch.ones` or `torch.zeros` etc to create its weights. The constructor
-will create an `torch.Tensor` subclass that contains a `jax.Array`.
+`torch.rand`, `torch.ones`, or `torch.zeros` to create its weights. When torchax
+is enabled, these constructors will create a `torchax.tensor.Tensor`, which
+contains a `jax.Array`.
-Then, each subsequent op can unpack the `jax.Array`, call the op implementation,
-and wraps it back into `torch.Tensor` subclass.
-See more at [how_it_works](docs/how_it_works.md) and [ops registry](docs/ops_registry.md).
+Then, each subsequent op will extract the `jax.Array`, call the op's JAX
+implementation, and wrap the result back into a `torchax.tensor.Tensor`,
+See more at [how it works](docs/how_it_works.md) and\
+[ops registry](docs/ops_registry.md).
 ### Executing with jax.jit
-The above script will execute the model using eager mode Jax as backend. This
-does allow executing torch models on TPU, but is often slower than what we can
+The above script will execute the model using eager mode JAX as the backend. This
+does allow executing torch models on TPUs, but is often slower than what we can
 achieve with `jax.jit`.
-`jax.jit` is a function that takes a Jax function (i.e. a function that takes jax array
-and returns jax array) into the same function, but faster.
+`jax.jit` is a function that takes a JAX function (i.e. a function that takes JAX arrays
+and returns JAX arrays) into a compiled (thus faster) version of the same function.
-We have made the `jax_jit` decorator that would accomplish the same with functions
-that takes and returns `torch.Tensor`. To use this, the first step is to create
+We have made a `jax_jit` decorator that would accomplish the same with functions
+that takes and returns `torch.Tensor`s. To use this, the first step is to create
 a functional version of this model: this means the parameters should be passed in
-as input instead of being attributes on class:
+as input instead of being attributes of the class:
 ```python
 def model_func(param, inputs):
   return torch.func.functional_call(m, param, inputs)
 ```
 Here we use [torch.func.functional_call](https://pytorch.org/docs/stable/generated/torch.func.functional_call.html)
-from PyTorch to replace the model
-weights with `param`, then call the model. This is roughly equivalent to:
+from PyTorch to replace the model weights with `param` and then call the
+model. This is roughly equivalent to:
 ```python
 def model_func(param, inputs):
@@ -255,87 +217,91 @@ def model_func(param, inputs):
   return m(*inputs)
 ```
-Now, we can apply `jax_jit`
+Now, we can apply `jax_jit` on `module_func`:
 ```python
 from torchax.interop import jax_jit
 model_func_jitted = jax_jit(model_func)
 print(model_func_jitted(new_state_dict, inputs))
 ```
-See more examples at [eager_mode.py](examples/eager_mode.py) and the (examples folder)[examples/]
-However, to ease the idiom of creating functional model and calling it with parameters,
-we also created the `JittableModule` helper class.
+See more examples at [eager_mode.py](examples/eager_mode.py) and the
+[examples folder](examples/).
-So the above can be written as:
+To ease the idiom of creating functional model and calling it with parameters,
+we also created the `JittableModule` helper class. It lets us rewrite the
+above as:
 ```python
 from torchax.interop import JittableModule
 m_jitted = JittableModule(m)
 res = m_jitted(...)
 ```
-The first time that `m_jitted` is called , it will trigger `jax.jit`
-then the subsequent computation with inputs of same shape will be fast.
+The first time `m_jitted` is called, it will trigger `jax.jit` to compile the
+compile for the given input shapes. Subsequent calls with the same input shapes
+will be fast as the compilation is cached.
+## Citation
-# Citation:
+```
 @software{torchax,
   author = {Han Qi, Chun-nien Chan, Will Cromar, Manfei Bai, Kevin Gleanson},
-  title = {torchax: PyTorch on TPU and Jax interoperability},
+  title = {torchax: PyTorch on TPU and JAX interoperability},
   url = {https://github.com/pytorch/xla/tree/master/torchax}
   version = {0.0.4},
   date = {2025-02-24},
 }
+```
 # Maintainers & Contributors:
 This library is created and maintained by the PyTorch/XLA team at Google Cloud.
-However, it benefitted from many direct and indirect
+It benefitted from many direct and indirect
 contributions outside of the team. Many of them done by
-fellow Googlers using [Google's 20% project policy](https://ebsedu.org/blog/google-tapping-workplace-actualization-20-time-rule), others by partner teams.
+fellow Googlers using [Google's 20% project policy](https://ebsedu.org/blog/google-tapping-workplace-actualization-20-time-rule).
+Others by partner teams at Google and other companies.
-Here is the full list of contributors by 2025-02-25.
+Here is the list of contributors by 2025-02-25.
-Han Qi (qihqi), Pytorch / XLA
-Manfei Bai (manfeibai), Pytorch / XLA
+```
+Han Qi (qihqi), PyTorch/XLA
+Manfei Bai (manfeibai), PyTorch/XLA
 Will Cromar (will-cromar), Meta
-Milad Mohammadi (miladm), Pytorch / XLA
-Siyuan Liu (lsy323), Pytorch / XLA
-Bhavya Bahl (bhavya01), Pytorch / XLA
-Pei Zhang (zpcore), Pytorch / XLA
-Yifei Teng (tengyifei), Pytorch / XLA
+Milad Mohammadi (miladm), PyTorch/XLA
+Siyuan Liu (lsy323), PyTorch/XLA
+Bhavya Bahl (bhavya01), PyTorch/XLA
+Pei Zhang (zpcore), PyTorch/XLA
+Yifei Teng (tengyifei), PyTorch/XLA
 Chunnien Chan (chunnienc), Google, ODML
-Alban Desmaison (albanD), Meta, Pytorch
-Simon Teo (simonteozw), Google(20%)
-David Huang (dvhg), Google(20%)
-Barni Seetharaman (barney-s), Google(20%)
-Anish Karthik (anishfish2) , Google(20%)
-Yao Gu (guyao) , Google(20%)
-Yenkai Wang (yenkwang) , Google(20%)
-Greg Shikhman (commander) , Google(20%)
-Matin Akhlaghinia (matinehAkhlaghinia), Google(20%)
-Tracy Chen (tracych477), Google(20%)
-Matthias Guenther (mrguenther) , Google(20%)
-WenXin Dong (wenxindongwork), Google(20%)
-Kevin Gleason (GleasonK) , Google, StableHLO
-Nupur Baghel (nupurbaghel), Google(20%)
-Gwen Mittertreiner (gmittert), Google(20%)
+Alban Desmaison (albanD), Meta, PyTorch
+Simon Teo (simonteozw), Google (20%)
+David Huang (dvhg), Google (20%)
+Barni Seetharaman (barney-s), Google (20%)
+Anish Karthik (anishfish2), Google (20%)
+Yao Gu (guyao), Google (20%)
+Yenkai Wang (yenkwang), Google (20%)
+Greg Shikhman (commander), Google (20%)
+Matin Akhlaghinia (matinehAkhlaghinia), Google (20%)
+Tracy Chen (tracych477), Google (20%)
+Matthias Guenther (mrguenther), Google (20%)
+WenXin Dong (wenxindongwork), Google (20%)
+Kevin Gleason (GleasonK), Google, StableHLO
+Nupur Baghel (nupurbaghel), Google (20%)
+Gwen Mittertreiner (gmittert), Google (20%)
 Zeev Melumian (zmelumian), Lightricks
-Vyom Sharma (vyom1611), Google(20%)
+Vyom Sharma (vyom1611), Google (20%)
 Shitong Wang (ShitongWang), Adobe
-Rémi Doreau (ayshiff), Google(20%)
+Rémi Doreau (ayshiff), Google (20%)
 Lance Wang (wang2yn84), Google, CoreML
-Hossein Sarshar (hosseinsarshar) , Google(20%)
-Daniel Vega-Myhre (danielvegamyhre) , Google(20%)
-Tianqi Fan (tqfan28), Google(20%)
-Jim Lin (jimlinntu), Google(20%)
+Hossein Sarshar (hosseinsarshar), Google (20%)
+Daniel Vega-Myhre (danielvegamyhre), Google (20%)
+Tianqi Fan (tqfan28), Google (20%)
+Jim Lin (jimlinntu), Google (20%)
 Fanhai Lu (FanhaiLu1), Google Cloud
 DeWitt Clinton (dewitt), Google PyTorch
-Aman Gupta (aman2930) , Google(20%)
+Aman Gupta (aman2930), Google (20%)
+```

torchax-0.0.6.dist-info/RECORD ADDED Viewed

@@ -0,0 +1,33 @@
+torchax/CONTRIBUTING.md,sha256=VOL0us6kS-uc4yE6IlSm6SDHYHnx-gw-0upFnP0VkSQ,1369
+torchax/__init__.py,sha256=c98iIGugRTbEVcsx8eWnbAjsC4mpcDrK23ZQqiMycLg,3157
+torchax/amp.py,sha256=-k8t4lrCsJLKHEhI6J0aHE3MAPEL-4DP6wCKtMwo1AM,11791
+torchax/config.py,sha256=O9yF96AShWb02hcwkT5ToPTt_hpOo3dMJNO30A7dmac,922
+torchax/configuration.py,sha256=O9yF96AShWb02hcwkT5ToPTt_hpOo3dMJNO30A7dmac,922
+torchax/decompositions.py,sha256=1p5TFZfAJ2Bs9BiSO1vXbnWEXnbPfC_gCQ54rDXhd9k,28859
+torchax/device_module.py,sha256=7fkdPwXG0qCBTmvDYHp0fvv4xK0W9avV_Ua3MeMzczE,349
+torchax/environment.py,sha256=AbpHGcgLb-kRsJGnwFEktk7uzpZOCcBY74-YBdrKVGs,1
+torchax/export.py,sha256=xU-UbrQBvQWUy-GM2FfeIHymlEdmYDYcPymjlcXM23w,8969
+torchax/flax.py,sha256=2Tg8inGskgAfByPxJQh4ItZHHAb-960gYq156bSO8V4,1280
+torchax/interop.py,sha256=7HvJwtxdodcCrMyJzs-Wr47hkHuoh6CWb2-YKoBwqV0,11076
+torchax/mesh_util.py,sha256=Ab4ic2eHWmQ3Mw3jpERvi-TKLIcDvQQoC6tuIZ9ig7Q,9314
+torchax/tensor.py,sha256=XjAp7khpQNhoVsSMzDj-V8l4DFT9jBaL4NVCi88a6K0,20893
+torchax/tf_integration.py,sha256=d_h4vSJm7N9rJXpUPNCDOiUz3J1-UPo3KU8D9Wi4nnc,4074
+torchax/train.py,sha256=rtvj6HkdnG9fc3VWYPNwHuxGlUxFJkUXJWED8azgtok,3855
+torchax/types.py,sha256=j4ERjkgDgwhgi9zrwwbbiv4HMDlrJ1IEMUCmP_BIJ9M,388
+torchax/util.py,sha256=cb-eudDE7AX2s-6zYtXdowgyzyvqPqE9MPP82PfH23g,3069
+torchax/view.py,sha256=1ekqRN04lAPd_icgZMKbSYWhr738DzVloc34ynml4wo,11121
+torchax/ops/__init__.py,sha256=Vr1p8zDHwfXZBUbw70iNiCJLZLNdI6gR_vUlaiA7Usg,270
+torchax/ops/jaten.py,sha256=WxfZU6p7b7OR98B3z0LCXKlV6U5aslXxJMJirBr6lns,165835
+torchax/ops/jax_reimplement.py,sha256=idkmFWNCXBilkmaHBGdivKz0XhsjSpqLNlGXxbBOKWQ,7302
+torchax/ops/jc10d.py,sha256=OzSYYle_5jBmNVP64SuJPz9S-rRGD6H7e1a9HHIKsjU,1322
+torchax/ops/jimage.py,sha256=P0lAauYX_au_xjIHDsG7H6jO7Jf54_VCAjzZuIZdhO0,3182
+torchax/ops/jlibrary.py,sha256=YfYUQbf5dKiMtEHUMfdgHTeLuNvvSTJ-l8s7wQNIvO0,2930
+torchax/ops/jtorch.py,sha256=wR4ZdDscxqG4VpxjcLGzgdUKmipa3fp7S0mK3DcD--A,17161
+torchax/ops/jtorchvision_nms.py,sha256=HSnhwU0gFaHucT7EvrEruJdnWkAWTw4T35GY525ohO8,8903
+torchax/ops/mappings.py,sha256=AESERtXJ6i_Hm0ycwEw7z5OJnHu-7QteWlSs-mlUPE4,3492
+torchax/ops/op_base.py,sha256=MLKFxMojIXgz4lkTE6k-8F-ddve-9vEiXkzj3P-YJPs,3739
+torchax/ops/ops_registry.py,sha256=qADpG1up0JOThoybiOQoRDWtAe5TOkHlqcj1bSHjtGY,1594
+torchax-0.0.6.dist-info/METADATA,sha256=uB9hoyxdfrAD14pHy0U8Gh1uCHbYwok-oEW12pEa6qs,10753
+torchax-0.0.6.dist-info/WHEEL,sha256=qtCwoSJWgHk21S1Kb4ihdzI2rlJ1ZKaIurTj_ngOhyQ,87
+torchax-0.0.6.dist-info/licenses/LICENSE,sha256=ZHyir3-ltOerFLt9JH1bjf7lIxIWipFmqeMnB_8z_aU,1498
+torchax-0.0.6.dist-info/RECORD,,

torchax/distributed.py DELETED Viewed

@@ -1,246 +0,0 @@
-"""`torch.distributed` backend implemented with JAX collective ops.
-EXPERIMENTAL: This module is still highly experimental, and it may be removed
-before any stable release.
-Note: JAX collective ops require that axis names be defined in `pmap` or
-`shmap`. The distributed backend only supports one axis, named `torch_dist`.
-This name is defined by our mirror implementation of `spawn`.
-"""
-import datetime
-import functools
-import logging
-import os
-from typing import List, Optional, Union
-import jax
-import numpy as np
-import torch
-import torch.distributed as dist
-import torch.distributed._functional_collectives
-from torch._C._distributed_c10d import ProcessGroup  # type: ignore
-import torch.distributed
-import torchax
-from jax.sharding import NamedSharding
-from jax.sharding import Mesh, PartitionSpec as P
-from jax.experimental import mesh_utils
-import torch.utils._pytree as torch_pytree
-from torchax import interop
-class ProcessGroupJax(ProcessGroup):
-  """Distributed backend implemented with JAX."""
-  def __init__(self, prefix_store, rank, size, timeout):
-    super().__init__(rank, size)
-    self._group_name = None
-  def getBackendName(self):
-    return "jax"
-  # TODO(wcromar): why doesn't default group name setter work?
-  # https://github.com/pytorch/pytorch/blob/7b1988f9222f3dec5cc2012afce84218199748ae/torch/csrc/distributed/c10d/ProcessGroup.cpp#L148-L152
-  def _set_group_name(self, name: str) -> None:
-    self._group_name = name
-  @property
-  def group_name(self):
-    assert self._group_name
-    return self._group_name
-  @staticmethod
-  def _work(
-    tensors: Union[torch.Tensor, List[torch.Tensor], List[List[torch.Tensor]]],
-  ) -> dist.Work:
-    fut = torch.futures.Future()
-    fut.set_result(tensors)
-    return torch._C._distributed_c10d._create_work_from_future(fut)
-  def _allgather_base(
-    self,
-    output: torch.Tensor,
-    input: torch.Tensor,
-    opts=...,
-  ) -> dist.Work:
-    assert isinstance(input, torchax.tensor.Tensor)
-    assert isinstance(output, torchax.tensor.Tensor)
-    torch.distributed._functional_collectives.all_gather_tensor_inplace(
-      output, input, group=self
-    )
-    return self._work(output)
-  def allreduce(
-    self,
-    tensors: List[torch.Tensor],
-    opts: dist.AllreduceOptions = ...,
-  ) -> dist.Work:
-    assert len(tensors) == 1
-    assert isinstance(tensors[0], torchax.tensor.Tensor)
-    torch.distributed._functional_collectives.all_reduce_inplace(
-      tensors[0],
-      torch.distributed._functional_collectives.REDUCE_OP_TO_STR[
-        opts.reduceOp.op
-      ],
-      self,
-    )
-    return self._work(tensors)
-  def broadcast(
-    self,
-    tensors: List[torch.Tensor],
-    opts: dist.BroadcastOptions = ...,
-  ) -> dist.Work:
-    assert len(tensors) == 1
-    assert isinstance(tensors[0], torchax.tensor.Tensor)
-    tensors[0].copy_(
-      torch.distributed._functional_collectives.broadcast(
-        tensors[0], opts.rootRank, group=self
-      )
-    )
-    return self._work(tensors)
-dist.Backend.register_backend("jax", ProcessGroupJax)
-def jax_rendezvous_handler(
-  url: str, timeout: datetime.timedelta = ..., **kwargs
-):
-  """Initialize distributed store with JAX process IDs.
-  Requires `$MASTER_ADDR` and `$MASTER_PORT`.
-  """
-  # TODO(wcromar): jax.distributed.initialize(...) for multiprocess on GPU
-  # TODO(wcromar): Can we use the XLA coordinator as a Store? This isn't part
-  # of their public Python API
-  master_ip = os.environ["MASTER_ADDR"]
-  master_port = int(os.environ["MASTER_PORT"])
-  # TODO(wcromar): Use `torchrun`'s store if available
-  store = dist.TCPStore(
-    master_ip,
-    master_port,
-    jax.process_count(),
-    is_master=jax.process_index() == 0,
-  )
-  yield (store, jax.process_index(), jax.process_count())
-dist.register_rendezvous_handler("jax", jax_rendezvous_handler)
-def spawn(f, args=(), env: Optional[torchax.tensor.Environment] = None):
-  """Wrap `f` in a JAX `pmap` with the axis name `torch_dist` defined.
-  `f` is expected to take the replica index as a positional argument, similar
-  to `torch.multiprocessing.spawn`.
-  Note: `spawn` does not actually create parallel processes.
-  """
-  env = env or torchax.default_env()
-  def jax_wrapper(index, jax_args):
-    index, args = env.j2t_iso([index, jax_args])
-    torch_outputs = f(index, *args)
-    return env.t2j_iso(torch_outputs)
-  jax_outputs = jax.pmap(jax_wrapper, axis_name="torch_dist")(
-    np.arange(jax.device_count()), env.t2j_iso(args)
-  )
-  return env.j2t_iso(jax_outputs)
-class DistributedDataParallel(torch.nn.Module):
-  """Re-implementation of DistributedDataParallel using JAX SPMD.
-  Splits inputs along batch dimension (assumed to be 0) across all devices in
-  JAX runtime, including remote devices. Each process should load a distinct
-  shard of the input data using e.g. DistributedSampler. Each process' shard
-  is then further split among the addressable devices (e.g. local TPU chips)
-  by `shard_input`.
-  Note: since parameters are replicated across addressable devices, inputs
-  must also be SPMD sharded using `shard_input` or `replicate_input`.
-  Example usage:
-  ```
-  jax_model = torchax.distributed.DistributedDataParallel(create_model())
-  for data, dataloader:
-    jax_data = jax_model.shard_input(data)
-    jax_output = jax_model(jax_data)
-  ```
-  """
-  def __init__(
-    self,
-    module: torch.nn.Module,
-    env: Optional[torchax.tensor.Environment] = None,
-    **kwargs,
-  ):
-    if kwargs:
-      logging.warning(f"Unsupported kwargs {kwargs}")
-    super().__init__()
-    self._env = env or torchax.default_env()
-    self._mesh = Mesh(
-      mesh_utils.create_device_mesh((jax.device_count(),)),
-      axis_names=("batch",),
-    )
-    replicated_state = torch_pytree.tree_map_only(
-      torch.Tensor,
-      lambda t: self._env.j2t_iso(
-        jax.device_put(
-          self._env.to_xla(t)._elem, NamedSharding(self._mesh, P())
-        )
-      ),
-      module.state_dict(),
-    )
-    # TODO: broadcast
-    module.load_state_dict(replicated_state, assign=True)
-    self._module = module
-  def shard_input(self, inp):
-    per_process_batch_size = inp.shape[0]  # assumes batch dim is 0
-    per_replica_batch_size = per_process_batch_size // jax.local_device_count()
-    per_replica_batches = torch.chunk(inp, jax.local_device_count())
-    global_batch_size = per_replica_batch_size * jax.device_count()
-    global_batch_shape = (global_batch_size,) + inp.shape[1:]
-    sharding = NamedSharding(self._mesh, P("batch"))
-    return self._env.j2t_iso(jax.make_array_from_single_device_arrays(
-      global_batch_shape,
-      NamedSharding(self._mesh, P("batch")),
-      arrays=[
-        jax.device_put(self._env.to_xla(batch)._elem, device)
-        for batch, device in zip(
-          per_replica_batches, sharding.addressable_devices
-        )
-      ],
-    ))
-  def replicate_input(self, inp):
-    return self._env.j2t_iso(
-      jax.device_put(inp._elem, NamedSharding(self._mesh, P()))
-    )
-  def jit_step(self, func):
-    @functools.partial(interop.jax_jit,
-                       kwargs_for_jax_jit={'donate_argnums': 0})
-    def _jit_fn(states, args):
-      self.load_state_dict(states)
-      outputs = func(*args)
-      return self.state_dict(), outputs
-    @functools.wraps(func)
-    def inner(*args):
-      jax_states = self.state_dict()
-      new_states, outputs = _jit_fn(jax_states, args)
-      self.load_state_dict(new_states)
-      return outputs
-    return inner
-  def forward(self, *args):
-    with self._env:
-      return self._module(*args)

torchax 0.0.4__py3-none-any.whl → 0.0.6__py3-none-any.whl

Potentially problematic release.

torchax 0.0.4py3-none-any.whl → 0.0.6py3-none-any.whl