PyPI - pyRDDLGym-jax - Versions diffs - 2.4__tar.gz → 2.6__tar.gz - Mend

pyRDDLGym-jax 2.4tar.gz → 2.6tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (57) hide show

{pyrddlgym_jax-2.4 → pyrddlgym_jax-2.6}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
-Metadata-Version: 2.2
+Metadata-Version: 2.4
 Name: pyRDDLGym-jax
-Version: 2.4
+Version: 2.6
 Summary: pyRDDLGym-jax: automatic differentiation for solving sequential planning problems in JAX.
 Home-page: https://github.com/pyrddlgym-project/pyRDDLGym-jax
 Author: Michael Gimelfarb, Ayal Taitler, Scott Sanner
@@ -20,7 +20,7 @@ Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
 Requires-Python: >=3.9
 Description-Content-Type: text/markdown
 License-File: LICENSE
-Requires-Dist: pyRDDLGym>=2.0
+Requires-Dist: pyRDDLGym>=2.3
 Requires-Dist: tqdm>=4.66
 Requires-Dist: jax>=0.4.12
 Requires-Dist: optax>=0.1.9
@@ -39,6 +39,7 @@ Dynamic: description
 Dynamic: description-content-type
 Dynamic: home-page
 Dynamic: license
+Dynamic: license-file
 Dynamic: provides-extra
 Dynamic: requires-dist
 Dynamic: requires-python
@@ -54,7 +55,7 @@ Dynamic: summary
 [Installation](#installation) | [Run cmd](#running-from-the-command-line) | [Run python](#running-from-another-python-application) | [Configuration](#configuring-the-planner) | [Dashboard](#jaxplan-dashboard) | [Tuning](#tuning-the-planner) | [Simulation](#simulation) | [Citing](#citing-jaxplan)
-**pyRDDLGym-jax (known in the literature as JaxPlan) is an efficient gradient-based/differentiable planning algorithm in JAX.**
+**pyRDDLGym-jax (or JaxPlan) is an efficient gradient-based planning algorithm based on JAX.**
 Purpose:
@@ -83,7 +84,7 @@ and was moved to the individual logic components which have their own unique wei
 > [!NOTE]
 > While JaxPlan can support some discrete state/action problems through model relaxations, on some discrete problems it can perform poorly (though there is an ongoing effort to remedy this!).
-> If you find it is not making sufficient progress, check out the [PROST planner](https://github.com/pyrddlgym-project/pyRDDLGym-prost) (for discrete spaces) or the [deep reinforcement learning wrappers](https://github.com/pyrddlgym-project/pyRDDLGym-rl).
+> If you find it is not making progress, check out the [PROST planner](https://github.com/pyrddlgym-project/pyRDDLGym-prost) (for discrete spaces) or the [deep reinforcement learning wrappers](https://github.com/pyrddlgym-project/pyRDDLGym-rl).
 ## Installation
@@ -116,7 +117,7 @@ pip install pyRDDLGym-jax[extra,dashboard]
 A basic run script is provided to train JaxPlan on any RDDL problem:
 ```shell
-jaxplan plan <domain> <instance> <method> <episodes>
+jaxplan plan <domain> <instance> <method> --episodes <episodes>
 ```
 where:
@@ -219,13 +220,7 @@ controller = JaxOfflineController(planner, **train_args)
 ## JaxPlan Dashboard
 Since version 1.0, JaxPlan has an optional dashboard that allows keeping track of the planner performance across multiple runs,
-and visualization of the policy or model, and other useful debugging features.
-<p align="middle">
-<img src="https://github.com/pyrddlgym-project/pyRDDLGym-jax/blob/main/Images/dashboard.png" width="480" height="248" margin=0/>
-</p>
-To run the dashboard, add the following entry to your config file:
+and visualization of the policy or model, and other useful debugging features. To run the dashboard, add the following to your config file:
 ```ini
 ...
@@ -234,14 +229,12 @@ dashboard=True
 ...
 ```
-More documentation about this and other new features will be coming soon.
 ## Tuning the Planner
 A basic run script is provided to run automatic Bayesian hyper-parameter tuning for the most sensitive parameters of JaxPlan:
 ```shell
-jaxplan tune <domain> <instance> <method> <trials> <iters> <workers> <dashboard>
+jaxplan tune <domain> <instance> <method> --trials <trials> --iters <iters> --workers <workers> --dashboard <dashboard> --filepath <filepath>
 ```
 where:
@@ -251,7 +244,8 @@ where:
 - ``trials`` is the (optional) number of trials/episodes to average in evaluating each hyper-parameter setting
 - ``iters`` is the (optional) maximum number of iterations/evaluations of Bayesian optimization to perform
 - ``workers`` is the (optional) number of parallel evaluations to be done at each iteration, e.g. the total evaluations = ``iters * workers``
-- ``dashboard`` is whether the optimizations are tracked in the dashboard application.
+- ``dashboard`` is whether the optimizations are tracked in the dashboard application
+- ``filepath`` is the optional file path where a config file with the best hyper-parameter setting will be saved.
 It is easy to tune a custom range of the planner's hyper-parameters efficiently.
 First create a config file template with patterns replacing concrete parameter values that you want to tune, e.g.:
@@ -291,23 +285,16 @@ env = pyRDDLGym.make(domain, instance, vectorized=True)
 with open('path/to/config.cfg', 'r') as file:
     config_template = file.read()
-# map parameters in the config that will be tuned
+# tune weight from 10^-1 ... 10^5 and lr from 10^-5 ... 10^1
 def power_10(x):
-    return 10.0 ** x
-hyperparams = [
-    Hyperparameter('TUNABLE_WEIGHT', -1., 5., power_10),  # tune weight from 10^-1 ... 10^5
-    Hyperparameter('TUNABLE_LEARNING_RATE', -5., 1., power_10),   # tune lr from 10^-5 ... 10^1
-]
+    return 10.0 ** x
+hyperparams = [Hyperparameter('TUNABLE_WEIGHT', -1., 5., power_10),
+               Hyperparameter('TUNABLE_LEARNING_RATE', -5., 1., power_10)]
 # build the tuner and tune
 tuning = JaxParameterTuning(env=env,
-                            config_template=config_template,
-                            hyperparams=hyperparams,
-                            online=False,
-                            eval_trials=trials,
-                            num_workers=workers,
-                            gp_iters=iters)
+                            config_template=config_template, hyperparams=hyperparams,
+                            online=False, eval_trials=trials, num_workers=workers, gp_iters=iters)
 tuning.tune(key=42, log_file='path/to/log.csv')
 ```

{pyrddlgym_jax-2.4 → pyrddlgym_jax-2.6}/README.md RENAMED Viewed

@@ -8,7 +8,7 @@
 [Installation](#installation) | [Run cmd](#running-from-the-command-line) | [Run python](#running-from-another-python-application) | [Configuration](#configuring-the-planner) | [Dashboard](#jaxplan-dashboard) | [Tuning](#tuning-the-planner) | [Simulation](#simulation) | [Citing](#citing-jaxplan)
-**pyRDDLGym-jax (known in the literature as JaxPlan) is an efficient gradient-based/differentiable planning algorithm in JAX.**
+**pyRDDLGym-jax (or JaxPlan) is an efficient gradient-based planning algorithm based on JAX.**
 Purpose:
@@ -37,7 +37,7 @@ and was moved to the individual logic components which have their own unique wei
 > [!NOTE]
 > While JaxPlan can support some discrete state/action problems through model relaxations, on some discrete problems it can perform poorly (though there is an ongoing effort to remedy this!).
-> If you find it is not making sufficient progress, check out the [PROST planner](https://github.com/pyrddlgym-project/pyRDDLGym-prost) (for discrete spaces) or the [deep reinforcement learning wrappers](https://github.com/pyrddlgym-project/pyRDDLGym-rl).
+> If you find it is not making progress, check out the [PROST planner](https://github.com/pyrddlgym-project/pyRDDLGym-prost) (for discrete spaces) or the [deep reinforcement learning wrappers](https://github.com/pyrddlgym-project/pyRDDLGym-rl).
 ## Installation
@@ -70,7 +70,7 @@ pip install pyRDDLGym-jax[extra,dashboard]
 A basic run script is provided to train JaxPlan on any RDDL problem:
 ```shell
-jaxplan plan <domain> <instance> <method> <episodes>
+jaxplan plan <domain> <instance> <method> --episodes <episodes>
 ```
 where:
@@ -173,13 +173,7 @@ controller = JaxOfflineController(planner, **train_args)
 ## JaxPlan Dashboard
 Since version 1.0, JaxPlan has an optional dashboard that allows keeping track of the planner performance across multiple runs,
-and visualization of the policy or model, and other useful debugging features.
-<p align="middle">
-<img src="https://github.com/pyrddlgym-project/pyRDDLGym-jax/blob/main/Images/dashboard.png" width="480" height="248" margin=0/>
-</p>
-To run the dashboard, add the following entry to your config file:
+and visualization of the policy or model, and other useful debugging features. To run the dashboard, add the following to your config file:
 ```ini
 ...
@@ -188,14 +182,12 @@ dashboard=True
 ...
 ```
-More documentation about this and other new features will be coming soon.
 ## Tuning the Planner
 A basic run script is provided to run automatic Bayesian hyper-parameter tuning for the most sensitive parameters of JaxPlan:
 ```shell
-jaxplan tune <domain> <instance> <method> <trials> <iters> <workers> <dashboard>
+jaxplan tune <domain> <instance> <method> --trials <trials> --iters <iters> --workers <workers> --dashboard <dashboard> --filepath <filepath>
 ```
 where:
@@ -205,7 +197,8 @@ where:
 - ``trials`` is the (optional) number of trials/episodes to average in evaluating each hyper-parameter setting
 - ``iters`` is the (optional) maximum number of iterations/evaluations of Bayesian optimization to perform
 - ``workers`` is the (optional) number of parallel evaluations to be done at each iteration, e.g. the total evaluations = ``iters * workers``
-- ``dashboard`` is whether the optimizations are tracked in the dashboard application.
+- ``dashboard`` is whether the optimizations are tracked in the dashboard application
+- ``filepath`` is the optional file path where a config file with the best hyper-parameter setting will be saved.
 It is easy to tune a custom range of the planner's hyper-parameters efficiently.
 First create a config file template with patterns replacing concrete parameter values that you want to tune, e.g.:
@@ -245,23 +238,16 @@ env = pyRDDLGym.make(domain, instance, vectorized=True)
 with open('path/to/config.cfg', 'r') as file:
     config_template = file.read()
-# map parameters in the config that will be tuned
+# tune weight from 10^-1 ... 10^5 and lr from 10^-5 ... 10^1
 def power_10(x):
-    return 10.0 ** x
-hyperparams = [
-    Hyperparameter('TUNABLE_WEIGHT', -1., 5., power_10),  # tune weight from 10^-1 ... 10^5
-    Hyperparameter('TUNABLE_LEARNING_RATE', -5., 1., power_10),   # tune lr from 10^-5 ... 10^1
-]
+    return 10.0 ** x
+hyperparams = [Hyperparameter('TUNABLE_WEIGHT', -1., 5., power_10),
+               Hyperparameter('TUNABLE_LEARNING_RATE', -5., 1., power_10)]
 # build the tuner and tune
 tuning = JaxParameterTuning(env=env,
-                            config_template=config_template,
-                            hyperparams=hyperparams,
-                            online=False,
-                            eval_trials=trials,
-                            num_workers=workers,
-                            gp_iters=iters)
+                            config_template=config_template, hyperparams=hyperparams,
+                            online=False, eval_trials=trials, num_workers=workers, gp_iters=iters)
 tuning.tune(key=42, log_file='path/to/log.csv')
 ```

pyrddlgym_jax-2.6/pyRDDLGym_jax/__init__.py ADDED Viewed

	@@ -0,0 +1 @@
1	+ __version__ = '2.6'

{pyrddlgym_jax-2.4 → pyrddlgym_jax-2.6}/pyRDDLGym_jax/core/compiler.py RENAMED Viewed

@@ -237,7 +237,8 @@ class JaxRDDLCompiler:
     def compile_transition(self, check_constraints: bool=False,
                            constraint_func: bool=False,
-                           init_params_constr: Dict[str, Any]={}) -> Callable:
+                           init_params_constr: Dict[str, Any]={},
+                           cache_path_info: bool=False) -> Callable:
         '''Compiles the current RDDL model into a JAX transition function that
         samples the next state.
@@ -274,6 +275,7 @@ class JaxRDDLCompiler:
         returned log and does not raise an exception
         :param constraint_func: produces the h(s, a) function described above
         in addition to the usual outputs
+        :param cache_path_info: whether to save full path traces as part of the log
         '''
         NORMAL = JaxRDDLCompiler.ERROR_CODES['NORMAL']
         rddl = self.rddl
@@ -322,8 +324,11 @@ class JaxRDDLCompiler:
             errors |= err
             # calculate fluent values
-            fluents = {name: values for (name, values) in subs.items()
-                       if name not in rddl.non_fluents}
+            if cache_path_info:
+                fluents = {name: values for (name, values) in subs.items()
+                           if name not in rddl.non_fluents}
+            else:
+                fluents = {}
             # set the next state to the current state
             for (state, next_state) in rddl.next_state.items():
@@ -367,7 +372,9 @@ class JaxRDDLCompiler:
                          n_batch: int,
                          check_constraints: bool=False,
                          constraint_func: bool=False,
-                         init_params_constr: Dict[str, Any]={}) -> Callable:
+                         init_params_constr: Dict[str, Any]={},
+                         model_params_reduction: Callable=lambda x: x[0],
+                         cache_path_info: bool=False) -> Callable:
         '''Compiles the current RDDL model into a JAX transition function that
         samples trajectories with a fixed horizon from a policy.
@@ -399,10 +406,13 @@ class JaxRDDLCompiler:
         returned log and does not raise an exception
         :param constraint_func: produces the h(s, a) constraint function
         in addition to the usual outputs
+        :param model_params_reduction: how to aggregate updated model_params across runs
+        in the batch (defaults to selecting the first element's parameters in the batch)
+        :param cache_path_info: whether to save full path traces as part of the log
         '''
         rddl = self.rddl
         jax_step_fn = self.compile_transition(
-            check_constraints, constraint_func, init_params_constr)
+            check_constraints, constraint_func, init_params_constr, cache_path_info)
         # for POMDP only observ-fluents are assumed visible to the policy
         if rddl.observ_fluents:
@@ -421,7 +431,6 @@ class JaxRDDLCompiler:
             return jax_step_fn(subkey, actions, subs, model_params)
         # do a batched step update from the policy
-        # TODO: come up with a better way to reduce the model_param batch dim
         def _jax_wrapped_batched_step_policy(carry, step):
             key, policy_params, hyperparams, subs, model_params = carry
             key, *subkeys = random.split(key, num=1 + n_batch)
@@ -430,7 +439,7 @@ class JaxRDDLCompiler:
                 _jax_wrapped_single_step_policy,
                 in_axes=(0, None, None, None, 0, None)
             )(keys, policy_params, hyperparams, step, subs, model_params)
-            model_params = jax.tree_map(partial(jnp.mean, axis=0), model_params)
+            model_params = jax.tree_util.tree_map(model_params_reduction, model_params)
             carry = (key, policy_params, hyperparams, subs, model_params)
             return carry, log
@@ -440,7 +449,7 @@ class JaxRDDLCompiler:
             start = (key, policy_params, hyperparams, subs, model_params)
             steps = jnp.arange(n_steps)
             end, log = jax.lax.scan(_jax_wrapped_batched_step_policy, start, steps)
-            log = jax.tree_map(partial(jnp.swapaxes, axis1=0, axis2=1), log)
+            log = jax.tree_util.tree_map(partial(jnp.swapaxes, axis1=0, axis2=1), log)
             model_params = end[-1]
             return log, model_params
@@ -707,7 +716,10 @@ class JaxRDDLCompiler:
                     sample = jnp.asarray(value, dtype=self._fix_dtype(value))
                     new_slices = [None] * len(jax_nested_expr)
                     for (i, jax_expr) in enumerate(jax_nested_expr):
-                        new_slices[i], key, err, params = jax_expr(x, params, key)
+                        new_slice, key, err, params = jax_expr(x, params, key)
+                        if not jnp.issubdtype(jnp.result_type(new_slice), jnp.integer):
+                            new_slice = jnp.asarray(new_slice, dtype=self.INT)
+                        new_slices[i] = new_slice
                         error |= err
                     new_slices = tuple(new_slices)
                     sample = sample[new_slices]
@@ -986,7 +998,8 @@ class JaxRDDLCompiler:
             sample_cases = [None] * len(jax_cases)
             for (i, jax_case) in enumerate(jax_cases):
                 sample_cases[i], key, err_case, params = jax_case(x, params, key)
-                err |= err_case
+                err |= err_case
+            sample_cases = jnp.asarray(sample_cases)
             sample_cases = jnp.asarray(sample_cases, dtype=self._fix_dtype(sample_cases))
             # predicate (enum) is an integer - use it to extract from case array

{pyrddlgym_jax-2.4 → pyrddlgym_jax-2.6}/pyRDDLGym_jax/core/logic.py RENAMED Viewed

@@ -1056,15 +1056,13 @@ class ExactLogic(Logic):
     def control_if(self, id, init_params):
         return self._jax_wrapped_calc_if_then_else_exact
-    @staticmethod
-    def _jax_wrapped_calc_switch_exact(pred, cases, params):
-        pred = pred[jnp.newaxis, ...]
-        sample = jnp.take_along_axis(cases, pred, axis=0)
-        assert sample.shape[0] == 1
-        return sample[0, ...], params
     def control_switch(self, id, init_params):
-        return self._jax_wrapped_calc_switch_exact
+        def _jax_wrapped_calc_switch_exact(pred, cases, params):
+            pred = jnp.asarray(pred[jnp.newaxis, ...], dtype=self.INT)
+            sample = jnp.take_along_axis(cases, pred, axis=0)
+            assert sample.shape[0] == 1
+            return sample[0, ...], params
+        return _jax_wrapped_calc_switch_exact
     # ===========================================================================
     # random variables

pyRDDLGym-jax 2.4__tar.gz → 2.6__tar.gz

pyRDDLGym-jax 2.4tar.gz → 2.6tar.gz