PyPI - pyRDDLGym-jax - Versions diffs - 1.1__tar.gz → 1.3__tar.gz - Mend

pyRDDLGym-jax 1.1tar.gz → 1.3tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (55) hide show

{pyrddlgym_jax-1.1 → pyrddlgym_jax-1.3}/PKG-INFO RENAMED Viewed

@@ -1,17 +1,21 @@
-Metadata-Version: 2.1
+Metadata-Version: 2.2
 Name: pyRDDLGym-jax
-Version: 1.1
+Version: 1.3
 Summary: pyRDDLGym-jax: automatic differentiation for solving sequential planning problems in JAX.
 Home-page: https://github.com/pyrddlgym-project/pyRDDLGym-jax
 Author: Michael Gimelfarb, Ayal Taitler, Scott Sanner
 Author-email: mike.gimelfarb@mail.utoronto.ca, ataitler@gmail.com, ssanner@mie.utoronto.ca
 License: MIT License
-Classifier: Development Status :: 3 - Alpha
+Classifier: Development Status :: 5 - Production/Stable
 Classifier: Intended Audience :: Science/Research
 Classifier: License :: OSI Approved :: MIT License
 Classifier: Natural Language :: English
 Classifier: Operating System :: OS Independent
 Classifier: Programming Language :: Python :: 3
+Classifier: Programming Language :: Python :: 3.9
+Classifier: Programming Language :: Python :: 3.10
+Classifier: Programming Language :: Python :: 3.11
+Classifier: Programming Language :: Python :: 3.12
 Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
 Requires-Python: >=3.9
 Description-Content-Type: text/markdown
@@ -28,6 +32,17 @@ Requires-Dist: rddlrepository>=2.0; extra == "extra"
 Provides-Extra: dashboard
 Requires-Dist: dash>=2.18.0; extra == "dashboard"
 Requires-Dist: dash-bootstrap-components>=1.6.0; extra == "dashboard"
+Dynamic: author
+Dynamic: author-email
+Dynamic: classifier
+Dynamic: description
+Dynamic: description-content-type
+Dynamic: home-page
+Dynamic: license
+Dynamic: provides-extra
+Dynamic: requires-dist
+Dynamic: requires-python
+Dynamic: summary
 # pyRDDLGym-jax
@@ -95,27 +110,28 @@ pip install pyRDDLGym-jax[extra,dashboard]
 ## Running from the Command Line
-A basic run script is provided to run JaxPlan on any domain in ``rddlrepository`` from the install directory of pyRDDLGym-jax:
+A basic run script is provided to train JaxPlan on any RDDL problem:
 ```shell
-python -m pyRDDLGym_jax.examples.run_plan <domain> <instance> <method> <episodes>
+jaxplan plan <domain> <instance> <method> <episodes>
 ```
 where:
 - ``domain`` is the domain identifier as specified in rddlrepository (i.e. Wildfire_MDP_ippc2014), or a path pointing to a valid ``domain.rddl`` file
 - ``instance`` is the instance identifier (i.e. 1, 2, ... 10), or a path pointing to a valid ``instance.rddl`` file
-- ``method`` is the planning method to use (i.e. drp, slp, replan)
+- ``method`` is the planning method to use (i.e. drp, slp, replan) or a path to a valid .cfg file (see section below)
 - ``episodes`` is the (optional) number of episodes to evaluate the learned policy.
-The ``method`` parameter supports three possible modes:
+The ``method`` parameter supports four possible modes:
 - ``slp`` is the basic straight line planner described [in this paper](https://proceedings.neurips.cc/paper_files/paper/2017/file/98b17f068d5d9b7668e19fb8ae470841-Paper.pdf)
 - ``drp`` is the deep reactive policy network described [in this paper](https://ojs.aaai.org/index.php/AAAI/article/view/4744)
-- ``replan`` is the same as ``slp`` except the plan is recalculated at every decision time step.
+- ``replan`` is the same as ``slp`` except the plan is recalculated at every decision time step
+- any other argument is interpreted as a file path to a valid configuration file.
-For example, the following will train JaxPlan on the Quadcopter domain with 4 drones:
+For example, the following will train JaxPlan on the Quadcopter domain with 4 drones (with default config):
 ```shell
-python -m pyRDDLGym_jax.examples.run_plan Quadcopter 1 slp
+jaxplan plan Quadcopter 1 slp
 ```
 ## Running from Another Python Application
@@ -197,7 +213,7 @@ controller = JaxOfflineController(planner, **train_args)
 ...
 ```
-### JaxPlan Dashboard
+## JaxPlan Dashboard
 Since version 1.0, JaxPlan has an optional dashboard that allows keeping track of the planner performance across multiple runs,
 and visualization of the policy or model, and other useful debugging features.
@@ -217,7 +233,7 @@ dashboard=True
 More documentation about this and other new features will be coming soon.
-### Tuning the Planner
+## Tuning the Planner
 It is easy to tune the planner's hyper-parameters efficiently and automatically using Bayesian optimization.
 To do this, first create a config file template with patterns replacing concrete parameter values that you want to tune, e.g.:
@@ -280,7 +296,7 @@ tuning.tune(key=42, log_file='path/to/log.csv')
 A basic run script is provided to run the automatic hyper-parameter tuning for the most sensitive parameters of JaxPlan:
 ```shell
-python -m pyRDDLGym_jax.examples.run_tune <domain> <instance> <method> <trials> <iters> <workers>
+jaxplan tune <domain> <instance> <method> <trials> <iters> <workers>
 ```
 where:

{pyrddlgym_jax-1.1 → pyrddlgym_jax-1.3}/README.md RENAMED Viewed

@@ -64,27 +64,28 @@ pip install pyRDDLGym-jax[extra,dashboard]
 ## Running from the Command Line
-A basic run script is provided to run JaxPlan on any domain in ``rddlrepository`` from the install directory of pyRDDLGym-jax:
+A basic run script is provided to train JaxPlan on any RDDL problem:
 ```shell
-python -m pyRDDLGym_jax.examples.run_plan <domain> <instance> <method> <episodes>
+jaxplan plan <domain> <instance> <method> <episodes>
 ```
 where:
 - ``domain`` is the domain identifier as specified in rddlrepository (i.e. Wildfire_MDP_ippc2014), or a path pointing to a valid ``domain.rddl`` file
 - ``instance`` is the instance identifier (i.e. 1, 2, ... 10), or a path pointing to a valid ``instance.rddl`` file
-- ``method`` is the planning method to use (i.e. drp, slp, replan)
+- ``method`` is the planning method to use (i.e. drp, slp, replan) or a path to a valid .cfg file (see section below)
 - ``episodes`` is the (optional) number of episodes to evaluate the learned policy.
-The ``method`` parameter supports three possible modes:
+The ``method`` parameter supports four possible modes:
 - ``slp`` is the basic straight line planner described [in this paper](https://proceedings.neurips.cc/paper_files/paper/2017/file/98b17f068d5d9b7668e19fb8ae470841-Paper.pdf)
 - ``drp`` is the deep reactive policy network described [in this paper](https://ojs.aaai.org/index.php/AAAI/article/view/4744)
-- ``replan`` is the same as ``slp`` except the plan is recalculated at every decision time step.
+- ``replan`` is the same as ``slp`` except the plan is recalculated at every decision time step
+- any other argument is interpreted as a file path to a valid configuration file.
-For example, the following will train JaxPlan on the Quadcopter domain with 4 drones:
+For example, the following will train JaxPlan on the Quadcopter domain with 4 drones (with default config):
 ```shell
-python -m pyRDDLGym_jax.examples.run_plan Quadcopter 1 slp
+jaxplan plan Quadcopter 1 slp
 ```
 ## Running from Another Python Application
@@ -166,7 +167,7 @@ controller = JaxOfflineController(planner, **train_args)
 ...
 ```
-### JaxPlan Dashboard
+## JaxPlan Dashboard
 Since version 1.0, JaxPlan has an optional dashboard that allows keeping track of the planner performance across multiple runs,
 and visualization of the policy or model, and other useful debugging features.
@@ -186,7 +187,7 @@ dashboard=True
 More documentation about this and other new features will be coming soon.
-### Tuning the Planner
+## Tuning the Planner
 It is easy to tune the planner's hyper-parameters efficiently and automatically using Bayesian optimization.
 To do this, first create a config file template with patterns replacing concrete parameter values that you want to tune, e.g.:
@@ -249,7 +250,7 @@ tuning.tune(key=42, log_file='path/to/log.csv')
 A basic run script is provided to run the automatic hyper-parameter tuning for the most sensitive parameters of JaxPlan:
 ```shell
-python -m pyRDDLGym_jax.examples.run_tune <domain> <instance> <method> <trials> <iters> <workers>
+jaxplan tune <domain> <instance> <method> <trials> <iters> <workers>
 ```
 where:

pyrddlgym_jax-1.3/pyRDDLGym_jax/__init__.py ADDED Viewed

	@@ -0,0 +1 @@
1	+ __version__ = '1.3'

pyrddlgym_jax-1.3/pyRDDLGym_jax/core/assets/favicon.ico ADDED Viewed

Binary file

{pyrddlgym_jax-1.1 → pyrddlgym_jax-1.3}/pyRDDLGym_jax/core/planner.py RENAMED Viewed

@@ -655,7 +655,10 @@ class JaxStraightLinePlan(JaxPlan):
                     if ranges[var] == 'bool':
                         param_flat = jnp.ravel(param)
                         if noop[var]:
-                            param_flat = (-param_flat) if wrap_sigmoid else 1.0 - param_flat
+                            if wrap_sigmoid:
+                                param_flat = -param_flat
+                            else:
+                                param_flat = 1.0 - param_flat
                         scores.append(param_flat)
                 scores = jnp.concatenate(scores)
                 descending = jnp.sort(scores)[::-1]
@@ -666,7 +669,10 @@ class JaxStraightLinePlan(JaxPlan):
                 new_params = {}
                 for (var, param) in params.items():
                     if ranges[var] == 'bool':
-                        new_param = param + (surplus if noop[var] else -surplus)
+                        if noop[var]:
+                            new_param = param + surplus
+                        else:
+                            new_param = param - surplus
                         new_param = _jax_project_bool_to_box(var, new_param, hyperparams)
                     else:
                         new_param = param
@@ -687,57 +693,73 @@ class JaxStraightLinePlan(JaxPlan):
         elif use_constraint_satisfaction and not self._use_new_projection:
             # calculate the surplus of actions above max-nondef-actions
-            def _jax_wrapped_sogbofa_surplus(params, hyperparams):
-                sum_action, count = 0.0, 0
-                for (var, param) in params.items():
+            def _jax_wrapped_sogbofa_surplus(actions):
+                sum_action, k = 0.0, 0
+                for (var, action) in actions.items():
                     if ranges[var] == 'bool':
-                        action = _jax_bool_param_to_action(var, param, hyperparams)
                         if noop[var]:
-                            sum_action += jnp.size(action) - jnp.sum(action)
-                            count += jnp.sum(action < 1)
-                        else:
-                            sum_action += jnp.sum(action)
-                            count += jnp.sum(action > 0)
+                            action = 1 - action
+                        sum_action += jnp.sum(action)
+                        k += jnp.count_nonzero(action)
                 surplus = jnp.maximum(sum_action - allowed_actions, 0.0)
-                count = jnp.maximum(count, 1)
-                return surplus / count
+                return surplus, k
             # return whether the surplus is positive or reached compute limit
             max_constraint_iter = self._max_constraint_iter
             def _jax_wrapped_sogbofa_continue(values):
-                it, _, _, surplus = values
-                return jnp.logical_and(it < max_constraint_iter, surplus > 0)
+                it, _, surplus, k = values
+                return jnp.logical_and(
+                    it < max_constraint_iter, jnp.logical_and(surplus > 0, k > 0))
             # reduce all bool action values by the surplus clipping at minimum
             # for no-op = True, do the opposite, i.e. increase all
             # bool action values by surplus clipping at maximum
             def _jax_wrapped_sogbofa_subtract_surplus(values):
-                it, params, hyperparams, surplus = values
-                new_params = {}
-                for (var, param) in params.items():
+                it, actions, surplus, k = values
+                amount = surplus / k
+                new_actions = {}
+                for (var, action) in actions.items():
                     if ranges[var] == 'bool':
-                        action = _jax_bool_param_to_action(var, param, hyperparams)
-                        new_action = action + (surplus if noop[var] else -surplus)
-                        new_action = jnp.clip(new_action, min_action, max_action)
-                        new_param = _jax_bool_action_to_param(var, new_action, hyperparams)
+                        if noop[var]:
+                            new_actions[var] = jnp.minimum(action + amount, 1)
+                        else:
+                            new_actions[var] = jnp.maximum(action - amount, 0)
                     else:
-                        new_param = param
-                    new_params[var] = new_param
-                new_surplus = _jax_wrapped_sogbofa_surplus(new_params, hyperparams)
+                        new_actions[var] = action
+                new_surplus, new_k = _jax_wrapped_sogbofa_surplus(new_actions)
                 new_it = it + 1
-                return new_it, new_params, hyperparams, new_surplus
+                return new_it, new_actions, new_surplus, new_k
             # apply the surplus to the actions until it becomes zero
             def _jax_wrapped_sogbofa_project(params, hyperparams):
-                surplus = _jax_wrapped_sogbofa_surplus(params, hyperparams)
-                _, params, _, surplus = jax.lax.while_loop(
+                # convert parameters to actions
+                actions = {}
+                for (var, param) in params.items():
+                    if ranges[var] == 'bool':
+                        actions[var] = _jax_bool_param_to_action(var, param, hyperparams)
+                    else:
+                        actions[var] = param
+                # run SOGBOFA loop on the actions to get adjusted actions
+                surplus, k = _jax_wrapped_sogbofa_surplus(actions)
+                _, actions, surplus, k = jax.lax.while_loop(
                     cond_fun=_jax_wrapped_sogbofa_continue,
                     body_fun=_jax_wrapped_sogbofa_subtract_surplus,
-                    init_val=(0, params, hyperparams, surplus)
+                    init_val=(0, actions, surplus, k)
                 )
                 converged = jnp.logical_not(surplus > 0)
-                return params, converged
+                # convert the adjusted actions back to parameters
+                new_params = {}
+                for (var, action) in actions.items():
+                    if ranges[var] == 'bool':
+                        action = jnp.clip(action, min_action, max_action)
+                        new_params[var] = _jax_bool_action_to_param(var, action, hyperparams)
+                    else:
+                        new_params[var] = action
+                return new_params, converged
             # clip actions to valid bounds and satisfy constraint on max actions
             def _jax_wrapped_slp_project_to_max_constraint(params, hyperparams):
@@ -1415,6 +1437,7 @@ r"""
         # optimization
         self.update = self._jax_update(train_loss)
+        self.check_zero_grad = self._jax_check_zero_gradients()
     def _jax_return(self, use_symlog):
         gamma = self.rddl.discount
@@ -1497,6 +1520,18 @@ r"""
         return jax.jit(_jax_wrapped_plan_update)
+    def _jax_check_zero_gradients(self):
+        def _jax_wrapped_zero_gradient(grad):
+            return jnp.allclose(grad, 0)
+        def _jax_wrapped_zero_gradients(grad):
+            leaves, _ = jax.tree_util.tree_flatten(
+                jax.tree_map(_jax_wrapped_zero_gradient, grad))
+            return jnp.all(jnp.asarray(leaves))
+        return jax.jit(_jax_wrapped_zero_gradients)
     def _batched_init_subs(self, subs):
         rddl = self.rddl
         n_train, n_test = self.batch_size_train, self.batch_size_test
@@ -1795,7 +1830,6 @@ r"""
         rolling_test_loss = RollingMean(test_rolling_window)
         log = {}
         status = JaxPlannerStatus.NORMAL
-        is_all_zero_fn = lambda x: np.allclose(x, 0)
         # initialize stopping criterion
         if stopping_rule is not None:
@@ -1836,9 +1870,7 @@ r"""
             # ==================================================================
             # no progress
-            grad_norm_zero, _ = jax.tree_util.tree_flatten(
-                jax.tree_map(is_all_zero_fn, train_log['grad']))
-            if np.all(grad_norm_zero):
+            if self.check_zero_grad(train_log['grad']):
                 status = JaxPlannerStatus.NO_PROGRESS
             # constraint satisfaction problem
@@ -2035,8 +2067,8 @@ r"""
             # must be numeric array
             # exception is for POMDPs at 1st epoch when observ-fluents are None
             dtype = np.atleast_1d(values).dtype
-            if not jnp.issubdtype(dtype, jnp.number) \
-            and not jnp.issubdtype(dtype, jnp.bool_):
+            if not np.issubdtype(dtype, np.number) \
+            and not np.issubdtype(dtype, np.bool_):
                 if step == 0 and var in self.rddl.observ_fluents:
                     subs[var] = self.test_compiled.init_values[var]
                 else:
@@ -2077,10 +2109,11 @@ def mean_variance_utility(returns: jnp.ndarray, beta: float) -> float:
 @jax.jit
 def cvar_utility(returns: jnp.ndarray, alpha: float) -> float:
-    alpha_mask = jax.lax.stop_gradient(
-        returns <= jnp.percentile(returns, q=100 * alpha))
-    return jnp.sum(returns * alpha_mask) / jnp.sum(alpha_mask)
+    var = jnp.percentile(returns, q=100 * alpha)
+    mask = returns <= var
+    weights = mask / jnp.maximum(1, jnp.sum(mask))
+    return jnp.sum(returns * weights)
 # ***********************************************************************
 # ALL VERSIONS OF CONTROLLERS

pyrddlgym_jax-1.3/pyRDDLGym_jax/entry_point.py ADDED Viewed

@@ -0,0 +1,27 @@
+import argparse
+from pyRDDLGym_jax.examples import run_plan, run_tune
+def main():
+    parser = argparse.ArgumentParser(description="Command line parser for the JaxPlan planner.")
+    subparsers = parser.add_subparsers(dest="jaxplan", required=True)
+    # planning
+    parser_plan = subparsers.add_parser("plan", help="Executes JaxPlan on a specified RDDL problem and method (slp, drp, or replan).")
+    parser_plan.add_argument('args', nargs=argparse.REMAINDER)
+    # tuning
+    parser_tune = subparsers.add_parser("tune", help="Tunes JaxPlan on a specified RDDL problem and method (slp, drp, or replan).")
+    parser_tune.add_argument('args', nargs=argparse.REMAINDER)
+    # dispatch
+    args = parser.parse_args()
+    if args.jaxplan == "plan":
+        run_plan.run_from_args(args.args)
+    elif args.jaxplan == "tune":
+        run_tune.run_from_args(args.args)
+    else:
+        parser.print_help()
+if __name__ == "__main__":
+    main()

pyrddlgym_jax-1.3/pyRDDLGym_jax/examples/configs/__init__.py ADDED Viewed

File without changes

{pyrddlgym_jax-1.1 → pyrddlgym_jax-1.3}/pyRDDLGym_jax/examples/run_plan.py RENAMED Viewed

@@ -12,7 +12,7 @@ The syntax for running this example is:
 where:
     <domain> is the name of a domain located in the /Examples directory
     <instance> is the instance number
-    <method> is either slp, drp, or replan
+    <method> is slp, drp, replan, or a path to a valid .cfg file
     <episodes> is the optional number of evaluation rollouts
 '''
 import os
@@ -32,12 +32,19 @@ def main(domain, instance, method, episodes=1):
     env = pyRDDLGym.make(domain, instance, vectorized=True)
     # load the config file with planner settings
-    abs_path = os.path.dirname(os.path.abspath(__file__))
-    config_path = os.path.join(abs_path, 'configs', f'{domain}_{method}.cfg')
-    if not os.path.isfile(config_path):
-        raise_warning(f'Config file {config_path} was not found, '
-                      f'using default_{method}.cfg.', 'red')
-        config_path = os.path.join(abs_path, 'configs', f'default_{method}.cfg')
+    if method in ['drp', 'slp', 'replan']:
+        abs_path = os.path.dirname(os.path.abspath(__file__))
+        config_path = os.path.join(abs_path, 'configs', f'{domain}_{method}.cfg')
+        if not os.path.isfile(config_path):
+            raise_warning(f'Config file {config_path} was not found, '
+                          f'using default_{method}.cfg.', 'red')
+            config_path = os.path.join(abs_path, 'configs', f'default_{method}.cfg')
+    elif os.path.isfile(method):
+        config_path = method
+    else:
+        print('method must be slp, drp, replan, or a path to a valid .cfg file.')
+        exit(1)
     planner_args, _, train_args = load_config(config_path)
     if 'dashboard' in train_args:
         train_args['dashboard'].launch()
@@ -54,16 +61,16 @@ def main(domain, instance, method, episodes=1):
     controller.evaluate(env, episodes=episodes, verbose=True, render=True)
     env.close()
-if __name__ == "__main__":
-    args = sys.argv[1:]
+def run_from_args(args):
     if len(args) < 3:
         print('python run_plan.py <domain> <instance> <method> [<episodes>]')
         exit(1)
-    if args[2] not in ['drp', 'slp', 'replan']:
-        print('<method> in [drp, slp, replan]')
-        exit(1)
     kwargs = {'domain': args[0], 'instance': args[1], 'method': args[2]}
     if len(args) >= 4: kwargs['episodes'] = int(args[3])
     main(**kwargs)
+if __name__ == "__main__":
+    run_from_args(sys.argv[1:])

{pyrddlgym_jax-1.1 → pyrddlgym_jax-1.3}/pyRDDLGym_jax/examples/run_tune.py RENAMED Viewed

@@ -75,8 +75,7 @@ def main(domain, instance, method, trials=5, iters=20, workers=4):
     env.close()
-if __name__ == "__main__":
-    args = sys.argv[1:]
+def run_from_args(args):
     if len(args) < 3:
         print('python run_tune.py <domain> <instance> <method> [<trials>] [<iters>] [<workers>]')
         exit(1)
@@ -88,4 +87,7 @@ if __name__ == "__main__":
     if len(args) >= 5: kwargs['iters'] = int(args[4])
     if len(args) >= 6: kwargs['workers'] = int(args[5])
     main(**kwargs)
+if __name__ == "__main__":
+    run_from_args(sys.argv[1:])

{pyrddlgym_jax-1.1 → pyrddlgym_jax-1.3}/pyRDDLGym_jax.egg-info/PKG-INFO RENAMED Viewed

@@ -1,17 +1,21 @@
-Metadata-Version: 2.1
+Metadata-Version: 2.2
 Name: pyRDDLGym-jax
-Version: 1.1
+Version: 1.3
 Summary: pyRDDLGym-jax: automatic differentiation for solving sequential planning problems in JAX.
 Home-page: https://github.com/pyrddlgym-project/pyRDDLGym-jax
 Author: Michael Gimelfarb, Ayal Taitler, Scott Sanner
 Author-email: mike.gimelfarb@mail.utoronto.ca, ataitler@gmail.com, ssanner@mie.utoronto.ca
 License: MIT License
-Classifier: Development Status :: 3 - Alpha
+Classifier: Development Status :: 5 - Production/Stable
 Classifier: Intended Audience :: Science/Research
 Classifier: License :: OSI Approved :: MIT License
 Classifier: Natural Language :: English
 Classifier: Operating System :: OS Independent
 Classifier: Programming Language :: Python :: 3
+Classifier: Programming Language :: Python :: 3.9
+Classifier: Programming Language :: Python :: 3.10
+Classifier: Programming Language :: Python :: 3.11
+Classifier: Programming Language :: Python :: 3.12
 Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
 Requires-Python: >=3.9
 Description-Content-Type: text/markdown
@@ -28,6 +32,17 @@ Requires-Dist: rddlrepository>=2.0; extra == "extra"
 Provides-Extra: dashboard
 Requires-Dist: dash>=2.18.0; extra == "dashboard"
 Requires-Dist: dash-bootstrap-components>=1.6.0; extra == "dashboard"
+Dynamic: author
+Dynamic: author-email
+Dynamic: classifier
+Dynamic: description
+Dynamic: description-content-type
+Dynamic: home-page
+Dynamic: license
+Dynamic: provides-extra
+Dynamic: requires-dist
+Dynamic: requires-python
+Dynamic: summary
 # pyRDDLGym-jax
@@ -95,27 +110,28 @@ pip install pyRDDLGym-jax[extra,dashboard]
 ## Running from the Command Line
-A basic run script is provided to run JaxPlan on any domain in ``rddlrepository`` from the install directory of pyRDDLGym-jax:
+A basic run script is provided to train JaxPlan on any RDDL problem:
 ```shell
-python -m pyRDDLGym_jax.examples.run_plan <domain> <instance> <method> <episodes>
+jaxplan plan <domain> <instance> <method> <episodes>
 ```
 where:
 - ``domain`` is the domain identifier as specified in rddlrepository (i.e. Wildfire_MDP_ippc2014), or a path pointing to a valid ``domain.rddl`` file
 - ``instance`` is the instance identifier (i.e. 1, 2, ... 10), or a path pointing to a valid ``instance.rddl`` file
-- ``method`` is the planning method to use (i.e. drp, slp, replan)
+- ``method`` is the planning method to use (i.e. drp, slp, replan) or a path to a valid .cfg file (see section below)
 - ``episodes`` is the (optional) number of episodes to evaluate the learned policy.
-The ``method`` parameter supports three possible modes:
+The ``method`` parameter supports four possible modes:
 - ``slp`` is the basic straight line planner described [in this paper](https://proceedings.neurips.cc/paper_files/paper/2017/file/98b17f068d5d9b7668e19fb8ae470841-Paper.pdf)
 - ``drp`` is the deep reactive policy network described [in this paper](https://ojs.aaai.org/index.php/AAAI/article/view/4744)
-- ``replan`` is the same as ``slp`` except the plan is recalculated at every decision time step.
+- ``replan`` is the same as ``slp`` except the plan is recalculated at every decision time step
+- any other argument is interpreted as a file path to a valid configuration file.
-For example, the following will train JaxPlan on the Quadcopter domain with 4 drones:
+For example, the following will train JaxPlan on the Quadcopter domain with 4 drones (with default config):
 ```shell
-python -m pyRDDLGym_jax.examples.run_plan Quadcopter 1 slp
+jaxplan plan Quadcopter 1 slp
 ```
 ## Running from Another Python Application
@@ -197,7 +213,7 @@ controller = JaxOfflineController(planner, **train_args)
 ...
 ```
-### JaxPlan Dashboard
+## JaxPlan Dashboard
 Since version 1.0, JaxPlan has an optional dashboard that allows keeping track of the planner performance across multiple runs,
 and visualization of the policy or model, and other useful debugging features.
@@ -217,7 +233,7 @@ dashboard=True
 More documentation about this and other new features will be coming soon.
-### Tuning the Planner
+## Tuning the Planner
 It is easy to tune the planner's hyper-parameters efficiently and automatically using Bayesian optimization.
 To do this, first create a config file template with patterns replacing concrete parameter values that you want to tune, e.g.:
@@ -280,7 +296,7 @@ tuning.tune(key=42, log_file='path/to/log.csv')
 A basic run script is provided to run the automatic hyper-parameter tuning for the most sensitive parameters of JaxPlan:
 ```shell
-python -m pyRDDLGym_jax.examples.run_tune <domain> <instance> <method> <trials> <iters> <workers>
+jaxplan tune <domain> <instance> <method> <trials> <iters> <workers>
 ```
 where:

{pyrddlgym_jax-1.1 → pyrddlgym_jax-1.3}/pyRDDLGym_jax.egg-info/SOURCES.txt RENAMED Viewed

@@ -2,9 +2,11 @@ LICENSE
 README.md
 setup.py
 pyRDDLGym_jax/__init__.py
+pyRDDLGym_jax/entry_point.py
 pyRDDLGym_jax.egg-info/PKG-INFO
 pyRDDLGym_jax.egg-info/SOURCES.txt
 pyRDDLGym_jax.egg-info/dependency_links.txt
+pyRDDLGym_jax.egg-info/entry_points.txt
 pyRDDLGym_jax.egg-info/requires.txt
 pyRDDLGym_jax.egg-info/top_level.txt
 pyRDDLGym_jax/core/__init__.py
@@ -14,6 +16,8 @@ pyRDDLGym_jax/core/planner.py
 pyRDDLGym_jax/core/simulator.py
 pyRDDLGym_jax/core/tuning.py
 pyRDDLGym_jax/core/visualization.py
+pyRDDLGym_jax/core/assets/__init__.py
+pyRDDLGym_jax/core/assets/favicon.ico
 pyRDDLGym_jax/examples/__init__.py
 pyRDDLGym_jax/examples/run_gradient.py
 pyRDDLGym_jax/examples/run_gym.py

pyrddlgym_jax-1.3/pyRDDLGym_jax.egg-info/entry_points.txt ADDED Viewed

	@@ -0,0 +1,2 @@
1	+ [console_scripts]
2	+ jaxplan = pyRDDLGym_jax.entry_point:main

{pyrddlgym_jax-1.1 → pyrddlgym_jax-1.3}/setup.py RENAMED Viewed

@@ -19,7 +19,7 @@ long_description = (Path(__file__).parent / "README.md").read_text()
 setup(
       name='pyRDDLGym-jax',
-      version='1.1',
+      version='1.3',
       author="Michael Gimelfarb, Ayal Taitler, Scott Sanner",
       author_email="mike.gimelfarb@mail.utoronto.ca, ataitler@gmail.com, ssanner@mie.utoronto.ca",
       description="pyRDDLGym-jax: automatic differentiation for solving sequential planning problems in JAX.",
@@ -41,15 +41,22 @@ setup(
           'dashboard': ['dash>=2.18.0', 'dash-bootstrap-components>=1.6.0']
       },
       python_requires=">=3.9",
-      package_data={'': ['*.cfg']},
+      package_data={'': ['*.cfg', '*.ico']},
       include_package_data=True,
+      entry_points={
+          'console_scripts': [ 'jaxplan=pyRDDLGym_jax.entry_point:main'],
+      },
       classifiers=[
-        "Development Status :: 3 - Alpha",
+        "Development Status :: 5 - Production/Stable",
         "Intended Audience :: Science/Research",
         "License :: OSI Approved :: MIT License",
         "Natural Language :: English",
         "Operating System :: OS Independent",
         "Programming Language :: Python :: 3",
+        "Programming Language :: Python :: 3.9",
+        "Programming Language :: Python :: 3.10",
+        "Programming Language :: Python :: 3.11",
+        "Programming Language :: Python :: 3.12",
         "Topic :: Scientific/Engineering :: Artificial Intelligence",
     ],
 )