PyPI - strainOptimizer - Versions diffs - 0.1.0__tar.gz - Mend

strainOptimizer 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (98) hide show

strainoptimizer-0.1.0/LICENSE ADDED Viewed

@@ -0,0 +1,21 @@
+MIT License
+Copyright (c) 2023 Hongzhong Lu @Shanghai Jiao Tong University
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

strainoptimizer-0.1.0/PKG-INFO ADDED Viewed

@@ -0,0 +1,161 @@
+Metadata-Version: 2.4
+Name: strainOptimizer
+Version: 0.1.0
+Summary: Computational strain design using enzyme-constrained GEMs and ETFL models
+Author-email: Haoyu Wang <wanghy@dicp.ac.cn>
+Maintainer-email: Haoyu Wang <wanghy@dicp.ac.cn>
+License: MIT
+Project-URL: Homepage, https://github.com/hongzhonglu/strainOptimizer
+Project-URL: Repository, https://github.com/hongzhonglu/strainOptimizer
+Project-URL: Bug Tracker, https://github.com/hongzhonglu/strainOptimizer/issues
+Project-URL: Documentation, https://github.com/hongzhonglu/strainOptimizer/blob/main/README.md
+Project-URL: Zenodo Archive, https://doi.org/10.5281/zenodo.20770724
+Keywords: metabolic engineering,strain design,genome-scale model,enzyme-constrained model,ecGEM,ETFL,systems biology,flux balance analysis,synthetic biology
+Classifier: Development Status :: 4 - Beta
+Classifier: Intended Audience :: Science/Research
+Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
+Classifier: Topic :: Scientific/Engineering :: Chemistry
+Classifier: License :: OSI Approved :: MIT License
+Classifier: Programming Language :: Python :: 3
+Classifier: Programming Language :: Python :: 3.9
+Classifier: Programming Language :: Python :: 3.10
+Classifier: Operating System :: OS Independent
+Requires-Python: <3.11,>=3.9
+Description-Content-Type: text/markdown
+License-File: LICENSE
+Requires-Dist: cobra==0.24.0
+Requires-Dist: python-libsbml>=5.19.0
+Requires-Dist: optlang==1.5.2
+Requires-Dist: pytfa>=0.9.1
+Requires-Dist: sympy==1.6.2
+Requires-Dist: numpy<2.0,>=1.23
+Requires-Dist: scipy>=1.13
+Requires-Dist: pandas<2.0,>=1.3.5
+Requires-Dist: matplotlib>=3.9
+Requires-Dist: networkx>=3.2
+Requires-Dist: biopython>=1.85
+Requires-Dist: openpyxl>=3.1
+Requires-Dist: xlrd>=2.0
+Requires-Dist: pydantic<2.0,>=1.10
+Requires-Dist: tqdm>=4.60
+Requires-Dist: rich>=13.0
+Requires-Dist: six>=1.16
+Requires-Dist: python-dateutil>=2.9
+Provides-Extra: gurobi
+Requires-Dist: gurobipy>=10.0; extra == "gurobi"
+Provides-Extra: cplex
+Requires-Dist: cplex>=22.1; extra == "cplex"
+Provides-Extra: omics
+Requires-Dist: troppo>=0.1; extra == "omics"
+Provides-Extra: dev
+Requires-Dist: pytest>=8.0; extra == "dev"
+Requires-Dist: build>=1.0; extra == "dev"
+Requires-Dist: twine>=5.0; extra == "dev"
+Requires-Dist: ruff>=0.4; extra == "dev"
+Provides-Extra: all
+Requires-Dist: strainOptimizer[dev,omics]; extra == "all"
+Dynamic: license-file
+# strainOptimizer
+[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.20770724.svg)](https://doi.org/10.5281/zenodo.20770724)
+Repo for strain design based on ETFL/ecGEM models.
+## Overview
+strainOptimizer is a Python package for strain design using enzyme-constrained genome-scale metabolic models (ecGEMs) and ETFL models. It implements advanced strain design algorithms such as ecFactory and ecFSEOF, which leverage the additional constraints provided by enzyme capacity to identify more realistic genetic modifications for enhanced production of target metabolites.
+## Installation
+1. Clone the repo
+```
+git clone https://github.com/hongzhonglu/strainOptimizer.git
+```
+2. Set up the environment
+```
+conda create -n strainOpitimizer python=3.10
+conda activate strainOpitimizer
+conda env update --file environment.yml
+pip install -e .
+```
+3. Solvers
+We recommend using commercial solvers such as CPLEX or Gurobi. Please install one of them and set up the corresponding Python API.
+- GUROBI (version 10.0)
+We recommend obtaining a valid Gurobi license. Otherwise, adjust the code to use a different optimizer. To retrieve your license::
+```
+grbgetkey <YOUR-LICENSE-KEY>
+```
+For details, see the Gurobi licensing guide: https://www.gurobi.com/documentation/
+- CPLEX
+After installing CPLEX into the system, set the CPLEX Python API in your environment :
+```
+python [cplex_path]/python/setup.py install
+```
+## Usage guide
+1. [ecFactory design example](examples/1.ecFactory_design_example.ipynb)
+2. [ecFSEOF design example](examples/2.ecFSEOF_design_example.ipynb)
+More examples can be found in the [examples](examples) folder.
+## Example script
+Here is an example script for 2-PE strain design:
+```python
+from strainOptimizer import strainOptimizer_engine,WorkflowParameters
+# Model parameters - model path, type, solver, growth reaction
+model_params = {
+    'model_path': 'example/models\yeast\ecYeastGEM_batch.xml',
+    'model_type': 'ecGEM',
+    'solver': 'optlang-gurobi',
+    'growth_id': 'r_2111',
+}
+# Strain parameters - target product and growth conditions
+strain_params = {
+    'target_id': 'r_1589',
+    'product_name': '2-phenylethanol',
+    'c_source': 'r_1714_REV',  # glucose exchange reaction
+    'c_uptake': 5,  # glucose uptake rate (mmol/gDW/h)
+}
+# Algorithm control parameters - workflow and output settings
+algorithm_params = {
+    'design_algorithm': 'ecFactory',
+    'simulation_method': 'ppfba',
+    'experimental_yield': None, # if without experimental yield data, use the 1/2
+    'remove_essential': True,
+    'output_directory': './results',
+    'steps':123,
+    'action_thresholds':[0.05,0.3,1.1]
+    # 'save_results': False,
+    # 'only_final_result': True,
+    # Note: ecFactory-specific parameters like steps, action_thresholds, etc.
+    # would need to be added to AlgorithmControl if they're used
+}
+# Create WorkflowParameters using the three-level structure
+params = WorkflowParameters(
+    model=model_params,
+    strain=strain_params,
+    algorithm=algorithm_params
+)
+# Create workflow engine using the new framework
+engine = strainOptimizer_engine(params)
+# Load model
+engine.load_model()
+# Run the design workflow
+final_result=engine.run_design()
+# More detailed results can be accessed via engine.all_results.
+# And results at each level have been saved in output_directory.
+```
+## Contribution
+* For contributors: Fork it to your Github account, and create a new branch from [`dev`](https://github.com/hongzhonglu/strainOptimizer/tree/dev).

strainoptimizer-0.1.0/README.md ADDED Viewed

@@ -0,0 +1,103 @@
+# strainOptimizer
+[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.20770724.svg)](https://doi.org/10.5281/zenodo.20770724)
+Repo for strain design based on ETFL/ecGEM models.
+## Overview
+strainOptimizer is a Python package for strain design using enzyme-constrained genome-scale metabolic models (ecGEMs) and ETFL models. It implements advanced strain design algorithms such as ecFactory and ecFSEOF, which leverage the additional constraints provided by enzyme capacity to identify more realistic genetic modifications for enhanced production of target metabolites.
+## Installation
+1. Clone the repo
+```
+git clone https://github.com/hongzhonglu/strainOptimizer.git
+```
+2. Set up the environment
+```
+conda create -n strainOpitimizer python=3.10
+conda activate strainOpitimizer
+conda env update --file environment.yml
+pip install -e .
+```
+3. Solvers
+We recommend using commercial solvers such as CPLEX or Gurobi. Please install one of them and set up the corresponding Python API.
+- GUROBI (version 10.0)
+We recommend obtaining a valid Gurobi license. Otherwise, adjust the code to use a different optimizer. To retrieve your license::
+```
+grbgetkey <YOUR-LICENSE-KEY>
+```
+For details, see the Gurobi licensing guide: https://www.gurobi.com/documentation/
+- CPLEX
+After installing CPLEX into the system, set the CPLEX Python API in your environment :
+```
+python [cplex_path]/python/setup.py install
+```
+## Usage guide
+1. [ecFactory design example](examples/1.ecFactory_design_example.ipynb)
+2. [ecFSEOF design example](examples/2.ecFSEOF_design_example.ipynb)
+More examples can be found in the [examples](examples) folder.
+## Example script
+Here is an example script for 2-PE strain design:
+```python
+from strainOptimizer import strainOptimizer_engine,WorkflowParameters
+# Model parameters - model path, type, solver, growth reaction
+model_params = {
+    'model_path': 'example/models\yeast\ecYeastGEM_batch.xml',
+    'model_type': 'ecGEM',
+    'solver': 'optlang-gurobi',
+    'growth_id': 'r_2111',
+}
+# Strain parameters - target product and growth conditions
+strain_params = {
+    'target_id': 'r_1589',
+    'product_name': '2-phenylethanol',
+    'c_source': 'r_1714_REV',  # glucose exchange reaction
+    'c_uptake': 5,  # glucose uptake rate (mmol/gDW/h)
+}
+# Algorithm control parameters - workflow and output settings
+algorithm_params = {
+    'design_algorithm': 'ecFactory',
+    'simulation_method': 'ppfba',
+    'experimental_yield': None, # if without experimental yield data, use the 1/2
+    'remove_essential': True,
+    'output_directory': './results',
+    'steps':123,
+    'action_thresholds':[0.05,0.3,1.1]
+    # 'save_results': False,
+    # 'only_final_result': True,
+    # Note: ecFactory-specific parameters like steps, action_thresholds, etc.
+    # would need to be added to AlgorithmControl if they're used
+}
+# Create WorkflowParameters using the three-level structure
+params = WorkflowParameters(
+    model=model_params,
+    strain=strain_params,
+    algorithm=algorithm_params
+)
+# Create workflow engine using the new framework
+engine = strainOptimizer_engine(params)
+# Load model
+engine.load_model()
+# Run the design workflow
+final_result=engine.run_design()
+# More detailed results can be accessed via engine.all_results.
+# And results at each level have been saved in output_directory.
+```
+## Contribution
+* For contributors: Fork it to your Github account, and create a new branch from [`dev`](https://github.com/hongzhonglu/strainOptimizer/tree/dev).

strainoptimizer-0.1.0/pyproject.toml ADDED Viewed

@@ -0,0 +1,167 @@
+[build-system]
+requires = ["setuptools>=68", "wheel"]
+build-backend = "setuptools.build_meta"
+# ---------------------------------------------------------------------------
+# Project metadata
+# ---------------------------------------------------------------------------
+[project]
+name = "strainOptimizer"
+version = "0.1.0"
+description = "Computational strain design using enzyme-constrained GEMs and ETFL models"
+readme = "README.md"
+license = { text = "MIT" }
+authors = [
+    { name = "Haoyu Wang", email = "wanghy@dicp.ac.cn" },
+]
+maintainers = [
+    { name = "Haoyu Wang", email = "wanghy@dicp.ac.cn" },
+]
+keywords = [
+    "metabolic engineering",
+    "strain design",
+    "genome-scale model",
+    "enzyme-constrained model",
+    "ecGEM",
+    "ETFL",
+    "systems biology",
+    "flux balance analysis",
+    "synthetic biology",
+]
+classifiers = [
+    "Development Status :: 4 - Beta",
+    "Intended Audience :: Science/Research",
+    "Topic :: Scientific/Engineering :: Bio-Informatics",
+    "Topic :: Scientific/Engineering :: Chemistry",
+    "License :: OSI Approved :: MIT License",
+    "Programming Language :: Python :: 3",
+    "Programming Language :: Python :: 3.9",
+    "Programming Language :: Python :: 3.10",
+    "Operating System :: OS Independent",
+]
+# Minimum Python version
+requires-python = ">=3.9,<3.11"
+# ---------------------------------------------------------------------------
+# Core runtime dependencies
+# ---------------------------------------------------------------------------
+# Commercial solvers (Gurobi, CPLEX) are NOT listed here because they require
+# separate license agreements and installation steps. See [project.optional-dependencies]
+# for the optional gurobipy extra.
+# pytfa is available at: https://github.com/EPFL-LCSB/pytfa
+# ---------------------------------------------------------------------------
+dependencies = [
+    # Metabolic modelling
+    "cobra==0.24.0",
+    "python-libsbml>=5.19.0",
+    "optlang==1.5.2",
+    # Thermodynamic flux analysis (ETFL dependency)
+    "pytfa>=0.9.1",
+    # Symbolic mathematics (ETFL uses sympy internals; v1.6.x required for
+    # compatibility with pytfa and the bundled ETFL code)
+    "sympy==1.6.2",
+    # Numerical / data
+    "numpy>=1.23,<2.0",
+    "scipy>=1.13",
+    "pandas>=1.3.5,<2.0",
+    # Visualisation
+    "matplotlib>=3.9",
+    # Graph utilities (used by cobra internals)
+    "networkx>=3.2",
+    # Bioinformatics
+    "biopython>=1.85",
+    # Excel I/O for results export
+    "openpyxl>=3.1",
+    "xlrd>=2.0",
+    # Data validation (pydantic v1 API; v2 is not compatible)
+    "pydantic>=1.10,<2.0",
+    # Progress bars
+    "tqdm>=4.60",
+    # Terminal output
+    "rich>=13.0",
+    # Miscellaneous
+    "six>=1.16",
+    "python-dateutil>=2.9",
+]
+# ---------------------------------------------------------------------------
+# Optional dependencies (extras)
+# ---------------------------------------------------------------------------
+[project.optional-dependencies]
+# Commercial solvers — install only if you hold a valid license
+gurobi = [
+    "gurobipy>=10.0",
+]
+cplex = [
+    # IBM CPLEX Python API must be installed manually from the CPLEX directory;
+    # the PyPI package below is the community/stub package. For the full solver
+    # install CPLEX and run: python <cplex>/python/setup.py install
+    "cplex>=22.1",
+]
+# Transcriptomics integration (GIMME example requires troppo)
+omics = [
+    "troppo>=0.1",
+]
+# Development and testing
+dev = [
+    "pytest>=8.0",
+    "build>=1.0",
+    "twine>=5.0",
+    "ruff>=0.4",
+]
+# Install everything (except commercial solvers which need manual setup)
+all = [
+    "strainOptimizer[omics,dev]",
+]
+# ---------------------------------------------------------------------------
+# Project URLs (shown on the PyPI page)
+# ---------------------------------------------------------------------------
+[project.urls]
+Homepage        = "https://github.com/hongzhonglu/strainOptimizer"
+Repository      = "https://github.com/hongzhonglu/strainOptimizer"
+"Bug Tracker"   = "https://github.com/hongzhonglu/strainOptimizer/issues"
+Documentation   = "https://github.com/hongzhonglu/strainOptimizer/blob/main/README.md"
+"Zenodo Archive" = "https://doi.org/10.5281/zenodo.20770724"
+# ---------------------------------------------------------------------------
+# Package discovery
+# ---------------------------------------------------------------------------
+[tool.setuptools.packages.find]
+where = ["src"]
+# Include model files and data bundled inside the package
+[tool.setuptools.package-data]
+"*" = ["*.json", "*.xml", "*.csv", "*.tsv", "*.xlsx", "*.yaml", "*.yml"]
+# ---------------------------------------------------------------------------
+# Build / upload workflow (for reference — not executed by setuptools)
+# ---------------------------------------------------------------------------
+# To build and upload to PyPI:
+#
+#   pip install build twine
+#   python -m build                     # creates dist/*.whl and dist/*.tar.gz
+#   twine check dist/*                  # validate metadata before upload
+#   twine upload dist/*                 # upload to PyPI (prompts for credentials)
+#
+# For a test upload first:
+#   twine upload --repository testpypi dist/*
+#
+# Then verify installation from TestPyPI:
+#   pip install --index-url https://test.pypi.org/simple/ strainOptimizer

strainoptimizer-0.1.0/setup.cfg ADDED Viewed

@@ -0,0 +1,4 @@
+[egg_info]
+tag_build =
+tag_date = 0

strainoptimizer-0.1.0/src/strainOptimizer/__init__.py ADDED Viewed

@@ -0,0 +1,12 @@
+# StrainOptimizer package
+from .strainDesign import (
+    strainOptimizer_engine,
+    WorkflowParameters,
+)
+__version__ = "0.1.0"
+__all__ = [
+    "strainOptimizer_engine",
+    "WorkflowParameters",
+]

strainoptimizer-0.1.0/src/strainOptimizer/analysis/FCC.py ADDED Viewed

@@ -0,0 +1,109 @@
+from strainOptimizer.simulation import mopa,moma
+def calculate_FCC_by_abundance(protID,model,productID,c_source='r_1714_REV', c_uptake=10, growthID='r_2111',objective='r_4046',objective_direction='max',delta_conc=1):
+    """
+    Calculate the flux control coefficient (FCC) for a given product and growth reaction by disturb enzyme abundance.
+    Args:
+        model (cobra.Model): The GEM model object.
+        c_source (str): The reaction ID of the carbon source uptake reaction. default is 'r_1714_REV' (glucose uptake).
+        c_uptake (float): The uptake rate of the carbon source.
+        productID (str): The reaction ID of the product output reaction.
+        growthID (str): The reaction ID of the growth reaction.
+        protID (str): The protein ID to calculate FCC for.
+        objective (str): The reaction ID of the objective reaction. default is NGAM (r_4046).
+        objective_direction (str): The direction of the objective reaction. 'max' or 'min'. default is 'max'.
+    Returns:
+        tuple: FCCg, FCCp
+    """
+    # calculate the reference strain by maximizing NGAM
+    # ref_growth=0.2
+    with model:
+        model.reactions.get_by_id(c_source).bounds = (c_uptake, c_uptake)  # set uptake rate
+        model.objective = growthID
+        model.objective_direction='max'
+        ref_growth=model.slim_optimize()/4  # set growth rate to 25% of max to allow for production
+        model.objective=productID
+        model.objective_direction='max'
+        max_production=model.slim_optimize()
+        ref_production=max_production/4
+        model.reactions.get_by_id(productID).bounds = (ref_production, 1000)
+        model.reactions.get_by_id(growthID).bounds = (ref_growth, 1000)  # set growth reaction bounds
+        model.objective = objective  # NGAM maximize as objective
+        model.objective_direction=objective_direction
+        ref_solution= model.optimize()
+    # calculate FCCg and FCCp
+    with model:
+        # overexpression for target protein
+        ref_conc=ref_solution.fluxes[protID]
+        new_conc=ref_conc*(1+delta_conc)  # increase protein concentration by delta_conc
+        # set the protein concentration
+        model.reactions.get_by_id(protID).lower_bound= new_conc
+        solution=mopa(model,reference_solution=ref_solution,linear=True)
+        # solution=moma(model,reference_solution=ref_solution,linear=False)
+        new_growth=solution.fluxes[growthID]
+        new_production=solution.fluxes[productID]
+    # FCCg
+    FCCg= ((new_growth-ref_growth)/ref_growth)/delta_conc
+    # calculate FCCp
+    FCCp=((new_production-ref_production)/ref_production)/delta_conc
+    # print('growth:',new_growth,'vs',ref_growth,'production:',new_production,'vs',ref_production)
+    return FCCg, FCCp
+def calculate_FCC_by_kcat(protID,model,productID, c_uptake=10, growthID='r_2111',delta_kcat=1):
+    '''
+    Calculate the flux control coefficient (FCC) for a given product and growth reaction by disturb enzyme kcat.
+    v/kcat<=protein_pool/MW
+    v/(kcat*(1+delta_kcat))<=protein_pool/MW
+    therefore, disturb kcat could be processed by modify the draw protein reaction coefficient
+    v/kcat<=protein_pool*(1+delta_kcat)/MW
+    v/kcat<=protein_pool/(MW/(1+delta_kcat))
+    MW'=MW/(1+delta_kcat)
+    Args:
+        model (cobra.Model): The GEM model object.
+        c_uptake (float): The uptake rate of the carbon source.
+        productID (str): The reaction ID of the product output reaction.
+        growthID (str): The reaction ID of the growth reaction.
+        protID (str): The protein ID to calculate FCC for.
+    '''
+    c_source='r_1714_REV'  # glucose uptake reaction
+    model.reactions.get_by_id(c_source).bounds = 0, c_uptake  # set uptake rate
+    with model:
+        model.objective=productID
+        model.objective_direction='max'
+        ref_production=model.slim_optimize()
+        model.objective = growthID
+        model.objective_direction='max'
+        ref_growth=model.slim_optimize()
+    # desturbe kcat
+    with model:
+        prot_pool=model.metabolites.get_by_id('prot_pool[c]')
+        ref_mw=model.reactions.get_by_id(protID).metabolites[prot_pool]
+        model.reactions.get_by_id(protID).metabolites[prot_pool]=ref_mw/(1+delta_kcat)
+        model.objective = productID
+        model.objective_direction='max'
+        new_production=model.slim_optimize()
+        model.objective = growthID
+        model.objective_direction='max'
+        new_growth=model.slim_optimize()
+    FCCg=((new_growth-ref_growth)/ref_growth)/delta_kcat
+    FCCp=((new_production-ref_production)/ref_production)/delta_kcat
+    return FCCg,FCCp

strainoptimizer-0.1.0/src/strainOptimizer/analysis/__init__.py ADDED Viewed

@@ -0,0 +1,4 @@
+from .ecGEM_utils import prepare_prot_solution_for_ec, prepare_metabolic_solution_for_ec
+from .etfl_utils import prepare_prot_solution_for_etfl, prepare_metabolic_solution_for_etfl

strainoptimizer-0.1.0/src/strainOptimizer/analysis/dataset.py ADDED Viewed

@@ -0,0 +1,73 @@
+# -*- coding: utf-8 -*-
+'''Load standard datasets foe strain design algorithm evaluation
+'''
+import pandas as pd
+import os
+# get the path of this file
+FILE_PATH = os.path.dirname(os.path.abspath(__file__))
+def load_experiment_targets(product:str, data_dir=FILE_PATH+'/../../../data/experiment_targets'):
+    '''Load experiment targets for a specific product
+    '''
+    available_products = [f.replace('_exp_targets.tsv','') for f in os.listdir(data_dir) if f.endswith('_exp_targets.tsv')]
+    if product not in available_products:
+        print('Available products:', available_products)
+        raise ValueError('The product %s is not available!' % product)
+    else:
+        df = pd.read_csv(os.path.join(data_dir, product+'_exp_targets.tsv'), sep='\t', index_col=0)
+        return df
+def calculate_exp_consistency(predict_result, exp_data, show=True, merge_ko_kd=False):
+    '''
+    Calculate the experimental consistency of the prediction results by comparing the predicted gene targets with the experimental gene targets.
+    Args:
+        merge_ko_kd: if True, treat KO and KD as one down-regulation category ('KD') before comparison.
+    '''
+    predict_result = predict_result[predict_result['action'].isin(['OE', 'KD', 'KO'])].copy()
+    exp_data = exp_data.copy()
+    if merge_ko_kd:
+        predict_result['action'] = predict_result['action'].replace('KO', 'KD')
+        exp_data['action'] = exp_data['action'].replace('KO', 'KD')
+    predict_group = predict_result.groupby('action')
+    exp_group = exp_data.groupby('action')
+    exp_consistency = dict()
+    overall_exp_num = 0
+    overall_hit_num = 0
+    overall_predict_num = 0
+    for key in exp_group.groups.keys():
+        exp_geneList = exp_group.get_group(key).index.tolist()
+        try:
+            predict_geneList = predict_group.get_group(key).index.tolist()
+        except:
+            predict_geneList = []
+        hit_geneList = list(set(exp_geneList).intersection(set(predict_geneList)))
+        overall_exp_num += len(exp_geneList)
+        overall_hit_num += len(hit_geneList)
+        overall_predict_num += len(predict_geneList)
+        exp_consistency[key] = {'exp': exp_geneList, 'predict': predict_geneList, 'hit': hit_geneList,
+                                'exp_num': len(exp_geneList), 'hit_num': len(hit_geneList),
+                                'consistency': len(set(exp_geneList).intersection(set(hit_geneList))) / len(
+                                    exp_geneList)}
+    if overall_predict_num==0:
+        return None
+    exp_consistency['overall'] = {'exp_num': overall_exp_num, 'hit_num': overall_hit_num,
+                                  'predict_num': overall_predict_num,
+                                  'consistency': overall_hit_num / overall_exp_num,
+                                  'precision': overall_hit_num / overall_predict_num}
+    if show==True:
+        for key in exp_consistency.keys():
+            print(f'{key}:')
+            print(exp_consistency[key])
+    return exp_consistency
+def gene_id_to_name(geneIDlist,annotation_file=FILE_PATH+'/../../../data/s288c_geneNames.csv'):
+    df=pd.read_csv(annotation_file,index_col=0)
+    df_geneName=df[df.index.isin(geneIDlist)]['geneName']
+    return df_geneName