PyPI - factorforge-cds - Versions diffs - 3.1.8__tar.gz → 3.2.0__tar.gz - Mend

factorforge-cds 3.1.8tar.gz → 3.2.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (88) hide show

{factorforge_cds-3.1.8 → factorforge_cds-3.2.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: factorforge-cds
-Version: 3.1.8
+Version: 3.2.0
 Summary: FactorForge - open-source constraint-based CDS design engine by Eijex.
 Author-email: Eijex <eijex.lab@gmail.com>
 License-Expression: AGPL-3.0-only
@@ -20,6 +20,7 @@ Requires-Dist: requests>=2.31
 Requires-Dist: click>=8.0
 Requires-Dist: pydantic>=2.0
 Provides-Extra: dev
+Requires-Dist: jsonschema>=4.0; extra == "dev"
 Requires-Dist: pytest>=7.0; extra == "dev"
 Requires-Dist: pytest-cov>=4.0; extra == "dev"
 Requires-Dist: ruff>=0.1; extra == "dev"
@@ -31,7 +32,7 @@ Dynamic: license-file
 # FactorForge
-**Open-source constraint-based CDS design engine for plant expression workflows, with initial focus on *Nicotiana benthamiana* and Tobacco BY-2.**
+**Open-source constraint-based CDS design engine for sequence-level CDS design, with primary support for *Nicotiana benthamiana* (Tobacco BY-2: experimental).**
 [![License](https://img.shields.io/badge/license-AGPL--3.0-blue.svg)](LICENSE)
 [![Python](https://img.shields.io/badge/python-3.10%2B-blue.svg)](https://www.python.org/)
@@ -41,7 +42,7 @@ Dynamic: license-file
 [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.20407331.svg)](https://doi.org/10.5281/zenodo.20407331)
 [![Web App](https://img.shields.io/badge/web-factorforge.eijex.com-brightgreen.svg)](https://factorforge.eijex.com)
-FactorForge optimizes protein sequences into host-compatible CDS by maximizing CAI, controlling GC content, eliminating PolyA signals, and producing MoClo/Golden Gate-ready constructs. Supports *N. benthamiana* (agroinfiltration) and Tobacco BY-2 (`--host by2`, bioreactor/cGMP workflows).
+FactorForge performs profile-guided CDS design with CAI/GC metrics, PolyA-signal screening, and Golden Gate/MoClo-aware checks. Primary support: *N. benthamiana* (agroinfiltration). Experimental host context: Tobacco BY-2 (`--host by2`).
 **→ [Full Documentation](https://eijex.github.io/factorforge-cds/)**
@@ -65,7 +66,7 @@ Or use the **[web app](https://factorforge.eijex.com)** — no installation requ
 | **Web App** | No installation, demo & light use | [factorforge.eijex.com](https://factorforge.eijex.com) |
 | **CLI / Python** | Local use, batch processing, data privacy | `pip install factorforge-cds` |
 | **Docker** | Full web interface locally | `docker pull ghcr.io/eijex/factorforge-cds:latest` |
-| **Eijex MCP** | AI agent access (Claude Code, Cursor) | [mcp.eijex.com](https://mcp.eijex.com) |
+| **Eijex MCP** | MCP-compatible agent access | [mcp.eijex.com](https://mcp.eijex.com) |
 ---
@@ -82,59 +83,38 @@ and are not imported by the installed package or exposed as supported engines.
 ---
-## Development History
-FactorForge has gone through several implementation generations before the current public release:
-| Generation | Status | Description |
-|-----------|--------|-------------|
-| **v1** — NBent_OptiCodon | Internal | Thesis-derived codon optimization baseline for *N. benthamiana* |
-| **v2** — Rule-Based Engine | Internal → Production | Deterministic, constraint-aware design engine; became the foundation for the public release |
-| **v3-alpha** — ML Prototype | Archived | ML-based design attempt; performance was insufficient for production use; preserved under `archive/v3-ml-prototype/` |
-| **v3.0+** — Current release | Public | Open-source release of the matured v2 engine under `factorforge.engines.profile` |
-| **v3.7+** — ML Engine | Planned | ML-based design as `--engine ml`; added once sufficient wet-lab data is available |
-The `archive/` directory preserves all three earlier tracks for provenance. None are installed or exposed by the current package.
----
 ## ⚠️ Validation Status
-FactorForge predictions are **in-silico only** and have not been experimentally validated in wet-lab conditions. See [Validation](https://eijex.github.io/factorforge-cds/validation/) and [VALIDATION.md](VALIDATION.md).
+FactorForge outputs are **in-silico only** and have not been experimentally validated in wet-lab conditions. See [Validation](https://eijex.github.io/factorforge-cds/validation/) and [VALIDATION.md](VALIDATION.md).
 ---
 ## Citing
 ```
-FactorForge v3.1.8 (2026). Open-source constraint-based CDS design engine.
+FactorForge v3.2.0 (2026). Open-source constraint-based CDS design engine.
 Eijex. https://github.com/eijex/factorforge-cds
 ```
-*A citable publication is in preparation.*
 ---
-## Contributors
+## Maintainer
-| | Name | Role |
-|--|------|------|
-| 👤 | Mun-Kyu Kim ([@eijex](https://github.com/eijex)) | Author & maintainer |
-| 🤖 | Claude (Anthropic) | Design, analysis, planning |
-| 🤖 | Codex (OpenAI) | Implementation |
+Mun-Kyu Kim ([@eijex](https://github.com/eijex))
 ## License
 GNU Affero General Public License v3.0 — see [LICENSE](LICENSE).
-**Disclaimer:** FactorForge is provided for research purposes only. Predictions are computational and have not been experimentally validated.
+**Disclaimer:** FactorForge is provided for research purposes only. Outputs are computational and have not been experimentally validated.
 ---
 ## Get in Touch
 - **Docs** — [eijex.github.io/factorforge-cds](https://eijex.github.io/factorforge-cds/)
-- **Wet-lab Results** — [Submit via Google Form](https://docs.google.com/forms/d/e/1FAIpQLSeSx-wYvF6YwHhSPdLMl-L44frCugdm25X_eDz50OaqTD66qA/viewform?usp=header) (recommended) or [GitHub Issue](https://github.com/eijex/factorforge-cds/issues/new?template=wet_lab_result.yml)
+- **Wet-lab Results** — Public-safe validation summaries are welcome. Do not submit raw sequences, confidential construct details, internal batch IDs, patient data, private contact information, exact process parameters, or confidential partner/customer data. See [VALIDATION.md](VALIDATION.md) before submitting.
 - **GitHub Issues** — bugs, features: [github.com/eijex/factorforge-cds/issues](https://github.com/eijex/factorforge-cds/issues)
 - **Email** — eijex.lab@gmail.com
-- **Web** — [factorforge.eijex.com](https://factorforge.eijex.com)
+- **FactorForge** — [factorforge.eijex.com](https://factorforge.eijex.com)
+- **Lab** — [www.eijex.com](https://www.eijex.com)

factorforge_cds-3.2.0/README.md ADDED Viewed

@@ -0,0 +1,88 @@
+# FactorForge
+**Open-source constraint-based CDS design engine for sequence-level CDS design, with primary support for *Nicotiana benthamiana* (Tobacco BY-2: experimental).**
+[![License](https://img.shields.io/badge/license-AGPL--3.0-blue.svg)](LICENSE)
+[![Python](https://img.shields.io/badge/python-3.10%2B-blue.svg)](https://www.python.org/)
+[![PyPI](https://img.shields.io/pypi/v/factorforge-cds.svg)](https://pypi.org/project/factorforge-cds/)
+[![CI](https://github.com/eijex/factorforge-cds/actions/workflows/ci.yml/badge.svg)](https://github.com/eijex/factorforge-cds/actions/workflows/ci.yml)
+[![codecov](https://codecov.io/gh/eijex/factorforge-cds/branch/main/graph/badge.svg)](https://codecov.io/gh/eijex/factorforge-cds)
+[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.20407331.svg)](https://doi.org/10.5281/zenodo.20407331)
+[![Web App](https://img.shields.io/badge/web-factorforge.eijex.com-brightgreen.svg)](https://factorforge.eijex.com)
+FactorForge performs profile-guided CDS design with CAI/GC metrics, PolyA-signal screening, and Golden Gate/MoClo-aware checks. Primary support: *N. benthamiana* (agroinfiltration). Experimental host context: Tobacco BY-2 (`--host by2`).
+**→ [Full Documentation](https://eijex.github.io/factorforge-cds/)**
+---
+## Quick Start
+```bash
+pip install factorforge-cds
+factorforge optimize my_protein.fasta -o output.fasta
+```
+Or use the **[web app](https://factorforge.eijex.com)** — no installation required.
+---
+## Access Options
+| Method | Description | Link |
+|--------|-------------|------|
+| **Web App** | No installation, demo & light use | [factorforge.eijex.com](https://factorforge.eijex.com) |
+| **CLI / Python** | Local use, batch processing, data privacy | `pip install factorforge-cds` |
+| **Docker** | Full web interface locally | `docker pull ghcr.io/eijex/factorforge-cds:latest` |
+| **Eijex MCP** | MCP-compatible agent access | [mcp.eijex.com](https://mcp.eijex.com) |
+---
+## Repository Structure
+The supported production engine is the deterministic profile engine under:
+```text
+src/factorforge/engines/profile/
+```
+Historical implementation tracks are preserved under `archive/` for provenance
+and are not imported by the installed package or exposed as supported engines.
+---
+## ⚠️ Validation Status
+FactorForge outputs are **in-silico only** and have not been experimentally validated in wet-lab conditions. See [Validation](https://eijex.github.io/factorforge-cds/validation/) and [VALIDATION.md](VALIDATION.md).
+---
+## Citing
+```
+FactorForge v3.2.0 (2026). Open-source constraint-based CDS design engine.
+Eijex. https://github.com/eijex/factorforge-cds
+```
+---
+## Maintainer
+Mun-Kyu Kim ([@eijex](https://github.com/eijex))
+## License
+GNU Affero General Public License v3.0 — see [LICENSE](LICENSE).
+**Disclaimer:** FactorForge is provided for research purposes only. Outputs are computational and have not been experimentally validated.
+---
+## Get in Touch
+- **Docs** — [eijex.github.io/factorforge-cds](https://eijex.github.io/factorforge-cds/)
+- **Wet-lab Results** — Public-safe validation summaries are welcome. Do not submit raw sequences, confidential construct details, internal batch IDs, patient data, private contact information, exact process parameters, or confidential partner/customer data. See [VALIDATION.md](VALIDATION.md) before submitting.
+- **GitHub Issues** — bugs, features: [github.com/eijex/factorforge-cds/issues](https://github.com/eijex/factorforge-cds/issues)
+- **Email** — eijex.lab@gmail.com
+- **FactorForge** — [factorforge.eijex.com](https://factorforge.eijex.com)
+- **Lab** — [www.eijex.com](https://www.eijex.com)

{factorforge_cds-3.1.8 → factorforge_cds-3.2.0}/pyproject.toml RENAMED Viewed

@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
 [project]
 name = "factorforge-cds"
-version = "3.1.8"
+version = "3.2.0"
 description = "FactorForge - open-source constraint-based CDS design engine by Eijex."
 readme = "README.md"
 license = "AGPL-3.0-only"
@@ -28,6 +28,7 @@ dependencies = [
 [project.optional-dependencies]
 dev = [
+    "jsonschema>=4.0",
     "pytest>=7.0",
     "pytest-cov>=4.0",
     "ruff>=0.1",

{factorforge_cds-3.1.8 → factorforge_cds-3.2.0}/src/factorforge/__init__.py RENAMED Viewed

@@ -4,7 +4,7 @@ FactorForge - Codon Optimization Platform
 profile: constraint-aware rule/profile engine
 """
-__version__ = "3.1.8"
+__version__ = "3.2.0"
 __author__ = "Eijex"
 # Auto-register engines (safe when running from source tree)

{factorforge_cds-3.1.8 → factorforge_cds-3.2.0}/src/factorforge/analysis/feasibility.py RENAMED Viewed

@@ -14,6 +14,16 @@ from factorforge.analysis.metrics import (
 )
+# Defaults calibrated to nbenthamiana profile engine output distribution
+# (analysis 004, n=49): avg CAI=0.76, avg GC=60.1% (range 55-71%).
+# DEFAULT_CAI_TARGET=0.82 aligns with industry practice (>0.8) and is achievable.
+# Exported as named constants so tests/test_registry_production_sync.py can
+# strictly compare them against the registry (single source of truth).
+DEFAULT_CAI_TARGET: float = 0.82
+DEFAULT_GC_LOW: float = 55.0
+DEFAULT_GC_HIGH: float = 65.0
 AA_TO_CODONS: dict[str, list[str]] = {}
 for _codon, _aa in STANDARD_GENETIC_CODE.items():
     if _aa == "*":
@@ -88,9 +98,9 @@ def _reconstruct_sequence(
 def analyze_feasibility(
     protein_sequence: str,
     codon_weights: dict[str, float],
-    target_cai: float = 0.82,
-    target_gc_low: float = 55.0,
-    target_gc_high: float = 65.0,
+    target_cai: float = DEFAULT_CAI_TARGET,
+    target_gc_low: float = DEFAULT_GC_LOW,
+    target_gc_high: float = DEFAULT_GC_HIGH,
     gc_ranges: list[tuple[float, float]] | None = None,
 ) -> dict[str, Any]:
     """Compute exact CAI/GC feasibility over synonymous codon choices.
@@ -99,9 +109,8 @@ def analyze_feasibility(
     global GC count. This is exact for global GC and CAI under the supplied
     codon weights.
-    Defaults calibrated to nbenthamiana profile engine output distribution
-    (analysis 004, n=49): avg CAI=0.76, avg GC=60.1% (range 55-71%).
-    target_cai=0.82 aligns with industry practice (>0.8) and is achievable.
+    See module-level DEFAULT_CAI_TARGET / DEFAULT_GC_LOW / DEFAULT_GC_HIGH for
+    the calibration rationale (analysis 004, n=49).
     """
     protein = "".join(protein_sequence.upper().split()).rstrip("*")
     if not protein:

{factorforge_cds-3.1.8 → factorforge_cds-3.2.0}/src/factorforge/analysis/metrics.py RENAMED Viewed

@@ -277,6 +277,32 @@ def calculate_cai(sequence: str, codon_weights: dict[str, float]) -> float:
     return math.exp(log_sum / count) if count else 0.0
+def calculate_dinucleotide_score(
+    sequence: str,
+    cpg_weight: float = 0.0,
+    tpa_weight: float = 1.0,
+) -> float:
+    """Score dinucleotide avoidance.
+    Plant default: CpG inactive (cpg_weight=0.0), only TpA is penalized.
+    Mammalian opt-in: set cpg_weight=1.0 and tpa_weight=1.0 to penalize both.
+    """
+    from factorforge.engines.profile.utils import calculate_dinucleotide_ratio
+    if len(sequence) < 6:
+        return 1.0
+    total_weight = cpg_weight + tpa_weight
+    if total_weight == 0:
+        return 1.0
+    cpg_ratio = calculate_dinucleotide_ratio(sequence, "CG")
+    tpa_ratio = calculate_dinucleotide_ratio(sequence, "TA")
+    cpg_score = max(0.0, 1.0 - cpg_ratio / 2.0)
+    tpa_score = max(0.0, 1.0 - tpa_ratio / 2.0)
+    return (cpg_weight * cpg_score + tpa_weight * tpa_score) / total_weight
 def codon_usage_profile(sequence: str) -> dict[str, dict[str, float | int | str]]:
     """Return codon counts and frequencies for a DNA sequence."""
     codons = _codons(sequence)

{factorforge_cds-3.1.8 → factorforge_cds-3.2.0}/src/factorforge/engines/__init__.py RENAMED Viewed

@@ -13,7 +13,7 @@ def register_builtin_engines() -> None:
         "profile",
         RuleBasedOptimizer,
         metadata={
-            "version": "3.1.8",
+            "version": "3.2.0",
             "engine_type": "profile_rule_based",
             "role": "stable_profile_engine",
             "stable": True,

{factorforge_cds-3.1.8 → factorforge_cds-3.2.0}/src/factorforge/engines/profile/__init__.py RENAMED Viewed

@@ -5,7 +5,7 @@ Production system (2026)
 Plant-specific rule-based optimization
 """
-__version__ = "3.1.8"
+__version__ = "3.2.0"
 from .optimizer import RuleBasedOptimizer
 from .pipeline import OptimizationPipeline

{factorforge_cds-3.1.8 → factorforge_cds-3.2.0}/src/factorforge/engines/profile/optimizer.py RENAMED Viewed

@@ -9,7 +9,7 @@ from factorforge.core.interfaces import OptimizationResult, OptimizerEngine
 from .exporter import SequenceExporter
 from .rules.reverse_translator import OptimizationProfile, ReverseTranslator
 from .rules.rule_engine import RuleEngine
-from .scoring import calculate_composite_score
+from .scoring import calculate_composite_score, compute_mfe_evidence
 from .validator import InputValidator
@@ -17,7 +17,7 @@ class RuleBasedOptimizer(OptimizerEngine):
     """Profile-based rule optimization engine."""
     name = "Profile-based"
-    version = "3.1.8"
+    version = "3.2.0"
     def __init__(self) -> None:
         self.validator = InputValidator()
@@ -30,6 +30,7 @@ class RuleBasedOptimizer(OptimizerEngine):
         sequence: str,
         profile: str | None = "balanced",
         host: str = "nbenthamiana",
+        seed: int | None = None,
         **kwargs: Any,
     ) -> OptimizationResult:
         """
@@ -91,7 +92,7 @@ class RuleBasedOptimizer(OptimizerEngine):
             candidates = [{"sequence": optimized_dna, "cai": cai, "gc": gc, "score": score}]
         else:
             candidates = translator.generate_candidates(
-                processed_seq, profile=opt_profile, n=1
+                processed_seq, profile=opt_profile, n=1, seed=seed
             )
             if not candidates:
                 raise ValueError("No candidates generated for input sequence.")
@@ -117,6 +118,10 @@ class RuleBasedOptimizer(OptimizerEngine):
             "score": candidates[0]["score"],
             "violations": sum(len(v) for v in scan_results.values()),
         }
+        # MFE provenance: expose whether MFE was actually computed so downstream
+        # artifacts (API response, Design Package) never report an uncomputed
+        # MFE as a misleading 0.0 (016 audit). Score value is unchanged.
+        metrics.update(compute_mfe_evidence(optimized_dna, profile=profile_value))
         return OptimizationResult(
             sequence=optimized_dna,

{factorforge_cds-3.1.8 → factorforge_cds-3.2.0}/src/factorforge/engines/profile/pipeline.py RENAMED Viewed

@@ -18,9 +18,14 @@ from factorforge.engines.profile.rules.reverse_translator import (
     ReverseTranslator,
 )
 from factorforge.engines.profile.rules.rule_engine import RuleEngine
-from factorforge.engines.profile.scoring import calculate_composite_score
+from factorforge.engines.profile.scoring import (
+    calculate_composite_score,
+    compute_mfe_evidence,
+)
 from factorforge.engines.profile.validator import InputValidator
+from factorforge.analysis.metrics import translate_dna
 from factorforge.utils.construct_id import generate_construct_id
+from factorforge.utils.sequence_validator import validate_cds_output
 logger = logging.getLogger(__name__)
@@ -48,7 +53,15 @@ class PipelineResult:
             "optimization_profile": self.metadata.get("profile", ""),
             "cai_score": round(metrics.get("cai", 0.0), 4),
             "gc_content_pct": round(metrics.get("gc", 0.0), 2),
-            "mfe_kcal_mol": round(metrics.get("mfe", 0.0), 2),
+            # MFE provenance (016 audit): None when not computed (e.g. ViennaRNA
+            # unavailable) — never report an uncomputed MFE as a misleading 0.0.
+            "mfe_kcal_mol": (
+                round(metrics["mfe_kcal_mol"], 2)
+                if metrics.get("mfe_kcal_mol") is not None
+                else None
+            ),
+            "mfe_status": metrics.get("mfe_status", "not_computed"),
+            "mfe_used": metrics.get("mfe_used", False),
             "polya_signal_count": len(scan.get("polya", [])),
             "domestication_edits": len(dom.get("removed_sites", [])),
             "sequence_length_aa": len(self.sequence) // 3,
@@ -175,6 +188,7 @@ class OptimizationPipeline:
         if seq_type == "dna":
             optimized_dna = processed
+            expected_protein = translate_dna(processed).rstrip("*")
             cai = translator.calculate_cai(optimized_dna)
             gc = translator.calculate_gc_content(optimized_dna)
             score = calculate_composite_score(
@@ -182,6 +196,7 @@ class OptimizationPipeline:
             )
             candidate_metrics = {"cai": cai, "gc": gc, "score": score}
         else:
+            expected_protein = processed.rstrip("*")
             logger.debug(f"Generating candidates with profile: {opt_profile.value}")
             candidates = translator.generate_candidates(processed, profile=opt_profile, n=1)
             if not candidates:
@@ -251,7 +266,20 @@ class OptimizationPipeline:
         assembly_standard = kwargs.get("assembly_standard", "golden_gate")
         domestication = domesticator.domesticate(optimized_dna, standard=assembly_standard)
+        if not domestication.get("success", False):
+            unfixable = domestication.get("unfixable", [])
+            error = domestication.get("error")
+            detail = error or f"unfixable restriction sites: {unfixable}"
+            raise ValueError(f"Domestication failed for {assembly_standard}: {detail}")
         domesticated_sequence = domestication.get("domesticated_seq", optimized_dna)
+        final_validation = validate_cds_output(expected_protein, domesticated_sequence)
+        if not final_validation["passed"]:
+            raise ValueError(
+                "Final CDS validation failed: "
+                f"{final_validation['errors']} "
+                f"(aa_identity={final_validation['aa_identity']:.4f})"
+            )
         template_name = construct_template or self.construct_template
         if template_name:
@@ -269,6 +297,13 @@ class OptimizationPipeline:
             construct_record = None
             final_sequence = domesticated_sequence
+        # MFE provenance for the final output sequence (016 audit): record
+        # whether MFE was computed so export_features / Design Package never
+        # report an uncomputed MFE as 0.0.
+        candidate_metrics.update(
+            compute_mfe_evidence(domesticated_sequence, profile=effective_profile)
+        )
         metadata: dict[str, Any] = {
             "construct_id": generate_construct_id(),
             "profile": effective_profile,
@@ -278,6 +313,7 @@ class OptimizationPipeline:
             "validation": val_result,
             "scan_results": scan_results,
             "domestication": domestication,
+            "final_validation": final_validation,
             "metrics": candidate_metrics,
             "scan_mode": scan_mode,
         }

{factorforge_cds-3.1.8 → factorforge_cds-3.2.0}/src/factorforge/engines/profile/rules/domesticator.py RENAMED Viewed

@@ -20,6 +20,18 @@ class Domesticator:
     - BioBricks (EcoRI, XbaI, SpeI, PstI)
     """
+    # Canonical Golden Gate Type IIS enzyme set, exported as GOLDEN_GATE_ENZYMES
+    # so tests/test_registry_production_sync.py::test_type_iis_sync can strictly
+    # compare it against the registry (single source of truth) instead of warning.
+    #
+    # BpiI and BbsI share the same GAAGAC Type IIS recognition/cut behavior in
+    # FactorForge's Golden Gate scanning context. The existing FactorForge
+    # production code and documentation consistently use BpiI as the canonical
+    # label; BbsI is a common synonym/vendor naming convention for the same
+    # scanning target. This is a naming normalization, not a biological
+    # threshold change. Order matches the registry value for stable comparison.
+    GOLDEN_GATE_ENZYMES: tuple[str, ...] = ("BsaI", "BpiI", "BsmBI")
     # Assembly standard definitions
     ASSEMBLY_STANDARDS: dict[str, dict[str, Any]] = {
         "golden_gate": {

{factorforge_cds-3.1.8 → factorforge_cds-3.2.0}/src/factorforge/engines/profile/rules/reverse_translator.py RENAMED Viewed

@@ -551,8 +551,8 @@ class ReverseTranslator:
         """Apply N-terminal codon ramp for co-translational folding.
         Replaces the first `ramp_codons` codons with lower-frequency synonymous
-        codons (bottom 50% by frequency) to slow the ribosome at the N-terminus.
-        Single-codon amino acids (Met, Trp) are left unchanged.
+        codons (bottom 25% by frequency; cutoff = 3*len//4) to slow the ribosome
+        at the N-terminus. Single-codon amino acids (Met, Trp) are left unchanged.
         TODO: ramp profile is currently not in VALID_PROFILES (not publicly accessible).
         Before re-enabling, revisit ramp_codons=50:
@@ -671,6 +671,7 @@ class ReverseTranslator:
         protein_seq: str,
         profile: OptimizationProfile = OptimizationProfile.BALANCED,
         n: int = 5,
+        seed: int | None = None,
         **kwargs: Any,
     ) -> list[dict[str, Any]]:
         """
@@ -697,6 +698,9 @@ class ReverseTranslator:
         if n < 1:
             raise ValueError("n must be >= 1")
+        # Seed before any candidate generation (covers both n=1 fast path and n>1).
+        random.seed(seed if seed is not None else secrets.randbits(32))
         def _build_candidate() -> dict[str, Any]:
             dna_seq = self.reverse_translate(protein_seq, profile, **kwargs)
             cai = self.calculate_cai(dna_seq)
@@ -720,7 +724,6 @@ class ReverseTranslator:
         candidates: list[dict[str, Any]] = []
         last_error: Exception | None = None
-        random.seed(secrets.randbits(32))
         for attempt in range(n):
             try:

{factorforge_cds-3.1.8 → factorforge_cds-3.2.0}/src/factorforge/engines/profile/rules/rule_engine.py RENAMED Viewed

@@ -354,13 +354,20 @@ class RuleEngine:
         max_gc: float = 75,
     ) -> list[dict[str, Any]]:
         """
-        Detect extreme GC regions
+        Detect extreme GC regions in a sliding local window.
+        This is a LOCAL synthesis/extreme-window guard (default 25-75% over a
+        50 bp window), NOT the global GC target. Global GC is governed separately
+        by the scoring band (GC_OPT_MIN/MAX, ~55-65%) and the API/DP gc_min/gc_max
+        constraints. The wide 25-75% band intentionally flags only synthesis-hostile
+        local windows; narrowing it toward the global optimum would raise false
+        positives against the engine's own output distribution (analysis 004: 55-71%).
         Args:
             seq: DNA sequence
             window: Window size (bp)
-            min_gc: Minimum GC% threshold
-            max_gc: Maximum GC% threshold
+            min_gc: Minimum local GC% threshold (synthesis guard, not global target)
+            max_gc: Maximum local GC% threshold (synthesis guard, not global target)
         Returns:
             List of violations

{factorforge_cds-3.1.8 → factorforge_cds-3.2.0}/src/factorforge/engines/profile/scoring.py RENAMED Viewed

@@ -34,6 +34,8 @@ class ScoringConfig:
     w_gc: float = 0.3
     w_mfe: float = 0.2
     w_dinuc: float = 0.0  # CpG/TpA dinucleotide penalty (opt-in, default off)
+    cpg_weight: float = 0.0  # plant default: CpG inactive
+    tpa_weight: float = 1.0  # plant default: TpA active
     w_syncodonlm: float = 0.0  # SynCodonLM quality score (opt-in, default off)
     gc_opt: float = GC_OPT_MID  # no longer used by calculate_composite_score (superseded by
                                 # gc_min/gc_max band); retained for external API compatibility
@@ -187,31 +189,19 @@ def gc_band_score(
     return max(0.0, 1.0 - distance / decay_width)
-def calculate_dinucleotide_score(sequence: str) -> float:
-    """Calculate a dinucleotide avoidance score (0-1, higher = fewer CpG/TpA).
-    Combines CpG and TpA observed/expected ratios. A sequence with no CpG
-    and no TpA scores 1.0; high density scores toward 0.0.
-    Args:
-        sequence: DNA sequence.
+def calculate_dinucleotide_score(
+    sequence: str,
+    cpg_weight: float = 0.0,
+    tpa_weight: float = 1.0,
+) -> float:
+    """Score dinucleotide avoidance.
-    Returns:
-        Dinucleotide avoidance score (0-1).
+    Plant default: CpG inactive (cpg_weight=0.0), only TpA is penalized.
+    Mammalian opt-in: set cpg_weight=1.0 and tpa_weight=1.0 to penalize both.
     """
-    from factorforge.engines.profile.utils import calculate_dinucleotide_ratio
+    from factorforge.analysis.metrics import calculate_dinucleotide_score as _score
-    if len(sequence) < 6:
-        return 1.0
-    cpg_ratio = calculate_dinucleotide_ratio(sequence, "CG")
-    tpa_ratio = calculate_dinucleotide_ratio(sequence, "TA")
-    # Score: 1.0 when ratio=0, 0.0 when ratio>=2.0
-    cpg_score = max(0.0, 1.0 - cpg_ratio / 2.0)
-    tpa_score = max(0.0, 1.0 - tpa_ratio / 2.0)
-    return (cpg_score + tpa_score) / 2.0
+    return _score(sequence, cpg_weight=cpg_weight, tpa_weight=tpa_weight)
 def calculate_composite_score(
@@ -279,7 +269,11 @@ def calculate_composite_score(
     dinuc_score = 0.5  # neutral default
     actual_w_dinuc = config.w_dinuc
     if actual_w_dinuc > 0 and sequence is not None:
-        dinuc_score = calculate_dinucleotide_score(sequence)
+        dinuc_score = calculate_dinucleotide_score(
+            sequence,
+            cpg_weight=config.cpg_weight,
+            tpa_weight=config.tpa_weight,
+        )
     elif actual_w_dinuc > 0:
         actual_w_dinuc = 0.0  # Cannot compute without sequence
@@ -308,3 +302,50 @@ def calculate_composite_score(
     )
     return round(score, 3)
+def compute_mfe_evidence(
+    sequence: str | None,
+    config: ScoringConfig | None = None,
+    profile: str | None = None,
+) -> dict[str, Any]:
+    """Return MFE provenance metadata for a scored sequence.
+    Mirrors the MFE branch of ``calculate_composite_score`` WITHOUT changing the
+    score. Its purpose is honesty: when MFE is not computed (e.g. ViennaRNA is
+    unavailable in the deployment, as on Vercel), callers must be able to tell
+    that ``mfe_kcal_mol`` is absent rather than a genuine 0.0.
+    Returns a dict with:
+        mfe_kcal_mol: float | None   (None when not computed)
+        mfe_status:   "computed" | "not_computed"
+        mfe_used:     bool           (whether MFE contributed to the score)
+        mfe_warning:  str | None     (reason when not used)
+        score_components: {cai_used, gc_used, mfe_used}
+    """
+    if config is None:
+        profile_name = (profile or "balanced").lower()
+        config = PROFILE_SCORING_CONFIGS.get(profile_name) or PROFILE_SCORING_CONFIGS["balanced"]
+    mfe_value: float | None = None
+    reason: str | None = None
+    if not config.use_mfe:
+        reason = "MFE scoring is disabled for this profile."
+    elif sequence is None:
+        reason = "MFE was not computed because no sequence was provided."
+    elif not _check_vienna_available():
+        reason = "MFE was not computed because ViennaRNA is unavailable in this environment."
+    else:
+        mfe_value = calculate_mfe(sequence)
+        if mfe_value is None:
+            reason = "MFE computation failed for this sequence."
+    mfe_used = mfe_value is not None
+    return {
+        "mfe_kcal_mol": round(mfe_value, 2) if mfe_used else None,
+        "mfe_status": "computed" if mfe_used else "not_computed",
+        "mfe_used": mfe_used,
+        "mfe_warning": None if mfe_used else reason,
+        "score_components": {"cai_used": True, "gc_used": True, "mfe_used": mfe_used},
+    }

factorforge-cds 3.1.8__tar.gz → 3.2.0__tar.gz

factorforge-cds 3.1.8tar.gz → 3.2.0tar.gz