aurelian 0.3.2__py3-none-any.whl → 0.3.4__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,5 +1,6 @@
1
1
  """
2
2
  Agent for working with gene information using the UniProt API and NCBI Entrez.
3
+ Provides structured information in the form of Narrative, Functional Terms Table, and Gene Summary Table.
3
4
  """
4
5
  from pydantic_ai import Agent
5
6
 
@@ -70,38 +71,54 @@ The analysis will cover multiple types of relationships:
70
71
  - Physical interactions
71
72
  - Genetic interactions
72
73
 
73
- IMPORTANT: For gene set analysis, ALWAYS include a distinct section titled "## Terms"
74
- that contains a semicolon-delimited list of functional terms relevant to the gene set,
75
- ordered by relevance. These terms should include:
76
- - Gene Ontology biological process terms (e.g., DNA repair, oxidative phosphorylation, signal transduction)
77
- - Molecular function terms (e.g., kinase activity, DNA binding, transporter activity)
78
- - Cellular component/localization terms (e.g., nucleus, plasma membrane, mitochondria)
79
- - Pathway names (e.g., glycolysis, TCA cycle, MAPK signaling)
80
- - Co-regulation terms (e.g., stress response regulon, heat shock response)
81
- - Interaction networks (e.g., protein complex formation, signaling cascade)
82
- - Metabolic process terms (e.g., fatty acid synthesis, amino acid metabolism)
83
- - Regulatory mechanisms (e.g., transcriptional regulation, post-translational modification)
84
- - Disease associations (if relevant, e.g., virulence, pathogenesis, antibiotic resistance)
85
- - Structural and functional domains/motifs (e.g., helix-turn-helix, zinc finger)
86
-
87
- Example of Terms section:
88
- ## Terms
89
- DNA damage response; p53 signaling pathway; apoptosis; cell cycle regulation; tumor suppression; DNA repair; protein ubiquitination; transcriptional regulation; nuclear localization; cancer predisposition
90
-
91
- IMPORTANT: After the Terms section, ALWAYS include a "## Gene Summary Table" with a markdown table
92
- summarizing the genes analyzed, with the following columns in this exact order:
93
- - ID: The gene identifier (same as Gene Symbol)
94
- - Annotation: Genomic coordinates or accession with position information
95
- - Genomic Context: Information about the genomic location (chromosome, plasmid, etc.)
96
- - Organism: The organism the gene belongs to
97
- - Description: The protein/gene function description
74
+ For gene set analysis, your output MUST always include three distinct sections:
75
+
76
+ 1. First, a "## Narrative" section providing a concise explanation of the functional and categorical relationships between the genes. This should:
77
+ - Prioritize explanations involving most or all genes in the set
78
+ - Refer to specific subsets of genes when discussing specialized functions
79
+ - Highlight the most significant shared pathways, processes, or disease associations
80
+ - Be clear, concise, and focused on biological meaning
81
+
82
+ 2. Second, a "## Functional Terms Table" that presents key functional terms in a tabular format with these columns:
83
+ - Functional Term: The biological term or concept (e.g., DNA repair, kinase activity)
84
+ - Genes: The genes associated with this term (comma-separated list)
85
+ - Source: The likely source database or ontology (e.g., GO-BP, KEGG, Reactome, GO-MF, GO-CC, Disease)
86
+
87
+ The functional terms should include various types:
88
+ - Gene Ontology biological process terms (e.g., DNA repair, oxidative phosphorylation)
89
+ - Molecular function terms (e.g., kinase activity, DNA binding)
90
+ - Cellular component/localization terms (e.g., nucleus, plasma membrane)
91
+ - Pathway names (e.g., glycolysis, MAPK signaling)
92
+ - Disease associations (if relevant)
93
+ - Structural and functional domains/motifs (if relevant)
94
+
95
+ Example of Functional Terms Table:
96
+ ## Functional Terms Table
97
+ | Functional Term | Genes | Source |
98
+ |-----------------|-------|--------|
99
+ | DNA damage response | BRCA1, BRCA2, ATM | GO-BP |
100
+ | Homologous recombination | BRCA1, BRCA2 | Reactome |
101
+ | Tumor suppression | BRCA1, BRCA2, ATM | Disease |
102
+ | Nuclear localization | BRCA1, BRCA2, ATM | GO-CC |
103
+ | Kinase activity | ATM | GO-MF |
104
+ | PARP inhibitor sensitivity | BRCA1, BRCA2, PARP1 | Pathway |
105
+
106
+ 3. Third, a "## Gene Summary Table" with a markdown table summarizing the genes analyzed,
107
+ with the following columns in this exact order:
108
+ - ID: The gene identifier (same as Gene Symbol)
109
+ - Annotation: Genomic coordinates or accession with position information
110
+ - Genomic Context: Information about the genomic location (chromosome, plasmid, etc.)
111
+ - Organism: The organism the gene belongs to
112
+ - Description: The protein/gene function description
98
113
 
99
114
  Example of Gene Summary Table:
100
115
  ## Gene Summary Table
101
116
  | ID | Annotation | Genomic Context | Organism | Description |
102
117
  |-------------|-------------|----------|----------------|------------|
103
118
  | BRCA1 | NC_000017.11 (43044295..43125483) | Chromosome 17 | Homo sapiens | Breast cancer type 1 susceptibility protein |
104
- | TP53 | NC_000017.11 (7668402..7687550) | Chromosome 17 | Homo sapiens | Tumor suppressor protein |
119
+ | BRCA2 | NC_000013.11 (32315474..32400266) | Chromosome 13 | Homo sapiens | Breast cancer type 2 susceptibility protein |
120
+ | ATM | NC_000011.10 (108222484..108369102) | Chromosome 11 | Homo sapiens | ATM serine/threonine kinase |
121
+ | PARP1 | NC_000001.11 (226360251..226408516) | Chromosome 1 | Homo sapiens | Poly(ADP-ribose) polymerase 1 |
105
122
 
106
123
  For bacterial genes, the table format would be:
107
124
  | ID | Annotation | Genomic Context | Organism | Description |
@@ -123,4 +140,4 @@ talisman_agent = Agent(
123
140
  talisman_agent.tool(get_gene_description)
124
141
  talisman_agent.tool(get_gene_descriptions)
125
142
  talisman_agent.tool(get_genes_from_list)
126
- talisman_agent.tool(analyze_gene_set)
143
+ #talisman_agent.tool(analyze_gene_set)