themefinder 0.6.2__py3-none-any.whl → 0.6.3__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Potentially problematic release.
This version of themefinder might be problematic. Click here for more details.
- themefinder/__init__.py +4 -0
- themefinder/core.py +129 -33
- themefinder/llm_batch_processor.py +32 -80
- themefinder/models.py +307 -94
- themefinder/prompts/detail_detection.txt +19 -0
- themefinder/prompts/sentiment_analysis.txt +0 -14
- themefinder/prompts/theme_condensation.txt +2 -22
- themefinder/prompts/theme_generation.txt +6 -38
- themefinder/prompts/theme_mapping.txt +6 -23
- themefinder/prompts/theme_refinement.txt +2 -12
- themefinder/prompts/theme_target_alignment.txt +2 -10
- {themefinder-0.6.2.dist-info → themefinder-0.6.3.dist-info}/METADATA +23 -8
- themefinder-0.6.3.dist-info/RECORD +17 -0
- {themefinder-0.6.2.dist-info → themefinder-0.6.3.dist-info}/WHEEL +1 -1
- themefinder-0.6.2.dist-info/RECORD +0 -16
- {themefinder-0.6.2.dist-info → themefinder-0.6.3.dist-info}/LICENCE +0 -0
|
@@ -10,17 +10,9 @@ Requirements:
|
|
|
10
10
|
- Each consolidated theme should capture all relevant information from its source themes
|
|
11
11
|
- Final descriptions should be concise but thorough
|
|
12
12
|
- The merged themes should be distinct from each other with minimal overlap
|
|
13
|
+
- The source_topic_count field should be included for each theme and represent the sum of all source themes that were combined to create it
|
|
14
|
+
- You cannot return more than {target_n_themes}
|
|
13
15
|
|
|
14
|
-
Return your output in the following JSON format:
|
|
15
|
-
|
|
16
|
-
{{
|
|
17
|
-
"responses": [
|
|
18
|
-
{{"topic_id": "A", "topic": "{{topic label 1}}: {{topic description 1}}"}},
|
|
19
|
-
{{"topic_id": "B", "topic": "{{topic label 2}}: {{topic description 2}}"}},
|
|
20
|
-
{{"topic_id": "C", "topic": "{{topic label 3}}: {{topic description 3}}"}},
|
|
21
|
-
// Additional topics as necessary
|
|
22
|
-
]
|
|
23
|
-
}}
|
|
24
16
|
|
|
25
17
|
Themes to analyze:
|
|
26
18
|
{responses}
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.3
|
|
2
2
|
Name: themefinder
|
|
3
|
-
Version: 0.6.
|
|
3
|
+
Version: 0.6.3
|
|
4
4
|
Summary: A topic modelling Python package designed for analysing one-to-many question-answer data eg free-text survey responses.
|
|
5
5
|
License: MIT
|
|
6
6
|
Author: i.AI
|
|
@@ -49,9 +49,9 @@ ThemeFinder takes as input a [pandas DataFrame](https://pandas.pydata.org/docs/r
|
|
|
49
49
|
- `response_id`: A unique identifier for each response
|
|
50
50
|
- `response`: The free text survey response
|
|
51
51
|
|
|
52
|
-
ThemeFinder
|
|
52
|
+
ThemeFinder now supports a range of language models through structured outputs.
|
|
53
53
|
|
|
54
|
-
The function `find_themes` identifies common themes in
|
|
54
|
+
The function `find_themes` identifies common themes in responses and labels them, it also outputs results from intermediate steps in the theme finding pipeline.
|
|
55
55
|
|
|
56
56
|
For this example, import the following Python packages into your virtual environment: `asyncio`, `pandas`, `lanchain`. And import `themefinder` as described above.
|
|
57
57
|
|
|
@@ -81,7 +81,6 @@ load_dotenv()
|
|
|
81
81
|
llm = AzureChatOpenAI(
|
|
82
82
|
model="gpt-4o",
|
|
83
83
|
temperature=0,
|
|
84
|
-
model_kwargs={"response_format": {"type": "json_object"}},
|
|
85
84
|
)
|
|
86
85
|
|
|
87
86
|
# Set up your data
|
|
@@ -97,18 +96,15 @@ question = "What do you think of ThemeFinder?"
|
|
|
97
96
|
# Make the system prompt specific to your use case
|
|
98
97
|
system_prompt = "You are an AI evaluation tool analyzing survey responses about a Python package."
|
|
99
98
|
|
|
100
|
-
# Run the function to find themes
|
|
101
|
-
# We use asyncio to query LLM endpoints asynchronously, so we need to await our function
|
|
99
|
+
# Run the function to find themes, we use asyncio to query LLM endpoints asynchronously, so we need to await our function
|
|
102
100
|
async def main():
|
|
103
101
|
result = await find_themes(responses_df, llm, question, system_prompt=system_prompt)
|
|
104
102
|
print(result)
|
|
105
103
|
|
|
106
104
|
if __name__ == "__main__":
|
|
107
105
|
asyncio.run(main())
|
|
108
|
-
|
|
109
106
|
```
|
|
110
107
|
|
|
111
|
-
|
|
112
108
|
## ThemeFinder pipeline
|
|
113
109
|
|
|
114
110
|
ThemeFinder's pipeline consists of five distinct stages, each utilizing a specialized LLM prompt:
|
|
@@ -145,6 +141,25 @@ The file `src/themefinder.core.py` contains the function `find_themes` which run
|
|
|
145
141
|
**For more detail - see the docs: [https://i-dot-ai.github.io/themefinder/](https://i-dot-ai.github.io/themefinder/).**
|
|
146
142
|
|
|
147
143
|
|
|
144
|
+
## Model Compatibility
|
|
145
|
+
|
|
146
|
+
ThemeFinder's structured output approach makes it compatible with a wide range of language models from various providers. This list is non-exhaustive, and other models may also work effectively:
|
|
147
|
+
|
|
148
|
+
### OpenAI Models
|
|
149
|
+
- GPT-4, GPT-4o, GPT-4.1
|
|
150
|
+
- All Azure OpenAI deployments
|
|
151
|
+
|
|
152
|
+
### Google Models
|
|
153
|
+
- Gemini series (1.5 Pro, 2.0 Pro, etc.)
|
|
154
|
+
|
|
155
|
+
### Anthropic Models
|
|
156
|
+
- Claude series (Claude 3 Opus, Sonnet, Haiku, etc.)
|
|
157
|
+
|
|
158
|
+
### Open Source Models
|
|
159
|
+
- Llama 2, Llama 3
|
|
160
|
+
- Mistral models (e.g., Mistral 7B, Mixtral)
|
|
161
|
+
|
|
162
|
+
|
|
148
163
|
## License
|
|
149
164
|
|
|
150
165
|
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
|
|
@@ -0,0 +1,17 @@
|
|
|
1
|
+
themefinder/__init__.py,sha256=yfIyHWPMM59u23m79igHSllT-w3r4l_euLCDZygo22Q,431
|
|
2
|
+
themefinder/core.py,sha256=J4BJZO8BNN9xbX3LsKah4ZOGkW6YJcg_iYB9HCH7UR0,22768
|
|
3
|
+
themefinder/llm_batch_processor.py,sha256=zdrQH1bvMR9FHWDaDp1tvdiADTHTaNDg_Z-3QQ0771k,17641
|
|
4
|
+
themefinder/models.py,sha256=RN_7WzucXgKWSVXEoizijTgAM63rMVvXW6vdGD3o6Z8,12332
|
|
5
|
+
themefinder/prompts/consultation_system_prompt.txt,sha256=_A07oY_an4hnRx-9pQ0y-TLXJz0dd8vDI-MZne7Mdb4,89
|
|
6
|
+
themefinder/prompts/detail_detection.txt,sha256=6Vr_oN7rF5BCFipnCIHTSF8MmjerGyCixRWRT3vni1U,941
|
|
7
|
+
themefinder/prompts/sentiment_analysis.txt,sha256=vYCDhtEsG5I9xixwVhZbvKPJGU1Gqpw4-xAqGz72xhU,1671
|
|
8
|
+
themefinder/prompts/theme_condensation.txt,sha256=pHWuCtfU58gdtP2BfGZWOTvcb0MnTpb9OhOCGtkJv8U,1672
|
|
9
|
+
themefinder/prompts/theme_generation.txt,sha256=QRKW7DtcMSb2olT6j5jmdEPcXPMeZgogM-NYddEIKRk,1871
|
|
10
|
+
themefinder/prompts/theme_mapping.txt,sha256=HtGuStm-622TIEaqdb9LTaBs9xE-n9lvmcGQTG2_JOQ,2042
|
|
11
|
+
themefinder/prompts/theme_refinement.txt,sha256=va9SPBbuR6F5th78Nx4lCREXDFltSO80JUsShR0FRgE,2556
|
|
12
|
+
themefinder/prompts/theme_target_alignment.txt,sha256=g7AVZLiP_xIH010X5SIZyG3q7gA6OBAplPv3xvmstOY,855
|
|
13
|
+
themefinder/themefinder_logging.py,sha256=n5SUQovEZLC4skEbxicjz_fOGF9mOk3S-Wpj5uXsaL8,314
|
|
14
|
+
themefinder-0.6.3.dist-info/LICENCE,sha256=C9ULIN0ctF60ZxUWH_hw1H434bDLg49Z-Qzn6BUHgqs,1060
|
|
15
|
+
themefinder-0.6.3.dist-info/METADATA,sha256=RtE3wRVnyr-DaHo4XFFsQtEIYJCByCmo5PsIMD0Tzh0,6850
|
|
16
|
+
themefinder-0.6.3.dist-info/WHEEL,sha256=b4K_helf-jlQoXBBETfwnf4B04YC67LOev0jo4fX5m8,88
|
|
17
|
+
themefinder-0.6.3.dist-info/RECORD,,
|
|
@@ -1,16 +0,0 @@
|
|
|
1
|
-
themefinder/__init__.py,sha256=wSpW2fEnC4gTzbeNC78nSD3DpJq43-h_H-LK_cqt1cw,327
|
|
2
|
-
themefinder/core.py,sha256=u1DY9gbzn-tFhQS3hrXQ8_1mIbR-iBWYVAdKeAX1BdE,18304
|
|
3
|
-
themefinder/llm_batch_processor.py,sha256=OrFEl1nSi5ninbSZSiE1HFMcYZiQ-NzuYPj_iDcPPoE,19988
|
|
4
|
-
themefinder/models.py,sha256=Y5-okndYwtBO09n_qUlYNVmHRVNEnJviArQZukm8Ox8,4251
|
|
5
|
-
themefinder/prompts/consultation_system_prompt.txt,sha256=_A07oY_an4hnRx-9pQ0y-TLXJz0dd8vDI-MZne7Mdb4,89
|
|
6
|
-
themefinder/prompts/sentiment_analysis.txt,sha256=9-LkdR95JTHXRKUXknAgNf86uVdv6jSaXMf-OtFL9_0,1948
|
|
7
|
-
themefinder/prompts/theme_condensation.txt,sha256=DB4pqUmMpo0OG4AZWGTj0FfLFfjbX6wOMUr44HBxZ1o,2433
|
|
8
|
-
themefinder/prompts/theme_generation.txt,sha256=JMXuNojxdSAcxPRU1Jg12Xunv_dX4hNvXYU2pXMWTAw,2500
|
|
9
|
-
themefinder/prompts/theme_mapping.txt,sha256=YcRGMkuTyTPzPQPtsDY31DUwX60c8AdmdHKw0XeUejQ,2258
|
|
10
|
-
themefinder/prompts/theme_refinement.txt,sha256=hBXwZnNZmhmoEFXpY5OJinp-7xxdoDRf_5LmgrilYgc,2713
|
|
11
|
-
themefinder/prompts/theme_target_alignment.txt,sha256=-_ghr4--KAN6Tz8ExO9s2IXvI6pjWaEA_nG5L83GV5I,1035
|
|
12
|
-
themefinder/themefinder_logging.py,sha256=n5SUQovEZLC4skEbxicjz_fOGF9mOk3S-Wpj5uXsaL8,314
|
|
13
|
-
themefinder-0.6.2.dist-info/LICENCE,sha256=C9ULIN0ctF60ZxUWH_hw1H434bDLg49Z-Qzn6BUHgqs,1060
|
|
14
|
-
themefinder-0.6.2.dist-info/METADATA,sha256=gI9Hp754EjopJQWw0QZIPb9dex8TalPMGnorUEOJlp0,6498
|
|
15
|
-
themefinder-0.6.2.dist-info/WHEEL,sha256=fGIA9gx4Qxk2KDKeNJCbOEwSrmLtjWCwzBz351GyrPQ,88
|
|
16
|
-
themefinder-0.6.2.dist-info/RECORD,,
|
|
File without changes
|