themefinder 0.5.3__py3-none-any.whl → 0.5.4__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of themefinder might be problematic. Click here for more details.

themefinder/core.py CHANGED
@@ -197,7 +197,7 @@ async def theme_condensation(
197
197
  themes_df: pd.DataFrame,
198
198
  llm: Runnable,
199
199
  question: str,
200
- batch_size: int = 100,
200
+ batch_size: int = 75,
201
201
  prompt_template: str | Path | PromptTemplate = "theme_condensation",
202
202
  system_prompt: str = CONSULTATION_SYSTEM_PROMPT,
203
203
  **kwargs,
@@ -1,30 +1,43 @@
1
1
  {system_prompt}
2
2
 
3
- Below is a question and a list of topics extracted from answers to that question. Each topic has a topic_label and a topic_description.
3
+ Below is a question and a list of topics extracted from answers to that question. Each topic has a topic_label, topic_description, and may have a source_topic_count field indicating how many original topics it represents.
4
4
 
5
5
  Your task is to analyze these topics and produce a refined list that:
6
6
  1. Identifies and preserves core themes that appear frequently
7
7
  2. Combines redundant topics while maintaining nuanced differences
8
8
  3. Ensures the final list represents the full spectrum of viewpoints present in the original data
9
+ 4. Tracks the total number of original topics combined into each new topic
9
10
 
10
11
  Guidelines for Topic Analysis:
11
12
  - Begin by identifying distinct concept clusters in the topics
12
13
  - Consider the context of the question when determining topic relevance
13
14
  - Look for complementary perspectives that could enrich understanding of the same core concept
14
15
  - Consider the key ideas behind themes when merging, don't simply focus on the words used in the label and description
16
+ - When combining topics:
17
+ * For topics without a source_topic_count field, assume count = 1
18
+ * For topics with source_topic_count, use their existing count
19
+ * The new topic's count should be the sum of all combined topics' counts
15
20
 
16
21
  For each topic in your output:
17
22
  1. Choose a clear, representative label that captures the essence of the combined or preserved topic
18
23
  2. Write a concise description that incorporates key insights from all constituent topics, this should only be a single sentence
19
-
20
- Return at most 30 topics
24
+ 3. Include the total count of original topics combined by summing the source_topic_counts of merged topics (or 1 for topics without a count)
21
25
 
22
26
  The final output should be in the following JSON format:
23
27
 
24
28
  {{"responses": [
25
- {{"topic_label": "{{label for condensed topic 1}}", "topic_description": "{{description for condensed topic 1}}"}},
26
- {{"topic_label": "{{label for condensed topic 2}}", "topic_description": "{{description for condensed topic 2}}"}},
27
- {{"topic_label": "{{label for condensed topic 3}}", "topic_description": "{{description for condensed topic 3}}"}},
29
+ {{"topic_label": "{{label for condensed topic 1}}",
30
+ "topic_description": "{{description for condensed topic 1}}",
31
+ "source_topic_count": {{sum of source_topic_counts from combined topics}}
32
+ }},
33
+ {{"topic_label": "{{label for condensed topic 2}}",
34
+ "topic_description": "{{description for condensed topic 2}}",
35
+ "source_topic_count": {{sum of source_topic_counts from combined topics}}
36
+ }},
37
+ {{"topic_label": "{{label for condensed topic 3}}",
38
+ "topic_description": "{{description for condensed topic 3}}",
39
+ "source_topic_count": {{sum of source_topic_counts from combined topics}}
40
+ }},
28
41
  // Additional topics as necessary
29
42
  ]}}
30
43
 
@@ -53,9 +53,9 @@ You will produce a list of NEUTRAL TOPICS based on the input. Each neutral topic
53
53
  Return your output in the following JSON format:
54
54
  {{
55
55
  "responses": [
56
- {{"topic_id": "A", "topic": "{{topic label 1}}: {{topic description 1}}"}},
57
- {{"topic_id": "B", "topic": "{{topic label 2}}: {{topic description 2}}"}},
58
- {{"topic_id": "C", "topic": "{{topic label 3}}: {{topic description 3}}"}},
56
+ {{"topic_id": "A", "topic": "{{topic label 1}}: {{topic description 1}}", "source_topic_count": {{count1}}}},
57
+ {{"topic_id": "B", "topic": "{{topic label 2}}: {{topic description 2}}", "source_topic_count": {{count2}}}},
58
+ {{"topic_id": "C", "topic": "{{topic label 3}}: {{topic description 3}}", "source_topic_count": {{count3}}}},
59
59
  // Additional topics as necessary
60
60
  ]
61
61
  }}
@@ -64,11 +64,14 @@ Return your output in the following JSON format:
64
64
  ## EXAMPLE
65
65
 
66
66
  OPINIONATED TOPIC:
67
- "Economic impact: Many respondents who support the policy believe it will create jobs and boost the economy, it could raise GDP by 2%."
67
+ "Economic impact: Many respondents who support the policy believe it will create jobs and boost the economy, it could raise GDP by 2%. [source_topic_count: 15]"
68
68
 
69
69
  NEUTRAL TOPIC:
70
- Topic Label: Economic Impact on Employment
71
- Description: The policy's potential effects on job creation and overall economic growth, including potential for a 2% increase in GDP.
70
+ {{
71
+ "topic_id": "A",
72
+ "topic": "Economic Impact on Employment: The policy's potential effects on job creation and overall economic growth, including potential for a 2% increase in GDP.",
73
+ "source_topic_count": 15
74
+ }}
72
75
 
73
76
  Remember, your goal is to create a list of neutral, informative, and distinct topics that accurately represent the content of the original opinionated topics without any bias or references to responses.
74
77
 
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.3
2
2
  Name: themefinder
3
- Version: 0.5.3
3
+ Version: 0.5.4
4
4
  Summary: A topic modelling Python package designed for analysing one-to-many question-answer data eg free-text survey responses.
5
5
  License: MIT
6
6
  Author: i.AI
@@ -1,15 +1,15 @@
1
1
  themefinder/__init__.py,sha256=p6QoCgA-BYWljk8yPOeTgkNcN5m_gA_o3Q86Eh0QjSM,327
2
- themefinder/core.py,sha256=B6Du59rPsZbBcP8tkKmXQn6h5vvLN_PZIferPnF3LNY,17538
2
+ themefinder/core.py,sha256=yH68-DtpIv0jX__LnjuBaKJn01hj-VurW3WnFxk0wMQ,17537
3
3
  themefinder/llm_batch_processor.py,sha256=SDDeMJeX1J3u7FGFddRhVSxty6U8lFVXwG4eNI_0C5o,12573
4
4
  themefinder/prompts/consultation_system_prompt.txt,sha256=_A07oY_an4hnRx-9pQ0y-TLXJz0dd8vDI-MZne7Mdb4,89
5
5
  themefinder/prompts/sentiment_analysis.txt,sha256=e3DcUKga6pSFcfeo2TAq8x9LXk0YDV-D7P2gtymcyuc,1832
6
- themefinder/prompts/theme_condensation.txt,sha256=GFwwQO_oZHhqhPnAfTn887fDzAIVxKoCyj0hXagyBIU,1645
6
+ themefinder/prompts/theme_condensation.txt,sha256=DB4pqUmMpo0OG4AZWGTj0FfLFfjbX6wOMUr44HBxZ1o,2433
7
7
  themefinder/prompts/theme_generation.txt,sha256=JMXuNojxdSAcxPRU1Jg12Xunv_dX4hNvXYU2pXMWTAw,2500
8
8
  themefinder/prompts/theme_mapping.txt,sha256=nb_D7gwKGd8BzrAlzSZC3mQIPYaCRXdE6XmoJaJEKZQ,2405
9
- themefinder/prompts/theme_refinement.txt,sha256=HCgvWAoz-cpFgjX_QS_VVY0X06d4ds0ekBgcoWyFyfg,3360
9
+ themefinder/prompts/theme_refinement.txt,sha256=_NVHdXBfqCFX2u0R5oZEqWQo70MAjJ5nXQfZ7p_HRAM,3528
10
10
  themefinder/prompts/theme_target_alignment.txt,sha256=-_ghr4--KAN6Tz8ExO9s2IXvI6pjWaEA_nG5L83GV5I,1035
11
11
  themefinder/themefinder_logging.py,sha256=n5SUQovEZLC4skEbxicjz_fOGF9mOk3S-Wpj5uXsaL8,314
12
- themefinder-0.5.3.dist-info/LICENCE,sha256=C9ULIN0ctF60ZxUWH_hw1H434bDLg49Z-Qzn6BUHgqs,1060
13
- themefinder-0.5.3.dist-info/METADATA,sha256=o9rzrhRK-4PMAv9wS8ZrnmTw1rTSYGU8zfPbB31r1DU,6483
14
- themefinder-0.5.3.dist-info/WHEEL,sha256=XbeZDeTWKc1w7CSIyre5aMDU_-PohRwTQceYnisIYYY,88
15
- themefinder-0.5.3.dist-info/RECORD,,
12
+ themefinder-0.5.4.dist-info/LICENCE,sha256=C9ULIN0ctF60ZxUWH_hw1H434bDLg49Z-Qzn6BUHgqs,1060
13
+ themefinder-0.5.4.dist-info/METADATA,sha256=JKSxdzARGcJ-OJwrd5ScuPzm4Uln2cBQ_SnrxFAhQLQ,6483
14
+ themefinder-0.5.4.dist-info/WHEEL,sha256=XbeZDeTWKc1w7CSIyre5aMDU_-PohRwTQceYnisIYYY,88
15
+ themefinder-0.5.4.dist-info/RECORD,,