PyPI - sdg-hub - Versions diffs - 0.6.1__py3-none-any.whl → 0.7.0__py3-none-any.whl - Mend

sdg-hub 0.6.1py3-none-any.whl → 0.7.0py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (14) hide show

sdg_hub/_version.py CHANGED Viewed

@@ -28,7 +28,7 @@ version_tuple: VERSION_TUPLE
 commit_id: COMMIT_ID
 __commit_id__: COMMIT_ID
-__version__ = version = '0.6.1'
-__version_tuple__ = version_tuple = (0, 6, 1)
+__version__ = version = '0.7.0'
+__version_tuple__ = version_tuple = (0, 7, 0)
 __commit_id__ = commit_id = None

sdg_hub/flows/evaluation/rag/__init__.py ADDED Viewed

File without changes

sdg_hub/flows/evaluation/rag/answer_generation.yaml ADDED Viewed

@@ -0,0 +1,21 @@
+- role: system
+  content: |
+    You are an extractive question-answering system. Your answers must be FULLY GROUNDED in the provided context.
+    Strict Rules:
+    1. Use ONLY information explicitly stated in the context
+    2. Do NOT make inferences, assumptions, or add general knowledge
+    3. Do NOT elaborate beyond what the context says
+    4. Quote or paraphrase the context directly
+    5. If information is missing, acknowledge it
+- role: user
+  content: |
+    Context:
+    {{context}}
+    Question:
+    {{question}}
+    Provide a direct, extractive answer using ONLY the information stated above.

sdg_hub/flows/evaluation/rag/conceptual_qa_generation.yaml ADDED Viewed

@@ -0,0 +1,25 @@
+- role: system
+  content: |
+    A "reasoning" question is a question with the following properties:
+    - It is a natural language question.
+    - It requires the reader to think critically and make an inference or draw a conclusion based on the information provided.
+    I will provide you with an abstract description of some content, the actual text content, and a specific topic to focus on.
+    Your Task:
+    1. Focus on the following topic: {{topic}}
+    2. Think of a "reasoning" question about this topic that can be answered using ONLY the provided text.
+    ## Abstract Description of Content
+    {{document_outline}}
+    ## Text Content
+    {{document}}
+    ## Topic to Focus On
+    {{topic}}
+    State the generated question. Do not say anything other than the question.
+- role: user
+  content: |
+    Generate the question.

sdg_hub/flows/evaluation/rag/context_extraction.yaml ADDED Viewed

@@ -0,0 +1,23 @@
+- role: system
+  content: |
+    You are an expert data annotator. Your task is to extract the EXACT sentences from the provided context that answer the question.
+    Rules:
+    1. Extract ONLY sentences that contain information used to answer the question.
+    2. Do NOT modify the sentences - copy them exactly as they appear.
+    3. If multiple sentences are needed, output them one per line.
+    4. If no sentence directly answers the question (based on the provided answer), output "No relevant sentences found."
+    5. Output ONLY the sentences, no other text.
+- role: user
+  content: |
+    Context:
+    {{context}}
+    Question:
+    {{question}}
+    Answer:
+    {{answer}}
+    Relevant Sentences:

sdg_hub/flows/evaluation/rag/flow.yaml ADDED Viewed

@@ -0,0 +1,201 @@
+metadata:
+  name: RAG Evaluation Dataset Flow
+  description: Generates Q&A pairs for RAG evaluation.
+  version: 1.0.0
+  author: "Red Hat AI RAG Contributors"
+  license: "Apache-2.0"
+  recommended_models:
+    default: "openai/gpt-oss-120b"
+    compatible:
+      - "meta-llama/Llama-3.3-70B-Instruct"
+      - "microsoft/phi-4"
+  tags:
+  - rag-evaluation
+  - qa-pairs
+  dataset_requirements:
+    required_columns:
+    - document
+    - document_outline
+    description: Input dataset should contain documents with text content and document outlines.
+  id: loud-dawn-245
+blocks:
+  - block_type: DuplicateColumnsBlock
+    block_config:
+      block_name: duplicate_to_context
+      input_cols: {document: context}
+  - block_type: PromptBuilderBlock
+    block_config:
+      block_name: topic_prompt
+      input_cols: [document]
+      output_cols: topic_messages
+      prompt_config_path: topic_generation.yaml
+  - block_type: LLMChatBlock
+    block_config:
+      block_name: gen_topic
+      input_cols: topic_messages
+      output_cols: topic_response
+      async_mode: true
+      n: 1
+      max_tokens: 2048
+      temperature: 0.7
+  - block_type: LLMParserBlock
+    block_config:
+      block_name: parse_topic
+      input_cols: topic_response
+      field_prefix: topic_
+      extract_content: true
+  - block_type: RenameColumnsBlock
+    block_config:
+      block_name: rename_topic
+      input_cols: {topic_content: topic}
+  - block_type: PromptBuilderBlock
+    block_config:
+      block_name: conceptual_prompt
+      input_cols:
+        document: document
+        document_outline: document_outline
+        topic: topic
+      output_cols: conceptual_messages
+      prompt_config_path: conceptual_qa_generation.yaml
+  - block_type: LLMChatBlock
+    block_config:
+      block_name: gen_conceptual_question
+      input_cols: conceptual_messages
+      output_cols: question_response
+      async_mode: true
+      n: 1
+      max_tokens: 2048
+      temperature: 0.7
+  - block_type: LLMParserBlock
+    block_config:
+      block_name: parse_question
+      input_cols: question_response
+      field_prefix: question_
+      extract_content: true
+  - block_type: PromptBuilderBlock
+    block_config:
+      block_name: evolution_prompt
+      input_cols: {question_content: question}
+      output_cols: evolution_messages
+      prompt_config_path: question_evolution.yaml
+  - block_type: LLMChatBlock
+    block_config:
+      block_name: evolve_question
+      input_cols: evolution_messages
+      output_cols: evolution_response
+      async_mode: true
+      n: 1
+      max_tokens: 4096
+      temperature: 0.7
+  - block_type: LLMParserBlock
+    block_config:
+      block_name: parse_evolved_question
+      input_cols: evolution_response
+      field_prefix: "evolved_"
+      extract_content: true
+  - block_type: PromptBuilderBlock
+    block_config:
+      block_name: answer_prompt
+      input_cols:
+        context: context
+        evolved_content: question
+      output_cols: answer_messages
+      prompt_config_path: answer_generation.yaml
+  - block_type: LLMChatBlock
+    block_config:
+      block_name: gen_answer
+      input_cols: answer_messages
+      output_cols: answer_response
+      async_mode: true
+      n: 1
+      max_tokens: 4096
+      temperature: 0.2
+  - block_type: LLMParserBlock
+    block_config:
+      block_name: parse_answer
+      input_cols: answer_response
+      field_prefix: "answer_"
+      extract_content: true
+  - block_type: PromptBuilderBlock
+    block_config:
+      block_name: critic_prompt
+      input_cols:
+        context: context
+        evolved_content: question
+        answer_content: answer
+      output_cols: critic_messages
+      prompt_config_path: groundedness_critic.yaml
+  - block_type: LLMChatBlock
+    block_config:
+      block_name: gen_critic_score
+      input_cols: critic_messages
+      output_cols: critic_response
+      async_mode: true
+      n: 1
+      max_tokens: 512
+      temperature: 0.0
+  - block_type: LLMParserBlock
+    block_config:
+      block_name: parse_critic_score
+      input_cols: critic_response
+      field_prefix: "critic_"
+      extract_content: true
+  - block_type: ColumnValueFilterBlock
+    block_config:
+      block_name: filter_ungrounded
+      input_cols: critic_content
+      filter_value: [4, 5]
+      operation: "eq"
+      convert_dtype: "int"
+  - block_type: PromptBuilderBlock
+    block_config:
+      block_name: extraction_prompt
+      input_cols:
+        context: context
+        evolved_content: question
+        answer_content: answer
+      output_cols: extraction_messages
+      prompt_config_path: context_extraction.yaml
+  - block_type: LLMChatBlock
+    block_config:
+      block_name: extract_context
+      input_cols: extraction_messages
+      output_cols: extraction_response
+      async_mode: true
+      n: 1
+      max_tokens: 4096
+      temperature: 0.0
+  - block_type: LLMParserBlock
+    block_config:
+      block_name: parse_extracted_context
+      input_cols: extraction_response
+      field_prefix: "ground_truth_"
+      extract_content: true
+  - block_type: RenameColumnsBlock
+    block_config:
+      block_name: rename_final_columns
+      input_cols:
+        evolved_content: question
+        answer_content: response
+        ground_truth_content: ground_truth_context

sdg_hub/flows/evaluation/rag/groundedness_critic.yaml ADDED Viewed

@@ -0,0 +1,24 @@
+- role: system
+  content: |
+    You are a strict evaluator for a RAG system. Your task is to rate how well the provided Answer is supported by the Context.
+    Score 1: The answer is completely unsupported or contradicts the context.
+    Score 2: The answer relies heavily on external knowledge or weak inferences.
+    Score 3: The answer is partially supported but includes some unsupported details.
+    Score 4: The answer is mostly supported, with only minor inferences.
+    Score 5: The answer is fully and explicitly supported by the context.
+    Output ONLY the integer score (1, 2, 3, 4, or 5). Do not output any other text.
+- role: user
+  content: |
+    Context:
+    {{context}}
+    Question:
+    {{question}}
+    Answer:
+    {{answer}}
+    Score:

sdg_hub/flows/evaluation/rag/question_evolution.yaml ADDED Viewed

@@ -0,0 +1,18 @@
+- role: system
+  content: |
+    You are an experienced linguistics expert for building testsets for large language model applications.
+    Your task is to rewrite the following question in a more indirect and compressed form, following these rules:
+    1. Make the question more indirect
+    2. Make the question shorter
+    3. Use abbreviations if possible
+    4. Keep the core meaning intact so it remains answerable
+    Output ONLY the rewritten question with a question mark "?" at the end. Do not provide any other explanation or text.
+- role: user
+  content: |
+    Question to rewrite:
+    {{question}}
+    Rewritten Question:

sdg_hub/flows/evaluation/rag/topic_generation.yaml ADDED Viewed

@@ -0,0 +1,12 @@
+- role: system
+  content: |
+    You are a helpful assistant that identifies a specific topic within a text.
+    Output only the topic. Do not include "The topic is" or any other text.
+- role: user
+  content: |
+    Identify a specific topic in the following text.
+    Text:
+    {{document}}
+    Topic:

{sdg_hub-0.6.1.dist-info → sdg_hub-0.7.0.dist-info}/METADATA RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: sdg_hub
-Version: 0.6.1
+Version: 0.7.0
 Summary: Synthetic Data Generation
 Author-email: Red Hat AI Innovation <abhandwa@redhat.com>
 License: Apache-2.0

{sdg_hub-0.6.1.dist-info → sdg_hub-0.7.0.dist-info}/RECORD RENAMED Viewed

@@ -1,5 +1,5 @@
 sdg_hub/__init__.py,sha256=TlkZT40-70urdcWLqv3kupaJj8s-SVgd2QyvlSFwb4A,510
-sdg_hub/_version.py,sha256=7vNQiXfKffK0nbqts6Xy6-E1b1YOm4EGigvgaHr83o4,704
+sdg_hub/_version.py,sha256=uLbRjFSUZAgfl7V7O8zKV5Db36k7tz87ZIVq3l2SWs0,704
 sdg_hub/py.typed,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
 sdg_hub/core/__init__.py,sha256=e3BoejbqjYhasf9t__L4qE52lkD9EBjx4o--2kqKdro,460
 sdg_hub/core/blocks/__init__.py,sha256=8Rn1SglH8V3jGmTD_cG-h7qk9ktAab2eaBdyk7RN_hY,865
@@ -37,6 +37,14 @@ sdg_hub/core/utils/logger_config.py,sha256=6_cnsIHtSAdq1iTTZ7Q7nAJ1dmldlxSZ0AB49
 sdg_hub/core/utils/path_resolution.py,sha256=yWof4kGNpQ5dKcrVHg0h9KfOKLZ6ROjdfsLAZsQT5rM,2000
 sdg_hub/core/utils/time_estimator.py,sha256=rM3_R-Ka5DEtvOtlJoA_5pXSyQ6tT6t4h6qh3_5BCZo,12639
 sdg_hub/core/utils/yaml_utils.py,sha256=tShCd-FFkp0xlKnLe7dXsMOR4AvT9d2qRUmu4ZnPSEY,1458
+sdg_hub/flows/evaluation/rag/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
+sdg_hub/flows/evaluation/rag/answer_generation.yaml,sha256=dxsHIPyEs14e9fH6JeEJgnrLIV-nLqXmnynj0XF_4os,624
+sdg_hub/flows/evaluation/rag/conceptual_qa_generation.yaml,sha256=cvU8P3EUj9-Cr19Y3ASxkxEBh9ll_NYMC3s6-x1Monc,847
+sdg_hub/flows/evaluation/rag/context_extraction.yaml,sha256=StAAU8yCTzaeGFKieJKFIDRfe21aqk7VIekMH1oEuxA,724
+sdg_hub/flows/evaluation/rag/flow.yaml,sha256=ZDkCrQaN9WfvwWaMjgfA2qUrTVz7pCw-PiHzOyzXKio,5276
+sdg_hub/flows/evaluation/rag/groundedness_critic.yaml,sha256=r5zqetGnNvg4UxCuENTzdWhCFbG6TnkY-seDMVRBBko,782
+sdg_hub/flows/evaluation/rag/question_evolution.yaml,sha256=d3G11dQ3Wkgz0JBNyqTi-6QMGIdODOVcGNw1x9OnTEE,649
+sdg_hub/flows/evaluation/rag/topic_generation.yaml,sha256=DhY_Wt7NzzjfirYlQQqABrXn73vMQj9W2XLZZEaofKc,303
 sdg_hub/flows/qa_generation/document_grounded_qa/enhanced_multi_summary_qa/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
 sdg_hub/flows/qa_generation/document_grounded_qa/enhanced_multi_summary_qa/generate_answers.yaml,sha256=THRT3cY44KGI_69B2wqt2Q89EknnOSE7B4A_jdnxlIU,330
 sdg_hub/flows/qa_generation/document_grounded_qa/enhanced_multi_summary_qa/generate_multiple_qa.yaml,sha256=Cs-yeiXs4yac3dZsurdXBZj-kkwWdK-xBywjvBlgtGI,669
@@ -76,8 +84,8 @@ sdg_hub/flows/text_analysis/structured_insights/extract_entities.yaml,sha256=Q_S
 sdg_hub/flows/text_analysis/structured_insights/extract_keywords.yaml,sha256=_nPPMdHnxag_lYbhYUjGJGo-CvRwWvwdGX7cQhdZ1S0,847
 sdg_hub/flows/text_analysis/structured_insights/flow.yaml,sha256=BBV18SdvuVTAESjwkJ7V1jbb-cSTBvNl3SCycd0oEQ4,4934
 sdg_hub/flows/text_analysis/structured_insights/summarize.yaml,sha256=WXwQak1pF8e1OwnOoI1EHu8QB6iUNW89rfkTdi1Oq54,687
-sdg_hub-0.6.1.dist-info/licenses/LICENSE,sha256=xx0jnfkXJvxRnG63LTGOxlggYnIysveWIZ6H3PNdCrQ,11357
-sdg_hub-0.6.1.dist-info/METADATA,sha256=JQxLH1YwDrV5D1cAaaRziFFiF17buxN-fnyse5lQVV8,9584
-sdg_hub-0.6.1.dist-info/WHEEL,sha256=_zCd3N1l69ArxyTb8rzEoP9TpbYXkqRFSNOD5OuxnTs,91
-sdg_hub-0.6.1.dist-info/top_level.txt,sha256=TqI7d-HE1n6zkXFkU0nF3A1Ct0P0pBaqI675uFokhx4,8
-sdg_hub-0.6.1.dist-info/RECORD,,
+sdg_hub-0.7.0.dist-info/licenses/LICENSE,sha256=xx0jnfkXJvxRnG63LTGOxlggYnIysveWIZ6H3PNdCrQ,11357
+sdg_hub-0.7.0.dist-info/METADATA,sha256=ABg2y-NjvyUPbMdqyDgrzQhpxdnv4oCwuOarTT86ahI,9584
+sdg_hub-0.7.0.dist-info/WHEEL,sha256=_zCd3N1l69ArxyTb8rzEoP9TpbYXkqRFSNOD5OuxnTs,91
+sdg_hub-0.7.0.dist-info/top_level.txt,sha256=TqI7d-HE1n6zkXFkU0nF3A1Ct0P0pBaqI675uFokhx4,8
+sdg_hub-0.7.0.dist-info/RECORD,,

{sdg_hub-0.6.1.dist-info → sdg_hub-0.7.0.dist-info}/WHEEL RENAMED Viewed

File without changes

{sdg_hub-0.6.1.dist-info → sdg_hub-0.7.0.dist-info}/licenses/LICENSE RENAMED Viewed

File without changes

{sdg_hub-0.6.1.dist-info → sdg_hub-0.7.0.dist-info}/top_level.txt RENAMED Viewed

File without changes

sdg-hub 0.6.1__py3-none-any.whl → 0.7.0__py3-none-any.whl

sdg-hub 0.6.1py3-none-any.whl → 0.7.0py3-none-any.whl