qtype 0.1.12__py3-none-any.whl → 0.1.13__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- qtype/` +0 -0
- qtype/application/__init__.py +0 -2
- qtype/application/converters/tools_from_api.py +28 -22
- qtype/application/converters/tools_from_module.py +66 -32
- qtype/commands/generate.py +90 -7
- qtype/commands/run.py +116 -44
- qtype/docs/.pages +8 -0
- {docs → qtype/docs}/Concepts/mental-model-and-philosophy.md +1 -1
- qtype/docs/Contributing/.pages +4 -0
- {docs → qtype/docs}/Contributing/index.md +8 -1
- {docs → qtype/docs}/Gallery/dataflow_pipelines.md +3 -2
- {docs → qtype/docs}/Gallery/research_assistant.md +3 -4
- {docs → qtype/docs}/Gallery/simple_chatbot.md +3 -1
- {docs → qtype/docs}/How To/Authentication/configure_aws_authentication.md +2 -2
- {docs → qtype/docs}/How To/Authentication/use_api_key_authentication.md +2 -2
- {docs → qtype/docs}/How To/Command Line Usage/load_multiple_inputs_from_files.md +24 -9
- {docs → qtype/docs}/How To/Command Line Usage/pass_inputs_on_the_cli.md +3 -3
- {docs → qtype/docs}/How To/Command Line Usage/serve_with_auto_reload.md +3 -2
- {docs → qtype/docs}/How To/Data Processing/adjust_concurrency.md +3 -4
- {docs → qtype/docs}/How To/Data Processing/cache_step_results.md +2 -2
- {docs → qtype/docs}/How To/Data Processing/decode_json_xml.md +1 -1
- {docs → qtype/docs}/How To/Data Processing/explode_collections.md +2 -2
- {docs → qtype/docs}/How To/Data Processing/gather_results.md +4 -4
- qtype/docs/How To/Data Processing/invoke_other_flows.md +71 -0
- qtype/docs/How To/Data Processing/load_data_from_athena.md +49 -0
- qtype/docs/How To/Data Processing/read_data_from_files.md +61 -0
- {docs → qtype/docs}/How To/Data Processing/read_sql_databases.md +2 -3
- {docs → qtype/docs}/How To/Data Processing/write_data_to_file.md +1 -2
- {docs → qtype/docs}/How To/Invoke Models/call_large_language_models.md +1 -1
- {docs → qtype/docs}/How To/Invoke Models/create_embeddings.md +1 -1
- {docs → qtype/docs}/How To/Invoke Models/reuse_prompts_with_templates.md +2 -3
- {docs → qtype/docs}/How To/Language Features/include_raw_text_from_other_files.md +2 -1
- {docs → qtype/docs}/How To/Language Features/reference_entities_by_id.md +2 -2
- qtype/docs/How To/Language Features/use_agent_skills.md +29 -0
- {docs → qtype/docs}/How To/Language Features/use_environment_variables.md +2 -1
- qtype/docs/How To/Language Features/use_optional_variables.md +42 -0
- {docs → qtype/docs}/How To/Language Features/use_qtype_mcp.md +4 -4
- {docs → qtype/docs}/How To/Observability & Debugging/trace_calls_with_open_telemetry.md +1 -1
- {docs → qtype/docs}/How To/Observability & Debugging/validate_qtype_yaml.md +3 -2
- {docs → qtype/docs}/How To/Observability & Debugging/visualize_application_architecture.md +1 -1
- {docs → qtype/docs}/How To/Qtype Server/serve_flows_as_apis.md +3 -3
- {docs → qtype/docs}/How To/Qtype Server/serve_flows_as_ui.md +2 -3
- {docs → qtype/docs}/How To/Qtype Server/use_conversational_interfaces.md +1 -4
- {docs → qtype/docs}/How To/Qtype Server/use_variables_with_ui_hints.md +3 -2
- {docs → qtype/docs}/How To/Tools & Integration/bind_tool_inputs_and_outputs.md +1 -2
- {docs → qtype/docs}/How To/Tools & Integration/create_tools_from_openapi_specifications.md +10 -14
- {docs → qtype/docs}/How To/Tools & Integration/create_tools_from_python_modules.md +5 -8
- {docs → qtype/docs}/Reference/cli.md +13 -15
- {docs → qtype/docs}/Reference/plugins.md +4 -0
- {docs → qtype/docs}/Reference/semantic-validation-rules.md +6 -1
- qtype/docs/Tutorials/.pages +1 -0
- {docs → qtype/docs}/Tutorials/01-first-qtype-application.md +3 -2
- {docs → qtype/docs}/Tutorials/02-conversational-chatbot.md +3 -3
- {docs → qtype/docs}/Tutorials/03-structured-data.md +9 -10
- {docs → qtype/docs}/Tutorials/04-tools-and-function-calling.md +12 -19
- {docs → qtype/docs}/components/APITool.md +1 -1
- qtype/docs/components/Aggregate.md +7 -0
- qtype/docs/components/Collect.md +6 -0
- qtype/docs/components/Construct.md +6 -0
- {docs → qtype/docs}/components/DocumentEmbedder.md +0 -1
- {docs → qtype/docs}/components/DocumentSplitter.md +0 -1
- qtype/docs/components/Explode.md +5 -0
- {docs → qtype/docs}/components/FieldExtractor.md +2 -1
- qtype/docs/components/InvokeFlow.md +8 -0
- qtype/docs/components/InvokeTool.md +8 -0
- {docs → qtype/docs}/components/PrimitiveTypeEnum.md +0 -1
- {docs → qtype/docs}/components/Source.md +0 -1
- {docs → qtype/docs}/components/Step.md +0 -1
- {docs → qtype/docs}/components/Tool.md +2 -2
- {docs → qtype/docs}/components/Variable.md +2 -0
- qtype/docs/legacy_how_tos/.pages +6 -0
- qtype/docs/skills/architect/SKILL.md +188 -0
- qtype/docs/skills/architect/references/cheatsheet.md +198 -0
- qtype/docs/skills/architect/references/patterns.md +29 -0
- qtype/docs/stylesheets/extra.css +27 -0
- qtype/dsl/linker.py +8 -0
- qtype/dsl/model.py +177 -84
- qtype/examples/data_processing/athena_query.qtype.yaml +56 -0
- qtype/examples/data_processing/batch_inputs.csv +5 -0
- qtype/examples/data_processing/create_sample_db.py +129 -0
- qtype/examples/data_processing/invoke_other_flows.qtype.yaml +98 -0
- qtype/examples/data_processing/reviews.db +0 -0
- qtype/examples/data_processing/sample_article.txt +1 -0
- qtype/examples/data_processing/sample_documents.jsonl +5 -0
- qtype/examples/language_features/optional_variables.qtype.yaml +32 -0
- qtype/examples/language_features/story_prompt.txt +6 -0
- qtype/examples/legacy/data/customers.csv +6 -0
- qtype/examples/legacy/echo/readme.md +29 -0
- qtype/examples/legacy/qtype_plugin_example.py +51 -0
- qtype/examples/legacy/sample_data.txt +43 -0
- qtype/examples/legacy/vertex/README.md +11 -0
- qtype/examples/research_assistant/tavily.qtype.yaml +216 -0
- {examples → qtype/examples}/tutorials/03_structured_data.qtype.yaml +2 -2
- {examples → qtype/examples}/tutorials/04_tools_and_function_calling.qtype.yaml +5 -5
- qtype/interpreter/base/stream_emitter.py +19 -13
- qtype/interpreter/converters.py +142 -26
- qtype/interpreter/executors/agent_executor.py +2 -3
- qtype/interpreter/executors/aggregate_executor.py +3 -4
- qtype/interpreter/executors/construct_executor.py +15 -15
- qtype/interpreter/executors/doc_to_text_executor.py +1 -3
- qtype/interpreter/executors/field_extractor_executor.py +13 -12
- qtype/interpreter/executors/file_source_executor.py +18 -31
- qtype/interpreter/executors/invoke_embedding_executor.py +1 -4
- qtype/interpreter/executors/invoke_flow_executor.py +2 -2
- qtype/interpreter/executors/invoke_tool_executor.py +19 -18
- qtype/interpreter/executors/llm_inference_executor.py +16 -18
- qtype/interpreter/executors/prompt_template_executor.py +1 -3
- qtype/interpreter/tools/function_tool_helper.py +11 -10
- qtype/interpreter/types.py +89 -4
- qtype/interpreter/typing.py +31 -32
- qtype/mcp/server.py +312 -57
- {schema → qtype/schema}/qtype.schema.json +77 -79
- qtype/semantic/checker.py +19 -0
- qtype/semantic/generate.py +3 -6
- qtype/semantic/model.py +26 -33
- qtype/semantic/resolver.py +7 -0
- qtype/semantic/visualize.py +8 -3
- {qtype-0.1.12.dist-info → qtype-0.1.13.dist-info}/METADATA +47 -46
- qtype-0.1.13.dist-info/RECORD +352 -0
- {qtype-0.1.12.dist-info → qtype-0.1.13.dist-info}/WHEEL +1 -2
- docs/How To/Data Processing/read_data_from_files.md +0 -35
- docs/components/Aggregate.md +0 -8
- docs/components/InvokeFlow.md +0 -8
- docs/components/InvokeTool.md +0 -8
- docs/components/ToolParameter.md +0 -6
- examples/research_assistant/tavily.qtype.yaml +0 -289
- qtype/application/facade.py +0 -177
- qtype-0.1.12.dist-info/RECORD +0 -325
- qtype-0.1.12.dist-info/top_level.txt +0 -1
- {docs → qtype/docs}/Contributing/roadmap.md +0 -0
- {docs → qtype/docs}/Decisions/ADR-001-Chat-vs-Completion-Endpoint-Features.md +0 -0
- {docs → qtype/docs}/Gallery/dataflow_pipelines.mermaid +0 -0
- {docs → qtype/docs}/Gallery/research_assistant.mermaid +0 -0
- {docs → qtype/docs}/Gallery/simple_chatbot.mermaid +0 -0
- {docs → qtype/docs}/How To/Language Features/include_qtype_yaml.md +0 -0
- {docs → qtype/docs}/How To/Observability & Debugging/visualize_example.mermaid +0 -0
- {docs → qtype/docs}/How To/Qtype Server/flow_as_ui.png +0 -0
- {docs → qtype/docs}/Tutorials/example_chat.png +0 -0
- {docs → qtype/docs}/Tutorials/index.md +0 -0
- {docs → qtype/docs}/components/APIKeyAuthProvider.md +0 -0
- {docs → qtype/docs}/components/AWSAuthProvider.md +0 -0
- {docs → qtype/docs}/components/AWSSecretManager.md +0 -0
- {docs → qtype/docs}/components/Agent.md +0 -0
- {docs → qtype/docs}/components/AggregateStats.md +0 -0
- {docs → qtype/docs}/components/Application.md +0 -0
- {docs → qtype/docs}/components/AuthorizationProvider.md +0 -0
- {docs → qtype/docs}/components/AuthorizationProviderList.md +0 -0
- {docs → qtype/docs}/components/BearerTokenAuthProvider.md +0 -0
- {docs → qtype/docs}/components/BedrockReranker.md +0 -0
- {docs → qtype/docs}/components/ChatContent.md +0 -0
- {docs → qtype/docs}/components/ChatMessage.md +0 -0
- {docs → qtype/docs}/components/ConstantPath.md +0 -0
- {docs → qtype/docs}/components/CustomType.md +0 -0
- {docs → qtype/docs}/components/Decoder.md +0 -0
- {docs → qtype/docs}/components/DecoderFormat.md +0 -0
- {docs → qtype/docs}/components/DocToTextConverter.md +0 -0
- {docs → qtype/docs}/components/Document.md +0 -0
- {docs → qtype/docs}/components/DocumentIndex.md +0 -0
- {docs → qtype/docs}/components/DocumentSearch.md +0 -0
- {docs → qtype/docs}/components/DocumentSource.md +0 -0
- {docs → qtype/docs}/components/Echo.md +0 -0
- {docs → qtype/docs}/components/Embedding.md +0 -0
- {docs → qtype/docs}/components/EmbeddingModel.md +0 -0
- {docs → qtype/docs}/components/FileSource.md +0 -0
- {docs → qtype/docs}/components/FileWriter.md +0 -0
- {docs → qtype/docs}/components/Flow.md +0 -0
- {docs → qtype/docs}/components/FlowInterface.md +0 -0
- {docs → qtype/docs}/components/Index.md +0 -0
- {docs → qtype/docs}/components/IndexUpsert.md +0 -0
- {docs → qtype/docs}/components/InvokeEmbedding.md +0 -0
- {docs → qtype/docs}/components/LLMInference.md +0 -0
- {docs → qtype/docs}/components/ListType.md +0 -0
- {docs → qtype/docs}/components/Memory.md +0 -0
- {docs → qtype/docs}/components/MessageRole.md +0 -0
- {docs → qtype/docs}/components/Model.md +0 -0
- {docs → qtype/docs}/components/ModelList.md +0 -0
- {docs → qtype/docs}/components/OAuth2AuthProvider.md +0 -0
- {docs → qtype/docs}/components/PromptTemplate.md +0 -0
- {docs → qtype/docs}/components/PythonFunctionTool.md +0 -0
- {docs → qtype/docs}/components/RAGChunk.md +0 -0
- {docs → qtype/docs}/components/RAGDocument.md +0 -0
- {docs → qtype/docs}/components/RAGSearchResult.md +0 -0
- {docs → qtype/docs}/components/Reranker.md +0 -0
- {docs → qtype/docs}/components/SQLSource.md +0 -0
- {docs → qtype/docs}/components/Search.md +0 -0
- {docs → qtype/docs}/components/SearchResult.md +0 -0
- {docs → qtype/docs}/components/SecretManager.md +0 -0
- {docs → qtype/docs}/components/SecretReference.md +0 -0
- {docs → qtype/docs}/components/TelemetrySink.md +0 -0
- {docs → qtype/docs}/components/ToolList.md +0 -0
- {docs → qtype/docs}/components/TypeList.md +0 -0
- {docs → qtype/docs}/components/VariableList.md +0 -0
- {docs → qtype/docs}/components/VectorIndex.md +0 -0
- {docs → qtype/docs}/components/VectorSearch.md +0 -0
- {docs → qtype/docs}/components/VertexAuthProvider.md +0 -0
- {docs → qtype/docs}/components/Writer.md +0 -0
- {docs → qtype/docs}/example_ui.png +0 -0
- {docs → qtype/docs}/index.md +0 -0
- {docs → qtype/docs}/legacy_how_tos/Configuration/modular-yaml.md +0 -0
- {docs → qtype/docs}/legacy_how_tos/Configuration/phoenix_projects.png +0 -0
- {docs → qtype/docs}/legacy_how_tos/Configuration/phoenix_traces.png +0 -0
- {docs → qtype/docs}/legacy_how_tos/Configuration/reference-by-id.md +0 -0
- {docs → qtype/docs}/legacy_how_tos/Configuration/telemetry-setup.md +0 -0
- {docs → qtype/docs}/legacy_how_tos/Data Types/custom-types.md +0 -0
- {docs → qtype/docs}/legacy_how_tos/Data Types/domain-types.md +0 -0
- {docs → qtype/docs}/legacy_how_tos/Debugging/visualize-apps.md +0 -0
- {docs → qtype/docs}/legacy_how_tos/Tools/api-tools.md +0 -0
- {docs → qtype/docs}/legacy_how_tos/Tools/python-tools.md +0 -0
- {examples → qtype/examples}/authentication/aws_authentication.qtype.yaml +0 -0
- {examples → qtype/examples}/conversational_ai/hello_world_chat.qtype.yaml +0 -0
- {examples → qtype/examples}/conversational_ai/simple_chatbot.qtype.yaml +0 -0
- {examples → qtype/examples}/data_processing/batch_processing.qtype.yaml +0 -0
- {examples → qtype/examples}/data_processing/cache_step_results.qtype.yaml +0 -0
- {examples → qtype/examples}/data_processing/collect_results.qtype.yaml +0 -0
- {examples → qtype/examples}/data_processing/dataflow_pipelines.qtype.yaml +0 -0
- {examples → qtype/examples}/data_processing/decode_json.qtype.yaml +0 -0
- {examples → qtype/examples}/data_processing/explode_items.qtype.yaml +0 -0
- {examples → qtype/examples}/data_processing/read_file.qtype.yaml +0 -0
- {examples → qtype/examples}/invoke_models/create_embeddings.qtype.yaml +0 -0
- {examples → qtype/examples}/invoke_models/simple_llm_call.qtype.yaml +0 -0
- {examples → qtype/examples}/language_features/include_raw.qtype.yaml +0 -0
- {examples → qtype/examples}/language_features/ui_hints.qtype.yaml +0 -0
- {examples → qtype/examples}/legacy/bedrock/data_analysis_with_telemetry.qtype.yaml +0 -0
- {examples → qtype/examples}/legacy/bedrock/hello_world.qtype.yaml +0 -0
- {examples → qtype/examples}/legacy/bedrock/hello_world_chat.qtype.yaml +0 -0
- {examples → qtype/examples}/legacy/bedrock/hello_world_chat_with_telemetry.qtype.yaml +0 -0
- {examples → qtype/examples}/legacy/bedrock/hello_world_chat_with_thinking.qtype.yaml +0 -0
- {examples → qtype/examples}/legacy/bedrock/hello_world_completion.qtype.yaml +0 -0
- {examples → qtype/examples}/legacy/bedrock/hello_world_completion_with_auth.qtype.yaml +0 -0
- {examples → qtype/examples}/legacy/bedrock/simple_agent_chat.qtype.yaml +0 -0
- {examples → qtype/examples}/legacy/chat_with_langfuse.qtype.yaml +0 -0
- {examples → qtype/examples}/legacy/data_processor.qtype.yaml +0 -0
- {examples → qtype/examples}/legacy/echo/debug_example.qtype.yaml +0 -0
- {examples → qtype/examples}/legacy/echo/prompt.qtype.yaml +0 -0
- {examples → qtype/examples}/legacy/echo/test.qtype.yaml +0 -0
- {examples → qtype/examples}/legacy/echo/video.qtype.yaml +0 -0
- {examples → qtype/examples}/legacy/field_extractor_example.qtype.yaml +0 -0
- {examples → qtype/examples}/legacy/multi_flow_example.qtype.yaml +0 -0
- {examples → qtype/examples}/legacy/openai/hello_world_chat.qtype.yaml +0 -0
- {examples → qtype/examples}/legacy/openai/hello_world_chat_with_telemetry.qtype.yaml +0 -0
- {examples → qtype/examples}/legacy/rag.qtype.yaml +0 -0
- {examples → qtype/examples}/legacy/time_utilities.qtype.yaml +0 -0
- {examples → qtype/examples}/legacy/vertex/hello_world_chat.qtype.yaml +0 -0
- {examples → qtype/examples}/legacy/vertex/hello_world_completion.qtype.yaml +0 -0
- {examples → qtype/examples}/legacy/vertex/hello_world_completion_with_auth.qtype.yaml +0 -0
- {examples → qtype/examples}/observability_debugging/trace_with_opentelemetry.qtype.yaml +0 -0
- {examples → qtype/examples}/research_assistant/research_assistant.qtype.yaml +0 -0
- {examples → qtype/examples}/research_assistant/tavily.oas.yaml +0 -0
- {examples → qtype/examples}/tutorials/01_hello_world.qtype.yaml +0 -0
- {examples → qtype/examples}/tutorials/02_conversational_chat.qtype.yaml +0 -0
- {qtype-0.1.12.dist-info → qtype-0.1.13.dist-info}/entry_points.txt +0 -0
- {qtype-0.1.12.dist-info → qtype-0.1.13.dist-info}/licenses/LICENSE +0 -0
qtype/commands/run.py
CHANGED
|
@@ -11,11 +11,10 @@ import warnings
|
|
|
11
11
|
from pathlib import Path
|
|
12
12
|
from typing import Any
|
|
13
13
|
|
|
14
|
-
import pandas as pd
|
|
15
14
|
from pydantic.warnings import UnsupportedFieldAttributeWarning
|
|
16
15
|
|
|
17
|
-
from qtype.application.facade import QTypeFacade
|
|
18
16
|
from qtype.base.exceptions import InterpreterError, LoadError, ValidationError
|
|
17
|
+
from qtype.interpreter.converters import read_dataframe_from_file
|
|
19
18
|
|
|
20
19
|
logger = logging.getLogger(__name__)
|
|
21
20
|
|
|
@@ -29,50 +28,124 @@ for name in ["httpx", "urllib3", "qdrant_client", "opensearch"]:
|
|
|
29
28
|
logging.getLogger(name).setLevel(logging.WARNING)
|
|
30
29
|
|
|
31
30
|
|
|
32
|
-
def
|
|
33
|
-
"""
|
|
34
|
-
|
|
31
|
+
def register_telemetry(spec) -> None:
|
|
32
|
+
"""Register telemetry if enabled in the spec."""
|
|
33
|
+
from qtype.interpreter.telemetry import register
|
|
34
|
+
from qtype.semantic.model import Application as SemanticApplication
|
|
35
|
+
|
|
36
|
+
if isinstance(spec, SemanticApplication) and spec.telemetry:
|
|
37
|
+
logger.info(
|
|
38
|
+
f"Telemetry enabled with endpoint: {spec.telemetry.endpoint}"
|
|
39
|
+
)
|
|
40
|
+
secret_mgr = create_secret_manager_for_spec(spec)
|
|
41
|
+
register(spec.telemetry, secret_mgr, spec.id)
|
|
42
|
+
|
|
43
|
+
|
|
44
|
+
def create_secret_manager_for_spec(spec):
|
|
45
|
+
"""Create a secret manager based on the specification."""
|
|
46
|
+
from qtype.interpreter.base.secrets import create_secret_manager
|
|
47
|
+
from qtype.semantic.model import Application as SemanticApplication
|
|
48
|
+
|
|
49
|
+
if isinstance(spec, SemanticApplication):
|
|
50
|
+
return create_secret_manager(spec.secret_manager)
|
|
51
|
+
else:
|
|
52
|
+
raise ValueError(
|
|
53
|
+
"Can't create secret manager for non-Application spec"
|
|
54
|
+
)
|
|
55
|
+
|
|
56
|
+
|
|
57
|
+
async def execute_workflow(
|
|
58
|
+
path: Path,
|
|
59
|
+
inputs: dict | Any,
|
|
60
|
+
flow_name: str | None = None,
|
|
61
|
+
**kwargs: Any,
|
|
62
|
+
) -> Any:
|
|
63
|
+
"""Execute a complete workflow from document to results.
|
|
64
|
+
|
|
65
|
+
Args:
|
|
66
|
+
path: Path to the QType specification file
|
|
67
|
+
inputs: Dictionary of input values or DataFrame for batch
|
|
68
|
+
flow_name: Optional name of flow to execute
|
|
69
|
+
**kwargs: Additional dependencies for execution
|
|
70
|
+
|
|
71
|
+
Returns:
|
|
72
|
+
DataFrame with results (one row per input)
|
|
35
73
|
"""
|
|
36
|
-
|
|
37
|
-
|
|
38
|
-
|
|
39
|
-
|
|
40
|
-
|
|
41
|
-
|
|
42
|
-
|
|
43
|
-
|
|
44
|
-
|
|
45
|
-
|
|
46
|
-
|
|
47
|
-
|
|
48
|
-
|
|
49
|
-
|
|
50
|
-
|
|
51
|
-
|
|
52
|
-
|
|
53
|
-
|
|
54
|
-
|
|
74
|
+
import pandas as pd
|
|
75
|
+
from opentelemetry import trace
|
|
76
|
+
|
|
77
|
+
from qtype.interpreter.base.executor_context import ExecutorContext
|
|
78
|
+
from qtype.interpreter.converters import (
|
|
79
|
+
dataframe_to_flow_messages,
|
|
80
|
+
flow_messages_to_dataframe,
|
|
81
|
+
)
|
|
82
|
+
from qtype.interpreter.flow import run_flow
|
|
83
|
+
from qtype.interpreter.types import Session
|
|
84
|
+
from qtype.semantic.loader import load
|
|
85
|
+
from qtype.semantic.model import Application as SemanticApplication
|
|
86
|
+
|
|
87
|
+
# Load the semantic application
|
|
88
|
+
semantic_model, type_registry = load(path)
|
|
89
|
+
assert isinstance(semantic_model, SemanticApplication)
|
|
90
|
+
register_telemetry(semantic_model)
|
|
91
|
+
|
|
92
|
+
# Find the flow to execute
|
|
93
|
+
if flow_name:
|
|
94
|
+
target_flow = None
|
|
95
|
+
for flow in semantic_model.flows:
|
|
96
|
+
if flow.id == flow_name:
|
|
97
|
+
target_flow = flow
|
|
98
|
+
break
|
|
99
|
+
if target_flow is None:
|
|
100
|
+
raise ValueError(f"Flow '{flow_name}' not found")
|
|
101
|
+
else:
|
|
102
|
+
if semantic_model.flows:
|
|
103
|
+
target_flow = semantic_model.flows[0]
|
|
55
104
|
else:
|
|
56
|
-
raise ValueError(
|
|
57
|
-
|
|
58
|
-
|
|
59
|
-
|
|
60
|
-
|
|
61
|
-
|
|
62
|
-
|
|
63
|
-
|
|
64
|
-
|
|
65
|
-
"application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
|
|
66
|
-
"application/vnd.ms-excel",
|
|
67
|
-
]:
|
|
68
|
-
return pd.read_excel(file_path)
|
|
69
|
-
elif mime_type in ["application/vnd.parquet", "application/octet-stream"]:
|
|
70
|
-
return pd.read_parquet(file_path)
|
|
105
|
+
raise ValueError("No flows found in application")
|
|
106
|
+
|
|
107
|
+
logger.info(f"Executing flow {target_flow.id} from {path}")
|
|
108
|
+
|
|
109
|
+
# Convert inputs to DataFrame (normalize single dict to 1-row DataFrame)
|
|
110
|
+
if isinstance(inputs, dict):
|
|
111
|
+
input_df = pd.DataFrame([inputs])
|
|
112
|
+
elif isinstance(inputs, pd.DataFrame):
|
|
113
|
+
input_df = inputs
|
|
71
114
|
else:
|
|
72
115
|
raise ValueError(
|
|
73
|
-
f"
|
|
116
|
+
f"Inputs must be dict or DataFrame, got {type(inputs)}"
|
|
74
117
|
)
|
|
75
118
|
|
|
119
|
+
# Create session
|
|
120
|
+
session = Session(
|
|
121
|
+
session_id=kwargs.pop("session_id", "default"),
|
|
122
|
+
conversation_history=kwargs.pop("conversation_history", []),
|
|
123
|
+
)
|
|
124
|
+
|
|
125
|
+
# Convert DataFrame to FlowMessages with type conversion
|
|
126
|
+
initial_messages_list = dataframe_to_flow_messages(
|
|
127
|
+
input_df, target_flow.inputs, session=session
|
|
128
|
+
)
|
|
129
|
+
|
|
130
|
+
# Execute the flow
|
|
131
|
+
secret_manager = create_secret_manager_for_spec(semantic_model)
|
|
132
|
+
|
|
133
|
+
context = ExecutorContext(
|
|
134
|
+
secret_manager=secret_manager,
|
|
135
|
+
tracer=trace.get_tracer(__name__),
|
|
136
|
+
)
|
|
137
|
+
results = await run_flow(
|
|
138
|
+
target_flow,
|
|
139
|
+
initial_messages_list,
|
|
140
|
+
context=context,
|
|
141
|
+
**kwargs,
|
|
142
|
+
)
|
|
143
|
+
|
|
144
|
+
# Convert results back to DataFrame
|
|
145
|
+
results_df = flow_messages_to_dataframe(results, target_flow)
|
|
146
|
+
|
|
147
|
+
return results_df
|
|
148
|
+
|
|
76
149
|
|
|
77
150
|
def run_flow(args: Any) -> None:
|
|
78
151
|
"""Run a QType YAML spec file by executing its flows.
|
|
@@ -82,7 +155,6 @@ def run_flow(args: Any) -> None:
|
|
|
82
155
|
"""
|
|
83
156
|
import asyncio
|
|
84
157
|
|
|
85
|
-
facade = QTypeFacade()
|
|
86
158
|
spec_path = Path(args.spec)
|
|
87
159
|
|
|
88
160
|
try:
|
|
@@ -90,7 +162,7 @@ def run_flow(args: Any) -> None:
|
|
|
90
162
|
|
|
91
163
|
if args.input_file:
|
|
92
164
|
logger.info(f"Loading input data from file: {args.input_file}")
|
|
93
|
-
input: Any =
|
|
165
|
+
input: Any = read_dataframe_from_file(args.input_file)
|
|
94
166
|
else:
|
|
95
167
|
# Parse input JSON
|
|
96
168
|
try:
|
|
@@ -99,9 +171,9 @@ def run_flow(args: Any) -> None:
|
|
|
99
171
|
logger.error(f"❌ Invalid JSON input: {e}")
|
|
100
172
|
return
|
|
101
173
|
|
|
102
|
-
# Execute the workflow using the
|
|
174
|
+
# Execute the workflow using the standalone function
|
|
103
175
|
result_df = asyncio.run(
|
|
104
|
-
|
|
176
|
+
execute_workflow(
|
|
105
177
|
spec_path,
|
|
106
178
|
flow_name=args.flow,
|
|
107
179
|
inputs=input,
|
qtype/docs/.pages
ADDED
|
@@ -128,7 +128,7 @@ Output Variables
|
|
|
128
128
|
|
|
129
129
|
**Linear execution:** Steps run sequentially in declaration order. Each step waits for its inputs to be available. Parallelism is supported for multiple inputs.
|
|
130
130
|
|
|
131
|
-
**1-to-many cardinality:** Some steps (like `Explode`) can produce multiple outputs for one input, creating fan-out patterns. Other steps (like `Collect`)
|
|
131
|
+
**1-to-many cardinality:** Some steps (like `Explode`) can produce multiple outputs for one input, creating fan-out patterns. Other steps (like `Collect`) gather many inputs into one collection. This enables batch processing patterns.
|
|
132
132
|
|
|
133
133
|
---
|
|
134
134
|
|
|
@@ -57,6 +57,14 @@ After installation, you should be able to run the `qtype` command from anywhere:
|
|
|
57
57
|
qtype --help
|
|
58
58
|
```
|
|
59
59
|
|
|
60
|
+
## Running the MCP In Dev Mode
|
|
61
|
+
|
|
62
|
+
To start it in `dev` mode with the inspector:
|
|
63
|
+
```
|
|
64
|
+
mcp dev qtype/mcp/server.py:mcp
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
|
|
60
68
|
## Running Tests
|
|
61
69
|
|
|
62
70
|
The project uses pytest for testing with coverage measurement:
|
|
@@ -161,7 +169,6 @@ pre-commit install
|
|
|
161
169
|
|
|
162
170
|
Settings are in `.pre-commit-config.yaml`:
|
|
163
171
|
|
|
164
|
-
|
|
165
172
|
## Project Structure
|
|
166
173
|
|
|
167
174
|
- `qtype/` – Python package for parsing, validating, and interpreting QType specs
|
|
@@ -76,5 +76,6 @@ result_file: results.parquet
|
|
|
76
76
|
|
|
77
77
|
## Learn More
|
|
78
78
|
|
|
79
|
-
- Tutorial:
|
|
80
|
-
-
|
|
79
|
+
- [Tutorial: Your First QType Application](../Tutorials/01-first-qtype-application.md)
|
|
80
|
+
- [Read Data from SQL Databases](../How%20To/Data%20Processing/read_sql_databases.md)
|
|
81
|
+
- [Adjust Concurrency](../How%20To/Data%20Processing/adjust_concurrency.md)
|
|
@@ -92,7 +92,6 @@ When running with the topic "Latest developments in retrieval augmented generati
|
|
|
92
92
|
|
|
93
93
|
## Learn More
|
|
94
94
|
|
|
95
|
-
-
|
|
96
|
-
-
|
|
97
|
-
-
|
|
98
|
-
- How-To: [Call Large Language Models](../How%20To/Invoke%20Models/call_large_language_models.md)
|
|
95
|
+
- [Create Tools from OpenAPI Specifications](../How%20To/Tools%20%26%20Integration/create_tools_from_openapi_specifications.md)
|
|
96
|
+
- [Bind Tool Inputs and Outputs](../How%20To/Tools%20%26%20Integration/bind_tool_inputs_and_outputs.md)
|
|
97
|
+
- [Call Large Language Models](../How%20To/Invoke%20Models/call_large_language_models.md)
|
|
@@ -33,4 +33,6 @@ qtype serve examples/conversational_ai/simple_chatbot.qtype.yaml
|
|
|
33
33
|
|
|
34
34
|
## Learn More
|
|
35
35
|
|
|
36
|
-
- Tutorial:
|
|
36
|
+
- [Tutorial: Conversational Chatbot](../Tutorials/02-conversational-chatbot.md)
|
|
37
|
+
- [Use Conversational Interfaces](../How%20To/Qtype%20Server/use_conversational_interfaces.md)
|
|
38
|
+
- [ChatMessage Reference](../components/ChatMessage.md)
|
|
@@ -55,6 +55,6 @@ models:
|
|
|
55
55
|
## See Also
|
|
56
56
|
|
|
57
57
|
- [AWSAuthProvider Reference](../../components/AWSAuthProvider.md)
|
|
58
|
+
- [Use API Key Authentication](use_api_key_authentication.md)
|
|
59
|
+
- [Call Large Language Models](../Invoke%20Models/call_large_language_models.md)
|
|
58
60
|
- [Model Reference](../../components/Model.md)
|
|
59
|
-
- [How-To: Use API Key Authentication](use_api_key_authentication.md)
|
|
60
|
-
- [How-To: Manage Secrets with Secret Manager](../Authentication/manage_secrets.md)
|
|
@@ -36,5 +36,5 @@ models:
|
|
|
36
36
|
|
|
37
37
|
- [APIKeyAuthProvider Reference](../../components/APIKeyAuthProvider.md)
|
|
38
38
|
- [Use Environment Variables](../Language%20Features/use_environment_variables.md)
|
|
39
|
-
- [
|
|
40
|
-
- [Tutorial: Your First QType Application](../../Tutorials/
|
|
39
|
+
- [Configure AWS Authentication](configure_aws_authentication.md)
|
|
40
|
+
- [Tutorial: Your First QType Application](../../Tutorials/01-first-qtype-application.md)
|
|
@@ -10,8 +10,8 @@ qtype run app.qtype.yaml --input-file inputs.csv
|
|
|
10
10
|
|
|
11
11
|
### Supported File Formats
|
|
12
12
|
|
|
13
|
-
- **CSV**: Columns map to input variable names
|
|
14
|
-
- **JSON**: Array of objects or records format
|
|
13
|
+
- **CSV**: Columns map to input variable names (best for primitive types)
|
|
14
|
+
- **JSON**: Array of objects or records format (best for nested/complex types)
|
|
15
15
|
- **Parquet**: Efficient columnar format for large datasets
|
|
16
16
|
- **Excel**: `.xlsx` or `.xls` files
|
|
17
17
|
|
|
@@ -19,10 +19,25 @@ qtype run app.qtype.yaml --input-file inputs.csv
|
|
|
19
19
|
|
|
20
20
|
When you provide `--input-file`, QType:
|
|
21
21
|
1. Reads the file into a pandas DataFrame
|
|
22
|
-
2.
|
|
23
|
-
3.
|
|
24
|
-
4.
|
|
25
|
-
5.
|
|
22
|
+
2. Automatically converts data to match input variable types
|
|
23
|
+
3. Each row becomes one execution of the flow
|
|
24
|
+
4. Column names must match flow input variable IDs
|
|
25
|
+
5. Processes rows with configured concurrency
|
|
26
|
+
6. Returns results as a DataFrame (can be saved with `--output`)
|
|
27
|
+
|
|
28
|
+
### Type Conversion
|
|
29
|
+
|
|
30
|
+
QType automatically converts file data to match your flow's input types:
|
|
31
|
+
|
|
32
|
+
- **Primitive types** (`int`, `float`, `bool`, `text`): Converted from file values
|
|
33
|
+
- **Custom types**: Validated and instantiated from dict/object columns (use JSON format)
|
|
34
|
+
- **Domain types**: Built-in types like `ChatMessage` or `SearchResult` (use JSON format)
|
|
35
|
+
|
|
36
|
+
**Format Selection Guide:**
|
|
37
|
+
|
|
38
|
+
- Use **CSV** for simple data with primitive types (strings, numbers, booleans)
|
|
39
|
+
- Use **JSON** for complex data with custom types, nested objects, or domain types
|
|
40
|
+
- Use **Parquet** for large datasets with mixed types and efficient storage
|
|
26
41
|
|
|
27
42
|
## Complete Example
|
|
28
43
|
|
|
@@ -57,6 +72,6 @@ qtype run batch_processing.qtype.yaml \
|
|
|
57
72
|
|
|
58
73
|
## See Also
|
|
59
74
|
|
|
60
|
-
|
|
61
|
-
|
|
62
|
-
- [
|
|
75
|
+
- [Pass Inputs On The CLI](pass_inputs_on_the_cli.md)
|
|
76
|
+
- [Adjust Concurrency](../Data%20Processing/adjust_concurrency.md)
|
|
77
|
+
- [Gallery: Dataflow Pipelines](../../Gallery/dataflow_pipelines.md)
|
|
@@ -47,6 +47,6 @@ flows:
|
|
|
47
47
|
|
|
48
48
|
## See Also
|
|
49
49
|
|
|
50
|
-
- [Load Multiple Inputs from Files](
|
|
51
|
-
- [
|
|
52
|
-
- [
|
|
50
|
+
- [Load Multiple Inputs from Files](load_multiple_inputs_from_files.md)
|
|
51
|
+
- [CLI Reference](../../Reference/cli.md)
|
|
52
|
+
- [Gallery: Dataflow Pipelines](../../Gallery/dataflow_pipelines.md)
|
|
@@ -22,5 +22,6 @@ qtype serve --reload -p 8080 examples/tutorials/01_hello_world.qtype.yaml
|
|
|
22
22
|
|
|
23
23
|
## See Also
|
|
24
24
|
|
|
25
|
-
- [Serve
|
|
26
|
-
- [
|
|
25
|
+
- [Serve Flows as APIs](../Qtype%20Server/serve_flows_as_apis.md)
|
|
26
|
+
- [Serve Flows as UI](../Qtype%20Server/serve_flows_as_ui.md)
|
|
27
|
+
- [CLI Reference](../../Reference/cli.md)
|
|
@@ -35,7 +35,6 @@ The following step types support `concurrency_config`:
|
|
|
35
35
|
|
|
36
36
|
## See Also
|
|
37
37
|
|
|
38
|
-
- [
|
|
39
|
-
- [
|
|
40
|
-
- [
|
|
41
|
-
- [LLM Processing Pipelines](../../Gallery/dataflow_pipelines.md)
|
|
38
|
+
- [Step Reference](../../components/Step.md)
|
|
39
|
+
- [Cache Step Results](cache_step_results.md)
|
|
40
|
+
- [Gallery: Dataflow Pipelines](../../Gallery/dataflow_pipelines.md)
|
|
@@ -66,6 +66,6 @@ qtype run examples/data_processing/cache_step_results.qtype.yaml --progress -i
|
|
|
66
66
|
|
|
67
67
|
## See Also
|
|
68
68
|
|
|
69
|
-
- [
|
|
69
|
+
- [Step Reference](../../components/Step.md)
|
|
70
70
|
- [Adjust Concurrency](adjust_concurrency.md)
|
|
71
|
-
- [Tutorial: Your First QType Application](../../Tutorials/
|
|
71
|
+
- [Tutorial: Your First QType Application](../../Tutorials/01-first-qtype-application.md)
|
|
@@ -21,4 +21,4 @@ Parse string data in JSON or XML format into structured outputs. This is particu
|
|
|
21
21
|
|
|
22
22
|
- [Decoder Reference](../../components/Decoder.md)
|
|
23
23
|
- [CustomType Reference](../../components/CustomType.md)
|
|
24
|
-
- [Tutorial:
|
|
24
|
+
- [Tutorial: Structured Data](../../Tutorials/03-structured-data.md)
|
|
@@ -35,6 +35,6 @@ qtype run examples/data_processing/explode_items.qtype.yaml \
|
|
|
35
35
|
|
|
36
36
|
## See Also
|
|
37
37
|
|
|
38
|
-
- [
|
|
38
|
+
- [Gather Results into a List](gather_results.md)
|
|
39
39
|
- [Explode Reference](../../components/Explode.md)
|
|
40
|
-
- [Adjust Concurrency](
|
|
40
|
+
- [Adjust Concurrency](adjust_concurrency.md)
|
|
@@ -22,8 +22,8 @@ steps:
|
|
|
22
22
|
|
|
23
23
|
- **Collect**: Gathers all input values from multiple messages into a single list output
|
|
24
24
|
- **Common ancestors**: Only variables that have the exact same value across ALL input messages are preserved in the output message
|
|
25
|
-
- **Fan-out pattern**: Typically used after `Explode` to reverse the fan-out and
|
|
26
|
-
- **Single output**: Always produces exactly one output message containing the
|
|
25
|
+
- **Fan-out pattern**: Typically used after `Explode` to reverse the fan-out and accumulate results
|
|
26
|
+
- **Single output**: Always produces exactly one output message containing the accumulate list
|
|
27
27
|
|
|
28
28
|
### Understanding Common Ancestors
|
|
29
29
|
|
|
@@ -63,6 +63,6 @@ all_processed: ['Processed: Phone', 'Processed: Laptop', 'Processed: Tablet']
|
|
|
63
63
|
|
|
64
64
|
## See Also
|
|
65
65
|
|
|
66
|
-
- [Explode Collections
|
|
66
|
+
- [Explode Collections](explode_collections.md)
|
|
67
67
|
- [Collect Reference](../../components/Collect.md)
|
|
68
|
-
- [
|
|
68
|
+
- [Aggregate Reference](../../components/Aggregate.md)
|
|
@@ -0,0 +1,71 @@
|
|
|
1
|
+
# Invoke Other Flows
|
|
2
|
+
|
|
3
|
+
Reuse flows as composable building blocks by invoking them from other flows with input and output bindings.
|
|
4
|
+
|
|
5
|
+
### QType YAML
|
|
6
|
+
|
|
7
|
+
```yaml
|
|
8
|
+
flows:
|
|
9
|
+
# Define reusable flow
|
|
10
|
+
- type: Flow
|
|
11
|
+
id: summarize_text
|
|
12
|
+
variables:
|
|
13
|
+
- id: input_text
|
|
14
|
+
type: text
|
|
15
|
+
- id: output_summary
|
|
16
|
+
type: text
|
|
17
|
+
inputs: [input_text]
|
|
18
|
+
outputs: [output_summary]
|
|
19
|
+
steps:
|
|
20
|
+
- type: LLMInference
|
|
21
|
+
id: summarizer
|
|
22
|
+
model: my_model
|
|
23
|
+
inputs: [input_text]
|
|
24
|
+
outputs: [output_summary]
|
|
25
|
+
|
|
26
|
+
# Main flow invokes the reusable flow
|
|
27
|
+
- type: Flow
|
|
28
|
+
id: main
|
|
29
|
+
variables:
|
|
30
|
+
- id: article
|
|
31
|
+
type: text
|
|
32
|
+
- id: summary
|
|
33
|
+
type: text
|
|
34
|
+
inputs: [article]
|
|
35
|
+
outputs: [summary]
|
|
36
|
+
steps:
|
|
37
|
+
- type: InvokeFlow
|
|
38
|
+
id: get_summary
|
|
39
|
+
flow: summarize_text # Reference to flow by ID
|
|
40
|
+
input_bindings:
|
|
41
|
+
input_text: article # Map flow input to step variable
|
|
42
|
+
output_bindings:
|
|
43
|
+
output_summary: summary # Map flow output to step variable
|
|
44
|
+
```
|
|
45
|
+
|
|
46
|
+
### Explanation
|
|
47
|
+
|
|
48
|
+
- **InvokeFlow**: Step type that executes another flow with variable mapping
|
|
49
|
+
- **flow**: ID of the flow to invoke (must be defined in the application)
|
|
50
|
+
- **input_bindings**: Maps flow input variables to the invoking step's variables (format: `flow_input_name: step_variable_name`)
|
|
51
|
+
- **output_bindings**: Maps flow output variables to the invoking step's variables (format: `flow_output_name: step_variable_name`)
|
|
52
|
+
- **Reusability**: Flows can be invoked multiple times with different bindings
|
|
53
|
+
|
|
54
|
+
## Complete Example
|
|
55
|
+
|
|
56
|
+
```yaml
|
|
57
|
+
--8<-- "../examples/data_processing/invoke_other_flows.qtype.yaml"
|
|
58
|
+
```
|
|
59
|
+
|
|
60
|
+
**Run it:**
|
|
61
|
+
```bash
|
|
62
|
+
qtype run examples/data_processing/invoke_other_flows.qtype.yaml \
|
|
63
|
+
--flow main \
|
|
64
|
+
--input '{"article_text": "Your article text here..."}'
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
## See Also
|
|
68
|
+
|
|
69
|
+
- [InvokeFlow Reference](../../components/InvokeFlow.md)
|
|
70
|
+
- [Flow Reference](../../components/Flow.md)
|
|
71
|
+
- [Use Agent Skills](../Language%20Features/use_agent_skills.md)
|
|
@@ -0,0 +1,49 @@
|
|
|
1
|
+
# Load Data from Athena
|
|
2
|
+
|
|
3
|
+
Query AWS Athena databases using standard SQL with the `SQLSource` step, which supports Athena through SQLAlchemy connection strings and AWS authentication.
|
|
4
|
+
|
|
5
|
+
### QType YAML
|
|
6
|
+
|
|
7
|
+
```yaml
|
|
8
|
+
flows:
|
|
9
|
+
- id: query-athena
|
|
10
|
+
steps:
|
|
11
|
+
- type: SQLSource
|
|
12
|
+
id: load_sales
|
|
13
|
+
connection: "awsathena+rest://:@athena.us-east-1.amazonaws.com:443/sales_db?s3_staging_dir=s3://my-results-bucket/athena-results/&work_group=primary&catalog_name=some_catalog"
|
|
14
|
+
query: |
|
|
15
|
+
SELECT
|
|
16
|
+
product_id,
|
|
17
|
+
product_name,
|
|
18
|
+
total_sales
|
|
19
|
+
FROM product_sales
|
|
20
|
+
WHERE total_sales >= :min_sales
|
|
21
|
+
ORDER BY total_sales DESC
|
|
22
|
+
inputs:
|
|
23
|
+
- min_sales
|
|
24
|
+
outputs:
|
|
25
|
+
- product_id
|
|
26
|
+
- product_name
|
|
27
|
+
- total_sales
|
|
28
|
+
```
|
|
29
|
+
|
|
30
|
+
### Explanation
|
|
31
|
+
|
|
32
|
+
- **awsathena+rest**: PyAthena SQLAlchemy dialect for accessing Athena via REST API
|
|
33
|
+
- **Connection string format**: `awsathena+rest://:@athena.{REGION}.amazonaws.com:443/{DATABASE}?s3_staging_dir={S3_PATH}&work_group={WORKGROUP}&catalog_name={CATALOG}"`
|
|
34
|
+
- **s3_staging_dir**: S3 location where Athena writes query results (required by Athena)
|
|
35
|
+
- **work_group**: Athena workgroup name (e.g., `primary`)
|
|
36
|
+
- **auth**: Reference to AWSAuthProvider for AWS credentials
|
|
37
|
+
- **query**: Standard SQL query with parameter substitution using `:parameter_name` syntax
|
|
38
|
+
|
|
39
|
+
## Complete Example
|
|
40
|
+
|
|
41
|
+
```yaml
|
|
42
|
+
--8<-- "../examples/data_processing/athena_query.qtype.yaml"
|
|
43
|
+
```
|
|
44
|
+
|
|
45
|
+
## See Also
|
|
46
|
+
|
|
47
|
+
- [SQLSource Reference](../../components/SQLSource.md)
|
|
48
|
+
- [Configure AWS Authentication](../Authentication/configure_aws_authentication.md)
|
|
49
|
+
- [Read Data from SQL Databases](read_sql_databases.md)
|
|
@@ -0,0 +1,61 @@
|
|
|
1
|
+
# Read Data from Files
|
|
2
|
+
|
|
3
|
+
Load structured data from files using FileSource, which supports CSV, JSON, JSONL, and Parquet formats with automatic format detection and type conversion.
|
|
4
|
+
|
|
5
|
+
### QType YAML
|
|
6
|
+
|
|
7
|
+
```yaml
|
|
8
|
+
steps:
|
|
9
|
+
- id: read_data
|
|
10
|
+
type: FileSource
|
|
11
|
+
path: batch_inputs.csv
|
|
12
|
+
outputs:
|
|
13
|
+
- query
|
|
14
|
+
- topic
|
|
15
|
+
```
|
|
16
|
+
|
|
17
|
+
### Explanation
|
|
18
|
+
|
|
19
|
+
- **FileSource**: Step that reads structured data from files using fsspec-compatible URIs
|
|
20
|
+
- **path**: File path (relative to YAML file or absolute), supports local files and cloud storage (s3://, gs://, etc.)
|
|
21
|
+
- **outputs**: Column names from the file to extract as variables (must match actual column names)
|
|
22
|
+
- **Format detection**: Automatically determined by file extension (.csv, .json, .jsonl, .parquet)
|
|
23
|
+
- **Type conversion**: Automatically converts data to match variable types (primitives, domain types, custom types)
|
|
24
|
+
- **Streaming**: Emits one FlowMessage per row, enabling downstream steps to process data in parallel
|
|
25
|
+
|
|
26
|
+
### Automatic Type Conversion
|
|
27
|
+
|
|
28
|
+
FileSource automatically converts data from files to match your variable types:
|
|
29
|
+
|
|
30
|
+
- **Primitive types** (`int`, `float`, `bool`, `text`): Direct conversion from file data
|
|
31
|
+
- **Domain types** (`ChatMessage`, `SearchResult`, etc.): Validated from dict/object columns
|
|
32
|
+
- **Custom types**: Your defined types are validated and instantiated from dict/object columns
|
|
33
|
+
|
|
34
|
+
**Format Recommendations:**
|
|
35
|
+
|
|
36
|
+
- **CSV**: Best for simple primitive types (strings, numbers, booleans)
|
|
37
|
+
- **JSON/JSONL**: Recommended for nested objects, custom types, and domain types
|
|
38
|
+
- **Parquet**: Best for large datasets with mixed types and efficient storage
|
|
39
|
+
|
|
40
|
+
**Example with Custom Types (JSON format):**
|
|
41
|
+
|
|
42
|
+
```json
|
|
43
|
+
[
|
|
44
|
+
{"person": {"name": "Alice", "age": 30}, "score": 95},
|
|
45
|
+
{"person": {"name": "Bob", "age": 25}, "score": 87}
|
|
46
|
+
]
|
|
47
|
+
```
|
|
48
|
+
|
|
49
|
+
JSON preserves nested objects, making it ideal for complex types. CSV stores everything as strings, requiring nested objects to be serialized as JSON strings within the CSV.
|
|
50
|
+
|
|
51
|
+
## Complete Example
|
|
52
|
+
|
|
53
|
+
```yaml
|
|
54
|
+
--8<-- "../examples/data_processing/read_file.qtype.yaml"
|
|
55
|
+
```
|
|
56
|
+
|
|
57
|
+
## See Also
|
|
58
|
+
|
|
59
|
+
- [FileSource Reference](../../components/FileSource.md)
|
|
60
|
+
- [Load Multiple Inputs from Files](../Command%20Line%20Usage/load_multiple_inputs_from_files.md)
|
|
61
|
+
- [Write Data to File](write_data_to_file.md)
|
|
@@ -42,6 +42,5 @@ steps:
|
|
|
42
42
|
## See Also
|
|
43
43
|
|
|
44
44
|
- [SQLSource Reference](../../components/SQLSource.md)
|
|
45
|
-
- [
|
|
46
|
-
- [
|
|
47
|
-
- [Example: Dataflow Pipeline](../../Gallery/Data%20Processing/dataflow_pipelines.md)
|
|
45
|
+
- [Load Data from Athena](load_data_from_athena.md)
|
|
46
|
+
- [Read Data from Files](read_data_from_files.md)
|
|
@@ -36,5 +36,4 @@ See the [LLM Processing Pipelines](../../Gallery/dataflow_pipelines.md) gallery
|
|
|
36
36
|
|
|
37
37
|
- [FileWriter Reference](../../components/FileWriter.md)
|
|
38
38
|
- [Read Data from Files](read_data_from_files.md)
|
|
39
|
-
- [
|
|
40
|
-
- [LLM Processing Pipelines](../../Gallery/dataflow_pipelines.md)
|
|
39
|
+
- [Gallery: Dataflow Pipelines](../../Gallery/dataflow_pipelines.md)
|
|
@@ -48,4 +48,4 @@ qtype run simple_llm_call.qtype.yaml --input '{"text": "What is the capital of F
|
|
|
48
48
|
|
|
49
49
|
- [LLMInference Reference](../../components/LLMInference.md)
|
|
50
50
|
- [Model Reference](../../components/Model.md)
|
|
51
|
-
- [Tutorial:
|
|
51
|
+
- [Tutorial: Conversational Chatbot](../../Tutorials/02-conversational-chatbot.md)
|
|
@@ -46,4 +46,4 @@ qtype run examples/invoke_models/create_embeddings.qtype.yaml \
|
|
|
46
46
|
|
|
47
47
|
- [InvokeEmbedding Reference](../../components/InvokeEmbedding.md)
|
|
48
48
|
- [EmbeddingModel Reference](../../components/EmbeddingModel.md)
|
|
49
|
-
- [
|
|
49
|
+
- [Embedding Reference](../../components/Embedding.md)
|