npm - howone - Versions diffs - 0.1.23 → 0.1.25 - Mend

howone 0.1.23 → 0.1.25

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (24) hide show

package/templates/vite/.howone/skills/howone-sdk/04-ai/03-ai-sdk-handoff.md CHANGED Viewed

@@ -1,80 +1,219 @@
 # AI SDK Handoff
-Use this reference after AI capability artifacts have been synced and app code needs to call the
-workflow.
+Use this reference after AI capability artifacts have been synced and app code must call the
+workflow through `@howone/sdk`.
+This file answers: **how does `.howone/ai/manifest.json` become `src/lib/sdk.ts`, and how should UI
+call it?**
 ## Binding Source
-Generate `src/lib/sdk.ts` from `.howone/ai/manifest.json`. Do not write AI bindings from memory or
-from the original user prompt.
+Generate `src/lib/sdk.ts` from `.howone/ai/manifest.json`. Do not write AI bindings from memory,
+from the original prompt, or from the workflow service response.
-For each manifest capability:
+For each manifest capability/action:
-1. Read `name`.
+1. Read stable action name/ID.
 2. Read `workflowId`.
 3. Read `inputSchema`.
 4. Read `outputSchema`.
-5. Generate zod input and output schemas from the JSON Schema fields.
+5. Generate Zod input and output schemas.
 6. Bind with `defineAiAction(name, { workflowId, inputSchema, outputSchema })`.
+7. Compose with `withAiActions(client, ai)`.
 `workflowId` is mandatory. Without it, the SDK falls back to the action name as the execution URL
-segment, which is not a workflow UUID.
+segment, and the workflow service will reject it because the segment is not a UUID.
+## Generated Binding Example
+```ts
+import {
+  createClient,
+  defineAiAction,
+  defineAiActions,
+  withAiActions,
+} from '@howone/sdk'
+import { z } from 'zod'
+const client = createClient({
+  projectId: import.meta.env.VITE_HOWONE_PROJECT_ID,
+  env: import.meta.env.VITE_HOWONE_ENV,
+})
+export const summarizeDocumentInputSchema = z.object({
+  document_url: z.string().url(),
+  summary_length: z.string().optional(),
+})
+export type SummarizeDocumentInput = z.infer<typeof summarizeDocumentInputSchema>
+export const summarizeDocumentOutputSchema = z.object({
+  summary: z.string(),
+})
+export type SummarizeDocumentOutput = z.infer<typeof summarizeDocumentOutputSchema>
+export const ai = defineAiActions({
+  summarizeDocument: defineAiAction('summarizeDocument', {
+    workflowId: '550e8400-e29b-41d4-a716-446655440000',
+    inputSchema: summarizeDocumentInputSchema,
+    outputSchema: summarizeDocumentOutputSchema,
+  }),
+})
+const howone = withAiActions(client, ai)
+export default howone
+```
+## JSON Schema To Zod
+| JSON Schema | Zod |
+|---|---|
+| `string` | `z.string()` |
+| `string` + `format: "uri"` | `z.string().url()` |
+| `number` | `z.number()` |
+| `integer` | `z.number().int()` |
+| `boolean` | `z.boolean()` |
+| `array` of strings | `z.array(z.string())` |
+| `array` of objects | `z.array(z.object({ ... }))` |
+| `object` | `z.object({ ... })` |
+| string enum | `z.enum([...])` |
+| field not in `required` | `.optional()` |
+| nullable | `.nullable()` |
+Rules:
+- Required manifest fields must stay required in Zod.
+- Do not add `.passthrough()` to hide execution envelope problems.
+- Do not make outputs optional to silence validation failures.
+- If the workflow returns a different shape, fix the workflow/capability contract.
+## Calling Actions
+For typed one-shot actions:
+```ts
+const output = await howone.ai.summarizeDocument.run({
+  document_url,
+  summary_length: 'short',
+})
+setSummary(output.summary)
+```
+When `outputSchema` exists, `.run()` returns the validated `finalResult` payload directly.
-## Output Handling
+Do not read:
-For a typed action with `outputSchema`, `.run()` returns the validated workflow output payload.
-The SDK unwraps the EAX execution envelope and validates `finalResult` internally.
+```ts
+result.finalResult.summary
+result.data.summary
+result.raw.finalResult
+```
+Those are execution-envelope paths, not the typed SDK action contract.
+## Streaming And Events
-Use this pattern:
+Use `.stream()` when UI needs live output or cancellation:
 ```ts
-const output = await howone.ai.generateSummary.run(input)
-// output is GenerateSummaryOutput when outputSchema is configured.
+const session = howone.ai.generateStory.stream(input, {
+  onStreamChunk: (chunk) => setDraft((prev) => prev + chunk),
+  onProgress: (progress) => setProgress(progress),
+  onError: (error) => setError(error.message),
+  onComplete: (result) => setRawResult(result),
+})
+cancelButton.onclick = () => session.cancel()
+const final = await session.result
 ```
-Do not read `raw.finalResult`, `raw.result`, or `raw.data.result` from a typed action result. Those
-paths are execution/SSE internals. App code should use the value returned by `.run()` directly.
+Use `.events()` when code wants an async iterable:
-Do not make every output field `.optional()` or add `.passthrough()` to hide validation errors.
-Required fields in the manifest must stay required in Zod. If validation fails, inspect
-`AiSchemaValidationError.issues` and fix the capability contract or workflow output mapping.
+```ts
+for await (const event of howone.ai.generateStory.events(input)) {
+  if (event.type === 'stream_content') {
+    appendText(String(event.data?.delta ?? ''))
+  }
+}
+```
 ## UI State
-AI calls should run in event handlers, effects, or explicit async actions. Never call
-`howone.ai.*.run()` inside JSX render.
+The SDK returns data and exposes callbacks. The app owns all visible UI.
-Recommended UI states:
+Recommended states:
-- idle
-- running
-- succeeded
-- failed
-- cancelled when using streaming
+```ts
+type AiUiState<T> =
+  | { status: 'idle' }
+  | { status: 'running'; progress?: number }
+  | { status: 'succeeded'; output: T }
+  | { status: 'failed'; message: string }
+  | { status: 'cancelled' }
+```
-For streaming, keep the `AiSession` in a ref and call `cancel()` from the UI.
+Do not add or import SDK toast APIs. Do not show SDK-owned overlays.
 ## Persistence Handoff
-If the app stores generated output:
+If AI output should survive refresh, use entity persistence after the action returns.
-1. Run the AI workflow.
-2. Use the typed output returned by `.run()`.
-3. Write the resulting data through `howone.entities.*`.
+For history-style products, prefer `runAiActionAndPersist()`:
-Do not ask the workflow to write to the database. Do not pass owner fields for authenticated
-user-owned entities; HowOne derives ownership from the JWT.
+```ts
+const result = await runAiActionAndPersist({
+  entity: howone.entities.Generation,
+  input: { prompt },
+  createPending: (input) => ({
+    prompt: input.prompt,
+    status: 'pending',
+    requestedAt: new Date().toISOString(),
+  }),
+  run: (input) => howone.ai.generateImage.run(input),
+  mapCompleted: ({ output }) => ({
+    status: 'completed',
+    resultUrl: output.generated_image_url,
+    completedAt: new Date().toISOString(),
+  }),
+  mapFailed: ({ error }) => ({
+    status: 'failed',
+    errorMessage: error instanceof Error ? error.message : 'Generation failed',
+  }),
+})
+```
+For simple save-after-success:
+```ts
+const output = await howone.ai.summarizeDocument.run(input)
+await howone.entities.DocumentSummary.create({
+  documentUrl: input.document_url,
+  summary: output.summary,
+  status: 'completed',
+})
+```
+Do not ask the workflow to write records. Do not pass owner fields for authenticated own entities.
 ## Workflow Edit Handoff
-When editing an external workflow implementation later, use `external-ai-capability` with:
+When changing external workflow behavior later:
+1. If schema changes, update AI capability contract first and sync manifest.
+2. Use `workflowConfigID` from a completed status result.
+3. Submit update with `capabilityName`, `workflowConfigID`, and `updatePrompt`.
+4. Preserve the new status result and config ID if it changes.
+5. Regenerate SDK only if manifest contract changed.
-- `mode: "update"`
-- `capabilityName`
-- `workflowConfigID`
-- `updatePrompt`
+`workflowConfigID` is not `workflowId`.
-`workflowConfigID` is not `workflowId`. It comes from a confirmed workflow status result:
-`payload.workflow_details.new_workflow_config_id`.
+## Handoff Checklist
-If the schema changed, update and sync the capability contract before submitting the workflow edit.
+- `.howone/ai/manifest.json` exists and is current.
+- Each action has `workflowId`.
+- Zod input/output schemas match manifest required fields.
+- `defineAiAction` uses action name + exact workflow UUID.
+- UI uses returned typed output, not raw execution envelope.
+- Streaming session is cancellable when UI exposes cancel.
+- Persistence goes through `howone.entities.*`.
+- Visible status/error UI is app-owned.

package/templates/vite/.howone/skills/howone-sdk/04-ai/04-service-capability-catalog.md ADDED Viewed

@@ -0,0 +1,281 @@
+# Service Capability Catalog
+Use this reference before designing an AI workflow. It tells the agent what the current workflow
+service can actually do and what input/output shapes are expected.
+Source: `docs/ai-capability.md`.
+## Quick Selection Table
+| User asks for | Use capability family | Typical inputs | Typical outputs |
+|---|---|---|---|
+| Latest info, research, source-backed answer | Web search / crawling | `topic`, `prompt`, `url`, `search_level` | `answer`, `sources`, `page_content` |
+| Generate artwork/photo/logo/mockup | Image generation | `image_description`, `style_preference`, optional references | `generated_image_url` |
+| Edit an image | Image editing | `source_image_url`, `edit_instruction` | `edited_image_url` |
+| OCR or visual analysis | Image analysis / OCR | `image_urls`, `analysis_prompt` | `analysis_result` or `extracted_text` |
+| Generate short video | Video generation | `video_prompt`, aspect/duration/frame URLs | `video_url` |
+| Join clips / extract frames | Video editing | `video_urls` or `video_url` | `video_url` or `image_url` |
+| Text to speech | Audio generation | `text_to_generate`, `language`, `voice_hint` | `audio_url` |
+| Speech to text | Audio recognition | `source_audio_url`, `language` | `transcript_text`, optional `utterances` |
+| Merge audio | Audio merging | `audio_urls` | `merged_audio_url` |
+| Stock/index history | Financial data retrieval | `trading_symbol`, `unit`, `start`, `end` | `price_history` |
+| Literature search/citations | Academic research | `query` | `papers`, `bibtex` |
+| Save generated file | File storage | `file_type`, `content` | `file_url` |
+If the requested behavior is not in this table or the detailed sections below, do not invent it.
+## Web Search And Crawling
+Use for latest information, news, market context, source-backed answers, web page extraction.
+Inputs:
+- `prompt`: query or detailed research instruction;
+- `search_level`: `low`, `medium`, or `high`; default to medium;
+- `offset`: pagination for low-level search;
+- page crawl input should be a URL.
+Outputs:
+- synthesized answer or raw search result;
+- `sources` array of URLs;
+- crawled page text/markdown when crawling.
+Rules:
+- Use web search when the user asks for current/latest information.
+- Use page crawling when the product needs content from a specific URL.
+- Do not use search as an outbound API caller.
+- Include source links in output when the product promises research.
+## Image Generation
+Use for new images from prompts or prompt + reference URLs.
+Inputs:
+- `image_description`: detailed prompt;
+- `style_preference`: optional;
+- `reference_image_urls`: optional URL array;
+- size/format options only when product exposes them.
+Outputs:
+- `generated_image_url` or `image_urls`;
+- avoid metadata unless product needs it.
+Rules:
+- One image per request is usually more reliable.
+- Do not put resolution text into the prompt when a size parameter exists.
+- Reference images must be URLs and should be described by position/content.
+- Subject to moderation; do not promise forbidden content.
+## Image Editing
+Use for modifying existing images.
+Inputs:
+- `source_image_url` or `source_image_urls`;
+- `edit_instruction`;
+- optional output size/format.
+Outputs:
+- `edited_image_url`.
+Supported edits include resize/crop/rotate, background removal/replacement, object removal/addition,
+style transfer, enhancement, merge/composite, lighting/color changes.
+Rules:
+- At least one image URL is required.
+- Keep edit instructions focused.
+- For complex multi-step edits, describe the final desired result.
+## Image Analysis And OCR
+Use for visual understanding, image comparison, text extraction, quality review.
+Inputs:
+- `image_urls`;
+- `analysis_prompt` or `ocr_instruction`.
+Outputs:
+- `analysis_result` for semantic analysis;
+- `extracted_text` for OCR.
+Rules:
+- Ask for the exact information needed.
+- Do not include confidence/bounding boxes unless user asks.
+- OCR quality depends on image quality.
+## Video Generation
+Use for short video clips from text or image frames.
+Inputs:
+- `video_prompt`;
+- optional `first_frame_url`, `last_frame_url`, `reference_image_urls`;
+- optional `aspect_ratio`, `duration`, `negative_prompt`, `generate_audio`.
+Outputs:
+- `video_url`.
+Rules:
+- Keep individual clips short, generally 5-10 seconds.
+- For consistency, generate/use a first-frame image.
+- For longer videos, generate clips and concatenate via video editing.
+- Audio in video works best with one speaker per clip.
+## Video Editing
+Use for concatenating clips or extracting first/last frames.
+Inputs:
+- concatenate: `video_urls` with at least two URLs;
+- frame extraction: `source_video_url`.
+Outputs:
+- `merged_video_url` or `frame_image_url`.
+Rules:
+- Inputs must be accessible URLs.
+- Best results when clips share resolution/aspect ratio.
+## Audio Generation
+Use for text-to-speech.
+Inputs:
+- `text_to_generate`;
+- `language` or `languages`;
+- `gender`;
+- `audio_hint`;
+- optional output format/name.
+Outputs:
+- `audio_url`.
+Rules:
+- Single speaker per call.
+- For dialogue, generate each speaker line and merge audio.
+- `audio_hint` should describe voice in English.
+## Audio Recognition
+Use for speech-to-text.
+Inputs:
+- `source_audio_url`;
+- optional `language`;
+- optional speaker diarization setting.
+Outputs:
+- `transcript_text`;
+- optional `utterances` when speaker info is requested.
+Rules:
+- Audio must be URL-accessible.
+- Silent or low-quality audio can produce empty/poor transcript.
+## Financial Data Retrieval
+Use for historical stock/index price data.
+Inputs:
+- `trading_symbol`;
+- `unit`: `daily` or `minute`;
+- `start`;
+- `end`.
+Outputs:
+- `price_history` array;
+- `trading_symbol`;
+- optional warning.
+Rules:
+- Historical data only; no real-time streaming.
+- Indices usually support daily data only.
+- Ask for exact tickers when possible.
+- Does not provide fundamentals, earnings, or live news unless combined with web search.
+## Academic Research
+Use for literature search, paper metadata, BibTeX.
+Inputs:
+- `query`.
+Outputs:
+- `papers`;
+- `bibtex` when citation output is requested.
+Rules:
+- Search quality depends on query specificity.
+- Availability varies by academic source.
+- PDF assets should be handled as URLs.
+## File Storage
+Use when workflow needs to save generated content into a file.
+Inputs:
+- `file_type`: `json`, `yaml`, `csv`, `pdf`, `md`, or `txt`;
+- `content`: string content to save.
+Outputs:
+- `file_url`.
+Rules:
+- Structured content must be serialized to string before saving.
+- Do not use file storage as a database.
+- If app needs records/history, persist file URL through entities.
+## Composition Patterns
+| Pattern | Workflow design |
+|---|---|
+| Image -> Video | generate first-frame image, pass as `first_frame_url` to video generation |
+| Multi-clip video | generate short clips, concatenate via video editing |
+| Dialogue audio | generate each speaker line, merge audio |
+| Search -> Report | web search/crawl, synthesize structured report, optionally save file |
+| Video -> Image edit -> Video | extract frame, edit frame, use as next reference |
+| RAG document chat | indexing workflow + query workflow |
+## Capability Rejection Checklist
+Stop or narrow scope if user requires:
+- real-time streaming market data;
+- arbitrary external API calls not listed;
+- raw file bytes/base64 in workflow;
+- database CRUD inside workflow;
+- unsupported provider-specific model guarantees;
+- content disallowed by moderation;
+- long video generation in one call beyond service limits.