@sambitcreate/parsely-cli 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2024
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,151 @@
1
+ # Parsely CLI
2
+
3
+ A smart, interactive recipe scraper for the terminal. Parsely extracts structured recipe data (ingredients, instructions, cook times) from any recipe URL using headless Chrome with an intelligent AI fallback.
4
+
5
+ Built with [Ink](https://github.com/vadimdemedes/ink) (React for CLIs) for a rich, responsive terminal UI.
6
+
7
+ ## Features
8
+
9
+ - **Interactive TUI** — Full terminal interface built with Ink and React, featuring bordered panels, spinners, and color-coded output.
10
+ - **Browser Scraping** — Headless Chrome via Puppeteer extracts Schema.org JSON-LD data from recipe pages, handling JavaScript-rendered content.
11
+ - **AI Fallback** — Automatically switches to OpenAI `gpt-4o-mini` when browser scraping can't find recipe data.
12
+ - **Structured Output** — Displays prep time, cook time, total time, ingredients, and step-by-step instructions in a clean card layout.
13
+ - **Keyboard-Driven** — Context-aware keybind hints in the footer; press `n` for a new recipe or `q` to quit.
14
+
15
+ ## Preview
16
+
17
+ ![Parsely CLI Screenshot](screenshot.png)
18
+
19
+ ## Project Structure
20
+
21
+ ```
22
+ parsely-cli/
23
+ ├── src/
24
+ │ ├── cli.tsx # Entry point — arg parsing, renders <App>
25
+ │ ├── app.tsx # Root component — state machine (idle → scraping → display)
26
+ │ ├── theme.ts # Color palette and symbols
27
+ │ ├── components/
28
+ │ │ ├── Banner.tsx # ASCII art header
29
+ │ │ ├── URLInput.tsx # URL text input with validation
30
+ │ │ ├── RecipeCard.tsx # Recipe display card (times, ingredients, instructions)
31
+ │ │ ├── ScrapingStatus.tsx # Spinner with phase updates
32
+ │ │ ├── Footer.tsx # Context-aware keybind hints
33
+ │ │ ├── Welcome.tsx # Welcome message
34
+ │ │ └── ErrorDisplay.tsx # Error panel
35
+ │ ├── services/
36
+ │ │ └── scraper.ts # Puppeteer + OpenAI scraping logic
37
+ │ └── utils/
38
+ │ └── helpers.ts # ISO duration parser, config, URL validation
39
+ ├── package.json
40
+ ├── tsconfig.json
41
+ ├── run.sh # Quick-start launcher script
42
+ ├── .env.local # Your OpenAI API key (create this)
43
+ ├── CLAUDE.md # AI assistant context
44
+ ├── CODE_OF_CONDUCT.md
45
+ └── LICENSE
46
+ ```
47
+
48
+ ## Setup
49
+
50
+ ### Prerequisites
51
+
52
+ - **Node.js** v18 or later
53
+ - **npm** v9 or later
54
+
55
+ ### Installation
56
+
57
+ 1. **Clone the repository:**
58
+
59
+ ```bash
60
+ git clone <your-repository-url>
61
+ cd parsely-cli
62
+ ```
63
+
64
+ 2. **Install dependencies:**
65
+
66
+ ```bash
67
+ npm install
68
+ ```
69
+
70
+ Uses `puppeteer-core` — no Chromium download. The CLI auto-detects system Chrome/Chromium. If none is found, browser scraping is skipped and the AI fallback is used.
71
+
72
+ 3. **Configure AI fallback (optional but recommended):**
73
+
74
+ Create a `.env.local` file in the project root:
75
+
76
+ ```
77
+ OPENAI_API_KEY="your_openai_api_key_here"
78
+ ```
79
+
80
+ Without this, the AI fallback will not function — browser scraping will still work for most recipe sites.
81
+
82
+ ## Usage
83
+
84
+ ### Quick Start
85
+
86
+ ```bash
87
+ ./run.sh
88
+ ```
89
+
90
+ The launcher script installs dependencies automatically on first run, then starts the TUI.
91
+
92
+ ### With a URL Argument
93
+
94
+ ```bash
95
+ npm start -- https://www.simplyrecipes.com/recipes/perfect_guacamole/
96
+ ```
97
+
98
+ Or via the run script:
99
+
100
+ ```bash
101
+ ./run.sh https://www.simplyrecipes.com/recipes/perfect_guacamole/
102
+ ```
103
+
104
+ ### Interactive Mode
105
+
106
+ Run without arguments and enter a URL when prompted:
107
+
108
+ ```bash
109
+ npm start
110
+ ```
111
+
112
+ ### Keyboard Shortcuts
113
+
114
+ | Key | Context | Action |
115
+ | -------- | -------- | ----------------- |
116
+ | `Enter` | Input | Submit URL |
117
+ | `n` | Display | Scrape new recipe |
118
+ | `q` | Display | Quit |
119
+ | `Ctrl+C` | Anywhere | Exit |
120
+
121
+ ## How It Works
122
+
123
+ 1. **Browser Scraping** — Puppeteer launches headless Chrome, navigates to the URL, and extracts `<script type="application/ld+json">` blocks. The first Schema.org `Recipe` object found is parsed and displayed.
124
+
125
+ 2. **AI Fallback** — If the browser fails or no JSON-LD recipe is found, the URL is sent to OpenAI's `gpt-4o-mini` with a structured extraction prompt. The model returns recipe data as JSON.
126
+
127
+ 3. **Display** — Recipe data is rendered in a bordered card with color-coded sections for times, ingredients, and instructions.
128
+
129
+ ## Architecture
130
+
131
+ The TUI is built with **Ink** (React for the terminal) following patterns inspired by [OpenCode](https://github.com/anomalyco/opencode):
132
+
133
+ - **Component-based architecture** — Each UI element is an isolated React component.
134
+ - **State machine** — The app cycles through phases: `idle` → `scraping` → `display` (or `error`).
135
+ - **Theme system** — Centralized color palette in `theme.ts` for consistent styling.
136
+ - **Context-aware footer** — Keybind hints update based on the current phase.
137
+ - **Callback-driven progress** — The scraper reports phase changes to the TUI via callbacks so the spinner updates in real time.
138
+
139
+ ## Troubleshooting
140
+
141
+ - **`Error: OpenAI API key not found`** — Create a `.env.local` file with your API key. The AI fallback requires this, but browser scraping works without it.
142
+ - **Browser scraping skipped** — The CLI auto-detects system Chrome/Chromium. If none is found, it skips browser scraping and uses the AI fallback. Install Chrome or Chromium to enable browser scraping.
143
+ - **No recipe found** — Some sites use non-standard recipe markup. The AI fallback handles most of these, but results depend on the OpenAI model's ability to extract the recipe.
144
+
145
+ ## License
146
+
147
+ MIT — see [LICENSE](LICENSE).
148
+
149
+ ## Code of Conduct
150
+
151
+ See [CODE_OF_CONDUCT.md](CODE_OF_CONDUCT.md).
package/dist/app.d.ts ADDED
@@ -0,0 +1,5 @@
1
+ interface AppProps {
2
+ initialUrl?: string;
3
+ }
4
+ export declare function App({ initialUrl }: AppProps): import("react/jsx-runtime").JSX.Element;
5
+ export {};
package/dist/app.js ADDED
@@ -0,0 +1,63 @@
1
+ import { jsx as _jsx, Fragment as _Fragment, jsxs as _jsxs } from "react/jsx-runtime";
2
+ import { useState, useCallback, useEffect } from 'react';
3
+ import { Box, Text, useApp, useInput } from 'ink';
4
+ import { Banner } from './components/Banner.js';
5
+ import { URLInput } from './components/URLInput.js';
6
+ import { RecipeCard } from './components/RecipeCard.js';
7
+ import { ScrapingStatus } from './components/ScrapingStatus.js';
8
+ import { Footer } from './components/Footer.js';
9
+ import { Welcome } from './components/Welcome.js';
10
+ import { ErrorDisplay } from './components/ErrorDisplay.js';
11
+ import { scrapeRecipe } from './services/scraper.js';
12
+ import { theme } from './theme.js';
13
+ export function App({ initialUrl }) {
14
+ const { exit } = useApp();
15
+ const [phase, setPhase] = useState(initialUrl ? 'scraping' : 'idle');
16
+ const [recipe, setRecipe] = useState(null);
17
+ const [scrapeStatus, setScrapeStatus] = useState(null);
18
+ const [error, setError] = useState('');
19
+ const handleScrape = useCallback(async (url) => {
20
+ setPhase('scraping');
21
+ setError('');
22
+ setScrapeStatus({ phase: 'browser', message: 'Starting\u2026' });
23
+ try {
24
+ const result = await scrapeRecipe(url, (status) => {
25
+ setScrapeStatus(status);
26
+ });
27
+ setRecipe(result);
28
+ setPhase('display');
29
+ }
30
+ catch (err) {
31
+ setError(err instanceof Error ? err.message : 'Failed to scrape recipe');
32
+ setPhase('error');
33
+ }
34
+ }, []);
35
+ const handleNewRecipe = useCallback(() => {
36
+ setPhase('idle');
37
+ setRecipe(null);
38
+ setError('');
39
+ setScrapeStatus(null);
40
+ }, []);
41
+ // Scrape the initial URL if provided via CLI argument
42
+ useEffect(() => {
43
+ if (initialUrl) {
44
+ handleScrape(initialUrl);
45
+ }
46
+ }, []); // eslint-disable-line react-hooks/exhaustive-deps
47
+ // Global keybinds – only active during the display phase so they
48
+ // do not interfere with the text input in idle/error phases.
49
+ useInput((input, key) => {
50
+ if (phase === 'display') {
51
+ if (input === 'n')
52
+ handleNewRecipe();
53
+ if (input === 'q')
54
+ exit();
55
+ }
56
+ // Ctrl+C is handled by Ink automatically
57
+ if (key.escape) {
58
+ if (phase === 'display')
59
+ exit();
60
+ }
61
+ });
62
+ return (_jsxs(Box, { flexDirection: "column", children: [_jsx(Banner, {}), _jsxs(Box, { flexDirection: "column", paddingX: 1, children: [phase === 'idle' && (_jsxs(_Fragment, { children: [_jsx(Welcome, {}), _jsx(URLInput, { onSubmit: handleScrape })] })), phase === 'scraping' && scrapeStatus && (_jsx(ScrapingStatus, { status: scrapeStatus })), phase === 'display' && recipe && (_jsxs(_Fragment, { children: [_jsx(RecipeCard, { recipe: recipe }), _jsx(Box, { marginTop: 1, marginLeft: 1, children: _jsxs(Text, { bold: true, color: theme.colors.success, children: [theme.symbols.check, " Recipe parsed successfully!"] }) })] })), phase === 'error' && (_jsxs(_Fragment, { children: [_jsx(ErrorDisplay, { message: error }), _jsx(URLInput, { onSubmit: handleScrape })] }))] }), _jsx(Footer, { phase: phase })] }));
63
+ }
package/dist/cli.d.ts ADDED
@@ -0,0 +1,2 @@
1
+ #!/usr/bin/env node
2
+ export {};
package/dist/cli.js ADDED
@@ -0,0 +1,34 @@
1
+ #!/usr/bin/env node
2
+ import { jsx as _jsx } from "react/jsx-runtime";
3
+ import { render } from 'ink';
4
+ import { App } from './app.js';
5
+ // Simple arg parsing – accept an optional recipe URL as the first positional arg
6
+ const args = process.argv.slice(2);
7
+ const url = args.find((a) => !a.startsWith('-'));
8
+ // Handle --help / -h
9
+ if (args.includes('--help') || args.includes('-h')) {
10
+ console.log(`
11
+ Parsely CLI — Smart recipe scraper
12
+
13
+ USAGE
14
+ parsely [url]
15
+
16
+ ARGUMENTS
17
+ url Optional recipe URL to scrape immediately
18
+
19
+ EXAMPLES
20
+ parsely
21
+ parsely https://www.simplyrecipes.com/recipes/perfect_guacamole/
22
+
23
+ The CLI scrapes recipe data using headless Chrome with an
24
+ AI fallback (OpenAI gpt-4o-mini). Create a .env.local file
25
+ with OPENAI_API_KEY=your_key to enable the AI fallback.
26
+ `);
27
+ process.exit(0);
28
+ }
29
+ // Handle --version / -v
30
+ if (args.includes('--version') || args.includes('-v')) {
31
+ console.log('parsely-cli v2.0.0');
32
+ process.exit(0);
33
+ }
34
+ render(_jsx(App, { initialUrl: url }));
@@ -0,0 +1 @@
1
+ export declare function Banner(): import("react/jsx-runtime").JSX.Element;
@@ -0,0 +1,12 @@
1
+ import { jsx as _jsx, jsxs as _jsxs } from "react/jsx-runtime";
2
+ import { Box, Text } from 'ink';
3
+ import { theme } from '../theme.js';
4
+ const LOGO = `\
5
+ ____ _ ____ ____ _____ _ __ __ ____ _ ___
6
+ | _ \\ / \\ | _ \\/ ___|| ____| | \\ \\ / / / ___| | |_ _|
7
+ | |_) / _ \\ | |_) \\___ \\| _| | | \\ V / | | | | | |
8
+ | __/ ___ \\| _ < ___) | |___| |___| | | |___| |___ | |
9
+ |_| /_/ \\_\\_| \\_\\____/|_____|_____|_| \\____|_____|___|`;
10
+ export function Banner() {
11
+ return (_jsxs(Box, { flexDirection: "column", marginBottom: 1, children: [_jsx(Text, { color: theme.colors.banner, bold: true, children: LOGO }), _jsxs(Text, { color: theme.colors.muted, children: [' ', "Smart recipe scraper ", theme.symbols.dot, " v2.0"] })] }));
12
+ }
@@ -0,0 +1,5 @@
1
+ interface ErrorDisplayProps {
2
+ message: string;
3
+ }
4
+ export declare function ErrorDisplay({ message }: ErrorDisplayProps): import("react/jsx-runtime").JSX.Element;
5
+ export {};
@@ -0,0 +1,6 @@
1
+ import { jsxs as _jsxs, jsx as _jsx } from "react/jsx-runtime";
2
+ import { Box, Text } from 'ink';
3
+ import { theme } from '../theme.js';
4
+ export function ErrorDisplay({ message }) {
5
+ return (_jsxs(Box, { flexDirection: "column", borderStyle: "round", borderColor: theme.colors.error, paddingX: 1, paddingY: 1, marginBottom: 1, children: [_jsxs(Text, { bold: true, color: theme.colors.error, children: [theme.symbols.cross, " Scraping Failed"] }), _jsx(Box, { marginTop: 1, marginLeft: 2, children: _jsx(Text, { color: theme.colors.text, wrap: "wrap", children: message }) }), _jsx(Box, { marginTop: 1, marginLeft: 2, children: _jsx(Text, { color: theme.colors.muted, children: "Check the URL and try again, or ensure your .env.local has a valid OPENAI_API_KEY." }) })] }));
6
+ }
@@ -0,0 +1,6 @@
1
+ export type AppPhase = 'idle' | 'scraping' | 'display' | 'error';
2
+ interface FooterProps {
3
+ phase: AppPhase;
4
+ }
5
+ export declare function Footer({ phase }: FooterProps): import("react/jsx-runtime").JSX.Element;
6
+ export {};
@@ -0,0 +1,25 @@
1
+ import { jsx as _jsx, jsxs as _jsxs } from "react/jsx-runtime";
2
+ import React from 'react';
3
+ import { Box, Text } from 'ink';
4
+ import { theme } from '../theme.js';
5
+ const keybinds = {
6
+ idle: [
7
+ { key: 'enter', label: 'submit' },
8
+ { key: 'ctrl+c', label: 'exit' },
9
+ ],
10
+ scraping: [
11
+ { key: 'ctrl+c', label: 'exit' },
12
+ ],
13
+ display: [
14
+ { key: 'n', label: 'new recipe' },
15
+ { key: 'q', label: 'quit' },
16
+ ],
17
+ error: [
18
+ { key: 'enter', label: 'submit' },
19
+ { key: 'ctrl+c', label: 'exit' },
20
+ ],
21
+ };
22
+ export function Footer({ phase }) {
23
+ const hints = keybinds[phase];
24
+ return (_jsx(Box, { marginTop: 1, paddingX: 1, gap: 1, children: hints.map((hint, i) => (_jsxs(React.Fragment, { children: [i > 0 && _jsx(Text, { color: theme.colors.muted, children: theme.symbols.dot }), _jsxs(Text, { children: [_jsx(Text, { color: theme.colors.primary, bold: true, children: hint.key }), _jsxs(Text, { color: theme.colors.muted, children: [" ", hint.label] })] })] }, hint.key))) }));
25
+ }
@@ -0,0 +1,6 @@
1
+ import type { Recipe } from '../services/scraper.js';
2
+ interface RecipeCardProps {
3
+ recipe: Recipe;
4
+ }
5
+ export declare function RecipeCard({ recipe }: RecipeCardProps): import("react/jsx-runtime").JSX.Element;
6
+ export {};
@@ -0,0 +1,46 @@
1
+ import { jsx as _jsx, jsxs as _jsxs } from "react/jsx-runtime";
2
+ import { Box, Text } from 'ink';
3
+ import { theme } from '../theme.js';
4
+ import { isoToMinutes, formatMinutes } from '../utils/helpers.js';
5
+ /**
6
+ * Extract instruction text from the various formats recipes use.
7
+ */
8
+ function extractInstructions(recipe) {
9
+ const raw = recipe.recipeInstructions;
10
+ if (!raw)
11
+ return [];
12
+ const steps = [];
13
+ if (Array.isArray(raw)) {
14
+ for (const step of raw) {
15
+ if (typeof step === 'string') {
16
+ steps.push(step);
17
+ }
18
+ else if (typeof step === 'object' && step !== null) {
19
+ if ('text' in step && step.text) {
20
+ steps.push(step.text);
21
+ }
22
+ else if ('itemListElement' in step && Array.isArray(step.itemListElement)) {
23
+ for (const sub of step.itemListElement) {
24
+ if (sub.text)
25
+ steps.push(sub.text);
26
+ }
27
+ }
28
+ }
29
+ }
30
+ }
31
+ else {
32
+ steps.push(String(raw));
33
+ }
34
+ return steps;
35
+ }
36
+ function TimeField({ label, iso }) {
37
+ if (!iso)
38
+ return null;
39
+ const mins = isoToMinutes(iso);
40
+ return (_jsxs(Box, { gap: 1, children: [_jsx(Text, { color: theme.colors.success, bold: true, children: label }), _jsx(Text, { color: theme.colors.text, children: formatMinutes(mins) }), _jsxs(Text, { color: theme.colors.muted, dimColor: true, children: ["(", iso, ")"] })] }));
41
+ }
42
+ export function RecipeCard({ recipe }) {
43
+ const instructions = extractInstructions(recipe);
44
+ const sourceLabel = recipe.source === 'browser' ? 'JSON-LD' : 'AI Fallback';
45
+ return (_jsxs(Box, { flexDirection: "column", borderStyle: "round", borderColor: theme.colors.secondary, paddingX: 1, paddingY: 1, children: [_jsxs(Box, { marginBottom: 1, children: [_jsx(Text, { bold: true, color: theme.colors.secondary, children: "Recipe Extract" }), _jsxs(Text, { color: theme.colors.muted, children: [' ', "(", sourceLabel, ")"] })] }), recipe.name && (_jsx(Box, { marginBottom: 1, children: _jsx(Text, { bold: true, color: theme.colors.text, children: recipe.name }) })), (recipe.prepTime || recipe.cookTime || recipe.totalTime) && (_jsxs(Box, { flexDirection: "column", marginBottom: 1, children: [_jsx(TimeField, { label: "Prep Time ", iso: recipe.prepTime }), _jsx(TimeField, { label: "Cook Time ", iso: recipe.cookTime }), _jsx(TimeField, { label: "Total Time", iso: recipe.totalTime })] })), recipe.recipeIngredient && recipe.recipeIngredient.length > 0 && (_jsxs(Box, { flexDirection: "column", marginBottom: 1, children: [_jsx(Text, { bold: true, color: theme.colors.accent, children: "Ingredients" }), recipe.recipeIngredient.map((item, i) => (_jsxs(Text, { color: theme.colors.text, children: [' ', theme.symbols.bullet, " ", item] }, i)))] })), instructions.length > 0 && (_jsxs(Box, { flexDirection: "column", children: [_jsx(Text, { bold: true, color: theme.colors.info, children: "Instructions" }), instructions.map((step, i) => (_jsxs(Text, { color: theme.colors.text, wrap: "wrap", children: [' ', i + 1, ". ", step] }, i)))] }))] }));
46
+ }
@@ -0,0 +1,6 @@
1
+ import type { ScrapeStatus } from '../services/scraper.js';
2
+ interface ScrapingStatusProps {
3
+ status: ScrapeStatus;
4
+ }
5
+ export declare function ScrapingStatus({ status }: ScrapingStatusProps): import("react/jsx-runtime").JSX.Element;
6
+ export {};
@@ -0,0 +1,16 @@
1
+ import { jsx as _jsx, jsxs as _jsxs } from "react/jsx-runtime";
2
+ import { Box, Text } from 'ink';
3
+ import Spinner from 'ink-spinner';
4
+ import { theme } from '../theme.js';
5
+ const phaseLabel = {
6
+ browser: 'Browser Scraping',
7
+ parsing: 'Parsing HTML',
8
+ ai: 'AI Extraction',
9
+ done: 'Complete',
10
+ error: 'Error',
11
+ };
12
+ export function ScrapingStatus({ status }) {
13
+ const isActive = status.phase !== 'done' && status.phase !== 'error';
14
+ const label = phaseLabel[status.phase] ?? status.phase;
15
+ return (_jsxs(Box, { flexDirection: "column", borderStyle: "round", borderColor: theme.colors.border, paddingX: 1, paddingY: 1, children: [_jsxs(Box, { gap: 1, children: [isActive && (_jsx(Text, { color: theme.colors.primary, children: _jsx(Spinner, { type: "dots" }) })), _jsx(Text, { bold: true, color: theme.colors.label, children: label })] }), _jsx(Box, { marginTop: 1, marginLeft: 2, children: _jsx(Text, { color: theme.colors.muted, children: status.message }) })] }));
16
+ }
@@ -0,0 +1,5 @@
1
+ interface URLInputProps {
2
+ onSubmit: (url: string) => void;
3
+ }
4
+ export declare function URLInput({ onSubmit }: URLInputProps): import("react/jsx-runtime").JSX.Element;
5
+ export {};
@@ -0,0 +1,31 @@
1
+ import { jsx as _jsx, jsxs as _jsxs } from "react/jsx-runtime";
2
+ import { useState } from 'react';
3
+ import { Box, Text } from 'ink';
4
+ import TextInput from 'ink-text-input';
5
+ import { theme } from '../theme.js';
6
+ import { isValidUrl } from '../utils/helpers.js';
7
+ export function URLInput({ onSubmit }) {
8
+ const [value, setValue] = useState('');
9
+ const [error, setError] = useState('');
10
+ const handleSubmit = (input) => {
11
+ const trimmed = input.trim();
12
+ if (!trimmed) {
13
+ setError('Please enter a URL');
14
+ return;
15
+ }
16
+ // Auto-prepend https:// if missing protocol
17
+ const url = /^https?:\/\//.test(trimmed) ? trimmed : `https://${trimmed}`;
18
+ if (!isValidUrl(url)) {
19
+ setError('Invalid URL. Please enter a valid recipe URL.');
20
+ return;
21
+ }
22
+ setError('');
23
+ setValue('');
24
+ onSubmit(url);
25
+ };
26
+ return (_jsxs(Box, { flexDirection: "column", children: [_jsxs(Box, { borderStyle: "round", borderColor: theme.colors.borderFocus, paddingX: 1, children: [_jsx(Text, { color: theme.colors.primary, bold: true, children: '\u276F ' }), _jsx(TextInput, { value: value, onChange: (v) => {
27
+ setValue(v);
28
+ if (error)
29
+ setError('');
30
+ }, onSubmit: handleSubmit, placeholder: "Enter recipe URL..." })] }), error && (_jsx(Box, { marginLeft: 2, marginTop: 0, children: _jsxs(Text, { color: theme.colors.error, children: [theme.symbols.cross, " ", error] }) }))] }));
31
+ }
@@ -0,0 +1 @@
1
+ export declare function Welcome(): import("react/jsx-runtime").JSX.Element;
@@ -0,0 +1,6 @@
1
+ import { jsx as _jsx, jsxs as _jsxs } from "react/jsx-runtime";
2
+ import { Box, Text } from 'ink';
3
+ import { theme } from '../theme.js';
4
+ export function Welcome() {
5
+ return (_jsxs(Box, { flexDirection: "column", marginBottom: 1, paddingX: 1, children: [_jsx(Text, { color: theme.colors.text, children: "Paste a recipe URL below to extract ingredients and instructions." }), _jsx(Text, { color: theme.colors.muted, children: "Uses browser scraping with AI fallback for best results." })] }));
6
+ }
@@ -0,0 +1,26 @@
1
+ export interface Recipe {
2
+ name?: string;
3
+ prepTime?: string;
4
+ cookTime?: string;
5
+ totalTime?: string;
6
+ recipeIngredient?: string[];
7
+ recipeInstructions?: Array<string | {
8
+ text?: string;
9
+ itemListElement?: Array<{
10
+ text?: string;
11
+ }>;
12
+ }>;
13
+ source: 'browser' | 'ai';
14
+ }
15
+ export type ScrapePhase = 'browser' | 'parsing' | 'ai' | 'done' | 'error';
16
+ export interface ScrapeStatus {
17
+ phase: ScrapePhase;
18
+ message: string;
19
+ recipe?: Recipe;
20
+ }
21
+ /**
22
+ * Scrape a recipe from the given URL.
23
+ * Tries Puppeteer-based browser scraping first, falls back to OpenAI.
24
+ * Calls `onStatus` with progress updates so the TUI can reflect each phase.
25
+ */
26
+ export declare function scrapeRecipe(url: string, onStatus: (status: ScrapeStatus) => void): Promise<Recipe>;
@@ -0,0 +1,166 @@
1
+ import puppeteer from 'puppeteer-core';
2
+ import * as cheerio from 'cheerio';
3
+ import OpenAI from 'openai';
4
+ import { execSync } from 'child_process';
5
+ import { existsSync } from 'fs';
6
+ import { loadConfig } from '../utils/helpers.js';
7
+ /* ------------------------------------------------------------------ */
8
+ /* JSON-LD helpers */
9
+ /* ------------------------------------------------------------------ */
10
+ /**
11
+ * Walk through JSON-LD script blocks and return the first Recipe object found.
12
+ * Handles direct Recipe type, @graph arrays, and nested lists.
13
+ */
14
+ function findRecipeJson(scripts) {
15
+ for (const raw of scripts) {
16
+ let data;
17
+ try {
18
+ data = JSON.parse(raw);
19
+ }
20
+ catch {
21
+ continue;
22
+ }
23
+ const candidates = Array.isArray(data)
24
+ ? data
25
+ : [data];
26
+ // Use index-based loop because we may push into candidates as we go
27
+ for (let i = 0; i < candidates.length; i++) {
28
+ const obj = candidates[i];
29
+ // Expand @graph
30
+ if (obj['@graph']) {
31
+ const graph = obj['@graph'];
32
+ const items = Array.isArray(graph)
33
+ ? graph
34
+ : [graph];
35
+ candidates.push(...items);
36
+ }
37
+ const recipeType = obj['@type'];
38
+ if (recipeType === 'Recipe' ||
39
+ (Array.isArray(recipeType) && recipeType.includes('Recipe'))) {
40
+ return obj;
41
+ }
42
+ }
43
+ }
44
+ return null;
45
+ }
46
+ /* ------------------------------------------------------------------ */
47
+ /* Chrome detection */
48
+ /* ------------------------------------------------------------------ */
49
+ const CHROME_PATHS = [
50
+ '/usr/bin/google-chrome-stable',
51
+ '/usr/bin/google-chrome',
52
+ '/usr/bin/chromium-browser',
53
+ '/usr/bin/chromium',
54
+ '/Applications/Google Chrome.app/Contents/MacOS/Google Chrome',
55
+ '/Applications/Chromium.app/Contents/MacOS/Chromium',
56
+ ];
57
+ function findChrome() {
58
+ // Check well-known paths
59
+ for (const p of CHROME_PATHS) {
60
+ if (existsSync(p))
61
+ return p;
62
+ }
63
+ // Try `which`
64
+ try {
65
+ const result = execSync('which chromium-browser || which chromium || which google-chrome 2>/dev/null', { encoding: 'utf-8' }).trim();
66
+ if (result)
67
+ return result;
68
+ }
69
+ catch { /* not found */ }
70
+ return null;
71
+ }
72
+ /* ------------------------------------------------------------------ */
73
+ /* Scraping strategies */
74
+ /* ------------------------------------------------------------------ */
75
+ async function scrapeWithBrowser(url) {
76
+ const chromePath = findChrome();
77
+ if (!chromePath)
78
+ return null; // No browser available – skip to AI
79
+ let browser = null;
80
+ try {
81
+ browser = await puppeteer.launch({
82
+ headless: true,
83
+ executablePath: chromePath,
84
+ args: ['--no-sandbox', '--disable-setuid-sandbox'],
85
+ });
86
+ const page = await browser.newPage();
87
+ await page.goto(url, { waitUntil: 'networkidle2', timeout: 10_000 });
88
+ const html = await page.content();
89
+ await browser.close();
90
+ browser = null;
91
+ const $ = cheerio.load(html);
92
+ const scripts = [];
93
+ $('script[type="application/ld+json"]').each((_i, el) => {
94
+ const text = $(el).text();
95
+ if (text)
96
+ scripts.push(text);
97
+ });
98
+ const recipe = findRecipeJson(scripts);
99
+ if (!recipe)
100
+ return null;
101
+ return { ...recipe, source: 'browser' };
102
+ }
103
+ catch {
104
+ if (browser) {
105
+ try {
106
+ await browser.close();
107
+ }
108
+ catch { /* noop */ }
109
+ }
110
+ return null;
111
+ }
112
+ }
113
+ async function scrapeWithAI(url) {
114
+ const { openaiApiKey } = loadConfig();
115
+ if (!openaiApiKey || openaiApiKey === 'YOUR_API_KEY_HERE') {
116
+ throw new Error('OpenAI API key not found. Create a .env.local file with OPENAI_API_KEY=your_key');
117
+ }
118
+ const client = new OpenAI({ apiKey: openaiApiKey });
119
+ const response = await client.chat.completions.create({
120
+ model: 'gpt-4o-mini',
121
+ messages: [
122
+ {
123
+ role: 'system',
124
+ content: 'You are a recipe scraper. Extract cookTime, prepTime, totalTime, ' +
125
+ 'recipeIngredient, and recipeInstructions from the provided URL. ' +
126
+ 'Return the data in a valid JSON object.',
127
+ },
128
+ { role: 'user', content: `Scrape this recipe: ${url}` },
129
+ ],
130
+ response_format: { type: 'json_object' },
131
+ });
132
+ const content = response.choices[0]?.message?.content;
133
+ if (!content)
134
+ throw new Error('AI returned empty response');
135
+ const recipe = JSON.parse(content);
136
+ return { ...recipe, source: 'ai' };
137
+ }
138
+ /* ------------------------------------------------------------------ */
139
+ /* Public orchestrator */
140
+ /* ------------------------------------------------------------------ */
141
+ /**
142
+ * Scrape a recipe from the given URL.
143
+ * Tries Puppeteer-based browser scraping first, falls back to OpenAI.
144
+ * Calls `onStatus` with progress updates so the TUI can reflect each phase.
145
+ */
146
+ export async function scrapeRecipe(url, onStatus) {
147
+ // Phase 1 – browser scraping
148
+ onStatus({ phase: 'browser', message: 'Launching browser\u2026' });
149
+ const browserResult = await scrapeWithBrowser(url);
150
+ if (browserResult) {
151
+ onStatus({ phase: 'done', message: 'Recipe found!', recipe: browserResult });
152
+ return browserResult;
153
+ }
154
+ // Phase 2 – AI fallback
155
+ onStatus({ phase: 'ai', message: 'Falling back to AI scraper\u2026' });
156
+ try {
157
+ const aiResult = await scrapeWithAI(url);
158
+ onStatus({ phase: 'done', message: 'Recipe extracted via AI!', recipe: aiResult });
159
+ return aiResult;
160
+ }
161
+ catch (error) {
162
+ const message = error instanceof Error ? error.message : 'Unknown error occurred';
163
+ onStatus({ phase: 'error', message });
164
+ throw error;
165
+ }
166
+ }
@@ -0,0 +1,31 @@
1
+ /**
2
+ * Parsely CLI theme - color palette and symbols for the TUI.
3
+ * Inspired by OpenCode's semantic theming approach.
4
+ */
5
+ export declare const theme: {
6
+ readonly colors: {
7
+ readonly primary: "#00d4aa";
8
+ readonly secondary: "#ff6b9d";
9
+ readonly accent: "#ffd93d";
10
+ readonly text: "#e0e0e0";
11
+ readonly muted: "#6272a4";
12
+ readonly error: "#ff5555";
13
+ readonly success: "#50fa7b";
14
+ readonly warning: "#f1fa8c";
15
+ readonly info: "#8be9fd";
16
+ readonly banner: "#50fa7b";
17
+ readonly border: "#6272a4";
18
+ readonly borderFocus: "#00d4aa";
19
+ readonly label: "#bd93f9";
20
+ };
21
+ readonly symbols: {
22
+ readonly bullet: "•";
23
+ readonly arrow: "→";
24
+ readonly check: "✓";
25
+ readonly cross: "✗";
26
+ readonly dot: "·";
27
+ readonly ellipsis: "…";
28
+ readonly line: "─";
29
+ };
30
+ };
31
+ export type Theme = typeof theme;
package/dist/theme.js ADDED
@@ -0,0 +1,30 @@
1
+ /**
2
+ * Parsely CLI theme - color palette and symbols for the TUI.
3
+ * Inspired by OpenCode's semantic theming approach.
4
+ */
5
+ export const theme = {
6
+ colors: {
7
+ primary: '#00d4aa',
8
+ secondary: '#ff6b9d',
9
+ accent: '#ffd93d',
10
+ text: '#e0e0e0',
11
+ muted: '#6272a4',
12
+ error: '#ff5555',
13
+ success: '#50fa7b',
14
+ warning: '#f1fa8c',
15
+ info: '#8be9fd',
16
+ banner: '#50fa7b',
17
+ border: '#6272a4',
18
+ borderFocus: '#00d4aa',
19
+ label: '#bd93f9',
20
+ },
21
+ symbols: {
22
+ bullet: '\u2022',
23
+ arrow: '\u2192',
24
+ check: '\u2713',
25
+ cross: '\u2717',
26
+ dot: '\u00B7',
27
+ ellipsis: '\u2026',
28
+ line: '\u2500',
29
+ },
30
+ };
@@ -0,0 +1,19 @@
1
+ /**
2
+ * Convert an ISO 8601 duration string (e.g. "PT1H30M") to total minutes.
3
+ * Returns -1 when the input is not parseable.
4
+ */
5
+ export declare function isoToMinutes(duration: string | undefined): number;
6
+ /**
7
+ * Format minutes into a human-readable string (e.g. "1h 30m").
8
+ */
9
+ export declare function formatMinutes(mins: number): string;
10
+ /**
11
+ * Load environment configuration from .env.local.
12
+ */
13
+ export declare function loadConfig(): {
14
+ openaiApiKey?: string;
15
+ };
16
+ /**
17
+ * Basic URL validation.
18
+ */
19
+ export declare function isValidUrl(url: string): boolean;
@@ -0,0 +1,51 @@
1
+ import { config } from 'dotenv';
2
+ import { resolve } from 'path';
3
+ /**
4
+ * Convert an ISO 8601 duration string (e.g. "PT1H30M") to total minutes.
5
+ * Returns -1 when the input is not parseable.
6
+ */
7
+ export function isoToMinutes(duration) {
8
+ if (!duration || typeof duration !== 'string')
9
+ return -1;
10
+ const match = duration.match(/^P(?:(\d+)D)?T?(?:(\d+)H)?(?:(\d+)M)?(?:(\d+)S)?$/);
11
+ if (!match)
12
+ return -1;
13
+ const days = parseInt(match[1] || '0', 10);
14
+ const hours = parseInt(match[2] || '0', 10);
15
+ const minutes = parseInt(match[3] || '0', 10);
16
+ const seconds = parseInt(match[4] || '0', 10);
17
+ return days * 1440 + hours * 60 + minutes + Math.ceil(seconds / 60);
18
+ }
19
+ /**
20
+ * Format minutes into a human-readable string (e.g. "1h 30m").
21
+ */
22
+ export function formatMinutes(mins) {
23
+ if (mins < 0)
24
+ return 'N/A';
25
+ if (mins < 60)
26
+ return `${mins} min`;
27
+ const h = Math.floor(mins / 60);
28
+ const m = mins % 60;
29
+ return m > 0 ? `${h}h ${m}m` : `${h}h`;
30
+ }
31
+ /**
32
+ * Load environment configuration from .env.local.
33
+ */
34
+ export function loadConfig() {
35
+ config({ path: resolve(process.cwd(), '.env.local') });
36
+ return {
37
+ openaiApiKey: process.env['OPENAI_API_KEY'],
38
+ };
39
+ }
40
+ /**
41
+ * Basic URL validation.
42
+ */
43
+ export function isValidUrl(url) {
44
+ try {
45
+ const parsed = new URL(url);
46
+ return parsed.protocol === 'http:' || parsed.protocol === 'https:';
47
+ }
48
+ catch {
49
+ return false;
50
+ }
51
+ }
package/package.json ADDED
@@ -0,0 +1,60 @@
1
+ {
2
+ "name": "@sambitcreate/parsely-cli",
3
+ "version": "2.0.0",
4
+ "description": "A smart recipe scraper CLI with interactive TUI built on Ink",
5
+ "type": "module",
6
+ "main": "dist/cli.js",
7
+ "bin": {
8
+ "parsely": "dist/cli.js"
9
+ },
10
+ "scripts": {
11
+ "start": "tsx src/cli.tsx",
12
+ "dev": "tsx watch src/cli.tsx",
13
+ "build": "tsc",
14
+ "prepublishOnly": "npm run build",
15
+ "test": "node dist/cli.js --version",
16
+ "typecheck": "tsc --noEmit"
17
+ },
18
+ "keywords": [
19
+ "recipe",
20
+ "scraper",
21
+ "cli",
22
+ "tui",
23
+ "ink",
24
+ "terminal"
25
+ ],
26
+ "license": "MIT",
27
+ "engines": {
28
+ "node": ">=18.0.0"
29
+ },
30
+ "files": [
31
+ "dist",
32
+ "package.json",
33
+ "README.md",
34
+ "LICENSE"
35
+ ],
36
+ "repository": {
37
+ "type": "git",
38
+ "url": "git+https://github.com/sambitcreate/parsely-cli.git"
39
+ },
40
+ "bugs": {
41
+ "url": "https://github.com/sambitcreate/parsely-cli/issues"
42
+ },
43
+ "homepage": "https://github.com/sambitcreate/parsely-cli#readme",
44
+ "dependencies": {
45
+ "ink": "^5.1.0",
46
+ "ink-spinner": "^5.0.0",
47
+ "ink-text-input": "^6.0.0",
48
+ "react": "^18.3.1",
49
+ "puppeteer-core": "^24.2.1",
50
+ "openai": "^4.82.0",
51
+ "cheerio": "^1.0.0",
52
+ "dotenv": "^16.4.7"
53
+ },
54
+ "devDependencies": {
55
+ "typescript": "^5.7.3",
56
+ "tsx": "^4.19.2",
57
+ "@types/react": "^18.3.18"
58
+ },
59
+ "packageManager": "pnpm@10.28.1+sha512.7d7dbbca9e99447b7c3bf7a73286afaaf6be99251eb9498baefa7d406892f67b879adb3a1d7e687fc4ccc1a388c7175fbaae567a26ab44d1067b54fcb0d6a316"
60
+ }