npm - substack-feed-api - Versions diffs - 1.1.2 → 2.0.0 - Mend

substack-feed-api 1.1.2 → 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

package/README.md +233 -55
package/dist/substackFeedApi.js +10051 -3372
package/dist/substackFeedApi.umd.cjs +8 -11
package/index.d.ts +55 -95
package/package.json +12 -10

package/README.md CHANGED Viewed

@@ -1,103 +1,281 @@
 # Substack Feed API
-The Substack Feed API is a powerful and flexible API designed to fetch and parse Substack newsletter feeds, making it easier for developers to integrate Substack content into their applications. Built with TypeScript, this API leverages modern JavaScript practices to provide a robust solution for accessing Substack newsletter data.
+`substack-feed-api` is a small TypeScript utility for turning RSS XML into typed objects using Cheerio, with first-class support for Substack and Goodreads feeds.
 ## Features
-- **Fetch Newsletter Feeds**: Easily retrieve the latest posts from any Substack newsletter.
-- **Parse Feed Data**: Convert Substack newsletter feeds into a structured format for easy integration.
-- **TypeScript Support**: Enjoy the benefits of TypeScript, with type definitions for all the API responses.
-- **Customizable**: Flexible API that can be tailored to fit the needs of your application.
+- **Type-safe** mapping from RSS XML to your own TypeScript types via generic selector maps.
+- Built-in helpers for Substack posts and Goodreads bookshelf RSS feeds (including shelves / reading status).
+- Uses Cheerio in XML mode, works well with namespaced tags like `content:encoded`.
+- Graceful error handling with configurable fallbacks and silent logging.
-## Getting Started
+## Installation
-### Prerequisites
+```bash
+npm install substack-feed-api
+# or
+yarn add substack-feed-api
+# or
+pnpm add substack-feed-api
+```
-- Node.js (version 20.x or higher)
-- npm (latest version)
+## Quick Start
-### Installation
+### Parsing Substack RSS
-To use the Substack Feed API in your project, follow these steps:
+Substack exposes a standard RSS 2.0 feed with a `<channel>` and multiple `<item>` entries; each item contains fields like `<title>`, `<description>`, `<link>`, `<pubDate>`, and `<content:encoded>` for the HTML body.
-1. Install the package via npm:
+```ts
+import { parseSubstackRss, SubstackItem } from 'substack-feed-api';
-```bash
-npm install substack-feed-api
+const xml = await fetch('https://example.substack.com/feed').then(r => r.text());
+const posts: SubstackItem[] = parseSubstackRss(xml);
+// Example item
+// {
+//   title: 'Both Not Half by Jassa Ahluwalia',
+//   description: 'A Humorous Journey Through Identity, Yet Lacking Cohesion',
+//   link: 'https://…',
+//   pubDate: 'Sun, 06 Oct 2024 15:35:17 GMT',
+//   content: '<p>Jassa Ahluwalia\'s <strong>Both Not Half</strong>…'
+// }
 ```
-2. Import the API into your project:
+You can override any selector if your feed schema differs:
-```typescript
-import { getSubstackFeed } from 'substack-feed-api';
+```ts
+const postsCustom = parseSubstackRss(xml, {
+  selectors: {
+    // use <description> as content
+    content: 'description',
+  },
+});
 ```
-### Usage
+### Parsing Goodreads Bookshelf RSS
-Here's a quick example to get you started:
+Goodreads’ “bookshelf” RSS feed exposes many book-related tags per `<item>` (e.g. `<title>`, `<book_description>`, `<book_large_image_url>`, `<author_name>`, `<user_shelves>`).
-```typescript
-getSubstackFeed('your-substack-newsletter-name')
-  .then((feed) => {
-    console.log(feed);
-  })
-  .catch((error) => {
-    console.error(error);
-  });
+The library exposes a Goodreads-specific helper that returns a higher-level `GoodreadsReadingState`:
+```ts
+import {
+  parseGoodreadsRss,
+  GoodreadsReadingState,
+} from 'substack-feed-api';
+const xml = await fetch('<goodreads-list-rss-url>').then(r => r.text());
+const states: GoodreadsReadingState[] = parseGoodreadsRss(xml);
+// Example shape:
+// {
+//   status: 'WANTS_TO_READ' | 'IS_READING' | 'FINISHED',
+//   book: {
+//     title: 'Malice (Detective Kaga, #1)',
+//     description: 'Acclaimed bestselling novelist Kunihiko Hidaka is found brutally murdered…',
+//     cover: 'https://i.gr-assets.com/.../20613611._SY475_.jpg',
+//     authors: [{ name: 'Keigo Higashino' }]
+//   }
+// }
 ```
-There is also a possibility to pass your own callback function to the `getSubstackFeed` function.
-This function will be called with the parsed feed data as an argument:
+By default, the Goodreads parser derives status from `user_shelves` (e.g. `to-read`, `currently-reading`, `read`).
-```typescript
-getSubstackFeed('your-substack-newsletter-name', (err, result) => {
-  console.log(result);
+You can still adjust selectors if Goodreads ever changes tag names:
+```ts
+const customStates = parseGoodreadsRss(xml, {
+  selectors: {
+    // Example: use medium image instead of large
+    cover: 'book_medium_image_url',
+  },
 });
 ```
-You can also use the other exported functions from the package:
-- `getFeedByLink`: Fetch a specific feed by its link.
-- `getPosts`: Get all posts from a feed.
+## API
-Replace `'your-substack-newsletter-name'` with the name of the Substack newsletter you wish to fetch.
+### `parseRssItems` – Generic Core
-## Development
+```ts
+function parseRssItems<TRaw extends Record<string, string>>(
+  xml: string,
+  options?: {
+    itemSelector?: string;
+    selectors?: Partial<Record<keyof TRaw, string>>;
+    fallback?: TRaw[];
+  }
+): TRaw[];
+```
-To contribute to the Substack Feed API or run it locally for development, you will need to follow these steps:
+- `xml`: Full RSS XML string.
+- `itemSelector`: CSS selector for each RSS item node, default `'channel > item'`.
+- `selectors`: Map from property name → CSS selector **relative to each item node**.
+- `fallback`: Array to return if parsing fails (e.g., malformed XML); error is logged to `console.error` but not thrown.
-1. Clone the repository:
+Example: Minimal Generic Usage:
-```bash
-git clone https://github.com/rohit1901/substack-feed-api.git
+```ts
+type MinimalItem = {
+  title: string;
+  link: string;
+};
+const items = parseRssItems<MinimalItem>(xml, {
+  selectors: {
+    title: 'title',
+    link: 'link',
+  },
+});
+```
+### `parseSubstackRss`
+```ts
+type SubstackItem = {
+  title: string;
+  description: string;
+  link: string;
+  pubDate: string;
+  content: string;
+};
+function parseSubstackRss(
+  xml: string,
+  options?: {
+    itemSelector?: string;
+    selectors?: Partial<Record<keyof SubstackItem, string>>;
+    fallback?: SubstackItem[];
+  }
+): SubstackItem[];
 ```
-2. Install dependencies:
+Default selectors (overridable):
-```bash
-npm install
+```ts
+{
+  title: 'title',
+  description: 'description',
+  link: 'link',
+  pubDate: 'pubDate',
+  content: 'content\\:encoded',
+}
 ```
-3. Run the development server:
+This matches typical Substack feeds which use `content:encoded` for the full HTML article body.
+### `parseGoodreadsRss`
+```ts
+type BookAuthor = { name: string };
+type GoodreadsBook = {
+  title: string;
+  description: string;
+  cover: string;
+  authors?: BookAuthor[];
+};
+type GoodreadsReadingStatus = 'IS_READING' | 'FINISHED' | 'WANTS_TO_READ';
+type GoodreadsReadingState = {
+  book: GoodreadsBook;
+  status: GoodreadsReadingStatus;
+};
+function parseGoodreadsRss(
+  xml: string,
+  options?: {
+    itemSelector?: string;
+    selectors?: Partial<{
+      title: string;
+      description: string;
+      cover: string;
+      author: string;
+      shelves: string;
+    }>;
+    fallback?: GoodreadsReadingState[]; // via raw fallback mapping
+  }
+): GoodreadsReadingState[];
+```
-```bash
-npm run dev
+Default Goodreads selectors map RSS tags to an internal flat type:
+```ts
+{
+  title: 'title',
+  description: 'book_description',
+  cover: 'book_large_image_url',
+  author: 'author_name',
+  shelves: 'user_shelves',
+}
 ```
-## Building
+The parser then:
-To build the project for production, run:
+- Builds a flat raw record from each `<item>`.
+- Maps `shelves` to a `GoodreadsReadingStatus` (e.g., `currently-reading` → `IS_READING`, `read` → `FINISHED`, otherwise `WANTS_TO_READ`).
+- Wraps book information into `GoodreadsBook` and `BookAuthor`.
-```bash
-npm run build
+## Error Handling
+All parsing functions follow the same pattern:
+- Wrap parsing and traversal in a `try/catch`.
+- On error, log a concise entry to `console.error` with context (selectors, item selector).
+- Return the provided `fallback` (default `[]`) instead of throwing.
+Example:
+```ts
+const items = parseSubstackRss('<invalid-xml>', {
+  fallback: [],
+}); // returns [], logs an error, does not crash your app
 ```
-This will generate the production-ready files in the `dist` directory.
+This makes the library safe to use in background jobs, CLI tools, or edge handlers where a single bad feed should not bring down the entire process.
-## Contributing
+## Extending for Other Feeds
-Contributions are welcome! Feel free to open an issue or submit a pull request if you have any improvements or bug fixes to suggest.
+To support another RSS feed type, you generally:
+1. Define a flat `TRaw` type that contains only string fields.
+2. Call `parseRssItems<TRaw>` with a selector map that matches the feed’s tags.
+3. Map `TRaw` to your domain model in a small wrapper, similar to `parseGoodreadsRss`.
+Example skeleton:
+```ts
+type MyFeedRaw = {
+  title: string;
+  summary: string;
+  link: string;
+};
+type MyFeedItem = {
+  title: string;
+  summary: string;
+  url: string;
+};
+function parseMyFeed(xml: string): MyFeedItem[] {
+  const raw = parseRssItems<MyFeedRaw>(xml, {
+    selectors: {
+      title: 'title',
+      summary: 'summary',
+      link: 'link',
+    },
+  });
+  return raw.map(r => ({
+    title: r.title,
+    summary: r.summary,
+    url: r.link,
+  }));
+}
+```
 ## License
 This project is licensed under the MIT License.
-```