npm - @wargas/crawler - Versions diffs - 0.0.4 → 0.0.5 - Mend

@wargas/crawler 0.0.4 → 0.0.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (2) hide show

package/README.md +100 -6
package/package.json +5 -1

package/README.md CHANGED Viewed

@@ -1,15 +1,109 @@
-# crawler
+# Crawler
-To install dependencies:
+Classe utilitária para realizar requisições HTTP com:
+* Persistência automática de cookies
+* Parsing automático de HTML
+* Manipulação do DOM usando `linkedom`
+* Cliente HTTP baseado em got
+---
+# Instalação
 ```bash
-bun install
+bun add @wargas/crawler
 ```
-To run:
+ou usando npm:
 ```bash
-bun run index.ts
+npm install @wargas/crawler
+```
+---
+# Uso básico
+```ts
+import { Crawler } from "@wargas/crawler";
+const crawler = Crawler.factory();
+await crawler.client.get("https://example.com");
+console.log(crawler.html);
+console.log(
+    crawler.document.querySelector("title")?.textContent
+);
+```
+---
+# Cookies persistentes
+Os cookies são armazenados automaticamente no arquivo:
+```txt
+cookies.json
 ```
-This project was created using `bun init` in bun v1.3.14. [Bun](https://bun.com) is a fast all-in-one JavaScript runtime.
+Isso permite manter sessão entre execuções.
+---
+# Limpar cookies
+```ts
+await crawler.removeAllCookies();
+```
+---
+# Acessando o DOM
+Como o HTML é convertido automaticamente usando `linkedom`, é possível utilizar APIs similares ao navegador:
+```ts
+const links = crawler.document.querySelectorAll("a");
+for (const link of links) {
+    console.log(link.getAttribute("href"));
+}
+```
+---
+# Configurações atuais
+A instância do `got` é criada com:
+```ts
+followRedirect: false
+```
+Ou seja:
+* redirects não são seguidos automaticamente
+* cookies são persistidos
+* HTML é parseado automaticamente após cada resposta
+---
+# Possíveis melhorias
+* Suporte a proxy
+* Retry automático
+* Timeout configurável
+* User-Agent customizado
+* Suporte a certificados digitais
+* Suporte a HTTP2
+* Métodos helper (`get`, `post`, `login`, etc.)
+* Cache de páginas
+---
+# Licença
+MIT

package/package.json CHANGED Viewed

@@ -1,10 +1,14 @@
 {
   "name": "@wargas/crawler",
-  "version": "0.0.4",
+  "version": "0.0.5",
   "module": "index.ts",
   "main": "dist/index.js",
   "type": "module",
   "private": false,
+  "repository": {
+    "type": "git",
+    "url": "https://github.com/wargas/crawler.git"
+  },
   "scripts": {
     "build": "bun build index.ts --target node --outdir dist",
     "publish": "npm publish --tag latest --access public"