PyPI - hipda - Versions diffs - 0.1.11__tar.gz - Mend

hipda 0.1.11__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (15) hide show

hipda-0.1.11/.github/workflows/python-publish.yml +63 -0
hipda-0.1.11/.gitignore +7 -0
hipda-0.1.11/PKG-INFO +54 -0
hipda-0.1.11/README.md +45 -0
hipda-0.1.11/docs/pypi-release.md +33 -0
hipda-0.1.11/pyproject.toml +20 -0
hipda-0.1.11/src/hipda_cli/__init__.py +5 -0
hipda-0.1.11/src/hipda_cli/auth.py +115 -0
hipda-0.1.11/src/hipda_cli/cli.py +236 -0
hipda-0.1.11/src/hipda_cli/client.py +90 -0
hipda-0.1.11/src/hipda_cli/models.py +24 -0
hipda-0.1.11/src/hipda_cli/parser.py +131 -0
hipda-0.1.11/tests/test_cli.py +180 -0
hipda-0.1.11/tests/test_client.py +99 -0
hipda-0.1.11/tests/test_parser.py +124 -0

hipda-0.1.11/.github/workflows/python-publish.yml ADDED Viewed

@@ -0,0 +1,63 @@
+---
+name: Publish Python Package
+"on":
+  release:
+    types: [published]
+permissions:
+  contents: read
+jobs:
+  build:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v6
+      - name: Install uv
+        uses: astral-sh/setup-uv@v8.1.0
+        with:
+          enable-cache: true
+      - name: Set up Python
+        uses: actions/setup-python@v6
+        with:
+          python-version: "3.13"
+      - name: Run tests
+        run: uvx --from pytest --with . pytest
+      - name: Build release distributions
+        run: uvx --from build pyproject-build
+      - name: Check release distributions
+        run: uvx --from twine twine check dist/*
+      - name: Upload distributions
+        uses: actions/upload-artifact@v7.0.1
+        with:
+          name: release-dists
+          path: dist/
+  pypi-publish:
+    runs-on: ubuntu-latest
+    needs: build
+    permissions:
+      id-token: write
+    environment:
+      name: pypi
+      url: https://pypi.org/project/hipda/
+    steps:
+      - name: Retrieve release distributions
+        uses: actions/download-artifact@v8.0.1
+        with:
+          name: release-dists
+          path: dist/
+      - name: Publish release distributions to PyPI
+        uses: pypa/gh-action-pypi-publish@release/v1
+        with:
+          packages-dir: dist/

hipda-0.1.11/.gitignore ADDED Viewed

@@ -0,0 +1,7 @@
+.env
+.DS_Store
+.pytest_cache/
+__pycache__/
+*.py[cod]
+dist/
+references/

hipda-0.1.11/PKG-INFO ADDED Viewed

@@ -0,0 +1,54 @@
+Metadata-Version: 2.4
+Name: hipda
+Version: 0.1.11
+Summary: CLI reader for 4D4Y/HiPDA Discovery forum posts.
+Requires-Python: >=3.11
+Requires-Dist: beautifulsoup4>=4.12
+Requires-Dist: browser-cookie3>=0.20
+Description-Content-Type: text/markdown
+# hipda
+CLI reader for the 4D4Y/HiPDA Discovery channel (`fid=2`).
+The site uses browser/session checks, so direct unauthenticated requests may return a Cloudflare challenge. Log in once through Chrome:
+```bash
+uvx --from . hipda login
+```
+From PyPI:
+```bash
+uvx hipda login
+```
+That opens 4D4Y in Google Chrome. After you finish logging in, return to the terminal and press Enter. Then read Discovery:
+```bash
+uvx --from . hipda list --limit 20
+uvx --from . hipda read 3446553
+```
+From PyPI:
+```bash
+uvx hipda list --limit 20
+uvx hipda read 3446553
+```
+`hipda list` also tries to import automatically if Chrome is already logged in, so most of the time you can skip straight to reading. The old `hipda discovery list` and `hipda discovery read` commands still work.
+The cookie is stored at `~/.config/hipda/cookie` and the user agent is stored at `~/.config/hipda/user-agent`, both with `0600` permissions. You can override them per command with `HIPDA_COOKIE` / `--cookie` and `HIPDA_USER_AGENT` / `--user-agent`.
+You can also pass a browser user agent:
+```bash
+HIPDA_USER_AGENT='Mozilla/5.0 ...' uvx --from . hipda list
+```
+The CLI disables HTTPS certificate verification by default because 4D4Y often fails from Python environments where Chrome still works. To verify certificates, pass a trusted root certificate and `--verify-tls`:
+```bash
+uvx --from . hipda --verify-tls --ca-file /path/to/root-ca.pem list
+```

hipda-0.1.11/README.md ADDED Viewed

@@ -0,0 +1,45 @@
+# hipda
+CLI reader for the 4D4Y/HiPDA Discovery channel (`fid=2`).
+The site uses browser/session checks, so direct unauthenticated requests may return a Cloudflare challenge. Log in once through Chrome:
+```bash
+uvx --from . hipda login
+```
+From PyPI:
+```bash
+uvx hipda login
+```
+That opens 4D4Y in Google Chrome. After you finish logging in, return to the terminal and press Enter. Then read Discovery:
+```bash
+uvx --from . hipda list --limit 20
+uvx --from . hipda read 3446553
+```
+From PyPI:
+```bash
+uvx hipda list --limit 20
+uvx hipda read 3446553
+```
+`hipda list` also tries to import automatically if Chrome is already logged in, so most of the time you can skip straight to reading. The old `hipda discovery list` and `hipda discovery read` commands still work.
+The cookie is stored at `~/.config/hipda/cookie` and the user agent is stored at `~/.config/hipda/user-agent`, both with `0600` permissions. You can override them per command with `HIPDA_COOKIE` / `--cookie` and `HIPDA_USER_AGENT` / `--user-agent`.
+You can also pass a browser user agent:
+```bash
+HIPDA_USER_AGENT='Mozilla/5.0 ...' uvx --from . hipda list
+```
+The CLI disables HTTPS certificate verification by default because 4D4Y often fails from Python environments where Chrome still works. To verify certificates, pass a trusted root certificate and `--verify-tls`:
+```bash
+uvx --from . hipda --verify-tls --ca-file /path/to/root-ca.pem list
+```

hipda-0.1.11/docs/pypi-release.md ADDED Viewed

@@ -0,0 +1,33 @@
+# PyPI Release
+This project publishes with GitHub Actions trusted publishing. No PyPI token is stored in GitHub secrets.
+## One-time PyPI setup
+Create trusted publishers for repository `cdpath/hipda`.
+If the project already exists on the index, add the publisher from that project's publishing settings. If the project does not exist yet, create a pending trusted publisher from the account publishing page; the first successful workflow run will create the project.
+PyPI:
+- Project: `hipda`
+- Owner: `cdpath`
+- Repository name: `hipda`
+- Workflow name: `python-publish.yml`
+- Environment name: `pypi`
+The workflow uses a GitHub environment named `pypi`. Configure environment protection rules in GitHub if releases should require manual approval.
+## PyPI release
+1. Bump `version` in `pyproject.toml`.
+2. Push the branch to GitHub.
+3. Create and publish a GitHub release.
+4. The release event publishes the package version to PyPI.
+5. Verify the package:
+```bash
+uvx --refresh --from hipda==<version> hipda --help
+```
+Package versions are immutable on PyPI. If a publish partially succeeds, bump the version before retrying.

hipda-0.1.11/pyproject.toml ADDED Viewed

@@ -0,0 +1,20 @@
+[project]
+name = "hipda"
+version = "0.1.11"
+description = "CLI reader for 4D4Y/HiPDA Discovery forum posts."
+readme = "README.md"
+requires-python = ">=3.11"
+dependencies = [
+  "beautifulsoup4>=4.12",
+  "browser-cookie3>=0.20",
+]
+[project.scripts]
+hipda = "hipda_cli.cli:main"
+[build-system]
+requires = ["hatchling"]
+build-backend = "hatchling.build"
+[tool.hatch.build.targets.wheel]
+packages = ["src/hipda_cli"]

hipda-0.1.11/src/hipda_cli/__init__.py ADDED Viewed

@@ -0,0 +1,5 @@
+"""Command-line tools for reading 4D4Y/HiPDA forum content."""
+__all__ = ["__version__"]
+__version__ = "0.1.9"

hipda-0.1.11/src/hipda_cli/auth.py ADDED Viewed

@@ -0,0 +1,115 @@
+from __future__ import annotations
+import os
+import plistlib
+import subprocess
+from pathlib import Path
+import browser_cookie3
+LOGIN_URL = "https://www.4d4y.com/forum/forumdisplay.php?fid=2"
+CHROME_INFO_PLIST_PATHS = (
+    Path("/Applications/Google Chrome.app/Contents/Info.plist"),
+    Path.home() / "Applications/Google Chrome.app/Contents/Info.plist",
+)
+def default_cookie_path() -> Path:
+    return _config_path("cookie")
+def default_user_agent_path() -> Path:
+    return _config_path("user-agent")
+def _config_path(name: str) -> Path:
+    config_home = os.environ.get("XDG_CONFIG_HOME")
+    if config_home:
+        return Path(config_home) / "hipda" / name
+    return Path.home() / ".config" / "hipda" / name
+def normalize_cookie(cookie: str) -> str:
+    cookie = cookie.strip()
+    if cookie.lower().startswith("cookie:"):
+        cookie = cookie.split(":", 1)[1].strip()
+    return cookie
+def load_cookie(path: Path | None = None) -> str:
+    cookie_path = path or default_cookie_path()
+    if not cookie_path.exists():
+        return ""
+    return normalize_cookie(cookie_path.read_text(encoding="utf-8"))
+def save_cookie(cookie: str, path: Path | None = None) -> Path:
+    normalized = normalize_cookie(cookie)
+    if not normalized:
+        raise ValueError("cookie is empty")
+    cookie_path = path or default_cookie_path()
+    cookie_path.parent.mkdir(parents=True, exist_ok=True)
+    cookie_path.write_text(normalized + "\n", encoding="utf-8")
+    cookie_path.chmod(0o600)
+    return cookie_path
+def load_user_agent(path: Path | None = None) -> str:
+    user_agent_path = path or default_user_agent_path()
+    if not user_agent_path.exists():
+        return ""
+    return user_agent_path.read_text(encoding="utf-8").strip()
+def save_user_agent(user_agent: str, path: Path | None = None) -> Path:
+    normalized = user_agent.strip()
+    if not normalized:
+        raise ValueError("user-agent is empty")
+    user_agent_path = path or default_user_agent_path()
+    user_agent_path.parent.mkdir(parents=True, exist_ok=True)
+    user_agent_path.write_text(normalized + "\n", encoding="utf-8")
+    user_agent_path.chmod(0o600)
+    return user_agent_path
+def cookie_header_from_browser(domain: str = "4d4y.com") -> str:
+    jar = browser_cookie3.chrome(domain_name=domain)
+    cookies = []
+    for cookie in jar:
+        if cookie.domain.lstrip(".") == domain or cookie.domain.endswith("." + domain):
+            cookies.append(f"{cookie.name}={cookie.value}")
+    return "; ".join(cookies)
+def chrome_user_agent() -> str:
+    major = "147"
+    for plist_path in CHROME_INFO_PLIST_PATHS:
+        if not plist_path.exists():
+            continue
+        with plist_path.open("rb") as file:
+            version = str(plistlib.load(file).get("CFBundleShortVersionString", ""))
+        if version:
+            major = version.split(".", 1)[0]
+            break
+    return (
+        "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) "
+        f"AppleWebKit/537.36 (KHTML, like Gecko) Chrome/{major}.0.0.0 Safari/537.36"
+    )
+def import_browser_auth(domain: str = "4d4y.com") -> tuple[str, str]:
+    cookie = cookie_header_from_browser(domain)
+    if not cookie:
+        raise ValueError(f"no {domain} cookies found in Chrome")
+    user_agent = chrome_user_agent()
+    save_cookie(cookie)
+    save_user_agent(user_agent)
+    return cookie, user_agent
+def open_login_page() -> None:
+    subprocess.run(["open", "-a", "Google Chrome", LOGIN_URL], check=False)

hipda-0.1.11/src/hipda_cli/cli.py ADDED Viewed

@@ -0,0 +1,236 @@
+from __future__ import annotations
+import argparse
+import sys
+from .auth import import_browser_auth, open_login_page, save_cookie, save_user_agent
+from .client import BASE_URL, HipdaClient, HipdaClientError
+from .parser import is_login_required_page, parse_forum_listing, parse_thread
+def build_parser() -> argparse.ArgumentParser:
+    parser = argparse.ArgumentParser(prog="hipda", description="Read 4D4Y/HiPDA forum posts from the terminal.")
+    parser.add_argument("--cookie", help="Logged-in Cookie header. Defaults to HIPDA_COOKIE.")
+    parser.add_argument("--user-agent", help="User-Agent header. Defaults to HIPDA_USER_AGENT or Chrome-like UA.")
+    parser.add_argument("--ca-file", help="PEM CA bundle to trust for HTTPS. Defaults to HIPDA_CA_FILE.")
+    parser.add_argument(
+        "--insecure-tls",
+        action="store_true",
+        help="Disable HTTPS certificate verification. This is the default for 4D4Y.",
+    )
+    parser.add_argument(
+        "--verify-tls",
+        action="store_true",
+        help="Enable HTTPS certificate verification.",
+    )
+    subparsers = parser.add_subparsers(dest="command", metavar="{login,list,read}")
+    subparsers.add_parser("login", help="Import 4D4Y login cookies from Chrome.")
+    list_parser = subparsers.add_parser("list", help="List Discovery threads.")
+    list_parser.add_argument("--page", type=int, default=1, help="Forum page number.")
+    list_parser.add_argument("--limit", type=int, default=30, help="Maximum number of threads to print.")
+    read_parser = subparsers.add_parser("read", help="Read a thread by tid or URL.")
+    read_parser.add_argument("thread", help="Thread id, or a full viewthread.php URL.")
+    read_parser.add_argument("--page", type=int, default=1, help="Thread page number.")
+    auth = subparsers.add_parser("auth", help=argparse.SUPPRESS)
+    auth_subparsers = auth.add_subparsers(dest="auth_command", required=True)
+    save_cookie_parser = auth_subparsers.add_parser("save-cookie", help="Save a pasted 4D4Y Cookie header.")
+    save_cookie_parser.add_argument("cookie", nargs="?", help="Cookie header value. Reads stdin if omitted.")
+    save_user_agent_parser = auth_subparsers.add_parser("save-user-agent", help="Save the Chrome User-Agent used with the cookie.")
+    save_user_agent_parser.add_argument("user_agent", nargs="?", help="User-Agent value. Reads stdin if omitted.")
+    discovery = subparsers.add_parser("discovery", help=argparse.SUPPRESS)
+    discovery_subparsers = discovery.add_subparsers(dest="discovery_command", required=True)
+    list_parser = discovery_subparsers.add_parser("list", help="List Discovery threads.")
+    list_parser.add_argument("--page", type=int, default=1, help="Forum page number.")
+    list_parser.add_argument("--limit", type=int, default=30, help="Maximum number of threads to print.")
+    read_parser = discovery_subparsers.add_parser("read", help="Read a thread by tid or URL.")
+    read_parser.add_argument("thread", help="Thread id, or a full viewthread.php URL.")
+    read_parser.add_argument("--page", type=int, default=1, help="Thread page number.")
+    subparsers._choices_actions = [
+        action for action in subparsers._choices_actions if action.dest not in {"auth", "discovery"}
+    ]
+    return parser
+def _thread_params(thread: str, page: int) -> dict[str, str | int]:
+    if "tid=" in thread:
+        tid = thread.split("tid=", 1)[1].split("&", 1)[0]
+    else:
+        tid = thread
+    return {"tid": tid, "page": page}
+def load_discovery_page(
+    *,
+    page: int,
+    path: str,
+    params: dict[str, str | int],
+    cookie: str | None,
+    user_agent: str | None,
+    ca_file: str | None,
+    insecure_tls: bool,
+    verify_tls: bool = False,
+) -> tuple[str, HipdaClient]:
+    client = HipdaClient.from_env(
+        cookie=cookie,
+        user_agent=user_agent,
+        ca_file=ca_file,
+        insecure_tls=insecure_tls or not verify_tls,
+        verify_tls=verify_tls,
+    )
+    html = client.get(path, params)
+    if not is_login_required_page(html):
+        return html, client
+    if cookie:
+        return html, client
+    try:
+        imported_cookie, imported_user_agent = import_browser_auth()
+    except Exception:
+        return html, client
+    client = HipdaClient.from_env(
+        cookie=imported_cookie,
+        user_agent=user_agent or imported_user_agent,
+        ca_file=ca_file,
+        insecure_tls=insecure_tls or not verify_tls,
+        verify_tls=verify_tls,
+    )
+    return client.get(path, params), client
+def wait_for_login_confirmation() -> None:
+    if sys.stdin.isatty():
+        input("Log in to 4D4Y in Chrome, then press Enter here...")
+def run(args: argparse.Namespace) -> int:
+    if args.command == "login":
+        try:
+            open_login_page()
+            wait_for_login_confirmation()
+            import_browser_auth()
+        except Exception as exc:
+            print(
+                "hipda: could not import 4D4Y cookies from Chrome. "
+                "Open Chrome, log in to https://www.4d4y.com/forum/forumdisplay.php?fid=2, then run `hipda login` again.",
+                file=sys.stderr,
+            )
+            print(f"hipda: {exc}", file=sys.stderr)
+            return 2
+        print("Imported 4D4Y login from Chrome.")
+        return 0
+    if args.command == "auth" and args.auth_command == "save-cookie":
+        cookie = args.cookie if args.cookie is not None else sys.stdin.read()
+        try:
+            path = save_cookie(cookie)
+        except ValueError as exc:
+            print(f"hipda: {exc}", file=sys.stderr)
+            return 2
+        print(f"Saved cookie to {path}")
+        return 0
+    if args.command == "auth" and args.auth_command == "save-user-agent":
+        user_agent = args.user_agent if args.user_agent is not None else sys.stdin.read()
+        try:
+            path = save_user_agent(user_agent)
+        except ValueError as exc:
+            print(f"hipda: {exc}", file=sys.stderr)
+            return 2
+        print(f"Saved user-agent to {path}")
+        return 0
+    client = HipdaClient.from_env(
+        cookie=args.cookie,
+        user_agent=args.user_agent,
+        ca_file=args.ca_file,
+        insecure_tls=args.insecure_tls or not args.verify_tls,
+        verify_tls=args.verify_tls,
+    )
+    try:
+        command = args.discovery_command if args.command == "discovery" else args.command
+        if command == "list":
+            html, client = load_discovery_page(
+                page=args.page,
+                path="forumdisplay.php",
+                params={"fid": 2, "page": args.page},
+                cookie=args.cookie,
+                user_agent=args.user_agent,
+                ca_file=args.ca_file,
+                insecure_tls=args.insecure_tls or not args.verify_tls,
+                verify_tls=args.verify_tls,
+            )
+            if is_login_required_page(html):
+                print(
+                    "hipda: 4D4Y says this request is not logged in. "
+                    "Open Chrome, log in to 4D4Y, then run `hipda login`.",
+                    file=sys.stderr,
+                )
+                return 2
+            threads = parse_forum_listing(html, base_url=BASE_URL)[: args.limit]
+            for thread in threads:
+                stats = ""
+                if thread.replies is not None and thread.views is not None:
+                    stats = f" {thread.replies}/{thread.views}"
+                last = f" last: {thread.last_author} {thread.last_at}".rstrip() if thread.last_author else ""
+                print(f"{thread.tid}\t{thread.title}\t{thread.author} {thread.created_at}{stats}{last}")
+            return 0
+        if command == "read":
+            html, client = load_discovery_page(
+                page=args.page,
+                path="viewthread.php",
+                params=_thread_params(args.thread, args.page),
+                cookie=args.cookie,
+                user_agent=args.user_agent,
+                ca_file=args.ca_file,
+                insecure_tls=args.insecure_tls or not args.verify_tls,
+                verify_tls=args.verify_tls,
+            )
+            if is_login_required_page(html):
+                print(
+                    "hipda: 4D4Y says this request is not logged in. "
+                    "Open Chrome, log in to 4D4Y, then run `hipda login`.",
+                    file=sys.stderr,
+                )
+                return 2
+            posts = parse_thread(html)
+            for index, post in enumerate(posts, start=1):
+                print(f"#{index} {post.author} {post.published_at}".rstrip())
+                print(post.content)
+                print()
+            return 0
+    except HipdaClientError as exc:
+        print(f"hipda: {exc}", file=sys.stderr)
+        if not client.cookie:
+            print("hipda: set HIPDA_COOKIE or pass --cookie with a logged-in 4D4Y Cookie header.", file=sys.stderr)
+        return 2
+    raise AssertionError(f"Unhandled command: {args}")
+def main(argv: list[str] | None = None) -> int:
+    parser = build_parser()
+    if argv is None:
+        argv = sys.argv[1:]
+    if not argv:
+        parser.print_help()
+        return 0
+    return run(parser.parse_args(argv))
+if __name__ == "__main__":
+    raise SystemExit(main())

hipda-0.1.11/src/hipda_cli/client.py ADDED Viewed

@@ -0,0 +1,90 @@
+from __future__ import annotations
+import os
+import ssl
+from dataclasses import dataclass
+from urllib.error import HTTPError, URLError
+from urllib.parse import urlencode
+from urllib.request import Request, urlopen
+from .auth import load_cookie, load_user_agent
+BASE_URL = "https://www.4d4y.com/forum/"
+DEFAULT_USER_AGENT = (
+    "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) "
+    "AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0 Safari/537.36"
+)
+class HipdaClientError(RuntimeError):
+    pass
+@dataclass(frozen=True)
+class HipdaClient:
+    cookie: str = ""
+    user_agent: str = DEFAULT_USER_AGENT
+    ca_file: str | None = None
+    insecure_tls: bool = True
+    base_url: str = BASE_URL
+    timeout: float = 20.0
+    @classmethod
+    def from_env(
+        cls,
+        cookie: str | None = None,
+        user_agent: str | None = None,
+        ca_file: str | None = None,
+        insecure_tls: bool = True,
+        verify_tls: bool = False,
+    ) -> "HipdaClient":
+        return cls(
+            cookie=cookie or os.environ.get("HIPDA_COOKIE", "") or load_cookie(),
+            user_agent=user_agent or os.environ.get("HIPDA_USER_AGENT", "") or load_user_agent() or DEFAULT_USER_AGENT,
+            ca_file=ca_file or os.environ.get("HIPDA_CA_FILE"),
+            insecure_tls=(
+                not verify_tls
+                and (insecure_tls or os.environ.get("HIPDA_INSECURE_TLS", "").lower() in {"1", "true", "yes"})
+            ),
+        )
+    def ssl_context(self) -> ssl.SSLContext | None:
+        if self.insecure_tls:
+            context = ssl.create_default_context()
+            context.check_hostname = False
+            context.verify_mode = ssl.CERT_NONE
+            return context
+        if self.ca_file:
+            return ssl.create_default_context(cafile=self.ca_file)
+        return None
+    def get(self, path: str, params: dict[str, str | int] | None = None) -> str:
+        url = self.base_url + path
+        if params:
+            url = f"{url}?{urlencode(params)}"
+        headers = {
+            "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
+            "Accept-Language": "zh-CN,zh;q=0.9,en;q=0.8",
+            "User-Agent": self.user_agent,
+        }
+        if self.cookie:
+            headers["Cookie"] = self.cookie
+        try:
+            with urlopen(Request(url, headers=headers), timeout=self.timeout, context=self.ssl_context()) as response:
+                body = response.read()
+                encoding = response.headers.get_content_charset() or "utf-8"
+                return body.decode(encoding, errors="replace")
+        except HTTPError as exc:
+            body = exc.read().decode("utf-8", errors="replace")[:300]
+            raise HipdaClientError(f"HTTP {exc.code} fetching {url}: {body}") from exc
+        except URLError as exc:
+            if isinstance(exc.reason, ssl.SSLCertVerificationError):
+                raise HipdaClientError(
+                    f"Could not verify TLS certificate for {url}: {exc.reason}. "
+                    "If you use a trusted local proxy, pass --ca-file /path/to/root.pem. "
+                    "As a last resort, pass --insecure-tls."
+                ) from exc
+            raise HipdaClientError(f"Could not fetch {url}: {exc.reason}") from exc

hipda-0.1.11/src/hipda_cli/models.py ADDED Viewed

@@ -0,0 +1,24 @@
+from __future__ import annotations
+from dataclasses import dataclass
+@dataclass(frozen=True)
+class ThreadSummary:
+    tid: str
+    title: str
+    url: str
+    author: str = ""
+    created_at: str = ""
+    replies: int | None = None
+    views: int | None = None
+    last_author: str = ""
+    last_at: str = ""
+@dataclass(frozen=True)
+class Post:
+    author: str
+    published_at: str
+    content: str

hipda-0.1.11/src/hipda_cli/parser.py ADDED Viewed

@@ -0,0 +1,131 @@
+from __future__ import annotations
+import re
+from urllib.parse import parse_qs, urljoin, urlparse
+from bs4 import BeautifulSoup
+from .models import Post, ThreadSummary
+THREAD_RE = re.compile(r"(?:^|[?&])tid=(\d+)")
+WHITESPACE_RE = re.compile(r"[ \t\r\f\v]+")
+def clean_text(value: str) -> str:
+    lines = []
+    for line in value.replace("\xa0", " ").splitlines():
+        cleaned = WHITESPACE_RE.sub(" ", line).strip()
+        if cleaned:
+            lines.append(cleaned)
+    return "\n".join(lines)
+def _tid_from_href(href: str) -> str | None:
+    parsed = urlparse(href)
+    tid = parse_qs(parsed.query).get("tid", [None])[0]
+    if tid:
+        return tid
+    match = THREAD_RE.search(href)
+    return match.group(1) if match else None
+def _split_cell_lines(cell) -> list[str]:
+    return clean_text(cell.get_text("\n")).splitlines()
+def _parse_counts(value: str) -> tuple[int | None, int | None]:
+    match = re.search(r"(\d+)\s*/\s*(\d+)", value)
+    if not match:
+        return None, None
+    return int(match.group(1)), int(match.group(2))
+def parse_forum_listing(html: str, base_url: str) -> list[ThreadSummary]:
+    soup = BeautifulSoup(html, "html.parser")
+    threads: list[ThreadSummary] = []
+    seen: set[str] = set()
+    for anchor in soup.find_all("a", href=True):
+        href = anchor["href"]
+        if "viewthread.php" not in href:
+            continue
+        tid = _tid_from_href(href)
+        title = clean_text(anchor.get_text())
+        if not tid or not title or tid in seen:
+            continue
+        row = anchor.find_parent("tr")
+        author = created_at = last_author = last_at = ""
+        replies = views = None
+        if row:
+            cells = row.find_all(["td", "th"], recursive=False)
+            anchor_cell = anchor.find_parent(["td", "th"])
+            anchor_cell_index = cells.index(anchor_cell) if anchor_cell in cells else -1
+            trailing_cells = cells[anchor_cell_index + 1 :] if anchor_cell_index >= 0 else []
+            if trailing_cells:
+                author_lines = _split_cell_lines(trailing_cells[0])
+                author = author_lines[0] if author_lines else ""
+                created_at = author_lines[1] if len(author_lines) > 1 else ""
+            if len(trailing_cells) > 1:
+                replies, views = _parse_counts(clean_text(trailing_cells[1].get_text(" ")))
+            if len(trailing_cells) > 2:
+                last_lines = _split_cell_lines(trailing_cells[2])
+                last_author = last_lines[0] if last_lines else ""
+                last_at = last_lines[1] if len(last_lines) > 1 else ""
+        seen.add(tid)
+        threads.append(
+            ThreadSummary(
+                tid=tid,
+                title=title,
+                url=urljoin(base_url, href),
+                author=author,
+                created_at=created_at,
+                replies=replies,
+                views=views,
+                last_author=last_author,
+                last_at=last_at,
+            )
+        )
+    return threads
+def is_login_required_page(html: str) -> bool:
+    text = clean_text(BeautifulSoup(html, "html.parser").get_text("\n"))
+    return "您还未登录" in text or "无权访问该版块" in text
+def parse_thread(html: str) -> list[Post]:
+    soup = BeautifulSoup(html, "html.parser")
+    posts: list[Post] = []
+    for container in soup.find_all(id=re.compile(r"^post_\d+")):
+        message = container.select_one(".t_msgfont") or container.select_one("[id^=postmessage_]")
+        if not message:
+            continue
+        author_node = container.select_one(".postauthor > .postinfo a") or container.select_one(".postauthor a")
+        if not author_node:
+            fallback_author_node = container.select_one(".postauthor")
+            if fallback_author_node and fallback_author_node.name != "td":
+                author_node = fallback_author_node
+        info_node = (
+            container.select_one(".authorinfo [id^=authorposton]")
+            or container.select_one(".postcontent .postinfo")
+            or container.select_one(".postinfo")
+        )
+        info_text = clean_text(info_node.get_text("\n")) if info_node else ""
+        published_at = re.sub(r"^发表于\s*", "", info_text.splitlines()[0]).strip() if info_text else ""
+        posts.append(
+            Post(
+                author=clean_text(author_node.get_text()) if author_node else "",
+                published_at=published_at,
+                content=clean_text(message.get_text("\n")),
+            )
+        )
+    return posts

hipda-0.1.11/tests/test_cli.py ADDED Viewed

@@ -0,0 +1,180 @@
+from hipda_cli.auth import default_cookie_path, default_user_agent_path, load_cookie, load_user_agent, save_cookie, save_user_agent
+from hipda_cli.cli import build_parser, load_discovery_page, main
+def test_build_parser_accepts_discovery_list_options():
+    parser = build_parser()
+    args = parser.parse_args(["--ca-file", "/tmp/root.pem", "--verify-tls", "discovery", "list", "--page", "2", "--limit", "5"])
+    assert args.command == "discovery"
+    assert args.discovery_command == "list"
+    assert args.ca_file == "/tmp/root.pem"
+    assert args.verify_tls is True
+    assert args.page == 2
+    assert args.limit == 5
+def test_help_hides_legacy_commands(capsys):
+    parser = build_parser()
+    parser.print_help()
+    out = capsys.readouterr().out
+    assert "{login,list,read}" in out
+    assert "auth" not in out
+    assert "discovery" not in out
+def test_main_without_subcommand_prints_help(capsys):
+    assert main([]) == 0
+    out = capsys.readouterr().out
+    assert "usage: hipda" in out
+    assert "{login,list,read}" in out
+def test_main_without_subcommand_from_sys_argv_prints_help(monkeypatch, capsys):
+    monkeypatch.setattr("sys.argv", ["hipda"])
+    assert main() == 0
+    out = capsys.readouterr().out
+    assert "usage: hipda" in out
+    assert "{login,list,read}" in out
+def test_build_parser_accepts_top_level_list_options():
+    parser = build_parser()
+    args = parser.parse_args(["list", "--page", "2", "--limit", "5"])
+    assert args.command == "list"
+    assert args.page == 2
+    assert args.limit == 5
+def test_build_parser_accepts_discovery_read_tid():
+    parser = build_parser()
+    args = parser.parse_args(["discovery", "read", "3446553", "--page", "3"])
+    assert args.discovery_command == "read"
+    assert args.thread == "3446553"
+    assert args.page == 3
+def test_build_parser_accepts_top_level_read_tid():
+    parser = build_parser()
+    args = parser.parse_args(["read", "3446553", "--page", "3"])
+    assert args.command == "read"
+    assert args.thread == "3446553"
+    assert args.page == 3
+def test_build_parser_accepts_auth_save_cookie():
+    parser = build_parser()
+    args = parser.parse_args(["auth", "save-cookie", "foo=bar; baz=qux"])
+    assert args.command == "auth"
+    assert args.auth_command == "save-cookie"
+    assert args.cookie == "foo=bar; baz=qux"
+def test_build_parser_accepts_auth_save_user_agent():
+    parser = build_parser()
+    args = parser.parse_args(["auth", "save-user-agent", "Mozilla/5.0 Chrome/147.0.0.0"])
+    assert args.command == "auth"
+    assert args.auth_command == "save-user-agent"
+    assert args.user_agent == "Mozilla/5.0 Chrome/147.0.0.0"
+def test_build_parser_accepts_login():
+    parser = build_parser()
+    args = parser.parse_args(["login"])
+    assert args.command == "login"
+def test_run_login_opens_chrome_then_imports(monkeypatch, capsys):
+    events = []
+    parser = build_parser()
+    args = parser.parse_args(["login"])
+    monkeypatch.setattr("hipda_cli.cli.open_login_page", lambda: events.append("open"))
+    monkeypatch.setattr("hipda_cli.cli.wait_for_login_confirmation", lambda: events.append("wait"))
+    monkeypatch.setattr("hipda_cli.cli.import_browser_auth", lambda: events.append("import") or ("cookie", "ua"))
+    from hipda_cli.cli import run
+    assert run(args) == 0
+    assert events == ["open", "wait", "import"]
+    assert "Imported 4D4Y login from Chrome." in capsys.readouterr().out
+def test_save_cookie_strips_cookie_prefix_and_uses_private_permissions(tmp_path):
+    cookie_path = tmp_path / "cookie"
+    save_cookie("Cookie: foo=bar; baz=qux\n", cookie_path)
+    assert load_cookie(cookie_path) == "foo=bar; baz=qux"
+    assert oct(cookie_path.stat().st_mode & 0o777) == "0o600"
+def test_default_cookie_path_uses_xdg_config_home(monkeypatch, tmp_path):
+    monkeypatch.setenv("XDG_CONFIG_HOME", str(tmp_path))
+    assert default_cookie_path() == tmp_path / "hipda" / "cookie"
+def test_save_user_agent_round_trips(tmp_path):
+    path = tmp_path / "user-agent"
+    save_user_agent("Mozilla/5.0 Chrome/147.0.0.0\n", path)
+    assert load_user_agent(path) == "Mozilla/5.0 Chrome/147.0.0.0"
+    assert oct(path.stat().st_mode & 0o777) == "0o600"
+def test_default_user_agent_path_uses_xdg_config_home(monkeypatch, tmp_path):
+    monkeypatch.setenv("XDG_CONFIG_HOME", str(tmp_path))
+    assert default_user_agent_path() == tmp_path / "hipda" / "user-agent"
+def test_load_discovery_page_imports_browser_auth_when_saved_auth_is_missing(monkeypatch, tmp_path):
+    monkeypatch.setenv("XDG_CONFIG_HOME", str(tmp_path))
+    calls = []
+    class FakeClient:
+        def __init__(self, cookie=""):
+            self.cookie = cookie
+        def get(self, path, params):
+            calls.append((self.cookie, path, params))
+            return "<html><a href='viewthread.php?tid=1'>ok</a></html>" if self.cookie else "您还未登录"
+    monkeypatch.setattr("hipda_cli.cli.HipdaClient.from_env", lambda **kwargs: FakeClient(kwargs.get("cookie") or load_cookie()))
+    monkeypatch.setattr("hipda_cli.cli.import_browser_auth", lambda: ("cdb_auth=abc", "Mozilla/5.0 Chrome/147.0.0.0"))
+    html, client = load_discovery_page(
+        page=1,
+        path="forumdisplay.php",
+        params={"fid": 2, "page": 1},
+        cookie=None,
+        user_agent=None,
+        ca_file=None,
+        insecure_tls=False,
+    )
+    assert "viewthread.php" in html
+    assert client.cookie == "cdb_auth=abc"
+    assert calls == [
+        ("", "forumdisplay.php", {"fid": 2, "page": 1}),
+        ("cdb_auth=abc", "forumdisplay.php", {"fid": 2, "page": 1}),
+    ]

hipda-0.1.11/tests/test_client.py ADDED Viewed

@@ -0,0 +1,99 @@
+import ssl
+from http.cookiejar import Cookie, CookieJar
+from hipda_cli.auth import LOGIN_URL, chrome_user_agent, cookie_header_from_browser, open_login_page
+from hipda_cli.client import HipdaClient
+def test_from_env_accepts_ca_file_and_insecure_tls():
+    client = HipdaClient.from_env(cookie="a=b", ca_file="/tmp/root.pem", insecure_tls=True)
+    assert client.ca_file == "/tmp/root.pem"
+    assert client.insecure_tls is True
+def test_from_env_disables_tls_verification_by_default():
+    client = HipdaClient.from_env(cookie="a=b")
+    assert client.insecure_tls is True
+def test_from_env_can_verify_tls():
+    client = HipdaClient.from_env(cookie="a=b", verify_tls=True)
+    assert client.insecure_tls is False
+def test_ssl_context_uses_ca_file_and_verifies_tls(monkeypatch):
+    calls = {}
+    def fake_create_default_context(*, cafile=None):
+        calls["cafile"] = cafile
+        return "context"
+    monkeypatch.setattr(ssl, "create_default_context", fake_create_default_context)
+    assert HipdaClient(ca_file="/tmp/root.pem", insecure_tls=False).ssl_context() == "context"
+    assert calls == {"cafile": "/tmp/root.pem"}
+def test_ssl_context_disables_verification_by_default():
+    context = HipdaClient().ssl_context()
+    assert context.check_hostname is False
+    assert context.verify_mode == ssl.CERT_NONE
+def test_cookie_header_from_browser_filters_4d4y_cookies(monkeypatch):
+    jar = CookieJar()
+    jar.set_cookie(_cookie("cdb_auth", "abc", ".4d4y.com"))
+    jar.set_cookie(_cookie("cf_clearance", "def", "www.4d4y.com"))
+    jar.set_cookie(_cookie("other", "nope", ".example.com"))
+    monkeypatch.setattr("hipda_cli.auth.browser_cookie3.chrome", lambda domain_name: jar)
+    assert cookie_header_from_browser("4d4y.com") == "cdb_auth=abc; cf_clearance=def"
+def test_chrome_user_agent_uses_chrome_version_from_plist(monkeypatch, tmp_path):
+    import plistlib
+    plist = tmp_path / "Info.plist"
+    plist.write_bytes(plistlib.dumps({"CFBundleShortVersionString": "147.0.1.2"}))
+    monkeypatch.setattr("hipda_cli.auth.CHROME_INFO_PLIST_PATHS", (plist,))
+    assert chrome_user_agent() == (
+        "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) "
+        "AppleWebKit/537.36 (KHTML, like Gecko) Chrome/147.0.0.0 Safari/537.36"
+    )
+def test_open_login_page_uses_google_chrome_on_macos(monkeypatch):
+    calls = []
+    monkeypatch.setattr("hipda_cli.auth.subprocess.run", lambda command, check: calls.append((command, check)))
+    open_login_page()
+    assert calls == [(["open", "-a", "Google Chrome", LOGIN_URL], False)]
+def _cookie(name: str, value: str, domain: str) -> Cookie:
+    return Cookie(
+        version=0,
+        name=name,
+        value=value,
+        port=None,
+        port_specified=False,
+        domain=domain,
+        domain_specified=True,
+        domain_initial_dot=domain.startswith("."),
+        path="/",
+        path_specified=True,
+        secure=True,
+        expires=None,
+        discard=False,
+        comment=None,
+        comment_url=None,
+        rest={},
+    )

hipda-0.1.11/tests/test_parser.py ADDED Viewed

@@ -0,0 +1,124 @@
+from hipda_cli.parser import is_login_required_page, parse_forum_listing, parse_thread
+LISTING_HTML = """
+<html><body>
+<table>
+  <tr class="thread">
+    <th><a href="viewthread.php?tid=3446553&extra=page%3D1">《放开那个女巫》动画做得还不错。</a></th>
+    <td class="author">老兵-猫族<br>2026-5-7</td>
+    <td class="nums">11/777</td>
+    <td class="lastpost">leeice<br>2026-5-8 11:56</td>
+  </tr>
+  <tr>
+    <th><a href="viewthread.php?tid=3447001">iPhone13PM手机更换电池，现在最好的姿势是啥？</a></th>
+    <td>死老妖<br>2026-5-8</td>
+    <td>0/4</td>
+    <td>死老妖<br>2026-5-8 11:55</td>
+  </tr>
+</table>
+</body></html>
+"""
+REALISTIC_LISTING_HTML = """
+<html><body>
+<tbody id="normalthread_3057651">
+<tr>
+  <td class="folder"><a href="viewthread.php?tid=3057651&amp;extra=page%3D1"><img src="folder.gif"></a></td>
+  <td class="icon">&nbsp;</td>
+  <th class="subject lock">
+    <span id="thread_3057651"><a href="viewthread.php?tid=3057651&amp;extra=page%3D1">hipda已迁移到新域名4d4y</a></span>
+  </th>
+  <td class="author"><cite><a href="space.php?uid=29">4d4y</a></cite><em>2022-6-13</em></td>
+  <td class="nums"><strong>0</strong>/<em>90106</em></td>
+  <td class="lastpost"><cite><a href="space.php?username=4d4y">4d4y</a></cite><em><a>2022-6-13 22:57</a></em></td>
+</tr>
+</tbody>
+</body></html>
+"""
+THREAD_HTML = """
+<html><body>
+<div id="post_1">
+  <div class="postauthor">老兵-猫族</div>
+  <div class="postinfo">发表于 2026-5-7 19:12</div>
+  <td class="t_msgfont">动画做得还不错。<br>节奏可以。</td>
+</div>
+<div id="post_2">
+  <div class="postauthor">leeice</div>
+  <div class="postinfo">发表于 2026-5-8 11:56</div>
+  <td class="t_msgfont">谢谢推荐</td>
+</div>
+</body></html>
+"""
+REALISTIC_THREAD_HTML = """
+<html><body>
+<div id="post_74215032">
+  <table><tr>
+    <td class="postauthor">
+      <div class="postinfo"><a href="space.php?uid=277860">死老妖</a></div>
+      <dl class="profile"><dt>UID</dt><dd>277860</dd></dl>
+    </td>
+    <td class="postcontent">
+      <div class="postinfo">
+        <div class="authorinfo"><em id="authorposton74215032">发表于 2026-5-8 11:55</em></div>
+      </div>
+      <div class="postmessage firstpost">
+        <td class="t_msgfont" id="postmessage_74215032">不要弹窗、要大容量</td>
+      </div>
+    </td>
+  </tr></table>
+</div>
+</body></html>
+"""
+def test_parse_forum_listing_extracts_threads_and_stats():
+    threads = parse_forum_listing(LISTING_HTML, base_url="https://www.4d4y.com/forum/")
+    assert [thread.tid for thread in threads] == ["3446553", "3447001"]
+    assert threads[0].title == "《放开那个女巫》动画做得还不错。"
+    assert threads[0].author == "老兵-猫族"
+    assert threads[0].replies == 11
+    assert threads[0].views == 777
+    assert threads[0].last_author == "leeice"
+    assert threads[0].url == "https://www.4d4y.com/forum/viewthread.php?tid=3446553&extra=page%3D1"
+def test_parse_forum_listing_uses_subject_link_when_icon_link_shares_tid():
+    threads = parse_forum_listing(REALISTIC_LISTING_HTML, base_url="https://www.4d4y.com/forum/")
+    assert len(threads) == 1
+    assert threads[0].tid == "3057651"
+    assert threads[0].title == "hipda已迁移到新域名4d4y"
+    assert threads[0].author == "4d4y"
+    assert threads[0].created_at == "2022-6-13"
+    assert threads[0].replies == 0
+    assert threads[0].views == 90106
+    assert threads[0].last_author == "4d4y"
+    assert threads[0].last_at == "2022-6-13 22:57"
+def test_parse_thread_extracts_posts():
+    posts = parse_thread(THREAD_HTML)
+    assert [post.author for post in posts] == ["老兵-猫族", "leeice"]
+    assert posts[0].published_at == "2026-5-7 19:12"
+    assert posts[0].content == "动画做得还不错。\n节奏可以。"
+def test_parse_thread_uses_discuz_author_and_date_without_sidebar_noise():
+    posts = parse_thread(REALISTIC_THREAD_HTML)
+    assert len(posts) == 1
+    assert posts[0].author == "死老妖"
+    assert posts[0].published_at == "2026-5-8 11:55"
+    assert posts[0].content == "不要弹窗、要大容量"
+def test_detects_login_required_page():
+    html = "<html><title>提示信息</title><body>对不起，您还未登录，无权访问该版块。</body></html>"
+    assert is_login_required_page(html) is True