hipda 0.1.11__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,63 @@
1
+ ---
2
+ name: Publish Python Package
3
+
4
+ "on":
5
+ release:
6
+ types: [published]
7
+
8
+ permissions:
9
+ contents: read
10
+
11
+ jobs:
12
+ build:
13
+ runs-on: ubuntu-latest
14
+
15
+ steps:
16
+ - uses: actions/checkout@v6
17
+
18
+ - name: Install uv
19
+ uses: astral-sh/setup-uv@v8.1.0
20
+ with:
21
+ enable-cache: true
22
+
23
+ - name: Set up Python
24
+ uses: actions/setup-python@v6
25
+ with:
26
+ python-version: "3.13"
27
+
28
+ - name: Run tests
29
+ run: uvx --from pytest --with . pytest
30
+
31
+ - name: Build release distributions
32
+ run: uvx --from build pyproject-build
33
+
34
+ - name: Check release distributions
35
+ run: uvx --from twine twine check dist/*
36
+
37
+ - name: Upload distributions
38
+ uses: actions/upload-artifact@v7.0.1
39
+ with:
40
+ name: release-dists
41
+ path: dist/
42
+
43
+ pypi-publish:
44
+ runs-on: ubuntu-latest
45
+ needs: build
46
+ permissions:
47
+ id-token: write
48
+
49
+ environment:
50
+ name: pypi
51
+ url: https://pypi.org/project/hipda/
52
+
53
+ steps:
54
+ - name: Retrieve release distributions
55
+ uses: actions/download-artifact@v8.0.1
56
+ with:
57
+ name: release-dists
58
+ path: dist/
59
+
60
+ - name: Publish release distributions to PyPI
61
+ uses: pypa/gh-action-pypi-publish@release/v1
62
+ with:
63
+ packages-dir: dist/
@@ -0,0 +1,7 @@
1
+ .env
2
+ .DS_Store
3
+ .pytest_cache/
4
+ __pycache__/
5
+ *.py[cod]
6
+ dist/
7
+ references/
hipda-0.1.11/PKG-INFO ADDED
@@ -0,0 +1,54 @@
1
+ Metadata-Version: 2.4
2
+ Name: hipda
3
+ Version: 0.1.11
4
+ Summary: CLI reader for 4D4Y/HiPDA Discovery forum posts.
5
+ Requires-Python: >=3.11
6
+ Requires-Dist: beautifulsoup4>=4.12
7
+ Requires-Dist: browser-cookie3>=0.20
8
+ Description-Content-Type: text/markdown
9
+
10
+ # hipda
11
+
12
+ CLI reader for the 4D4Y/HiPDA Discovery channel (`fid=2`).
13
+
14
+ The site uses browser/session checks, so direct unauthenticated requests may return a Cloudflare challenge. Log in once through Chrome:
15
+
16
+ ```bash
17
+ uvx --from . hipda login
18
+ ```
19
+
20
+ From PyPI:
21
+
22
+ ```bash
23
+ uvx hipda login
24
+ ```
25
+
26
+ That opens 4D4Y in Google Chrome. After you finish logging in, return to the terminal and press Enter. Then read Discovery:
27
+
28
+ ```bash
29
+ uvx --from . hipda list --limit 20
30
+ uvx --from . hipda read 3446553
31
+ ```
32
+
33
+ From PyPI:
34
+
35
+ ```bash
36
+ uvx hipda list --limit 20
37
+ uvx hipda read 3446553
38
+ ```
39
+
40
+ `hipda list` also tries to import automatically if Chrome is already logged in, so most of the time you can skip straight to reading. The old `hipda discovery list` and `hipda discovery read` commands still work.
41
+
42
+ The cookie is stored at `~/.config/hipda/cookie` and the user agent is stored at `~/.config/hipda/user-agent`, both with `0600` permissions. You can override them per command with `HIPDA_COOKIE` / `--cookie` and `HIPDA_USER_AGENT` / `--user-agent`.
43
+
44
+ You can also pass a browser user agent:
45
+
46
+ ```bash
47
+ HIPDA_USER_AGENT='Mozilla/5.0 ...' uvx --from . hipda list
48
+ ```
49
+
50
+ The CLI disables HTTPS certificate verification by default because 4D4Y often fails from Python environments where Chrome still works. To verify certificates, pass a trusted root certificate and `--verify-tls`:
51
+
52
+ ```bash
53
+ uvx --from . hipda --verify-tls --ca-file /path/to/root-ca.pem list
54
+ ```
hipda-0.1.11/README.md ADDED
@@ -0,0 +1,45 @@
1
+ # hipda
2
+
3
+ CLI reader for the 4D4Y/HiPDA Discovery channel (`fid=2`).
4
+
5
+ The site uses browser/session checks, so direct unauthenticated requests may return a Cloudflare challenge. Log in once through Chrome:
6
+
7
+ ```bash
8
+ uvx --from . hipda login
9
+ ```
10
+
11
+ From PyPI:
12
+
13
+ ```bash
14
+ uvx hipda login
15
+ ```
16
+
17
+ That opens 4D4Y in Google Chrome. After you finish logging in, return to the terminal and press Enter. Then read Discovery:
18
+
19
+ ```bash
20
+ uvx --from . hipda list --limit 20
21
+ uvx --from . hipda read 3446553
22
+ ```
23
+
24
+ From PyPI:
25
+
26
+ ```bash
27
+ uvx hipda list --limit 20
28
+ uvx hipda read 3446553
29
+ ```
30
+
31
+ `hipda list` also tries to import automatically if Chrome is already logged in, so most of the time you can skip straight to reading. The old `hipda discovery list` and `hipda discovery read` commands still work.
32
+
33
+ The cookie is stored at `~/.config/hipda/cookie` and the user agent is stored at `~/.config/hipda/user-agent`, both with `0600` permissions. You can override them per command with `HIPDA_COOKIE` / `--cookie` and `HIPDA_USER_AGENT` / `--user-agent`.
34
+
35
+ You can also pass a browser user agent:
36
+
37
+ ```bash
38
+ HIPDA_USER_AGENT='Mozilla/5.0 ...' uvx --from . hipda list
39
+ ```
40
+
41
+ The CLI disables HTTPS certificate verification by default because 4D4Y often fails from Python environments where Chrome still works. To verify certificates, pass a trusted root certificate and `--verify-tls`:
42
+
43
+ ```bash
44
+ uvx --from . hipda --verify-tls --ca-file /path/to/root-ca.pem list
45
+ ```
@@ -0,0 +1,33 @@
1
+ # PyPI Release
2
+
3
+ This project publishes with GitHub Actions trusted publishing. No PyPI token is stored in GitHub secrets.
4
+
5
+ ## One-time PyPI setup
6
+
7
+ Create trusted publishers for repository `cdpath/hipda`.
8
+
9
+ If the project already exists on the index, add the publisher from that project's publishing settings. If the project does not exist yet, create a pending trusted publisher from the account publishing page; the first successful workflow run will create the project.
10
+
11
+ PyPI:
12
+
13
+ - Project: `hipda`
14
+ - Owner: `cdpath`
15
+ - Repository name: `hipda`
16
+ - Workflow name: `python-publish.yml`
17
+ - Environment name: `pypi`
18
+
19
+ The workflow uses a GitHub environment named `pypi`. Configure environment protection rules in GitHub if releases should require manual approval.
20
+
21
+ ## PyPI release
22
+
23
+ 1. Bump `version` in `pyproject.toml`.
24
+ 2. Push the branch to GitHub.
25
+ 3. Create and publish a GitHub release.
26
+ 4. The release event publishes the package version to PyPI.
27
+ 5. Verify the package:
28
+
29
+ ```bash
30
+ uvx --refresh --from hipda==<version> hipda --help
31
+ ```
32
+
33
+ Package versions are immutable on PyPI. If a publish partially succeeds, bump the version before retrying.
@@ -0,0 +1,20 @@
1
+ [project]
2
+ name = "hipda"
3
+ version = "0.1.11"
4
+ description = "CLI reader for 4D4Y/HiPDA Discovery forum posts."
5
+ readme = "README.md"
6
+ requires-python = ">=3.11"
7
+ dependencies = [
8
+ "beautifulsoup4>=4.12",
9
+ "browser-cookie3>=0.20",
10
+ ]
11
+
12
+ [project.scripts]
13
+ hipda = "hipda_cli.cli:main"
14
+
15
+ [build-system]
16
+ requires = ["hatchling"]
17
+ build-backend = "hatchling.build"
18
+
19
+ [tool.hatch.build.targets.wheel]
20
+ packages = ["src/hipda_cli"]
@@ -0,0 +1,5 @@
1
+ """Command-line tools for reading 4D4Y/HiPDA forum content."""
2
+
3
+ __all__ = ["__version__"]
4
+
5
+ __version__ = "0.1.9"
@@ -0,0 +1,115 @@
1
+ from __future__ import annotations
2
+
3
+ import os
4
+ import plistlib
5
+ import subprocess
6
+ from pathlib import Path
7
+
8
+ import browser_cookie3
9
+
10
+
11
+ LOGIN_URL = "https://www.4d4y.com/forum/forumdisplay.php?fid=2"
12
+
13
+ CHROME_INFO_PLIST_PATHS = (
14
+ Path("/Applications/Google Chrome.app/Contents/Info.plist"),
15
+ Path.home() / "Applications/Google Chrome.app/Contents/Info.plist",
16
+ )
17
+
18
+
19
+ def default_cookie_path() -> Path:
20
+ return _config_path("cookie")
21
+
22
+
23
+ def default_user_agent_path() -> Path:
24
+ return _config_path("user-agent")
25
+
26
+
27
+ def _config_path(name: str) -> Path:
28
+ config_home = os.environ.get("XDG_CONFIG_HOME")
29
+ if config_home:
30
+ return Path(config_home) / "hipda" / name
31
+ return Path.home() / ".config" / "hipda" / name
32
+
33
+
34
+ def normalize_cookie(cookie: str) -> str:
35
+ cookie = cookie.strip()
36
+ if cookie.lower().startswith("cookie:"):
37
+ cookie = cookie.split(":", 1)[1].strip()
38
+ return cookie
39
+
40
+
41
+ def load_cookie(path: Path | None = None) -> str:
42
+ cookie_path = path or default_cookie_path()
43
+ if not cookie_path.exists():
44
+ return ""
45
+ return normalize_cookie(cookie_path.read_text(encoding="utf-8"))
46
+
47
+
48
+ def save_cookie(cookie: str, path: Path | None = None) -> Path:
49
+ normalized = normalize_cookie(cookie)
50
+ if not normalized:
51
+ raise ValueError("cookie is empty")
52
+
53
+ cookie_path = path or default_cookie_path()
54
+ cookie_path.parent.mkdir(parents=True, exist_ok=True)
55
+ cookie_path.write_text(normalized + "\n", encoding="utf-8")
56
+ cookie_path.chmod(0o600)
57
+ return cookie_path
58
+
59
+
60
+ def load_user_agent(path: Path | None = None) -> str:
61
+ user_agent_path = path or default_user_agent_path()
62
+ if not user_agent_path.exists():
63
+ return ""
64
+ return user_agent_path.read_text(encoding="utf-8").strip()
65
+
66
+
67
+ def save_user_agent(user_agent: str, path: Path | None = None) -> Path:
68
+ normalized = user_agent.strip()
69
+ if not normalized:
70
+ raise ValueError("user-agent is empty")
71
+
72
+ user_agent_path = path or default_user_agent_path()
73
+ user_agent_path.parent.mkdir(parents=True, exist_ok=True)
74
+ user_agent_path.write_text(normalized + "\n", encoding="utf-8")
75
+ user_agent_path.chmod(0o600)
76
+ return user_agent_path
77
+
78
+
79
+ def cookie_header_from_browser(domain: str = "4d4y.com") -> str:
80
+ jar = browser_cookie3.chrome(domain_name=domain)
81
+ cookies = []
82
+ for cookie in jar:
83
+ if cookie.domain.lstrip(".") == domain or cookie.domain.endswith("." + domain):
84
+ cookies.append(f"{cookie.name}={cookie.value}")
85
+ return "; ".join(cookies)
86
+
87
+
88
+ def chrome_user_agent() -> str:
89
+ major = "147"
90
+ for plist_path in CHROME_INFO_PLIST_PATHS:
91
+ if not plist_path.exists():
92
+ continue
93
+ with plist_path.open("rb") as file:
94
+ version = str(plistlib.load(file).get("CFBundleShortVersionString", ""))
95
+ if version:
96
+ major = version.split(".", 1)[0]
97
+ break
98
+ return (
99
+ "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) "
100
+ f"AppleWebKit/537.36 (KHTML, like Gecko) Chrome/{major}.0.0.0 Safari/537.36"
101
+ )
102
+
103
+
104
+ def import_browser_auth(domain: str = "4d4y.com") -> tuple[str, str]:
105
+ cookie = cookie_header_from_browser(domain)
106
+ if not cookie:
107
+ raise ValueError(f"no {domain} cookies found in Chrome")
108
+ user_agent = chrome_user_agent()
109
+ save_cookie(cookie)
110
+ save_user_agent(user_agent)
111
+ return cookie, user_agent
112
+
113
+
114
+ def open_login_page() -> None:
115
+ subprocess.run(["open", "-a", "Google Chrome", LOGIN_URL], check=False)
@@ -0,0 +1,236 @@
1
+ from __future__ import annotations
2
+
3
+ import argparse
4
+ import sys
5
+
6
+ from .auth import import_browser_auth, open_login_page, save_cookie, save_user_agent
7
+ from .client import BASE_URL, HipdaClient, HipdaClientError
8
+ from .parser import is_login_required_page, parse_forum_listing, parse_thread
9
+
10
+
11
+ def build_parser() -> argparse.ArgumentParser:
12
+ parser = argparse.ArgumentParser(prog="hipda", description="Read 4D4Y/HiPDA forum posts from the terminal.")
13
+ parser.add_argument("--cookie", help="Logged-in Cookie header. Defaults to HIPDA_COOKIE.")
14
+ parser.add_argument("--user-agent", help="User-Agent header. Defaults to HIPDA_USER_AGENT or Chrome-like UA.")
15
+ parser.add_argument("--ca-file", help="PEM CA bundle to trust for HTTPS. Defaults to HIPDA_CA_FILE.")
16
+ parser.add_argument(
17
+ "--insecure-tls",
18
+ action="store_true",
19
+ help="Disable HTTPS certificate verification. This is the default for 4D4Y.",
20
+ )
21
+ parser.add_argument(
22
+ "--verify-tls",
23
+ action="store_true",
24
+ help="Enable HTTPS certificate verification.",
25
+ )
26
+
27
+ subparsers = parser.add_subparsers(dest="command", metavar="{login,list,read}")
28
+
29
+ subparsers.add_parser("login", help="Import 4D4Y login cookies from Chrome.")
30
+
31
+ list_parser = subparsers.add_parser("list", help="List Discovery threads.")
32
+ list_parser.add_argument("--page", type=int, default=1, help="Forum page number.")
33
+ list_parser.add_argument("--limit", type=int, default=30, help="Maximum number of threads to print.")
34
+
35
+ read_parser = subparsers.add_parser("read", help="Read a thread by tid or URL.")
36
+ read_parser.add_argument("thread", help="Thread id, or a full viewthread.php URL.")
37
+ read_parser.add_argument("--page", type=int, default=1, help="Thread page number.")
38
+
39
+ auth = subparsers.add_parser("auth", help=argparse.SUPPRESS)
40
+ auth_subparsers = auth.add_subparsers(dest="auth_command", required=True)
41
+ save_cookie_parser = auth_subparsers.add_parser("save-cookie", help="Save a pasted 4D4Y Cookie header.")
42
+ save_cookie_parser.add_argument("cookie", nargs="?", help="Cookie header value. Reads stdin if omitted.")
43
+ save_user_agent_parser = auth_subparsers.add_parser("save-user-agent", help="Save the Chrome User-Agent used with the cookie.")
44
+ save_user_agent_parser.add_argument("user_agent", nargs="?", help="User-Agent value. Reads stdin if omitted.")
45
+
46
+ discovery = subparsers.add_parser("discovery", help=argparse.SUPPRESS)
47
+ discovery_subparsers = discovery.add_subparsers(dest="discovery_command", required=True)
48
+
49
+ list_parser = discovery_subparsers.add_parser("list", help="List Discovery threads.")
50
+ list_parser.add_argument("--page", type=int, default=1, help="Forum page number.")
51
+ list_parser.add_argument("--limit", type=int, default=30, help="Maximum number of threads to print.")
52
+
53
+ read_parser = discovery_subparsers.add_parser("read", help="Read a thread by tid or URL.")
54
+ read_parser.add_argument("thread", help="Thread id, or a full viewthread.php URL.")
55
+ read_parser.add_argument("--page", type=int, default=1, help="Thread page number.")
56
+
57
+ subparsers._choices_actions = [
58
+ action for action in subparsers._choices_actions if action.dest not in {"auth", "discovery"}
59
+ ]
60
+
61
+ return parser
62
+
63
+
64
+ def _thread_params(thread: str, page: int) -> dict[str, str | int]:
65
+ if "tid=" in thread:
66
+ tid = thread.split("tid=", 1)[1].split("&", 1)[0]
67
+ else:
68
+ tid = thread
69
+ return {"tid": tid, "page": page}
70
+
71
+
72
+ def load_discovery_page(
73
+ *,
74
+ page: int,
75
+ path: str,
76
+ params: dict[str, str | int],
77
+ cookie: str | None,
78
+ user_agent: str | None,
79
+ ca_file: str | None,
80
+ insecure_tls: bool,
81
+ verify_tls: bool = False,
82
+ ) -> tuple[str, HipdaClient]:
83
+ client = HipdaClient.from_env(
84
+ cookie=cookie,
85
+ user_agent=user_agent,
86
+ ca_file=ca_file,
87
+ insecure_tls=insecure_tls or not verify_tls,
88
+ verify_tls=verify_tls,
89
+ )
90
+ html = client.get(path, params)
91
+ if not is_login_required_page(html):
92
+ return html, client
93
+
94
+ if cookie:
95
+ return html, client
96
+
97
+ try:
98
+ imported_cookie, imported_user_agent = import_browser_auth()
99
+ except Exception:
100
+ return html, client
101
+
102
+ client = HipdaClient.from_env(
103
+ cookie=imported_cookie,
104
+ user_agent=user_agent or imported_user_agent,
105
+ ca_file=ca_file,
106
+ insecure_tls=insecure_tls or not verify_tls,
107
+ verify_tls=verify_tls,
108
+ )
109
+ return client.get(path, params), client
110
+
111
+
112
+ def wait_for_login_confirmation() -> None:
113
+ if sys.stdin.isatty():
114
+ input("Log in to 4D4Y in Chrome, then press Enter here...")
115
+
116
+
117
+ def run(args: argparse.Namespace) -> int:
118
+ if args.command == "login":
119
+ try:
120
+ open_login_page()
121
+ wait_for_login_confirmation()
122
+ import_browser_auth()
123
+ except Exception as exc:
124
+ print(
125
+ "hipda: could not import 4D4Y cookies from Chrome. "
126
+ "Open Chrome, log in to https://www.4d4y.com/forum/forumdisplay.php?fid=2, then run `hipda login` again.",
127
+ file=sys.stderr,
128
+ )
129
+ print(f"hipda: {exc}", file=sys.stderr)
130
+ return 2
131
+ print("Imported 4D4Y login from Chrome.")
132
+ return 0
133
+
134
+ if args.command == "auth" and args.auth_command == "save-cookie":
135
+ cookie = args.cookie if args.cookie is not None else sys.stdin.read()
136
+ try:
137
+ path = save_cookie(cookie)
138
+ except ValueError as exc:
139
+ print(f"hipda: {exc}", file=sys.stderr)
140
+ return 2
141
+ print(f"Saved cookie to {path}")
142
+ return 0
143
+
144
+ if args.command == "auth" and args.auth_command == "save-user-agent":
145
+ user_agent = args.user_agent if args.user_agent is not None else sys.stdin.read()
146
+ try:
147
+ path = save_user_agent(user_agent)
148
+ except ValueError as exc:
149
+ print(f"hipda: {exc}", file=sys.stderr)
150
+ return 2
151
+ print(f"Saved user-agent to {path}")
152
+ return 0
153
+
154
+ client = HipdaClient.from_env(
155
+ cookie=args.cookie,
156
+ user_agent=args.user_agent,
157
+ ca_file=args.ca_file,
158
+ insecure_tls=args.insecure_tls or not args.verify_tls,
159
+ verify_tls=args.verify_tls,
160
+ )
161
+
162
+ try:
163
+ command = args.discovery_command if args.command == "discovery" else args.command
164
+
165
+ if command == "list":
166
+ html, client = load_discovery_page(
167
+ page=args.page,
168
+ path="forumdisplay.php",
169
+ params={"fid": 2, "page": args.page},
170
+ cookie=args.cookie,
171
+ user_agent=args.user_agent,
172
+ ca_file=args.ca_file,
173
+ insecure_tls=args.insecure_tls or not args.verify_tls,
174
+ verify_tls=args.verify_tls,
175
+ )
176
+ if is_login_required_page(html):
177
+ print(
178
+ "hipda: 4D4Y says this request is not logged in. "
179
+ "Open Chrome, log in to 4D4Y, then run `hipda login`.",
180
+ file=sys.stderr,
181
+ )
182
+ return 2
183
+ threads = parse_forum_listing(html, base_url=BASE_URL)[: args.limit]
184
+ for thread in threads:
185
+ stats = ""
186
+ if thread.replies is not None and thread.views is not None:
187
+ stats = f" {thread.replies}/{thread.views}"
188
+ last = f" last: {thread.last_author} {thread.last_at}".rstrip() if thread.last_author else ""
189
+ print(f"{thread.tid}\t{thread.title}\t{thread.author} {thread.created_at}{stats}{last}")
190
+ return 0
191
+
192
+ if command == "read":
193
+ html, client = load_discovery_page(
194
+ page=args.page,
195
+ path="viewthread.php",
196
+ params=_thread_params(args.thread, args.page),
197
+ cookie=args.cookie,
198
+ user_agent=args.user_agent,
199
+ ca_file=args.ca_file,
200
+ insecure_tls=args.insecure_tls or not args.verify_tls,
201
+ verify_tls=args.verify_tls,
202
+ )
203
+ if is_login_required_page(html):
204
+ print(
205
+ "hipda: 4D4Y says this request is not logged in. "
206
+ "Open Chrome, log in to 4D4Y, then run `hipda login`.",
207
+ file=sys.stderr,
208
+ )
209
+ return 2
210
+ posts = parse_thread(html)
211
+ for index, post in enumerate(posts, start=1):
212
+ print(f"#{index} {post.author} {post.published_at}".rstrip())
213
+ print(post.content)
214
+ print()
215
+ return 0
216
+ except HipdaClientError as exc:
217
+ print(f"hipda: {exc}", file=sys.stderr)
218
+ if not client.cookie:
219
+ print("hipda: set HIPDA_COOKIE or pass --cookie with a logged-in 4D4Y Cookie header.", file=sys.stderr)
220
+ return 2
221
+
222
+ raise AssertionError(f"Unhandled command: {args}")
223
+
224
+
225
+ def main(argv: list[str] | None = None) -> int:
226
+ parser = build_parser()
227
+ if argv is None:
228
+ argv = sys.argv[1:]
229
+ if not argv:
230
+ parser.print_help()
231
+ return 0
232
+ return run(parser.parse_args(argv))
233
+
234
+
235
+ if __name__ == "__main__":
236
+ raise SystemExit(main())
@@ -0,0 +1,90 @@
1
+ from __future__ import annotations
2
+
3
+ import os
4
+ import ssl
5
+ from dataclasses import dataclass
6
+ from urllib.error import HTTPError, URLError
7
+ from urllib.parse import urlencode
8
+ from urllib.request import Request, urlopen
9
+
10
+ from .auth import load_cookie, load_user_agent
11
+
12
+
13
+ BASE_URL = "https://www.4d4y.com/forum/"
14
+ DEFAULT_USER_AGENT = (
15
+ "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) "
16
+ "AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0 Safari/537.36"
17
+ )
18
+
19
+
20
+ class HipdaClientError(RuntimeError):
21
+ pass
22
+
23
+
24
+ @dataclass(frozen=True)
25
+ class HipdaClient:
26
+ cookie: str = ""
27
+ user_agent: str = DEFAULT_USER_AGENT
28
+ ca_file: str | None = None
29
+ insecure_tls: bool = True
30
+ base_url: str = BASE_URL
31
+ timeout: float = 20.0
32
+
33
+ @classmethod
34
+ def from_env(
35
+ cls,
36
+ cookie: str | None = None,
37
+ user_agent: str | None = None,
38
+ ca_file: str | None = None,
39
+ insecure_tls: bool = True,
40
+ verify_tls: bool = False,
41
+ ) -> "HipdaClient":
42
+ return cls(
43
+ cookie=cookie or os.environ.get("HIPDA_COOKIE", "") or load_cookie(),
44
+ user_agent=user_agent or os.environ.get("HIPDA_USER_AGENT", "") or load_user_agent() or DEFAULT_USER_AGENT,
45
+ ca_file=ca_file or os.environ.get("HIPDA_CA_FILE"),
46
+ insecure_tls=(
47
+ not verify_tls
48
+ and (insecure_tls or os.environ.get("HIPDA_INSECURE_TLS", "").lower() in {"1", "true", "yes"})
49
+ ),
50
+ )
51
+
52
+ def ssl_context(self) -> ssl.SSLContext | None:
53
+ if self.insecure_tls:
54
+ context = ssl.create_default_context()
55
+ context.check_hostname = False
56
+ context.verify_mode = ssl.CERT_NONE
57
+ return context
58
+ if self.ca_file:
59
+ return ssl.create_default_context(cafile=self.ca_file)
60
+ return None
61
+
62
+ def get(self, path: str, params: dict[str, str | int] | None = None) -> str:
63
+ url = self.base_url + path
64
+ if params:
65
+ url = f"{url}?{urlencode(params)}"
66
+
67
+ headers = {
68
+ "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
69
+ "Accept-Language": "zh-CN,zh;q=0.9,en;q=0.8",
70
+ "User-Agent": self.user_agent,
71
+ }
72
+ if self.cookie:
73
+ headers["Cookie"] = self.cookie
74
+
75
+ try:
76
+ with urlopen(Request(url, headers=headers), timeout=self.timeout, context=self.ssl_context()) as response:
77
+ body = response.read()
78
+ encoding = response.headers.get_content_charset() or "utf-8"
79
+ return body.decode(encoding, errors="replace")
80
+ except HTTPError as exc:
81
+ body = exc.read().decode("utf-8", errors="replace")[:300]
82
+ raise HipdaClientError(f"HTTP {exc.code} fetching {url}: {body}") from exc
83
+ except URLError as exc:
84
+ if isinstance(exc.reason, ssl.SSLCertVerificationError):
85
+ raise HipdaClientError(
86
+ f"Could not verify TLS certificate for {url}: {exc.reason}. "
87
+ "If you use a trusted local proxy, pass --ca-file /path/to/root.pem. "
88
+ "As a last resort, pass --insecure-tls."
89
+ ) from exc
90
+ raise HipdaClientError(f"Could not fetch {url}: {exc.reason}") from exc
@@ -0,0 +1,24 @@
1
+ from __future__ import annotations
2
+
3
+ from dataclasses import dataclass
4
+
5
+
6
+ @dataclass(frozen=True)
7
+ class ThreadSummary:
8
+ tid: str
9
+ title: str
10
+ url: str
11
+ author: str = ""
12
+ created_at: str = ""
13
+ replies: int | None = None
14
+ views: int | None = None
15
+ last_author: str = ""
16
+ last_at: str = ""
17
+
18
+
19
+ @dataclass(frozen=True)
20
+ class Post:
21
+ author: str
22
+ published_at: str
23
+ content: str
24
+
@@ -0,0 +1,131 @@
1
+ from __future__ import annotations
2
+
3
+ import re
4
+ from urllib.parse import parse_qs, urljoin, urlparse
5
+
6
+ from bs4 import BeautifulSoup
7
+
8
+ from .models import Post, ThreadSummary
9
+
10
+
11
+ THREAD_RE = re.compile(r"(?:^|[?&])tid=(\d+)")
12
+ WHITESPACE_RE = re.compile(r"[ \t\r\f\v]+")
13
+
14
+
15
+ def clean_text(value: str) -> str:
16
+ lines = []
17
+ for line in value.replace("\xa0", " ").splitlines():
18
+ cleaned = WHITESPACE_RE.sub(" ", line).strip()
19
+ if cleaned:
20
+ lines.append(cleaned)
21
+ return "\n".join(lines)
22
+
23
+
24
+ def _tid_from_href(href: str) -> str | None:
25
+ parsed = urlparse(href)
26
+ tid = parse_qs(parsed.query).get("tid", [None])[0]
27
+ if tid:
28
+ return tid
29
+ match = THREAD_RE.search(href)
30
+ return match.group(1) if match else None
31
+
32
+
33
+ def _split_cell_lines(cell) -> list[str]:
34
+ return clean_text(cell.get_text("\n")).splitlines()
35
+
36
+
37
+ def _parse_counts(value: str) -> tuple[int | None, int | None]:
38
+ match = re.search(r"(\d+)\s*/\s*(\d+)", value)
39
+ if not match:
40
+ return None, None
41
+ return int(match.group(1)), int(match.group(2))
42
+
43
+
44
+ def parse_forum_listing(html: str, base_url: str) -> list[ThreadSummary]:
45
+ soup = BeautifulSoup(html, "html.parser")
46
+ threads: list[ThreadSummary] = []
47
+ seen: set[str] = set()
48
+
49
+ for anchor in soup.find_all("a", href=True):
50
+ href = anchor["href"]
51
+ if "viewthread.php" not in href:
52
+ continue
53
+
54
+ tid = _tid_from_href(href)
55
+ title = clean_text(anchor.get_text())
56
+ if not tid or not title or tid in seen:
57
+ continue
58
+
59
+ row = anchor.find_parent("tr")
60
+ author = created_at = last_author = last_at = ""
61
+ replies = views = None
62
+ if row:
63
+ cells = row.find_all(["td", "th"], recursive=False)
64
+ anchor_cell = anchor.find_parent(["td", "th"])
65
+ anchor_cell_index = cells.index(anchor_cell) if anchor_cell in cells else -1
66
+ trailing_cells = cells[anchor_cell_index + 1 :] if anchor_cell_index >= 0 else []
67
+ if trailing_cells:
68
+ author_lines = _split_cell_lines(trailing_cells[0])
69
+ author = author_lines[0] if author_lines else ""
70
+ created_at = author_lines[1] if len(author_lines) > 1 else ""
71
+ if len(trailing_cells) > 1:
72
+ replies, views = _parse_counts(clean_text(trailing_cells[1].get_text(" ")))
73
+ if len(trailing_cells) > 2:
74
+ last_lines = _split_cell_lines(trailing_cells[2])
75
+ last_author = last_lines[0] if last_lines else ""
76
+ last_at = last_lines[1] if len(last_lines) > 1 else ""
77
+
78
+ seen.add(tid)
79
+ threads.append(
80
+ ThreadSummary(
81
+ tid=tid,
82
+ title=title,
83
+ url=urljoin(base_url, href),
84
+ author=author,
85
+ created_at=created_at,
86
+ replies=replies,
87
+ views=views,
88
+ last_author=last_author,
89
+ last_at=last_at,
90
+ )
91
+ )
92
+
93
+ return threads
94
+
95
+
96
+ def is_login_required_page(html: str) -> bool:
97
+ text = clean_text(BeautifulSoup(html, "html.parser").get_text("\n"))
98
+ return "您还未登录" in text or "无权访问该版块" in text
99
+
100
+
101
+ def parse_thread(html: str) -> list[Post]:
102
+ soup = BeautifulSoup(html, "html.parser")
103
+ posts: list[Post] = []
104
+
105
+ for container in soup.find_all(id=re.compile(r"^post_\d+")):
106
+ message = container.select_one(".t_msgfont") or container.select_one("[id^=postmessage_]")
107
+ if not message:
108
+ continue
109
+
110
+ author_node = container.select_one(".postauthor > .postinfo a") or container.select_one(".postauthor a")
111
+ if not author_node:
112
+ fallback_author_node = container.select_one(".postauthor")
113
+ if fallback_author_node and fallback_author_node.name != "td":
114
+ author_node = fallback_author_node
115
+ info_node = (
116
+ container.select_one(".authorinfo [id^=authorposton]")
117
+ or container.select_one(".postcontent .postinfo")
118
+ or container.select_one(".postinfo")
119
+ )
120
+ info_text = clean_text(info_node.get_text("\n")) if info_node else ""
121
+ published_at = re.sub(r"^发表于\s*", "", info_text.splitlines()[0]).strip() if info_text else ""
122
+
123
+ posts.append(
124
+ Post(
125
+ author=clean_text(author_node.get_text()) if author_node else "",
126
+ published_at=published_at,
127
+ content=clean_text(message.get_text("\n")),
128
+ )
129
+ )
130
+
131
+ return posts
@@ -0,0 +1,180 @@
1
+ from hipda_cli.auth import default_cookie_path, default_user_agent_path, load_cookie, load_user_agent, save_cookie, save_user_agent
2
+ from hipda_cli.cli import build_parser, load_discovery_page, main
3
+
4
+
5
+ def test_build_parser_accepts_discovery_list_options():
6
+ parser = build_parser()
7
+
8
+ args = parser.parse_args(["--ca-file", "/tmp/root.pem", "--verify-tls", "discovery", "list", "--page", "2", "--limit", "5"])
9
+
10
+ assert args.command == "discovery"
11
+ assert args.discovery_command == "list"
12
+ assert args.ca_file == "/tmp/root.pem"
13
+ assert args.verify_tls is True
14
+ assert args.page == 2
15
+ assert args.limit == 5
16
+
17
+
18
+ def test_help_hides_legacy_commands(capsys):
19
+ parser = build_parser()
20
+
21
+ parser.print_help()
22
+
23
+ out = capsys.readouterr().out
24
+ assert "{login,list,read}" in out
25
+ assert "auth" not in out
26
+ assert "discovery" not in out
27
+
28
+
29
+ def test_main_without_subcommand_prints_help(capsys):
30
+ assert main([]) == 0
31
+
32
+ out = capsys.readouterr().out
33
+ assert "usage: hipda" in out
34
+ assert "{login,list,read}" in out
35
+
36
+
37
+ def test_main_without_subcommand_from_sys_argv_prints_help(monkeypatch, capsys):
38
+ monkeypatch.setattr("sys.argv", ["hipda"])
39
+
40
+ assert main() == 0
41
+
42
+ out = capsys.readouterr().out
43
+ assert "usage: hipda" in out
44
+ assert "{login,list,read}" in out
45
+
46
+
47
+ def test_build_parser_accepts_top_level_list_options():
48
+ parser = build_parser()
49
+
50
+ args = parser.parse_args(["list", "--page", "2", "--limit", "5"])
51
+
52
+ assert args.command == "list"
53
+ assert args.page == 2
54
+ assert args.limit == 5
55
+
56
+
57
+ def test_build_parser_accepts_discovery_read_tid():
58
+ parser = build_parser()
59
+
60
+ args = parser.parse_args(["discovery", "read", "3446553", "--page", "3"])
61
+
62
+ assert args.discovery_command == "read"
63
+ assert args.thread == "3446553"
64
+ assert args.page == 3
65
+
66
+
67
+ def test_build_parser_accepts_top_level_read_tid():
68
+ parser = build_parser()
69
+
70
+ args = parser.parse_args(["read", "3446553", "--page", "3"])
71
+
72
+ assert args.command == "read"
73
+ assert args.thread == "3446553"
74
+ assert args.page == 3
75
+
76
+
77
+ def test_build_parser_accepts_auth_save_cookie():
78
+ parser = build_parser()
79
+
80
+ args = parser.parse_args(["auth", "save-cookie", "foo=bar; baz=qux"])
81
+
82
+ assert args.command == "auth"
83
+ assert args.auth_command == "save-cookie"
84
+ assert args.cookie == "foo=bar; baz=qux"
85
+
86
+
87
+ def test_build_parser_accepts_auth_save_user_agent():
88
+ parser = build_parser()
89
+
90
+ args = parser.parse_args(["auth", "save-user-agent", "Mozilla/5.0 Chrome/147.0.0.0"])
91
+
92
+ assert args.command == "auth"
93
+ assert args.auth_command == "save-user-agent"
94
+ assert args.user_agent == "Mozilla/5.0 Chrome/147.0.0.0"
95
+
96
+
97
+ def test_build_parser_accepts_login():
98
+ parser = build_parser()
99
+
100
+ args = parser.parse_args(["login"])
101
+
102
+ assert args.command == "login"
103
+
104
+
105
+ def test_run_login_opens_chrome_then_imports(monkeypatch, capsys):
106
+ events = []
107
+ parser = build_parser()
108
+ args = parser.parse_args(["login"])
109
+ monkeypatch.setattr("hipda_cli.cli.open_login_page", lambda: events.append("open"))
110
+ monkeypatch.setattr("hipda_cli.cli.wait_for_login_confirmation", lambda: events.append("wait"))
111
+ monkeypatch.setattr("hipda_cli.cli.import_browser_auth", lambda: events.append("import") or ("cookie", "ua"))
112
+
113
+ from hipda_cli.cli import run
114
+
115
+ assert run(args) == 0
116
+ assert events == ["open", "wait", "import"]
117
+ assert "Imported 4D4Y login from Chrome." in capsys.readouterr().out
118
+
119
+
120
+ def test_save_cookie_strips_cookie_prefix_and_uses_private_permissions(tmp_path):
121
+ cookie_path = tmp_path / "cookie"
122
+
123
+ save_cookie("Cookie: foo=bar; baz=qux\n", cookie_path)
124
+
125
+ assert load_cookie(cookie_path) == "foo=bar; baz=qux"
126
+ assert oct(cookie_path.stat().st_mode & 0o777) == "0o600"
127
+
128
+
129
+ def test_default_cookie_path_uses_xdg_config_home(monkeypatch, tmp_path):
130
+ monkeypatch.setenv("XDG_CONFIG_HOME", str(tmp_path))
131
+
132
+ assert default_cookie_path() == tmp_path / "hipda" / "cookie"
133
+
134
+
135
+ def test_save_user_agent_round_trips(tmp_path):
136
+ path = tmp_path / "user-agent"
137
+
138
+ save_user_agent("Mozilla/5.0 Chrome/147.0.0.0\n", path)
139
+
140
+ assert load_user_agent(path) == "Mozilla/5.0 Chrome/147.0.0.0"
141
+ assert oct(path.stat().st_mode & 0o777) == "0o600"
142
+
143
+
144
+ def test_default_user_agent_path_uses_xdg_config_home(monkeypatch, tmp_path):
145
+ monkeypatch.setenv("XDG_CONFIG_HOME", str(tmp_path))
146
+
147
+ assert default_user_agent_path() == tmp_path / "hipda" / "user-agent"
148
+
149
+
150
+ def test_load_discovery_page_imports_browser_auth_when_saved_auth_is_missing(monkeypatch, tmp_path):
151
+ monkeypatch.setenv("XDG_CONFIG_HOME", str(tmp_path))
152
+ calls = []
153
+
154
+ class FakeClient:
155
+ def __init__(self, cookie=""):
156
+ self.cookie = cookie
157
+
158
+ def get(self, path, params):
159
+ calls.append((self.cookie, path, params))
160
+ return "<html><a href='viewthread.php?tid=1'>ok</a></html>" if self.cookie else "您还未登录"
161
+
162
+ monkeypatch.setattr("hipda_cli.cli.HipdaClient.from_env", lambda **kwargs: FakeClient(kwargs.get("cookie") or load_cookie()))
163
+ monkeypatch.setattr("hipda_cli.cli.import_browser_auth", lambda: ("cdb_auth=abc", "Mozilla/5.0 Chrome/147.0.0.0"))
164
+
165
+ html, client = load_discovery_page(
166
+ page=1,
167
+ path="forumdisplay.php",
168
+ params={"fid": 2, "page": 1},
169
+ cookie=None,
170
+ user_agent=None,
171
+ ca_file=None,
172
+ insecure_tls=False,
173
+ )
174
+
175
+ assert "viewthread.php" in html
176
+ assert client.cookie == "cdb_auth=abc"
177
+ assert calls == [
178
+ ("", "forumdisplay.php", {"fid": 2, "page": 1}),
179
+ ("cdb_auth=abc", "forumdisplay.php", {"fid": 2, "page": 1}),
180
+ ]
@@ -0,0 +1,99 @@
1
+ import ssl
2
+
3
+ from http.cookiejar import Cookie, CookieJar
4
+
5
+ from hipda_cli.auth import LOGIN_URL, chrome_user_agent, cookie_header_from_browser, open_login_page
6
+ from hipda_cli.client import HipdaClient
7
+
8
+
9
+ def test_from_env_accepts_ca_file_and_insecure_tls():
10
+ client = HipdaClient.from_env(cookie="a=b", ca_file="/tmp/root.pem", insecure_tls=True)
11
+
12
+ assert client.ca_file == "/tmp/root.pem"
13
+ assert client.insecure_tls is True
14
+
15
+
16
+ def test_from_env_disables_tls_verification_by_default():
17
+ client = HipdaClient.from_env(cookie="a=b")
18
+
19
+ assert client.insecure_tls is True
20
+
21
+
22
+ def test_from_env_can_verify_tls():
23
+ client = HipdaClient.from_env(cookie="a=b", verify_tls=True)
24
+
25
+ assert client.insecure_tls is False
26
+
27
+
28
+ def test_ssl_context_uses_ca_file_and_verifies_tls(monkeypatch):
29
+ calls = {}
30
+
31
+ def fake_create_default_context(*, cafile=None):
32
+ calls["cafile"] = cafile
33
+ return "context"
34
+
35
+ monkeypatch.setattr(ssl, "create_default_context", fake_create_default_context)
36
+
37
+ assert HipdaClient(ca_file="/tmp/root.pem", insecure_tls=False).ssl_context() == "context"
38
+ assert calls == {"cafile": "/tmp/root.pem"}
39
+
40
+
41
+ def test_ssl_context_disables_verification_by_default():
42
+ context = HipdaClient().ssl_context()
43
+
44
+ assert context.check_hostname is False
45
+ assert context.verify_mode == ssl.CERT_NONE
46
+
47
+
48
+ def test_cookie_header_from_browser_filters_4d4y_cookies(monkeypatch):
49
+ jar = CookieJar()
50
+ jar.set_cookie(_cookie("cdb_auth", "abc", ".4d4y.com"))
51
+ jar.set_cookie(_cookie("cf_clearance", "def", "www.4d4y.com"))
52
+ jar.set_cookie(_cookie("other", "nope", ".example.com"))
53
+
54
+ monkeypatch.setattr("hipda_cli.auth.browser_cookie3.chrome", lambda domain_name: jar)
55
+
56
+ assert cookie_header_from_browser("4d4y.com") == "cdb_auth=abc; cf_clearance=def"
57
+
58
+
59
+ def test_chrome_user_agent_uses_chrome_version_from_plist(monkeypatch, tmp_path):
60
+ import plistlib
61
+
62
+ plist = tmp_path / "Info.plist"
63
+ plist.write_bytes(plistlib.dumps({"CFBundleShortVersionString": "147.0.1.2"}))
64
+ monkeypatch.setattr("hipda_cli.auth.CHROME_INFO_PLIST_PATHS", (plist,))
65
+
66
+ assert chrome_user_agent() == (
67
+ "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) "
68
+ "AppleWebKit/537.36 (KHTML, like Gecko) Chrome/147.0.0.0 Safari/537.36"
69
+ )
70
+
71
+
72
+ def test_open_login_page_uses_google_chrome_on_macos(monkeypatch):
73
+ calls = []
74
+ monkeypatch.setattr("hipda_cli.auth.subprocess.run", lambda command, check: calls.append((command, check)))
75
+
76
+ open_login_page()
77
+
78
+ assert calls == [(["open", "-a", "Google Chrome", LOGIN_URL], False)]
79
+
80
+
81
+ def _cookie(name: str, value: str, domain: str) -> Cookie:
82
+ return Cookie(
83
+ version=0,
84
+ name=name,
85
+ value=value,
86
+ port=None,
87
+ port_specified=False,
88
+ domain=domain,
89
+ domain_specified=True,
90
+ domain_initial_dot=domain.startswith("."),
91
+ path="/",
92
+ path_specified=True,
93
+ secure=True,
94
+ expires=None,
95
+ discard=False,
96
+ comment=None,
97
+ comment_url=None,
98
+ rest={},
99
+ )
@@ -0,0 +1,124 @@
1
+ from hipda_cli.parser import is_login_required_page, parse_forum_listing, parse_thread
2
+
3
+
4
+ LISTING_HTML = """
5
+ <html><body>
6
+ <table>
7
+ <tr class="thread">
8
+ <th><a href="viewthread.php?tid=3446553&extra=page%3D1">《放开那个女巫》动画做得还不错。</a></th>
9
+ <td class="author">老兵-猫族<br>2026-5-7</td>
10
+ <td class="nums">11/777</td>
11
+ <td class="lastpost">leeice<br>2026-5-8 11:56</td>
12
+ </tr>
13
+ <tr>
14
+ <th><a href="viewthread.php?tid=3447001">iPhone13PM手机更换电池,现在最好的姿势是啥?</a></th>
15
+ <td>死老妖<br>2026-5-8</td>
16
+ <td>0/4</td>
17
+ <td>死老妖<br>2026-5-8 11:55</td>
18
+ </tr>
19
+ </table>
20
+ </body></html>
21
+ """
22
+
23
+ REALISTIC_LISTING_HTML = """
24
+ <html><body>
25
+ <tbody id="normalthread_3057651">
26
+ <tr>
27
+ <td class="folder"><a href="viewthread.php?tid=3057651&amp;extra=page%3D1"><img src="folder.gif"></a></td>
28
+ <td class="icon">&nbsp;</td>
29
+ <th class="subject lock">
30
+ <span id="thread_3057651"><a href="viewthread.php?tid=3057651&amp;extra=page%3D1">hipda已迁移到新域名4d4y</a></span>
31
+ </th>
32
+ <td class="author"><cite><a href="space.php?uid=29">4d4y</a></cite><em>2022-6-13</em></td>
33
+ <td class="nums"><strong>0</strong>/<em>90106</em></td>
34
+ <td class="lastpost"><cite><a href="space.php?username=4d4y">4d4y</a></cite><em><a>2022-6-13 22:57</a></em></td>
35
+ </tr>
36
+ </tbody>
37
+ </body></html>
38
+ """
39
+
40
+
41
+ THREAD_HTML = """
42
+ <html><body>
43
+ <div id="post_1">
44
+ <div class="postauthor">老兵-猫族</div>
45
+ <div class="postinfo">发表于 2026-5-7 19:12</div>
46
+ <td class="t_msgfont">动画做得还不错。<br>节奏可以。</td>
47
+ </div>
48
+ <div id="post_2">
49
+ <div class="postauthor">leeice</div>
50
+ <div class="postinfo">发表于 2026-5-8 11:56</div>
51
+ <td class="t_msgfont">谢谢推荐</td>
52
+ </div>
53
+ </body></html>
54
+ """
55
+
56
+ REALISTIC_THREAD_HTML = """
57
+ <html><body>
58
+ <div id="post_74215032">
59
+ <table><tr>
60
+ <td class="postauthor">
61
+ <div class="postinfo"><a href="space.php?uid=277860">死老妖</a></div>
62
+ <dl class="profile"><dt>UID</dt><dd>277860</dd></dl>
63
+ </td>
64
+ <td class="postcontent">
65
+ <div class="postinfo">
66
+ <div class="authorinfo"><em id="authorposton74215032">发表于 2026-5-8 11:55</em></div>
67
+ </div>
68
+ <div class="postmessage firstpost">
69
+ <td class="t_msgfont" id="postmessage_74215032">不要弹窗、要大容量</td>
70
+ </div>
71
+ </td>
72
+ </tr></table>
73
+ </div>
74
+ </body></html>
75
+ """
76
+
77
+
78
+ def test_parse_forum_listing_extracts_threads_and_stats():
79
+ threads = parse_forum_listing(LISTING_HTML, base_url="https://www.4d4y.com/forum/")
80
+
81
+ assert [thread.tid for thread in threads] == ["3446553", "3447001"]
82
+ assert threads[0].title == "《放开那个女巫》动画做得还不错。"
83
+ assert threads[0].author == "老兵-猫族"
84
+ assert threads[0].replies == 11
85
+ assert threads[0].views == 777
86
+ assert threads[0].last_author == "leeice"
87
+ assert threads[0].url == "https://www.4d4y.com/forum/viewthread.php?tid=3446553&extra=page%3D1"
88
+
89
+
90
+ def test_parse_forum_listing_uses_subject_link_when_icon_link_shares_tid():
91
+ threads = parse_forum_listing(REALISTIC_LISTING_HTML, base_url="https://www.4d4y.com/forum/")
92
+
93
+ assert len(threads) == 1
94
+ assert threads[0].tid == "3057651"
95
+ assert threads[0].title == "hipda已迁移到新域名4d4y"
96
+ assert threads[0].author == "4d4y"
97
+ assert threads[0].created_at == "2022-6-13"
98
+ assert threads[0].replies == 0
99
+ assert threads[0].views == 90106
100
+ assert threads[0].last_author == "4d4y"
101
+ assert threads[0].last_at == "2022-6-13 22:57"
102
+
103
+
104
+ def test_parse_thread_extracts_posts():
105
+ posts = parse_thread(THREAD_HTML)
106
+
107
+ assert [post.author for post in posts] == ["老兵-猫族", "leeice"]
108
+ assert posts[0].published_at == "2026-5-7 19:12"
109
+ assert posts[0].content == "动画做得还不错。\n节奏可以。"
110
+
111
+
112
+ def test_parse_thread_uses_discuz_author_and_date_without_sidebar_noise():
113
+ posts = parse_thread(REALISTIC_THREAD_HTML)
114
+
115
+ assert len(posts) == 1
116
+ assert posts[0].author == "死老妖"
117
+ assert posts[0].published_at == "2026-5-8 11:55"
118
+ assert posts[0].content == "不要弹窗、要大容量"
119
+
120
+
121
+ def test_detects_login_required_page():
122
+ html = "<html><title>提示信息</title><body>对不起,您还未登录,无权访问该版块。</body></html>"
123
+
124
+ assert is_login_required_page(html) is True