simplesitesearch 0.0.5__tar.gz → 0.0.7__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {simplesitesearch-0.0.5/simplesitesearch.egg-info → simplesitesearch-0.0.7}/PKG-INFO +51 -1
- {simplesitesearch-0.0.5 → simplesitesearch-0.0.7}/README.md +50 -0
- {simplesitesearch-0.0.5 → simplesitesearch-0.0.7}/pyproject.toml +1 -1
- {simplesitesearch-0.0.5 → simplesitesearch-0.0.7}/setup.cfg +1 -1
- {simplesitesearch-0.0.5 → simplesitesearch-0.0.7}/setup.py +1 -1
- simplesitesearch-0.0.7/simplesitesearch/__init__.py +1 -0
- simplesitesearch-0.0.7/simplesitesearch/indexing.py +156 -0
- simplesitesearch-0.0.7/simplesitesearch/indexing_middleware.py +46 -0
- simplesitesearch-0.0.7/simplesitesearch/indexing_views.py +70 -0
- {simplesitesearch-0.0.5 → simplesitesearch-0.0.7}/simplesitesearch/templates/simplesitesearch/pagination.html +1 -1
- {simplesitesearch-0.0.5 → simplesitesearch-0.0.7}/simplesitesearch/templates/simplesitesearch/search_results.html +19 -3
- simplesitesearch-0.0.7/simplesitesearch/urls.py +17 -0
- {simplesitesearch-0.0.5 → simplesitesearch-0.0.7}/simplesitesearch/views.py +46 -3
- {simplesitesearch-0.0.5 → simplesitesearch-0.0.7/simplesitesearch.egg-info}/PKG-INFO +51 -1
- {simplesitesearch-0.0.5 → simplesitesearch-0.0.7}/simplesitesearch.egg-info/SOURCES.txt +3 -0
- simplesitesearch-0.0.5/simplesitesearch/__init__.py +0 -1
- simplesitesearch-0.0.5/simplesitesearch/urls.py +0 -7
- {simplesitesearch-0.0.5 → simplesitesearch-0.0.7}/LICENSE +0 -0
- {simplesitesearch-0.0.5 → simplesitesearch-0.0.7}/MANIFEST.in +0 -0
- {simplesitesearch-0.0.5 → simplesitesearch-0.0.7}/simplesitesearch/cms_apps.py +0 -0
- {simplesitesearch-0.0.5 → simplesitesearch-0.0.7}/simplesitesearch/utils.py +0 -0
- {simplesitesearch-0.0.5 → simplesitesearch-0.0.7}/simplesitesearch.egg-info/dependency_links.txt +0 -0
- {simplesitesearch-0.0.5 → simplesitesearch-0.0.7}/simplesitesearch.egg-info/not-zip-safe +0 -0
- {simplesitesearch-0.0.5 → simplesitesearch-0.0.7}/simplesitesearch.egg-info/requires.txt +0 -0
- {simplesitesearch-0.0.5 → simplesitesearch-0.0.7}/simplesitesearch.egg-info/top_level.txt +0 -0
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: simplesitesearch
|
|
3
|
-
Version: 0.0.
|
|
3
|
+
Version: 0.0.7
|
|
4
4
|
Summary: Reptile Simple Site Search django app
|
|
5
5
|
Home-page: https://github.com/FlavienLouis/simplesitesearch
|
|
6
6
|
Author: Reptile Tech
|
|
@@ -160,6 +160,49 @@ Pagination links preserve the tag filter.
|
|
|
160
160
|
|
|
161
161
|
The search includes basic honeypot protection. If a `message` parameter is present, the search will not execute.
|
|
162
162
|
|
|
163
|
+
## Indexing authentication (optional)
|
|
164
|
+
|
|
165
|
+
When crawling or indexing protected CMS content, enable opaque bearer-token auth so indexers can act as a configured Django user.
|
|
166
|
+
|
|
167
|
+
### Settings
|
|
168
|
+
|
|
169
|
+
```python
|
|
170
|
+
SIMPLE_SITE_SEARCH_INDEXING_ENABLED = True
|
|
171
|
+
SIMPLE_SITE_SEARCH_INDEXING_USER = "indexer" # username, user id, or pk string
|
|
172
|
+
SIMPLE_SITE_SEARCH_INDEXING_BOOTSTRAP_TOKEN = "your-long-random-secret"
|
|
173
|
+
|
|
174
|
+
# Optional (defaults shown)
|
|
175
|
+
SIMPLE_SITE_SEARCH_INDEXING_ACCESS_TTL = 3600 # seconds
|
|
176
|
+
SIMPLE_SITE_SEARCH_INDEXING_REFRESH_TTL = 86400 # seconds
|
|
177
|
+
SIMPLE_SITE_SEARCH_INDEXING_CACHE_PREFIX = "sss_idx"
|
|
178
|
+
```
|
|
179
|
+
|
|
180
|
+
Use a shared Django cache backend in multi-worker deployments so tokens are visible across processes.
|
|
181
|
+
|
|
182
|
+
### Middleware
|
|
183
|
+
|
|
184
|
+
Add after `AuthenticationMiddleware`:
|
|
185
|
+
|
|
186
|
+
```python
|
|
187
|
+
MIDDLEWARE = [
|
|
188
|
+
# ...
|
|
189
|
+
"django.contrib.auth.middleware.AuthenticationMiddleware",
|
|
190
|
+
"simplesitesearch.indexing_middleware.IndexingAccessTokenMiddleware",
|
|
191
|
+
# ...
|
|
192
|
+
]
|
|
193
|
+
```
|
|
194
|
+
|
|
195
|
+
### Token endpoints
|
|
196
|
+
|
|
197
|
+
Both endpoints are under the same URL prefix as search (e.g. `/search/` when included at `path('search/', include('simplesitesearch.urls'))`):
|
|
198
|
+
|
|
199
|
+
| Endpoint | Auth header | Response |
|
|
200
|
+
|----------|-------------|----------|
|
|
201
|
+
| `POST …/internal/indexing/token/` | `Authorization: Bearer <bootstrap_token>` | `access_token`, `refresh_token`, `expires_in`, `refresh_expires_in` |
|
|
202
|
+
| `POST …/internal/indexing/refresh/` | `Authorization: Bearer <refresh_token>` | New token pair (refresh rotation) |
|
|
203
|
+
|
|
204
|
+
Send the access token on subsequent requests; the middleware logs in the configured user for that request.
|
|
205
|
+
|
|
163
206
|
## Utility functions (QOL)
|
|
164
207
|
|
|
165
208
|
The app provides helpers in `simplesitesearch.utils` for use in views, management commands, or other code.
|
|
@@ -285,6 +328,13 @@ For support and questions, please open an issue on the [GitHub repository](https
|
|
|
285
328
|
|
|
286
329
|
## Changelog
|
|
287
330
|
|
|
331
|
+
### 0.0.7
|
|
332
|
+
- **Added** optional indexing authentication: bootstrap/refresh token endpoints, cache-backed opaque tokens, and `IndexingAccessTokenMiddleware` for indexer login via Bearer access tokens.
|
|
333
|
+
|
|
334
|
+
### 0.0.6
|
|
335
|
+
- **Changed** search results: API hits normalized for templates (`display_title`, `snippet`, domain/type/language/tags/date metadata); highlighted title when available.
|
|
336
|
+
- **Changed** templates: `{% load static %}` instead of deprecated `staticfiles` (Django 4+).
|
|
337
|
+
|
|
288
338
|
### 0.0.5
|
|
289
339
|
- **Fixed** tag parsing: single tag string (e.g. `Hometag`) no longer sent as `H,o,m,e,t,a,g`; string is treated as one tag. API URL keeps commas unencoded so multiple tags parse correctly.
|
|
290
340
|
|
|
@@ -114,6 +114,49 @@ Pagination links preserve the tag filter.
|
|
|
114
114
|
|
|
115
115
|
The search includes basic honeypot protection. If a `message` parameter is present, the search will not execute.
|
|
116
116
|
|
|
117
|
+
## Indexing authentication (optional)
|
|
118
|
+
|
|
119
|
+
When crawling or indexing protected CMS content, enable opaque bearer-token auth so indexers can act as a configured Django user.
|
|
120
|
+
|
|
121
|
+
### Settings
|
|
122
|
+
|
|
123
|
+
```python
|
|
124
|
+
SIMPLE_SITE_SEARCH_INDEXING_ENABLED = True
|
|
125
|
+
SIMPLE_SITE_SEARCH_INDEXING_USER = "indexer" # username, user id, or pk string
|
|
126
|
+
SIMPLE_SITE_SEARCH_INDEXING_BOOTSTRAP_TOKEN = "your-long-random-secret"
|
|
127
|
+
|
|
128
|
+
# Optional (defaults shown)
|
|
129
|
+
SIMPLE_SITE_SEARCH_INDEXING_ACCESS_TTL = 3600 # seconds
|
|
130
|
+
SIMPLE_SITE_SEARCH_INDEXING_REFRESH_TTL = 86400 # seconds
|
|
131
|
+
SIMPLE_SITE_SEARCH_INDEXING_CACHE_PREFIX = "sss_idx"
|
|
132
|
+
```
|
|
133
|
+
|
|
134
|
+
Use a shared Django cache backend in multi-worker deployments so tokens are visible across processes.
|
|
135
|
+
|
|
136
|
+
### Middleware
|
|
137
|
+
|
|
138
|
+
Add after `AuthenticationMiddleware`:
|
|
139
|
+
|
|
140
|
+
```python
|
|
141
|
+
MIDDLEWARE = [
|
|
142
|
+
# ...
|
|
143
|
+
"django.contrib.auth.middleware.AuthenticationMiddleware",
|
|
144
|
+
"simplesitesearch.indexing_middleware.IndexingAccessTokenMiddleware",
|
|
145
|
+
# ...
|
|
146
|
+
]
|
|
147
|
+
```
|
|
148
|
+
|
|
149
|
+
### Token endpoints
|
|
150
|
+
|
|
151
|
+
Both endpoints are under the same URL prefix as search (e.g. `/search/` when included at `path('search/', include('simplesitesearch.urls'))`):
|
|
152
|
+
|
|
153
|
+
| Endpoint | Auth header | Response |
|
|
154
|
+
|----------|-------------|----------|
|
|
155
|
+
| `POST …/internal/indexing/token/` | `Authorization: Bearer <bootstrap_token>` | `access_token`, `refresh_token`, `expires_in`, `refresh_expires_in` |
|
|
156
|
+
| `POST …/internal/indexing/refresh/` | `Authorization: Bearer <refresh_token>` | New token pair (refresh rotation) |
|
|
157
|
+
|
|
158
|
+
Send the access token on subsequent requests; the middleware logs in the configured user for that request.
|
|
159
|
+
|
|
117
160
|
## Utility functions (QOL)
|
|
118
161
|
|
|
119
162
|
The app provides helpers in `simplesitesearch.utils` for use in views, management commands, or other code.
|
|
@@ -239,6 +282,13 @@ For support and questions, please open an issue on the [GitHub repository](https
|
|
|
239
282
|
|
|
240
283
|
## Changelog
|
|
241
284
|
|
|
285
|
+
### 0.0.7
|
|
286
|
+
- **Added** optional indexing authentication: bootstrap/refresh token endpoints, cache-backed opaque tokens, and `IndexingAccessTokenMiddleware` for indexer login via Bearer access tokens.
|
|
287
|
+
|
|
288
|
+
### 0.0.6
|
|
289
|
+
- **Changed** search results: API hits normalized for templates (`display_title`, `snippet`, domain/type/language/tags/date metadata); highlighted title when available.
|
|
290
|
+
- **Changed** templates: `{% load static %}` instead of deprecated `staticfiles` (Django 4+).
|
|
291
|
+
|
|
242
292
|
### 0.0.5
|
|
243
293
|
- **Fixed** tag parsing: single tag string (e.g. `Hometag`) no longer sent as `H,o,m,e,t,a,g`; string is treated as one tag. API URL keeps commas unencoded so multiple tags parse correctly.
|
|
244
294
|
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
__version__ = "0.0.7"
|
|
@@ -0,0 +1,156 @@
|
|
|
1
|
+
"""
|
|
2
|
+
Opaque indexing tokens stored in Django cache (shared backend required in multi-worker setups).
|
|
3
|
+
"""
|
|
4
|
+
from __future__ import annotations
|
|
5
|
+
|
|
6
|
+
import hashlib
|
|
7
|
+
import hmac
|
|
8
|
+
import json
|
|
9
|
+
import secrets
|
|
10
|
+
from typing import Optional, Tuple
|
|
11
|
+
|
|
12
|
+
from django.conf import settings
|
|
13
|
+
from django.core.cache import cache
|
|
14
|
+
|
|
15
|
+
|
|
16
|
+
def indexing_cache_prefix() -> str:
|
|
17
|
+
return getattr(
|
|
18
|
+
settings,
|
|
19
|
+
"SIMPLE_SITE_SEARCH_INDEXING_CACHE_PREFIX",
|
|
20
|
+
"sss_idx",
|
|
21
|
+
)
|
|
22
|
+
|
|
23
|
+
|
|
24
|
+
def access_ttl_seconds() -> int:
|
|
25
|
+
return int(
|
|
26
|
+
getattr(settings, "SIMPLE_SITE_SEARCH_INDEXING_ACCESS_TTL", 3600)
|
|
27
|
+
)
|
|
28
|
+
|
|
29
|
+
|
|
30
|
+
def refresh_ttl_seconds() -> int:
|
|
31
|
+
return int(
|
|
32
|
+
getattr(settings, "SIMPLE_SITE_SEARCH_INDEXING_REFRESH_TTL", 86400)
|
|
33
|
+
)
|
|
34
|
+
|
|
35
|
+
|
|
36
|
+
def _hash_token(raw: str) -> str:
|
|
37
|
+
return hashlib.sha256(raw.encode("utf-8")).hexdigest()
|
|
38
|
+
|
|
39
|
+
|
|
40
|
+
def access_cache_key(raw_access_token: str) -> str:
|
|
41
|
+
return "%s:a:%s" % (indexing_cache_prefix(), _hash_token(raw_access_token))
|
|
42
|
+
|
|
43
|
+
|
|
44
|
+
def refresh_cache_key(raw_refresh_token: str) -> str:
|
|
45
|
+
return "%s:r:%s" % (indexing_cache_prefix(), _hash_token(raw_refresh_token))
|
|
46
|
+
|
|
47
|
+
|
|
48
|
+
def parse_bearer_authorization(header_value: str) -> Optional[str]:
|
|
49
|
+
if not header_value:
|
|
50
|
+
return None
|
|
51
|
+
parts = header_value.split(None, 1)
|
|
52
|
+
if len(parts) != 2 or parts[0].lower() != "bearer":
|
|
53
|
+
return None
|
|
54
|
+
token = parts[1].strip()
|
|
55
|
+
return token or None
|
|
56
|
+
|
|
57
|
+
|
|
58
|
+
def bootstrap_token_valid(given: str) -> bool:
|
|
59
|
+
expected = getattr(settings, "SIMPLE_SITE_SEARCH_INDEXING_BOOTSTRAP_TOKEN", "") or ""
|
|
60
|
+
if not expected or not given:
|
|
61
|
+
return False
|
|
62
|
+
try:
|
|
63
|
+
eb = expected.encode("utf-8")
|
|
64
|
+
gb = given.encode("utf-8")
|
|
65
|
+
except UnicodeEncodeError:
|
|
66
|
+
return False
|
|
67
|
+
if len(eb) != len(gb):
|
|
68
|
+
return False
|
|
69
|
+
return hmac.compare_digest(eb, gb)
|
|
70
|
+
|
|
71
|
+
|
|
72
|
+
def issue_token_pair(user_id: int) -> Tuple[str, str, int, int]:
|
|
73
|
+
access = secrets.token_urlsafe(32)
|
|
74
|
+
refresh = secrets.token_urlsafe(32)
|
|
75
|
+
at = access_ttl_seconds()
|
|
76
|
+
rt = refresh_ttl_seconds()
|
|
77
|
+
cache.set(
|
|
78
|
+
access_cache_key(access),
|
|
79
|
+
json.dumps({"user_id": user_id}),
|
|
80
|
+
at,
|
|
81
|
+
)
|
|
82
|
+
cache.set(
|
|
83
|
+
refresh_cache_key(refresh),
|
|
84
|
+
json.dumps({"user_id": user_id}),
|
|
85
|
+
rt,
|
|
86
|
+
)
|
|
87
|
+
return access, refresh, at, rt
|
|
88
|
+
|
|
89
|
+
|
|
90
|
+
def lookup_access_user_id(raw_access_token: str) -> Optional[int]:
|
|
91
|
+
raw = cache.get(access_cache_key(raw_access_token))
|
|
92
|
+
if raw is None:
|
|
93
|
+
return None
|
|
94
|
+
if isinstance(raw, bytes):
|
|
95
|
+
raw = raw.decode("utf-8")
|
|
96
|
+
try:
|
|
97
|
+
data = json.loads(raw)
|
|
98
|
+
uid = data.get("user_id")
|
|
99
|
+
return int(uid) if uid is not None else None
|
|
100
|
+
except (TypeError, ValueError, json.JSONDecodeError):
|
|
101
|
+
return None
|
|
102
|
+
|
|
103
|
+
|
|
104
|
+
def consume_refresh_and_rotate(user_id: int, old_refresh: str) -> Optional[Tuple[str, str, int, int]]:
|
|
105
|
+
"""
|
|
106
|
+
Validate refresh token, delete it, issue new access + refresh (rotation).
|
|
107
|
+
Returns (access, refresh, access_ttl, refresh_ttl) or None if invalid.
|
|
108
|
+
"""
|
|
109
|
+
rkey = refresh_cache_key(old_refresh)
|
|
110
|
+
raw = cache.get(rkey)
|
|
111
|
+
if raw is None:
|
|
112
|
+
return None
|
|
113
|
+
if isinstance(raw, bytes):
|
|
114
|
+
raw = raw.decode("utf-8")
|
|
115
|
+
try:
|
|
116
|
+
data = json.loads(raw)
|
|
117
|
+
if int(data.get("user_id")) != int(user_id):
|
|
118
|
+
return None
|
|
119
|
+
except (TypeError, ValueError, json.JSONDecodeError):
|
|
120
|
+
return None
|
|
121
|
+
cache.delete(rkey)
|
|
122
|
+
return issue_token_pair(user_id)
|
|
123
|
+
|
|
124
|
+
|
|
125
|
+
def lookup_refresh_user_id(raw_refresh_token: str) -> Optional[int]:
|
|
126
|
+
rkey = refresh_cache_key(raw_refresh_token)
|
|
127
|
+
raw = cache.get(rkey)
|
|
128
|
+
if raw is None:
|
|
129
|
+
return None
|
|
130
|
+
if isinstance(raw, bytes):
|
|
131
|
+
raw = raw.decode("utf-8")
|
|
132
|
+
try:
|
|
133
|
+
data = json.loads(raw)
|
|
134
|
+
return int(data["user_id"])
|
|
135
|
+
except (TypeError, ValueError, KeyError, json.JSONDecodeError):
|
|
136
|
+
return None
|
|
137
|
+
|
|
138
|
+
|
|
139
|
+
def resolve_indexing_user_id() -> Optional[int]:
|
|
140
|
+
spec = getattr(settings, "SIMPLE_SITE_SEARCH_INDEXING_USER", None)
|
|
141
|
+
if spec is None:
|
|
142
|
+
return None
|
|
143
|
+
from django.contrib.auth import get_user_model
|
|
144
|
+
|
|
145
|
+
User = get_user_model()
|
|
146
|
+
if isinstance(spec, int):
|
|
147
|
+
pk = spec
|
|
148
|
+
else:
|
|
149
|
+
s = str(spec).strip()
|
|
150
|
+
if s.isdigit():
|
|
151
|
+
pk = int(s)
|
|
152
|
+
else:
|
|
153
|
+
lookup = {User.USERNAME_FIELD: s}
|
|
154
|
+
u = User.objects.filter(**lookup).values_list("pk", flat=True).first()
|
|
155
|
+
return int(u) if u is not None else None
|
|
156
|
+
return pk if User.objects.filter(pk=pk).exists() else None
|
|
@@ -0,0 +1,46 @@
|
|
|
1
|
+
from __future__ import annotations
|
|
2
|
+
|
|
3
|
+
from django.conf import settings
|
|
4
|
+
from django.contrib.auth import get_user_model, login
|
|
5
|
+
|
|
6
|
+
from .indexing import lookup_access_user_id, parse_bearer_authorization
|
|
7
|
+
|
|
8
|
+
|
|
9
|
+
def _skip_for_path(path: str) -> bool:
|
|
10
|
+
if "internal/indexing/token" in path or "internal/indexing/refresh" in path:
|
|
11
|
+
return True
|
|
12
|
+
return False
|
|
13
|
+
|
|
14
|
+
|
|
15
|
+
class IndexingAccessTokenMiddleware:
|
|
16
|
+
"""
|
|
17
|
+
After AuthenticationMiddleware: if Authorization Bearer matches a valid indexing
|
|
18
|
+
access token, log in as the configured indexer user for this request.
|
|
19
|
+
"""
|
|
20
|
+
|
|
21
|
+
def __init__(self, get_response):
|
|
22
|
+
self.get_response = get_response
|
|
23
|
+
|
|
24
|
+
def __call__(self, request):
|
|
25
|
+
if not getattr(settings, "SIMPLE_SITE_SEARCH_INDEXING_ENABLED", False):
|
|
26
|
+
return self.get_response(request)
|
|
27
|
+
if _skip_for_path(getattr(request, "path", "") or ""):
|
|
28
|
+
return self.get_response(request)
|
|
29
|
+
authz = request.META.get("HTTP_AUTHORIZATION", "")
|
|
30
|
+
token = parse_bearer_authorization(authz)
|
|
31
|
+
if not token:
|
|
32
|
+
return self.get_response(request)
|
|
33
|
+
user_id = lookup_access_user_id(token)
|
|
34
|
+
if user_id is None:
|
|
35
|
+
return self.get_response(request)
|
|
36
|
+
User = get_user_model()
|
|
37
|
+
try:
|
|
38
|
+
user = User.objects.get(pk=user_id, is_active=True)
|
|
39
|
+
except User.DoesNotExist:
|
|
40
|
+
return self.get_response(request)
|
|
41
|
+
login(
|
|
42
|
+
request,
|
|
43
|
+
user,
|
|
44
|
+
backend="django.contrib.auth.backends.ModelBackend",
|
|
45
|
+
)
|
|
46
|
+
return self.get_response(request)
|
|
@@ -0,0 +1,70 @@
|
|
|
1
|
+
from __future__ import annotations
|
|
2
|
+
|
|
3
|
+
from django.conf import settings
|
|
4
|
+
from django.http import JsonResponse
|
|
5
|
+
from django.views.decorators.csrf import csrf_exempt
|
|
6
|
+
from django.views.decorators.http import require_POST
|
|
7
|
+
|
|
8
|
+
from .indexing import (
|
|
9
|
+
bootstrap_token_valid,
|
|
10
|
+
consume_refresh_and_rotate,
|
|
11
|
+
issue_token_pair,
|
|
12
|
+
lookup_refresh_user_id,
|
|
13
|
+
parse_bearer_authorization,
|
|
14
|
+
resolve_indexing_user_id,
|
|
15
|
+
)
|
|
16
|
+
|
|
17
|
+
|
|
18
|
+
def _indexing_disabled_response():
|
|
19
|
+
return JsonResponse({"detail": "indexing_disabled"}, status=403)
|
|
20
|
+
|
|
21
|
+
|
|
22
|
+
def _json_error(detail: str, status: int):
|
|
23
|
+
return JsonResponse({"detail": detail}, status=status)
|
|
24
|
+
|
|
25
|
+
|
|
26
|
+
@csrf_exempt
|
|
27
|
+
@require_POST
|
|
28
|
+
def obtain_indexing_token(request):
|
|
29
|
+
if not getattr(settings, "SIMPLE_SITE_SEARCH_INDEXING_ENABLED", False):
|
|
30
|
+
return _indexing_disabled_response()
|
|
31
|
+
user_id = resolve_indexing_user_id()
|
|
32
|
+
if user_id is None:
|
|
33
|
+
return _json_error("indexing_user_not_configured", 503)
|
|
34
|
+
token = parse_bearer_authorization(request.META.get("HTTP_AUTHORIZATION", ""))
|
|
35
|
+
if not token or not bootstrap_token_valid(token):
|
|
36
|
+
return _json_error("invalid_bootstrap", 401)
|
|
37
|
+
access, refresh, at, rt = issue_token_pair(user_id)
|
|
38
|
+
return JsonResponse(
|
|
39
|
+
{
|
|
40
|
+
"access_token": access,
|
|
41
|
+
"refresh_token": refresh,
|
|
42
|
+
"expires_in": at,
|
|
43
|
+
"refresh_expires_in": rt,
|
|
44
|
+
}
|
|
45
|
+
)
|
|
46
|
+
|
|
47
|
+
|
|
48
|
+
@csrf_exempt
|
|
49
|
+
@require_POST
|
|
50
|
+
def refresh_indexing_token(request):
|
|
51
|
+
if not getattr(settings, "SIMPLE_SITE_SEARCH_INDEXING_ENABLED", False):
|
|
52
|
+
return _indexing_disabled_response()
|
|
53
|
+
token = parse_bearer_authorization(request.META.get("HTTP_AUTHORIZATION", ""))
|
|
54
|
+
if not token:
|
|
55
|
+
return _json_error("missing_refresh", 401)
|
|
56
|
+
user_id = lookup_refresh_user_id(token)
|
|
57
|
+
if user_id is None:
|
|
58
|
+
return _json_error("invalid_refresh", 401)
|
|
59
|
+
pair = consume_refresh_and_rotate(user_id, token)
|
|
60
|
+
if pair is None:
|
|
61
|
+
return _json_error("invalid_refresh", 401)
|
|
62
|
+
access, refresh, at, rt = pair
|
|
63
|
+
return JsonResponse(
|
|
64
|
+
{
|
|
65
|
+
"access_token": access,
|
|
66
|
+
"refresh_token": refresh,
|
|
67
|
+
"expires_in": at,
|
|
68
|
+
"refresh_expires_in": rt,
|
|
69
|
+
}
|
|
70
|
+
)
|
|
@@ -1,5 +1,5 @@
|
|
|
1
1
|
{% extends "base.html" %}
|
|
2
|
-
{% load i18n
|
|
2
|
+
{% load i18n static cms_tags %}
|
|
3
3
|
|
|
4
4
|
{% block template_class %}template_search{% endblock %}
|
|
5
5
|
|
|
@@ -27,13 +27,29 @@
|
|
|
27
27
|
{% for result in results %}
|
|
28
28
|
<div class="search__content__result">
|
|
29
29
|
<div class="headings__third-container">
|
|
30
|
-
<a class="headings headings__third headings--blue" href="{{result.url}}">{{ result.
|
|
30
|
+
<a class="headings headings__third headings--blue" href="{{ result.url }}">{{ result.display_title|safe }}</a>
|
|
31
31
|
</div>
|
|
32
|
+
{% if result.snippet %}
|
|
32
33
|
<div class="text-container">
|
|
33
34
|
<div class="text text--graydark">
|
|
34
|
-
<p>{{result.
|
|
35
|
+
<p>{{ result.snippet|safe }}…</p>
|
|
35
36
|
</div>
|
|
36
37
|
</div>
|
|
38
|
+
{% endif %}
|
|
39
|
+
{% if result.domain or result.type or result.language or result.last_modified %}
|
|
40
|
+
<div class="text-container">
|
|
41
|
+
<div class="text text--graydark">
|
|
42
|
+
<p>{% if result.domain %}{{ result.domain }}{% endif %}{% if result.type %}{% if result.domain %} · {% endif %}{{ result.type }}{% endif %}{% if result.language %}{% if result.domain or result.type %} · {% endif %}{{ result.language }}{% endif %}{% if result.last_modified %}{% if result.domain or result.type or result.language %} · {% endif %}{{ result.last_modified|date:"SHORT_DATETIME_FORMAT" }}{% endif %}</p>
|
|
43
|
+
</div>
|
|
44
|
+
</div>
|
|
45
|
+
{% endif %}
|
|
46
|
+
{% if result.tags %}
|
|
47
|
+
<div class="text-container">
|
|
48
|
+
<div class="text text--graydark">
|
|
49
|
+
<p>{% for t in result.tags %}{{ t }}{% if not forloop.last %}, {% endif %}{% endfor %}</p>
|
|
50
|
+
</div>
|
|
51
|
+
</div>
|
|
52
|
+
{% endif %}
|
|
37
53
|
</div>
|
|
38
54
|
{% empty %}
|
|
39
55
|
<div class="search__content--empty">
|
|
@@ -0,0 +1,17 @@
|
|
|
1
|
+
from django.urls import re_path
|
|
2
|
+
|
|
3
|
+
from . import indexing_views, views
|
|
4
|
+
|
|
5
|
+
urlpatterns = [
|
|
6
|
+
re_path(r'^$', views.SearchResult.as_view(), name='search'),
|
|
7
|
+
re_path(
|
|
8
|
+
r'^internal/indexing/token/?$',
|
|
9
|
+
indexing_views.obtain_indexing_token,
|
|
10
|
+
name='sss_indexing_token',
|
|
11
|
+
),
|
|
12
|
+
re_path(
|
|
13
|
+
r'^internal/indexing/refresh/?$',
|
|
14
|
+
indexing_views.refresh_indexing_token,
|
|
15
|
+
name='sss_indexing_refresh',
|
|
16
|
+
),
|
|
17
|
+
]
|
|
@@ -1,5 +1,6 @@
|
|
|
1
1
|
from math import floor
|
|
2
2
|
|
|
3
|
+
from django.utils.dateparse import parse_datetime
|
|
3
4
|
from django.views.generic import TemplateView
|
|
4
5
|
|
|
5
6
|
from .utils import (
|
|
@@ -59,6 +60,46 @@ def get_total_pages(total_hits):
|
|
|
59
60
|
return pages_count
|
|
60
61
|
|
|
61
62
|
|
|
63
|
+
def _optional_datetime(value):
|
|
64
|
+
if not value or not isinstance(value, str):
|
|
65
|
+
return None
|
|
66
|
+
return parse_datetime(value)
|
|
67
|
+
|
|
68
|
+
|
|
69
|
+
def normalize_search_hit(hit):
|
|
70
|
+
"""
|
|
71
|
+
Reduce each API hit to fields useful for templates (omit bulky nested blobs).
|
|
72
|
+
"""
|
|
73
|
+
if not isinstance(hit, dict):
|
|
74
|
+
return hit
|
|
75
|
+
|
|
76
|
+
highlights = hit.get("highlights") or {}
|
|
77
|
+
title_highlight = None
|
|
78
|
+
title_snippets = highlights.get("title")
|
|
79
|
+
if title_snippets and isinstance(title_snippets, list) and title_snippets[0]:
|
|
80
|
+
title_highlight = title_snippets[0]
|
|
81
|
+
|
|
82
|
+
snippet = (
|
|
83
|
+
hit.get("highlight")
|
|
84
|
+
or hit.get("description")
|
|
85
|
+
or hit.get("content_preview")
|
|
86
|
+
or ""
|
|
87
|
+
)
|
|
88
|
+
|
|
89
|
+
modified_raw = hit.get("last_modified") or hit.get("indexed_at") or ""
|
|
90
|
+
|
|
91
|
+
return {
|
|
92
|
+
"url": hit.get("url") or "",
|
|
93
|
+
"display_title": title_highlight or hit.get("title") or "",
|
|
94
|
+
"snippet": snippet,
|
|
95
|
+
"domain": hit.get("domain") or "",
|
|
96
|
+
"type": hit.get("type") or "",
|
|
97
|
+
"tags": hit.get("tags") if isinstance(hit.get("tags"), list) else [],
|
|
98
|
+
"language": hit.get("language") or "",
|
|
99
|
+
"last_modified": _optional_datetime(modified_raw),
|
|
100
|
+
}
|
|
101
|
+
|
|
102
|
+
|
|
62
103
|
def get_api_re_path(term, current_page, tags=None):
|
|
63
104
|
"""Build the search API URL (delegates to utils for consistency)."""
|
|
64
105
|
return get_search_api_url(term, current_page, tags=tags)
|
|
@@ -84,21 +125,23 @@ class SearchResult(TemplateView):
|
|
|
84
125
|
response_data = get_search_results(
|
|
85
126
|
term, current_page, tags=tags_list or None
|
|
86
127
|
)
|
|
87
|
-
|
|
128
|
+
total_hits = response_data.get("total_hits", 0)
|
|
129
|
+
pages_count = get_total_pages(total_hits)
|
|
88
130
|
prev_page_number, next_page_number = get_prev_next_page_number(pages_count, current_page)
|
|
89
131
|
prev_link, next_link = get_prev_next_links(
|
|
90
132
|
next_page_number, prev_page_number, term, tags=tags_list or None
|
|
91
133
|
)
|
|
92
134
|
page_links = get_page_links(pages_count, current_page, term, tags=tags_list or None)
|
|
93
135
|
|
|
136
|
+
raw_hits = response_data.get("hits") or []
|
|
94
137
|
context.update({
|
|
95
138
|
"pages_count": pages_count,
|
|
96
139
|
"current_page": current_page,
|
|
97
|
-
"results_count":
|
|
140
|
+
"results_count": total_hits,
|
|
98
141
|
"prev_link": prev_link,
|
|
99
142
|
"next_link": next_link,
|
|
100
143
|
"page_links": page_links,
|
|
101
|
-
"results":
|
|
144
|
+
"results": [normalize_search_hit(h) for h in raw_hits],
|
|
102
145
|
})
|
|
103
146
|
else:
|
|
104
147
|
context.update({"results": None})
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: simplesitesearch
|
|
3
|
-
Version: 0.0.
|
|
3
|
+
Version: 0.0.7
|
|
4
4
|
Summary: Reptile Simple Site Search django app
|
|
5
5
|
Home-page: https://github.com/FlavienLouis/simplesitesearch
|
|
6
6
|
Author: Reptile Tech
|
|
@@ -160,6 +160,49 @@ Pagination links preserve the tag filter.
|
|
|
160
160
|
|
|
161
161
|
The search includes basic honeypot protection. If a `message` parameter is present, the search will not execute.
|
|
162
162
|
|
|
163
|
+
## Indexing authentication (optional)
|
|
164
|
+
|
|
165
|
+
When crawling or indexing protected CMS content, enable opaque bearer-token auth so indexers can act as a configured Django user.
|
|
166
|
+
|
|
167
|
+
### Settings
|
|
168
|
+
|
|
169
|
+
```python
|
|
170
|
+
SIMPLE_SITE_SEARCH_INDEXING_ENABLED = True
|
|
171
|
+
SIMPLE_SITE_SEARCH_INDEXING_USER = "indexer" # username, user id, or pk string
|
|
172
|
+
SIMPLE_SITE_SEARCH_INDEXING_BOOTSTRAP_TOKEN = "your-long-random-secret"
|
|
173
|
+
|
|
174
|
+
# Optional (defaults shown)
|
|
175
|
+
SIMPLE_SITE_SEARCH_INDEXING_ACCESS_TTL = 3600 # seconds
|
|
176
|
+
SIMPLE_SITE_SEARCH_INDEXING_REFRESH_TTL = 86400 # seconds
|
|
177
|
+
SIMPLE_SITE_SEARCH_INDEXING_CACHE_PREFIX = "sss_idx"
|
|
178
|
+
```
|
|
179
|
+
|
|
180
|
+
Use a shared Django cache backend in multi-worker deployments so tokens are visible across processes.
|
|
181
|
+
|
|
182
|
+
### Middleware
|
|
183
|
+
|
|
184
|
+
Add after `AuthenticationMiddleware`:
|
|
185
|
+
|
|
186
|
+
```python
|
|
187
|
+
MIDDLEWARE = [
|
|
188
|
+
# ...
|
|
189
|
+
"django.contrib.auth.middleware.AuthenticationMiddleware",
|
|
190
|
+
"simplesitesearch.indexing_middleware.IndexingAccessTokenMiddleware",
|
|
191
|
+
# ...
|
|
192
|
+
]
|
|
193
|
+
```
|
|
194
|
+
|
|
195
|
+
### Token endpoints
|
|
196
|
+
|
|
197
|
+
Both endpoints are under the same URL prefix as search (e.g. `/search/` when included at `path('search/', include('simplesitesearch.urls'))`):
|
|
198
|
+
|
|
199
|
+
| Endpoint | Auth header | Response |
|
|
200
|
+
|----------|-------------|----------|
|
|
201
|
+
| `POST …/internal/indexing/token/` | `Authorization: Bearer <bootstrap_token>` | `access_token`, `refresh_token`, `expires_in`, `refresh_expires_in` |
|
|
202
|
+
| `POST …/internal/indexing/refresh/` | `Authorization: Bearer <refresh_token>` | New token pair (refresh rotation) |
|
|
203
|
+
|
|
204
|
+
Send the access token on subsequent requests; the middleware logs in the configured user for that request.
|
|
205
|
+
|
|
163
206
|
## Utility functions (QOL)
|
|
164
207
|
|
|
165
208
|
The app provides helpers in `simplesitesearch.utils` for use in views, management commands, or other code.
|
|
@@ -285,6 +328,13 @@ For support and questions, please open an issue on the [GitHub repository](https
|
|
|
285
328
|
|
|
286
329
|
## Changelog
|
|
287
330
|
|
|
331
|
+
### 0.0.7
|
|
332
|
+
- **Added** optional indexing authentication: bootstrap/refresh token endpoints, cache-backed opaque tokens, and `IndexingAccessTokenMiddleware` for indexer login via Bearer access tokens.
|
|
333
|
+
|
|
334
|
+
### 0.0.6
|
|
335
|
+
- **Changed** search results: API hits normalized for templates (`display_title`, `snippet`, domain/type/language/tags/date metadata); highlighted title when available.
|
|
336
|
+
- **Changed** templates: `{% load static %}` instead of deprecated `staticfiles` (Django 4+).
|
|
337
|
+
|
|
288
338
|
### 0.0.5
|
|
289
339
|
- **Fixed** tag parsing: single tag string (e.g. `Hometag`) no longer sent as `H,o,m,e,t,a,g`; string is treated as one tag. API URL keeps commas unencoded so multiple tags parse correctly.
|
|
290
340
|
|
|
@@ -6,6 +6,9 @@ setup.cfg
|
|
|
6
6
|
setup.py
|
|
7
7
|
simplesitesearch/__init__.py
|
|
8
8
|
simplesitesearch/cms_apps.py
|
|
9
|
+
simplesitesearch/indexing.py
|
|
10
|
+
simplesitesearch/indexing_middleware.py
|
|
11
|
+
simplesitesearch/indexing_views.py
|
|
9
12
|
simplesitesearch/urls.py
|
|
10
13
|
simplesitesearch/utils.py
|
|
11
14
|
simplesitesearch/views.py
|
|
@@ -1 +0,0 @@
|
|
|
1
|
-
__version__ = "0.0.5"
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
{simplesitesearch-0.0.5 → simplesitesearch-0.0.7}/simplesitesearch.egg-info/dependency_links.txt
RENAMED
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|