simplesitesearch 0.0.5__tar.gz → 0.0.7__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (25) hide show
  1. {simplesitesearch-0.0.5/simplesitesearch.egg-info → simplesitesearch-0.0.7}/PKG-INFO +51 -1
  2. {simplesitesearch-0.0.5 → simplesitesearch-0.0.7}/README.md +50 -0
  3. {simplesitesearch-0.0.5 → simplesitesearch-0.0.7}/pyproject.toml +1 -1
  4. {simplesitesearch-0.0.5 → simplesitesearch-0.0.7}/setup.cfg +1 -1
  5. {simplesitesearch-0.0.5 → simplesitesearch-0.0.7}/setup.py +1 -1
  6. simplesitesearch-0.0.7/simplesitesearch/__init__.py +1 -0
  7. simplesitesearch-0.0.7/simplesitesearch/indexing.py +156 -0
  8. simplesitesearch-0.0.7/simplesitesearch/indexing_middleware.py +46 -0
  9. simplesitesearch-0.0.7/simplesitesearch/indexing_views.py +70 -0
  10. {simplesitesearch-0.0.5 → simplesitesearch-0.0.7}/simplesitesearch/templates/simplesitesearch/pagination.html +1 -1
  11. {simplesitesearch-0.0.5 → simplesitesearch-0.0.7}/simplesitesearch/templates/simplesitesearch/search_results.html +19 -3
  12. simplesitesearch-0.0.7/simplesitesearch/urls.py +17 -0
  13. {simplesitesearch-0.0.5 → simplesitesearch-0.0.7}/simplesitesearch/views.py +46 -3
  14. {simplesitesearch-0.0.5 → simplesitesearch-0.0.7/simplesitesearch.egg-info}/PKG-INFO +51 -1
  15. {simplesitesearch-0.0.5 → simplesitesearch-0.0.7}/simplesitesearch.egg-info/SOURCES.txt +3 -0
  16. simplesitesearch-0.0.5/simplesitesearch/__init__.py +0 -1
  17. simplesitesearch-0.0.5/simplesitesearch/urls.py +0 -7
  18. {simplesitesearch-0.0.5 → simplesitesearch-0.0.7}/LICENSE +0 -0
  19. {simplesitesearch-0.0.5 → simplesitesearch-0.0.7}/MANIFEST.in +0 -0
  20. {simplesitesearch-0.0.5 → simplesitesearch-0.0.7}/simplesitesearch/cms_apps.py +0 -0
  21. {simplesitesearch-0.0.5 → simplesitesearch-0.0.7}/simplesitesearch/utils.py +0 -0
  22. {simplesitesearch-0.0.5 → simplesitesearch-0.0.7}/simplesitesearch.egg-info/dependency_links.txt +0 -0
  23. {simplesitesearch-0.0.5 → simplesitesearch-0.0.7}/simplesitesearch.egg-info/not-zip-safe +0 -0
  24. {simplesitesearch-0.0.5 → simplesitesearch-0.0.7}/simplesitesearch.egg-info/requires.txt +0 -0
  25. {simplesitesearch-0.0.5 → simplesitesearch-0.0.7}/simplesitesearch.egg-info/top_level.txt +0 -0
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: simplesitesearch
3
- Version: 0.0.5
3
+ Version: 0.0.7
4
4
  Summary: Reptile Simple Site Search django app
5
5
  Home-page: https://github.com/FlavienLouis/simplesitesearch
6
6
  Author: Reptile Tech
@@ -160,6 +160,49 @@ Pagination links preserve the tag filter.
160
160
 
161
161
  The search includes basic honeypot protection. If a `message` parameter is present, the search will not execute.
162
162
 
163
+ ## Indexing authentication (optional)
164
+
165
+ When crawling or indexing protected CMS content, enable opaque bearer-token auth so indexers can act as a configured Django user.
166
+
167
+ ### Settings
168
+
169
+ ```python
170
+ SIMPLE_SITE_SEARCH_INDEXING_ENABLED = True
171
+ SIMPLE_SITE_SEARCH_INDEXING_USER = "indexer" # username, user id, or pk string
172
+ SIMPLE_SITE_SEARCH_INDEXING_BOOTSTRAP_TOKEN = "your-long-random-secret"
173
+
174
+ # Optional (defaults shown)
175
+ SIMPLE_SITE_SEARCH_INDEXING_ACCESS_TTL = 3600 # seconds
176
+ SIMPLE_SITE_SEARCH_INDEXING_REFRESH_TTL = 86400 # seconds
177
+ SIMPLE_SITE_SEARCH_INDEXING_CACHE_PREFIX = "sss_idx"
178
+ ```
179
+
180
+ Use a shared Django cache backend in multi-worker deployments so tokens are visible across processes.
181
+
182
+ ### Middleware
183
+
184
+ Add after `AuthenticationMiddleware`:
185
+
186
+ ```python
187
+ MIDDLEWARE = [
188
+ # ...
189
+ "django.contrib.auth.middleware.AuthenticationMiddleware",
190
+ "simplesitesearch.indexing_middleware.IndexingAccessTokenMiddleware",
191
+ # ...
192
+ ]
193
+ ```
194
+
195
+ ### Token endpoints
196
+
197
+ Both endpoints are under the same URL prefix as search (e.g. `/search/` when included at `path('search/', include('simplesitesearch.urls'))`):
198
+
199
+ | Endpoint | Auth header | Response |
200
+ |----------|-------------|----------|
201
+ | `POST …/internal/indexing/token/` | `Authorization: Bearer <bootstrap_token>` | `access_token`, `refresh_token`, `expires_in`, `refresh_expires_in` |
202
+ | `POST …/internal/indexing/refresh/` | `Authorization: Bearer <refresh_token>` | New token pair (refresh rotation) |
203
+
204
+ Send the access token on subsequent requests; the middleware logs in the configured user for that request.
205
+
163
206
  ## Utility functions (QOL)
164
207
 
165
208
  The app provides helpers in `simplesitesearch.utils` for use in views, management commands, or other code.
@@ -285,6 +328,13 @@ For support and questions, please open an issue on the [GitHub repository](https
285
328
 
286
329
  ## Changelog
287
330
 
331
+ ### 0.0.7
332
+ - **Added** optional indexing authentication: bootstrap/refresh token endpoints, cache-backed opaque tokens, and `IndexingAccessTokenMiddleware` for indexer login via Bearer access tokens.
333
+
334
+ ### 0.0.6
335
+ - **Changed** search results: API hits normalized for templates (`display_title`, `snippet`, domain/type/language/tags/date metadata); highlighted title when available.
336
+ - **Changed** templates: `{% load static %}` instead of deprecated `staticfiles` (Django 4+).
337
+
288
338
  ### 0.0.5
289
339
  - **Fixed** tag parsing: single tag string (e.g. `Hometag`) no longer sent as `H,o,m,e,t,a,g`; string is treated as one tag. API URL keeps commas unencoded so multiple tags parse correctly.
290
340
 
@@ -114,6 +114,49 @@ Pagination links preserve the tag filter.
114
114
 
115
115
  The search includes basic honeypot protection. If a `message` parameter is present, the search will not execute.
116
116
 
117
+ ## Indexing authentication (optional)
118
+
119
+ When crawling or indexing protected CMS content, enable opaque bearer-token auth so indexers can act as a configured Django user.
120
+
121
+ ### Settings
122
+
123
+ ```python
124
+ SIMPLE_SITE_SEARCH_INDEXING_ENABLED = True
125
+ SIMPLE_SITE_SEARCH_INDEXING_USER = "indexer" # username, user id, or pk string
126
+ SIMPLE_SITE_SEARCH_INDEXING_BOOTSTRAP_TOKEN = "your-long-random-secret"
127
+
128
+ # Optional (defaults shown)
129
+ SIMPLE_SITE_SEARCH_INDEXING_ACCESS_TTL = 3600 # seconds
130
+ SIMPLE_SITE_SEARCH_INDEXING_REFRESH_TTL = 86400 # seconds
131
+ SIMPLE_SITE_SEARCH_INDEXING_CACHE_PREFIX = "sss_idx"
132
+ ```
133
+
134
+ Use a shared Django cache backend in multi-worker deployments so tokens are visible across processes.
135
+
136
+ ### Middleware
137
+
138
+ Add after `AuthenticationMiddleware`:
139
+
140
+ ```python
141
+ MIDDLEWARE = [
142
+ # ...
143
+ "django.contrib.auth.middleware.AuthenticationMiddleware",
144
+ "simplesitesearch.indexing_middleware.IndexingAccessTokenMiddleware",
145
+ # ...
146
+ ]
147
+ ```
148
+
149
+ ### Token endpoints
150
+
151
+ Both endpoints are under the same URL prefix as search (e.g. `/search/` when included at `path('search/', include('simplesitesearch.urls'))`):
152
+
153
+ | Endpoint | Auth header | Response |
154
+ |----------|-------------|----------|
155
+ | `POST …/internal/indexing/token/` | `Authorization: Bearer <bootstrap_token>` | `access_token`, `refresh_token`, `expires_in`, `refresh_expires_in` |
156
+ | `POST …/internal/indexing/refresh/` | `Authorization: Bearer <refresh_token>` | New token pair (refresh rotation) |
157
+
158
+ Send the access token on subsequent requests; the middleware logs in the configured user for that request.
159
+
117
160
  ## Utility functions (QOL)
118
161
 
119
162
  The app provides helpers in `simplesitesearch.utils` for use in views, management commands, or other code.
@@ -239,6 +282,13 @@ For support and questions, please open an issue on the [GitHub repository](https
239
282
 
240
283
  ## Changelog
241
284
 
285
+ ### 0.0.7
286
+ - **Added** optional indexing authentication: bootstrap/refresh token endpoints, cache-backed opaque tokens, and `IndexingAccessTokenMiddleware` for indexer login via Bearer access tokens.
287
+
288
+ ### 0.0.6
289
+ - **Changed** search results: API hits normalized for templates (`display_title`, `snippet`, domain/type/language/tags/date metadata); highlighted title when available.
290
+ - **Changed** templates: `{% load static %}` instead of deprecated `staticfiles` (Django 4+).
291
+
242
292
  ### 0.0.5
243
293
  - **Fixed** tag parsing: single tag string (e.g. `Hometag`) no longer sent as `H,o,m,e,t,a,g`; string is treated as one tag. API URL keeps commas unencoded so multiple tags parse correctly.
244
294
 
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
4
4
 
5
5
  [project]
6
6
  name = "simplesitesearch"
7
- version = "0.0.5"
7
+ version = "0.0.7"
8
8
  description = "Reptile Simple Site Search django app"
9
9
  readme = "README.md"
10
10
  license = {text = "MIT"}
@@ -1,6 +1,6 @@
1
1
  [metadata]
2
2
  name = simplesitesearch
3
- version = 0.0.4
3
+ version = 0.0.7
4
4
  author = Reptile Tech
5
5
  author_email = flouis@reptile.tech
6
6
  description = Reptile Simple Site Search django app
@@ -8,7 +8,7 @@ with open("README.md", "r", encoding="utf-8") as fh:
8
8
 
9
9
  setup(
10
10
  name="simplesitesearch",
11
- version="0.0.5",
11
+ version="0.0.7",
12
12
  author="Reptile Tech",
13
13
  author_email="flouis@reptile.tech",
14
14
  description="Reptile Simple Site Search django app",
@@ -0,0 +1 @@
1
+ __version__ = "0.0.7"
@@ -0,0 +1,156 @@
1
+ """
2
+ Opaque indexing tokens stored in Django cache (shared backend required in multi-worker setups).
3
+ """
4
+ from __future__ import annotations
5
+
6
+ import hashlib
7
+ import hmac
8
+ import json
9
+ import secrets
10
+ from typing import Optional, Tuple
11
+
12
+ from django.conf import settings
13
+ from django.core.cache import cache
14
+
15
+
16
+ def indexing_cache_prefix() -> str:
17
+ return getattr(
18
+ settings,
19
+ "SIMPLE_SITE_SEARCH_INDEXING_CACHE_PREFIX",
20
+ "sss_idx",
21
+ )
22
+
23
+
24
+ def access_ttl_seconds() -> int:
25
+ return int(
26
+ getattr(settings, "SIMPLE_SITE_SEARCH_INDEXING_ACCESS_TTL", 3600)
27
+ )
28
+
29
+
30
+ def refresh_ttl_seconds() -> int:
31
+ return int(
32
+ getattr(settings, "SIMPLE_SITE_SEARCH_INDEXING_REFRESH_TTL", 86400)
33
+ )
34
+
35
+
36
+ def _hash_token(raw: str) -> str:
37
+ return hashlib.sha256(raw.encode("utf-8")).hexdigest()
38
+
39
+
40
+ def access_cache_key(raw_access_token: str) -> str:
41
+ return "%s:a:%s" % (indexing_cache_prefix(), _hash_token(raw_access_token))
42
+
43
+
44
+ def refresh_cache_key(raw_refresh_token: str) -> str:
45
+ return "%s:r:%s" % (indexing_cache_prefix(), _hash_token(raw_refresh_token))
46
+
47
+
48
+ def parse_bearer_authorization(header_value: str) -> Optional[str]:
49
+ if not header_value:
50
+ return None
51
+ parts = header_value.split(None, 1)
52
+ if len(parts) != 2 or parts[0].lower() != "bearer":
53
+ return None
54
+ token = parts[1].strip()
55
+ return token or None
56
+
57
+
58
+ def bootstrap_token_valid(given: str) -> bool:
59
+ expected = getattr(settings, "SIMPLE_SITE_SEARCH_INDEXING_BOOTSTRAP_TOKEN", "") or ""
60
+ if not expected or not given:
61
+ return False
62
+ try:
63
+ eb = expected.encode("utf-8")
64
+ gb = given.encode("utf-8")
65
+ except UnicodeEncodeError:
66
+ return False
67
+ if len(eb) != len(gb):
68
+ return False
69
+ return hmac.compare_digest(eb, gb)
70
+
71
+
72
+ def issue_token_pair(user_id: int) -> Tuple[str, str, int, int]:
73
+ access = secrets.token_urlsafe(32)
74
+ refresh = secrets.token_urlsafe(32)
75
+ at = access_ttl_seconds()
76
+ rt = refresh_ttl_seconds()
77
+ cache.set(
78
+ access_cache_key(access),
79
+ json.dumps({"user_id": user_id}),
80
+ at,
81
+ )
82
+ cache.set(
83
+ refresh_cache_key(refresh),
84
+ json.dumps({"user_id": user_id}),
85
+ rt,
86
+ )
87
+ return access, refresh, at, rt
88
+
89
+
90
+ def lookup_access_user_id(raw_access_token: str) -> Optional[int]:
91
+ raw = cache.get(access_cache_key(raw_access_token))
92
+ if raw is None:
93
+ return None
94
+ if isinstance(raw, bytes):
95
+ raw = raw.decode("utf-8")
96
+ try:
97
+ data = json.loads(raw)
98
+ uid = data.get("user_id")
99
+ return int(uid) if uid is not None else None
100
+ except (TypeError, ValueError, json.JSONDecodeError):
101
+ return None
102
+
103
+
104
+ def consume_refresh_and_rotate(user_id: int, old_refresh: str) -> Optional[Tuple[str, str, int, int]]:
105
+ """
106
+ Validate refresh token, delete it, issue new access + refresh (rotation).
107
+ Returns (access, refresh, access_ttl, refresh_ttl) or None if invalid.
108
+ """
109
+ rkey = refresh_cache_key(old_refresh)
110
+ raw = cache.get(rkey)
111
+ if raw is None:
112
+ return None
113
+ if isinstance(raw, bytes):
114
+ raw = raw.decode("utf-8")
115
+ try:
116
+ data = json.loads(raw)
117
+ if int(data.get("user_id")) != int(user_id):
118
+ return None
119
+ except (TypeError, ValueError, json.JSONDecodeError):
120
+ return None
121
+ cache.delete(rkey)
122
+ return issue_token_pair(user_id)
123
+
124
+
125
+ def lookup_refresh_user_id(raw_refresh_token: str) -> Optional[int]:
126
+ rkey = refresh_cache_key(raw_refresh_token)
127
+ raw = cache.get(rkey)
128
+ if raw is None:
129
+ return None
130
+ if isinstance(raw, bytes):
131
+ raw = raw.decode("utf-8")
132
+ try:
133
+ data = json.loads(raw)
134
+ return int(data["user_id"])
135
+ except (TypeError, ValueError, KeyError, json.JSONDecodeError):
136
+ return None
137
+
138
+
139
+ def resolve_indexing_user_id() -> Optional[int]:
140
+ spec = getattr(settings, "SIMPLE_SITE_SEARCH_INDEXING_USER", None)
141
+ if spec is None:
142
+ return None
143
+ from django.contrib.auth import get_user_model
144
+
145
+ User = get_user_model()
146
+ if isinstance(spec, int):
147
+ pk = spec
148
+ else:
149
+ s = str(spec).strip()
150
+ if s.isdigit():
151
+ pk = int(s)
152
+ else:
153
+ lookup = {User.USERNAME_FIELD: s}
154
+ u = User.objects.filter(**lookup).values_list("pk", flat=True).first()
155
+ return int(u) if u is not None else None
156
+ return pk if User.objects.filter(pk=pk).exists() else None
@@ -0,0 +1,46 @@
1
+ from __future__ import annotations
2
+
3
+ from django.conf import settings
4
+ from django.contrib.auth import get_user_model, login
5
+
6
+ from .indexing import lookup_access_user_id, parse_bearer_authorization
7
+
8
+
9
+ def _skip_for_path(path: str) -> bool:
10
+ if "internal/indexing/token" in path or "internal/indexing/refresh" in path:
11
+ return True
12
+ return False
13
+
14
+
15
+ class IndexingAccessTokenMiddleware:
16
+ """
17
+ After AuthenticationMiddleware: if Authorization Bearer matches a valid indexing
18
+ access token, log in as the configured indexer user for this request.
19
+ """
20
+
21
+ def __init__(self, get_response):
22
+ self.get_response = get_response
23
+
24
+ def __call__(self, request):
25
+ if not getattr(settings, "SIMPLE_SITE_SEARCH_INDEXING_ENABLED", False):
26
+ return self.get_response(request)
27
+ if _skip_for_path(getattr(request, "path", "") or ""):
28
+ return self.get_response(request)
29
+ authz = request.META.get("HTTP_AUTHORIZATION", "")
30
+ token = parse_bearer_authorization(authz)
31
+ if not token:
32
+ return self.get_response(request)
33
+ user_id = lookup_access_user_id(token)
34
+ if user_id is None:
35
+ return self.get_response(request)
36
+ User = get_user_model()
37
+ try:
38
+ user = User.objects.get(pk=user_id, is_active=True)
39
+ except User.DoesNotExist:
40
+ return self.get_response(request)
41
+ login(
42
+ request,
43
+ user,
44
+ backend="django.contrib.auth.backends.ModelBackend",
45
+ )
46
+ return self.get_response(request)
@@ -0,0 +1,70 @@
1
+ from __future__ import annotations
2
+
3
+ from django.conf import settings
4
+ from django.http import JsonResponse
5
+ from django.views.decorators.csrf import csrf_exempt
6
+ from django.views.decorators.http import require_POST
7
+
8
+ from .indexing import (
9
+ bootstrap_token_valid,
10
+ consume_refresh_and_rotate,
11
+ issue_token_pair,
12
+ lookup_refresh_user_id,
13
+ parse_bearer_authorization,
14
+ resolve_indexing_user_id,
15
+ )
16
+
17
+
18
+ def _indexing_disabled_response():
19
+ return JsonResponse({"detail": "indexing_disabled"}, status=403)
20
+
21
+
22
+ def _json_error(detail: str, status: int):
23
+ return JsonResponse({"detail": detail}, status=status)
24
+
25
+
26
+ @csrf_exempt
27
+ @require_POST
28
+ def obtain_indexing_token(request):
29
+ if not getattr(settings, "SIMPLE_SITE_SEARCH_INDEXING_ENABLED", False):
30
+ return _indexing_disabled_response()
31
+ user_id = resolve_indexing_user_id()
32
+ if user_id is None:
33
+ return _json_error("indexing_user_not_configured", 503)
34
+ token = parse_bearer_authorization(request.META.get("HTTP_AUTHORIZATION", ""))
35
+ if not token or not bootstrap_token_valid(token):
36
+ return _json_error("invalid_bootstrap", 401)
37
+ access, refresh, at, rt = issue_token_pair(user_id)
38
+ return JsonResponse(
39
+ {
40
+ "access_token": access,
41
+ "refresh_token": refresh,
42
+ "expires_in": at,
43
+ "refresh_expires_in": rt,
44
+ }
45
+ )
46
+
47
+
48
+ @csrf_exempt
49
+ @require_POST
50
+ def refresh_indexing_token(request):
51
+ if not getattr(settings, "SIMPLE_SITE_SEARCH_INDEXING_ENABLED", False):
52
+ return _indexing_disabled_response()
53
+ token = parse_bearer_authorization(request.META.get("HTTP_AUTHORIZATION", ""))
54
+ if not token:
55
+ return _json_error("missing_refresh", 401)
56
+ user_id = lookup_refresh_user_id(token)
57
+ if user_id is None:
58
+ return _json_error("invalid_refresh", 401)
59
+ pair = consume_refresh_and_rotate(user_id, token)
60
+ if pair is None:
61
+ return _json_error("invalid_refresh", 401)
62
+ access, refresh, at, rt = pair
63
+ return JsonResponse(
64
+ {
65
+ "access_token": access,
66
+ "refresh_token": refresh,
67
+ "expires_in": at,
68
+ "refresh_expires_in": rt,
69
+ }
70
+ )
@@ -1,4 +1,4 @@
1
- {% load i18n staticfiles cms_tags %}
1
+ {% load i18n static cms_tags %}
2
2
 
3
3
  <div class="search__pagination d-flex justify-content-between justify-sm-content-center align-items-center">
4
4
  <div class="search__pagination__nav search__pagination__nav--prev">
@@ -1,5 +1,5 @@
1
1
  {% extends "base.html" %}
2
- {% load i18n staticfiles cms_tags %}
2
+ {% load i18n static cms_tags %}
3
3
 
4
4
  {% block template_class %}template_search{% endblock %}
5
5
 
@@ -27,13 +27,29 @@
27
27
  {% for result in results %}
28
28
  <div class="search__content__result">
29
29
  <div class="headings__third-container">
30
- <a class="headings headings__third headings--blue" href="{{result.url}}">{{ result.title|safe }}</a>
30
+ <a class="headings headings__third headings--blue" href="{{ result.url }}">{{ result.display_title|safe }}</a>
31
31
  </div>
32
+ {% if result.snippet %}
32
33
  <div class="text-container">
33
34
  <div class="text text--graydark">
34
- <p>{{result.highlight|safe}}...</p>
35
+ <p>{{ result.snippet|safe }}…</p>
35
36
  </div>
36
37
  </div>
38
+ {% endif %}
39
+ {% if result.domain or result.type or result.language or result.last_modified %}
40
+ <div class="text-container">
41
+ <div class="text text--graydark">
42
+ <p>{% if result.domain %}{{ result.domain }}{% endif %}{% if result.type %}{% if result.domain %} · {% endif %}{{ result.type }}{% endif %}{% if result.language %}{% if result.domain or result.type %} · {% endif %}{{ result.language }}{% endif %}{% if result.last_modified %}{% if result.domain or result.type or result.language %} · {% endif %}{{ result.last_modified|date:"SHORT_DATETIME_FORMAT" }}{% endif %}</p>
43
+ </div>
44
+ </div>
45
+ {% endif %}
46
+ {% if result.tags %}
47
+ <div class="text-container">
48
+ <div class="text text--graydark">
49
+ <p>{% for t in result.tags %}{{ t }}{% if not forloop.last %}, {% endif %}{% endfor %}</p>
50
+ </div>
51
+ </div>
52
+ {% endif %}
37
53
  </div>
38
54
  {% empty %}
39
55
  <div class="search__content--empty">
@@ -0,0 +1,17 @@
1
+ from django.urls import re_path
2
+
3
+ from . import indexing_views, views
4
+
5
+ urlpatterns = [
6
+ re_path(r'^$', views.SearchResult.as_view(), name='search'),
7
+ re_path(
8
+ r'^internal/indexing/token/?$',
9
+ indexing_views.obtain_indexing_token,
10
+ name='sss_indexing_token',
11
+ ),
12
+ re_path(
13
+ r'^internal/indexing/refresh/?$',
14
+ indexing_views.refresh_indexing_token,
15
+ name='sss_indexing_refresh',
16
+ ),
17
+ ]
@@ -1,5 +1,6 @@
1
1
  from math import floor
2
2
 
3
+ from django.utils.dateparse import parse_datetime
3
4
  from django.views.generic import TemplateView
4
5
 
5
6
  from .utils import (
@@ -59,6 +60,46 @@ def get_total_pages(total_hits):
59
60
  return pages_count
60
61
 
61
62
 
63
+ def _optional_datetime(value):
64
+ if not value or not isinstance(value, str):
65
+ return None
66
+ return parse_datetime(value)
67
+
68
+
69
+ def normalize_search_hit(hit):
70
+ """
71
+ Reduce each API hit to fields useful for templates (omit bulky nested blobs).
72
+ """
73
+ if not isinstance(hit, dict):
74
+ return hit
75
+
76
+ highlights = hit.get("highlights") or {}
77
+ title_highlight = None
78
+ title_snippets = highlights.get("title")
79
+ if title_snippets and isinstance(title_snippets, list) and title_snippets[0]:
80
+ title_highlight = title_snippets[0]
81
+
82
+ snippet = (
83
+ hit.get("highlight")
84
+ or hit.get("description")
85
+ or hit.get("content_preview")
86
+ or ""
87
+ )
88
+
89
+ modified_raw = hit.get("last_modified") or hit.get("indexed_at") or ""
90
+
91
+ return {
92
+ "url": hit.get("url") or "",
93
+ "display_title": title_highlight or hit.get("title") or "",
94
+ "snippet": snippet,
95
+ "domain": hit.get("domain") or "",
96
+ "type": hit.get("type") or "",
97
+ "tags": hit.get("tags") if isinstance(hit.get("tags"), list) else [],
98
+ "language": hit.get("language") or "",
99
+ "last_modified": _optional_datetime(modified_raw),
100
+ }
101
+
102
+
62
103
  def get_api_re_path(term, current_page, tags=None):
63
104
  """Build the search API URL (delegates to utils for consistency)."""
64
105
  return get_search_api_url(term, current_page, tags=tags)
@@ -84,21 +125,23 @@ class SearchResult(TemplateView):
84
125
  response_data = get_search_results(
85
126
  term, current_page, tags=tags_list or None
86
127
  )
87
- pages_count = get_total_pages(response_data["total_hits"])
128
+ total_hits = response_data.get("total_hits", 0)
129
+ pages_count = get_total_pages(total_hits)
88
130
  prev_page_number, next_page_number = get_prev_next_page_number(pages_count, current_page)
89
131
  prev_link, next_link = get_prev_next_links(
90
132
  next_page_number, prev_page_number, term, tags=tags_list or None
91
133
  )
92
134
  page_links = get_page_links(pages_count, current_page, term, tags=tags_list or None)
93
135
 
136
+ raw_hits = response_data.get("hits") or []
94
137
  context.update({
95
138
  "pages_count": pages_count,
96
139
  "current_page": current_page,
97
- "results_count": response_data["total_hits"],
140
+ "results_count": total_hits,
98
141
  "prev_link": prev_link,
99
142
  "next_link": next_link,
100
143
  "page_links": page_links,
101
- "results": response_data["hits"],
144
+ "results": [normalize_search_hit(h) for h in raw_hits],
102
145
  })
103
146
  else:
104
147
  context.update({"results": None})
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: simplesitesearch
3
- Version: 0.0.5
3
+ Version: 0.0.7
4
4
  Summary: Reptile Simple Site Search django app
5
5
  Home-page: https://github.com/FlavienLouis/simplesitesearch
6
6
  Author: Reptile Tech
@@ -160,6 +160,49 @@ Pagination links preserve the tag filter.
160
160
 
161
161
  The search includes basic honeypot protection. If a `message` parameter is present, the search will not execute.
162
162
 
163
+ ## Indexing authentication (optional)
164
+
165
+ When crawling or indexing protected CMS content, enable opaque bearer-token auth so indexers can act as a configured Django user.
166
+
167
+ ### Settings
168
+
169
+ ```python
170
+ SIMPLE_SITE_SEARCH_INDEXING_ENABLED = True
171
+ SIMPLE_SITE_SEARCH_INDEXING_USER = "indexer" # username, user id, or pk string
172
+ SIMPLE_SITE_SEARCH_INDEXING_BOOTSTRAP_TOKEN = "your-long-random-secret"
173
+
174
+ # Optional (defaults shown)
175
+ SIMPLE_SITE_SEARCH_INDEXING_ACCESS_TTL = 3600 # seconds
176
+ SIMPLE_SITE_SEARCH_INDEXING_REFRESH_TTL = 86400 # seconds
177
+ SIMPLE_SITE_SEARCH_INDEXING_CACHE_PREFIX = "sss_idx"
178
+ ```
179
+
180
+ Use a shared Django cache backend in multi-worker deployments so tokens are visible across processes.
181
+
182
+ ### Middleware
183
+
184
+ Add after `AuthenticationMiddleware`:
185
+
186
+ ```python
187
+ MIDDLEWARE = [
188
+ # ...
189
+ "django.contrib.auth.middleware.AuthenticationMiddleware",
190
+ "simplesitesearch.indexing_middleware.IndexingAccessTokenMiddleware",
191
+ # ...
192
+ ]
193
+ ```
194
+
195
+ ### Token endpoints
196
+
197
+ Both endpoints are under the same URL prefix as search (e.g. `/search/` when included at `path('search/', include('simplesitesearch.urls'))`):
198
+
199
+ | Endpoint | Auth header | Response |
200
+ |----------|-------------|----------|
201
+ | `POST …/internal/indexing/token/` | `Authorization: Bearer <bootstrap_token>` | `access_token`, `refresh_token`, `expires_in`, `refresh_expires_in` |
202
+ | `POST …/internal/indexing/refresh/` | `Authorization: Bearer <refresh_token>` | New token pair (refresh rotation) |
203
+
204
+ Send the access token on subsequent requests; the middleware logs in the configured user for that request.
205
+
163
206
  ## Utility functions (QOL)
164
207
 
165
208
  The app provides helpers in `simplesitesearch.utils` for use in views, management commands, or other code.
@@ -285,6 +328,13 @@ For support and questions, please open an issue on the [GitHub repository](https
285
328
 
286
329
  ## Changelog
287
330
 
331
+ ### 0.0.7
332
+ - **Added** optional indexing authentication: bootstrap/refresh token endpoints, cache-backed opaque tokens, and `IndexingAccessTokenMiddleware` for indexer login via Bearer access tokens.
333
+
334
+ ### 0.0.6
335
+ - **Changed** search results: API hits normalized for templates (`display_title`, `snippet`, domain/type/language/tags/date metadata); highlighted title when available.
336
+ - **Changed** templates: `{% load static %}` instead of deprecated `staticfiles` (Django 4+).
337
+
288
338
  ### 0.0.5
289
339
  - **Fixed** tag parsing: single tag string (e.g. `Hometag`) no longer sent as `H,o,m,e,t,a,g`; string is treated as one tag. API URL keeps commas unencoded so multiple tags parse correctly.
290
340
 
@@ -6,6 +6,9 @@ setup.cfg
6
6
  setup.py
7
7
  simplesitesearch/__init__.py
8
8
  simplesitesearch/cms_apps.py
9
+ simplesitesearch/indexing.py
10
+ simplesitesearch/indexing_middleware.py
11
+ simplesitesearch/indexing_views.py
9
12
  simplesitesearch/urls.py
10
13
  simplesitesearch/utils.py
11
14
  simplesitesearch/views.py
@@ -1 +0,0 @@
1
- __version__ = "0.0.5"
@@ -1,7 +0,0 @@
1
- from django.urls import include, re_path
2
-
3
- from . import views
4
-
5
- urlpatterns = [
6
- re_path(r'^$', views.SearchResult.as_view(), name='search'),
7
- ]