weavert-kit-common-web-research 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,80 @@
1
+ Metadata-Version: 2.4
2
+ Name: weavert-kit-common-web-research
3
+ Version: 0.1.0
4
+ Summary: Unified AI-first web_research and read-only web primitive kit surfaces for WeaveRT product kits.
5
+ Author: WeaveRT Maintainers
6
+ License-Expression: Apache-2.0
7
+ Project-URL: Homepage, https://github.com/xyz2b/weave-ai-runtime
8
+ Project-URL: Documentation, https://github.com/xyz2b/weave-ai-runtime/tree/main/docs
9
+ Project-URL: Repository, https://github.com/xyz2b/weave-ai-runtime
10
+ Project-URL: Issues, https://github.com/xyz2b/weave-ai-runtime/issues
11
+ Keywords: weavert,agents,ai,product-kit,shared-kit
12
+ Classifier: Development Status :: 3 - Alpha
13
+ Classifier: Intended Audience :: Developers
14
+ Classifier: Operating System :: OS Independent
15
+ Classifier: Programming Language :: Python
16
+ Classifier: Programming Language :: Python :: 3
17
+ Classifier: Programming Language :: Python :: 3.11
18
+ Classifier: Programming Language :: Python :: 3.12
19
+ Classifier: Programming Language :: Python :: 3.13
20
+ Requires-Python: >=3.11
21
+ Description-Content-Type: text/markdown
22
+ Requires-Dist: weavert<0.2.0,>=0.1.0
23
+ Requires-Dist: weavert-kit-common-retrieval<0.2.0,>=0.1.0
24
+ Requires-Dist: weavert-web-research<0.2.0,>=0.1.0
25
+
26
+ # Web Research Common Kit
27
+
28
+ Canonical import root: `weavert_kit_common_web_research`
29
+
30
+ ## What this package owns
31
+
32
+ - the unified public `web_research` entrypoint for read-only public-web information retrieval
33
+ - low-level `web_search`, single-page `web_fetch`, and `web_find` primitives backed by `weavert-web-research`
34
+ - a package-owned goal-driven research loop behind `web_research` that plans queries, selects pages, evaluates evidence coverage, and stops with explicit reasons
35
+ - the package-owned `web-searcher` delegated worker reserved for bounded implementation-period fallback paths
36
+ - first-party research profiles: `general`, `coding`, `business`, `academic`, `legal_compliance`, and `product_shopping`
37
+ - common result envelopes with sources, evidence, conflicts, gaps, freshness, provider metadata, research trace, and profile facets
38
+
39
+ ## Canonical names
40
+
41
+ - package root: `packages/product-kits/common/web-research`
42
+ - install name: `weavert-kit-common-web-research`
43
+ - import root: `weavert_kit_common_web_research`
44
+ - runtime activation: `weavert-shared-web-research`
45
+
46
+ ## Boundary
47
+
48
+ Use `web_research` for goal-driven AI-first research and pass `profile="coding"` or another supported profile when the scenario needs profile-specific source ranking or facets. `web_research` is the supported path for multi-page source discovery and inspection: it derives bounded queries from the objective, ranks candidate pages, inspects ledger-verified sources, reports gaps or conflicts, and exposes loop decisions in `research_trace` and `trace_summary`. Use low-level `web_fetch` only for one explicit page at a time; callers that need manual multi-page inspection should issue repeated single-page fetches.
49
+
50
+ This package is read-only. Browser navigation, clicks, form filling, authenticated browsing, and DOM interaction remain browser-bridge responsibilities.
51
+
52
+ ## Search Provider Selection
53
+
54
+ Public tool names stay stable: callers continue to use `web_research`, `web_search`, `web_fetch`, and `web_find`. Search provider selection is handled by the shared `weavert-web-research` core.
55
+
56
+ - `google-search`: set `GOOGLE_SEARCH_API_KEY` and `GOOGLE_SEARCH_CX`; optionally set `WEAVERT_WEB_SEARCH_PROVIDER=google-search`.
57
+ - `brave-search`: set `BRAVE_SEARCH_API_KEY` or `WEAVERT_BRAVE_SEARCH_API_KEY`; optionally set `WEAVERT_WEB_SEARCH_PROVIDER=brave-search`.
58
+ - `bing-grounding`: set `FOUNDRY_PROJECT_ENDPOINT`, `FOUNDRY_MODEL_DEPLOYMENT_NAME`, `BING_PROJECT_CONNECTION_ID`, and `AGENT_TOKEN`; optionally set `WEAVERT_WEB_SEARCH_PROVIDER=bing-grounding`.
59
+ - `duckduckgo-html`: no-credential fallback. It does not expose a stable freshness filter through this adapter.
60
+
61
+ Bing grounding uses Azure AI Foundry Responses API `bing_grounding` and normalizes stable public URL citations into the shared result shape. It is not the retired Bing Search API v7 endpoint. Google and Brave map domain constraints into provider query operators where supported, while Bing grounding and DuckDuckGo report those controls as framework-filtered. The shared core still revalidates accepted result URLs against allowed domains, blocked domains, and public-host policy. Freshness semantics are provider-specific: Google uses approximate `dateRestrict`, Brave uses its `freshness` parameter, Bing grounding maps supported 1/7/30 day freshness windows, and DuckDuckGo reports freshness as unsupported.
62
+
63
+ ## Research Profiles and Quality Signals
64
+
65
+ `web_research` applies profile strategy before inspecting pages. Coding prioritizes official documentation, release notes, changelogs, source repositories, and issue trackers, with facets for API names, versions, compatibility notes, and breaking changes. Legal compliance prioritizes statutes, regulations, standards, and official guidance, and preserves jurisdiction, authority, freshness, and effective-date gaps. Business research favors company sources, filings, announcements, credible news, reviews, competitors, timelines, comparison axes, and market claims. Academic research favors papers, publishers, institutions, preprints, methods, experiments, conclusions, and citation metadata. Product shopping favors official specs, current prices, reviews, alternatives, comparison axes, and purchase-risk signals.
66
+
67
+ Candidate sources receive traceable quality metadata before fetch: objective relevance, profile priority, provider metadata, freshness signals, preferred or allowed domains, duplicate clusters, and deterministic tie-breaking by domain and URL. After inspection, ledger evidence keeps source class and quality metadata so callers and tests can explain why a source was selected.
68
+
69
+ ## Claims, Conflicts, Gaps, and Limits
70
+
71
+ Claim annotations are accepted only when they bind to an inspected ledger source, page, or evidence item. Unbound annotations are dropped and traced. Rule-derived dates, versions, prices, numbers, source-type hints, and duplicate signals appear as `auxiliary_signals`; they help diagnostics and facets but do not prove claim correctness.
72
+
73
+ Conflicting ledger-bound claims are projected into `conflicts`. Unresolved conflicts lower confidence and produce `stop_reason="unresolved_conflict"`; resolved conflicts keep a resolution rationale when stronger evidence is identified. Gaps describe missing preferred evidence, unsupported freshness, provider fallback, policy blocks, or partial results.
74
+
75
+ Remaining limits are explicit: this kit does not drive a browser, click through pages, authenticate, inspect local workspaces, run shell-assisted searches, or guarantee truth beyond inspected public evidence. Host-level browser bridges, local workspace search, and shell tools remain separate surfaces.
76
+
77
+ ## See also
78
+
79
+ - `../README.md`
80
+ - `../../../framework-packs/capabilities/web-research/README.md`
@@ -0,0 +1,55 @@
1
+ # Web Research Common Kit
2
+
3
+ Canonical import root: `weavert_kit_common_web_research`
4
+
5
+ ## What this package owns
6
+
7
+ - the unified public `web_research` entrypoint for read-only public-web information retrieval
8
+ - low-level `web_search`, single-page `web_fetch`, and `web_find` primitives backed by `weavert-web-research`
9
+ - a package-owned goal-driven research loop behind `web_research` that plans queries, selects pages, evaluates evidence coverage, and stops with explicit reasons
10
+ - the package-owned `web-searcher` delegated worker reserved for bounded implementation-period fallback paths
11
+ - first-party research profiles: `general`, `coding`, `business`, `academic`, `legal_compliance`, and `product_shopping`
12
+ - common result envelopes with sources, evidence, conflicts, gaps, freshness, provider metadata, research trace, and profile facets
13
+
14
+ ## Canonical names
15
+
16
+ - package root: `packages/product-kits/common/web-research`
17
+ - install name: `weavert-kit-common-web-research`
18
+ - import root: `weavert_kit_common_web_research`
19
+ - runtime activation: `weavert-shared-web-research`
20
+
21
+ ## Boundary
22
+
23
+ Use `web_research` for goal-driven AI-first research and pass `profile="coding"` or another supported profile when the scenario needs profile-specific source ranking or facets. `web_research` is the supported path for multi-page source discovery and inspection: it derives bounded queries from the objective, ranks candidate pages, inspects ledger-verified sources, reports gaps or conflicts, and exposes loop decisions in `research_trace` and `trace_summary`. Use low-level `web_fetch` only for one explicit page at a time; callers that need manual multi-page inspection should issue repeated single-page fetches.
24
+
25
+ This package is read-only. Browser navigation, clicks, form filling, authenticated browsing, and DOM interaction remain browser-bridge responsibilities.
26
+
27
+ ## Search Provider Selection
28
+
29
+ Public tool names stay stable: callers continue to use `web_research`, `web_search`, `web_fetch`, and `web_find`. Search provider selection is handled by the shared `weavert-web-research` core.
30
+
31
+ - `google-search`: set `GOOGLE_SEARCH_API_KEY` and `GOOGLE_SEARCH_CX`; optionally set `WEAVERT_WEB_SEARCH_PROVIDER=google-search`.
32
+ - `brave-search`: set `BRAVE_SEARCH_API_KEY` or `WEAVERT_BRAVE_SEARCH_API_KEY`; optionally set `WEAVERT_WEB_SEARCH_PROVIDER=brave-search`.
33
+ - `bing-grounding`: set `FOUNDRY_PROJECT_ENDPOINT`, `FOUNDRY_MODEL_DEPLOYMENT_NAME`, `BING_PROJECT_CONNECTION_ID`, and `AGENT_TOKEN`; optionally set `WEAVERT_WEB_SEARCH_PROVIDER=bing-grounding`.
34
+ - `duckduckgo-html`: no-credential fallback. It does not expose a stable freshness filter through this adapter.
35
+
36
+ Bing grounding uses Azure AI Foundry Responses API `bing_grounding` and normalizes stable public URL citations into the shared result shape. It is not the retired Bing Search API v7 endpoint. Google and Brave map domain constraints into provider query operators where supported, while Bing grounding and DuckDuckGo report those controls as framework-filtered. The shared core still revalidates accepted result URLs against allowed domains, blocked domains, and public-host policy. Freshness semantics are provider-specific: Google uses approximate `dateRestrict`, Brave uses its `freshness` parameter, Bing grounding maps supported 1/7/30 day freshness windows, and DuckDuckGo reports freshness as unsupported.
37
+
38
+ ## Research Profiles and Quality Signals
39
+
40
+ `web_research` applies profile strategy before inspecting pages. Coding prioritizes official documentation, release notes, changelogs, source repositories, and issue trackers, with facets for API names, versions, compatibility notes, and breaking changes. Legal compliance prioritizes statutes, regulations, standards, and official guidance, and preserves jurisdiction, authority, freshness, and effective-date gaps. Business research favors company sources, filings, announcements, credible news, reviews, competitors, timelines, comparison axes, and market claims. Academic research favors papers, publishers, institutions, preprints, methods, experiments, conclusions, and citation metadata. Product shopping favors official specs, current prices, reviews, alternatives, comparison axes, and purchase-risk signals.
41
+
42
+ Candidate sources receive traceable quality metadata before fetch: objective relevance, profile priority, provider metadata, freshness signals, preferred or allowed domains, duplicate clusters, and deterministic tie-breaking by domain and URL. After inspection, ledger evidence keeps source class and quality metadata so callers and tests can explain why a source was selected.
43
+
44
+ ## Claims, Conflicts, Gaps, and Limits
45
+
46
+ Claim annotations are accepted only when they bind to an inspected ledger source, page, or evidence item. Unbound annotations are dropped and traced. Rule-derived dates, versions, prices, numbers, source-type hints, and duplicate signals appear as `auxiliary_signals`; they help diagnostics and facets but do not prove claim correctness.
47
+
48
+ Conflicting ledger-bound claims are projected into `conflicts`. Unresolved conflicts lower confidence and produce `stop_reason="unresolved_conflict"`; resolved conflicts keep a resolution rationale when stronger evidence is identified. Gaps describe missing preferred evidence, unsupported freshness, provider fallback, policy blocks, or partial results.
49
+
50
+ Remaining limits are explicit: this kit does not drive a browser, click through pages, authenticate, inspect local workspaces, run shell-assisted searches, or guarantee truth beyond inspected public evidence. Host-level browser bridges, local workspace search, and shell tools remain separate surfaces.
51
+
52
+ ## See also
53
+
54
+ - `../README.md`
55
+ - `../../../framework-packs/capabilities/web-research/README.md`
@@ -0,0 +1,37 @@
1
+ [build-system]
2
+ requires = ["setuptools>=69", "wheel"]
3
+ build-backend = "setuptools.build_meta"
4
+
5
+ [project]
6
+ name = "weavert-kit-common-web-research"
7
+ version = "0.1.0"
8
+ description = "Unified AI-first web_research and read-only web primitive kit surfaces for WeaveRT product kits."
9
+ readme = "README.md"
10
+ license = "Apache-2.0"
11
+ authors = [{ name = "WeaveRT Maintainers" }]
12
+ requires-python = ">=3.11"
13
+ dependencies = [
14
+ "weavert>=0.1.0,<0.2.0",
15
+ "weavert-kit-common-retrieval>=0.1.0,<0.2.0",
16
+ "weavert-web-research>=0.1.0,<0.2.0",
17
+ ]
18
+ keywords = ["weavert", "agents", "ai", "product-kit", "shared-kit"]
19
+ classifiers = [
20
+ "Development Status :: 3 - Alpha",
21
+ "Intended Audience :: Developers",
22
+ "Operating System :: OS Independent",
23
+ "Programming Language :: Python",
24
+ "Programming Language :: Python :: 3",
25
+ "Programming Language :: Python :: 3.11",
26
+ "Programming Language :: Python :: 3.12",
27
+ "Programming Language :: Python :: 3.13",
28
+ ]
29
+
30
+ [project.urls]
31
+ Homepage = "https://github.com/xyz2b/weave-ai-runtime"
32
+ Documentation = "https://github.com/xyz2b/weave-ai-runtime/tree/main/docs"
33
+ Repository = "https://github.com/xyz2b/weave-ai-runtime"
34
+ Issues = "https://github.com/xyz2b/weave-ai-runtime/issues"
35
+
36
+ [tool.setuptools.packages.find]
37
+ where = ["src"]
@@ -0,0 +1,4 @@
1
+ [egg_info]
2
+ tag_build =
3
+ tag_date = 0
4
+
@@ -0,0 +1,104 @@
1
+ from __future__ import annotations
2
+
3
+ from weavert.package_system.protocols import RuntimePackageManifest
4
+ from weavert.extension_contracts.scenario_runtime_packs import (
5
+ ReferenceSharedPackageShape,
6
+ build_reference_shared_package_manifest,
7
+ )
8
+
9
+ from ._builtins import (
10
+ WEB_RESEARCH_TOOLS,
11
+ WEB_RESEARCH_WORKER_AGENTS,
12
+ web_research_builtin_tools,
13
+ web_research_worker_builtin_agents,
14
+ )
15
+ from ._tool_impls import (
16
+ web_fetch_tool,
17
+ web_find_tool,
18
+ web_search_tool,
19
+ validate_web_fetch,
20
+ validate_web_find,
21
+ validate_web_search,
22
+ validate_web_research,
23
+ web_research_tool,
24
+ )
25
+
26
+ REFERENCE_SHARED_PACKAGE_SHAPE = ReferenceSharedPackageShape(
27
+ package_name="weavert-shared-web-research",
28
+ capability_key="weavert.reference.shared.web_research",
29
+ description="Reference shared package for AI-first web_research plus low-level read-only web primitives.",
30
+ shared_surface_family="web-research",
31
+ intended_profiles=(
32
+ "chat",
33
+ "coding",
34
+ "local_assistant",
35
+ "business",
36
+ "academic",
37
+ "legal_compliance",
38
+ "product_shopping",
39
+ ),
40
+ surfaces=(
41
+ "AI-first bounded web_research entrypoint",
42
+ "read-only web search",
43
+ "single-page remote fetch",
44
+ "page-local web evidence finding",
45
+ "bounded multi-page inspection behind web_research",
46
+ "HTTP-aware web helpers",
47
+ ),
48
+ tool_ids=WEB_RESEARCH_TOOLS,
49
+ agent_ids=WEB_RESEARCH_WORKER_AGENTS,
50
+ notes=(
51
+ "Scenario packs should recommend web_research as the public web research entrypoint.",
52
+ "Scenario packs set default research profiles without changing public web tool names.",
53
+ "Low-level primitives remain available for explicit search, fetch, and page-local find flows.",
54
+ "web-searcher is a package-owned delegated worker behind web_research, not the recommended public path.",
55
+ "The default posture stays read-only and web research even when external web is enabled.",
56
+ "Browser navigation or interaction still requires a separate browser bridge package.",
57
+ ),
58
+ )
59
+
60
+
61
+ def reference_shared_package_shapes() -> tuple[ReferenceSharedPackageShape, ...]:
62
+ return (REFERENCE_SHARED_PACKAGE_SHAPE,)
63
+
64
+
65
+ def reference_shared_package_shape(name: str | None = None) -> ReferenceSharedPackageShape:
66
+ normalized = REFERENCE_SHARED_PACKAGE_SHAPE.package_name if name is None else str(name)
67
+ if normalized in {
68
+ REFERENCE_SHARED_PACKAGE_SHAPE.package_name,
69
+ REFERENCE_SHARED_PACKAGE_SHAPE.capability_key,
70
+ }:
71
+ return REFERENCE_SHARED_PACKAGE_SHAPE
72
+ raise KeyError(f"Unknown web shared package shape: {name}")
73
+
74
+
75
+ def reference_shared_package_manifest() -> RuntimePackageManifest:
76
+ return build_reference_shared_package_manifest(
77
+ REFERENCE_SHARED_PACKAGE_SHAPE,
78
+ builtin_tools=web_research_builtin_tools,
79
+ builtin_agents=web_research_worker_builtin_agents,
80
+ )
81
+
82
+
83
+ def reference_shared_package_manifests() -> tuple[RuntimePackageManifest, ...]:
84
+ return (reference_shared_package_manifest(),)
85
+
86
+
87
+ __all__ = [
88
+ "WEB_RESEARCH_TOOLS",
89
+ "REFERENCE_SHARED_PACKAGE_SHAPE",
90
+ "WEB_RESEARCH_WORKER_AGENTS",
91
+ "web_fetch_tool",
92
+ "web_find_tool",
93
+ "web_search_tool",
94
+ "reference_shared_package_manifest",
95
+ "reference_shared_package_manifests",
96
+ "reference_shared_package_shape",
97
+ "reference_shared_package_shapes",
98
+ "validate_web_fetch",
99
+ "validate_web_find",
100
+ "validate_web_search",
101
+ "validate_web_research",
102
+ "web_research_tool",
103
+ "web_research_worker_builtin_agents",
104
+ ]