searxng 0.1.0 → 0.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +16 -0
- data/README.md +68 -39
- data/lib/searxng/client.rb +59 -6
- data/lib/searxng/errors.rb +56 -0
- data/lib/searxng/server.rb +574 -2
- data/lib/searxng/version.rb +1 -1
- data/sig/searxng.rbs +11 -1
- metadata +1 -1
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 312181892379c3ed0181f46a10f535e96803969d7f1b224b87e07fe7fe2d68d4
|
|
4
|
+
data.tar.gz: e206771ab2279dac1329a08dc1195d254e5992c4532c7e2e5afd95a67b267991
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: ce7b9acd2231550ddd32fd27956b1ef366f044a769f7b0034a6f5c5ade98efed0ac96a34856d1015456cc46132b024afb723726117dd3a9df7b82b26282d5ad7
|
|
7
|
+
data.tar.gz: 5636c48dd7221654b182793931ffb097fe017939fe8fc41781819deda7ed627a921edf508243f16943fa70ade77ee04b59b3ea6ed6be76de63eb34becbe2e13c
|
data/CHANGELOG.md
CHANGED
|
@@ -1,5 +1,21 @@
|
|
|
1
1
|
# Changelog
|
|
2
2
|
|
|
3
|
+
## Unreleased
|
|
4
|
+
|
|
5
|
+
## 0.2.0
|
|
6
|
+
|
|
7
|
+
- MCP: Added `web_url_read` tool (URL fetch + Markdown conversion with pagination/section options).
|
|
8
|
+
- MCP: Added `gemini://` support in `web_url_read` (Gemini fetch + Gemtext to Markdown).
|
|
9
|
+
- MCP: Added `ipfs://` support in `web_url_read` via configurable `IPFS_GATEWAY`.
|
|
10
|
+
- MCP: Added `ftp://`, `sftp://`, and `smb://` support in `web_url_read`.
|
|
11
|
+
- MCP: Added resources `config://server-config` and `help://usage-guide`.
|
|
12
|
+
- MCP: Added startup environment validation for `SEARXNG_URL` and auth variable pairs.
|
|
13
|
+
- Errors: Added richer, user-facing configuration/network/HTTP messages.
|
|
14
|
+
- Client: Added auth env aliases (`AUTH_USERNAME`/`AUTH_PASSWORD`) and proxy auto-detection from `HTTP_PROXY`/`HTTPS_PROXY`/`ALL_PROXY` with `NO_PROXY`.
|
|
15
|
+
- Docs: Added protocol support notes for Gemini, IPFS, Tor, I2P, and SOCKS5-related setups.
|
|
16
|
+
- Docs: Added protocol support and environment notes for FTP/SFTP/SMB.
|
|
17
|
+
- Search tool: Improved no-results response to include actionable guidance.
|
|
18
|
+
|
|
3
19
|
## 0.1.0
|
|
4
20
|
|
|
5
21
|
- Initial release.
|
data/README.md
CHANGED
|
@@ -1,15 +1,9 @@
|
|
|
1
|
-
# searxng
|
|
1
|
+
# searxng.rb
|
|
2
2
|
|
|
3
|
-
[](https://badge.fury.io/rb/searxng) [](https://github.com/amkisko/searxng.rb/actions/workflows/test.yml) [](https://codecov.io/gh/amkisko/searxng.rb)
|
|
3
|
+
[](https://badge.fury.io/rb/searxng) [](https://github.com/amkisko/searxng.rb/actions/workflows/test.yml) [](https://codecov.io/gh/amkisko/searxng.rb)
|
|
4
4
|
|
|
5
5
|
Ruby gem providing a SearXNG HTTP client, CLI (search), and MCP (Model Context Protocol) server for web search. Integrates with MCP-compatible clients like Codex, Cursor, Claude, and other MCP-enabled tools.
|
|
6
6
|
|
|
7
|
-
Sponsored by [Kisko Labs](https://www.kiskolabs.com).
|
|
8
|
-
|
|
9
|
-
<a href="https://www.kiskolabs.com">
|
|
10
|
-
<img src="kisko.svg" width="200" alt="Sponsored by Kisko Labs" />
|
|
11
|
-
</a>
|
|
12
|
-
|
|
13
7
|
## Requirements
|
|
14
8
|
|
|
15
9
|
- **Ruby 3.1 or higher** (Ruby 3.0 and earlier are not supported). For managing Ruby versions, [rbenv](https://github.com/rbenv/rbenv) or [mise](https://mise.jdx.dev/) are recommended; system Ruby may be sufficient if it meets the version requirement.
|
|
@@ -26,15 +20,61 @@ Or add to your Gemfile:
|
|
|
26
20
|
gem "searxng"
|
|
27
21
|
```
|
|
28
22
|
|
|
23
|
+
**Quick start with a local SearXNG instance:** The repo includes a ready-made [docker-compose](examples/docker-compose.yml) and [settings](examples/searxng/settings.yml) so you can run SearXNG locally (JSON format and limiter preconfigured for the gem). From the repo root: `docker compose -f examples/docker-compose.yml up -d`, then `export SEARXNG_URL="http://localhost:8080"`. See [examples/SETUP.md](examples/SETUP.md) for details.
|
|
24
|
+
|
|
25
|
+
### CLI
|
|
26
|
+
|
|
27
|
+
```bash
|
|
28
|
+
searxng search "your query"
|
|
29
|
+
searxng search "ruby" --page 2 --language en --time-range day --json
|
|
30
|
+
```
|
|
31
|
+
|
|
32
|
+
Options: `--url`, `--page`, `--language`, `--time-range` (day|month|year), `--safesearch` (0|1|2), `--json`.
|
|
33
|
+
|
|
34
|
+
### Ruby API
|
|
35
|
+
|
|
36
|
+
```ruby
|
|
37
|
+
require "searxng"
|
|
38
|
+
|
|
39
|
+
client = Searxng::Client.new
|
|
40
|
+
data = client.search("ruby programming", pageno: 1, language: "en")
|
|
41
|
+
|
|
42
|
+
data[:results].each do |r|
|
|
43
|
+
puts r[:title], r[:url], r[:content]
|
|
44
|
+
end
|
|
45
|
+
```
|
|
46
|
+
|
|
29
47
|
### Configuration
|
|
30
48
|
|
|
31
|
-
- **SEARXNG_URL
|
|
32
|
-
- **SEARXNG_USER** / **SEARXNG_PASSWORD** (optional): Basic auth credentials if your instance is protected.
|
|
49
|
+
- **SEARXNG_URL**: Base URL of your SearXNG instance (e.g. `http://localhost:8080` or `https://search.example.com`). The client defaults to `http://localhost:8080`; MCP server startup validates that this variable is explicitly set and well-formed.
|
|
50
|
+
- **SEARXNG_USER** / **SEARXNG_PASSWORD** (optional): Basic auth credentials if your instance is protected. `AUTH_USERNAME` / `AUTH_PASSWORD` are also accepted for compatibility. Auth vars must be provided as a pair.
|
|
33
51
|
- **SEARXNG_CA_FILE** / **SEARXNG_CA_PATH** (optional): Custom CA certificate file or directory for HTTPS. You can also pass `ca_file:`, `ca_path:`, or `verify_mode:` to `Searxng::Client.new`.
|
|
34
52
|
- **SEARXNG_USER_AGENT** (optional): Custom User-Agent string. The client sends a default identifying the gem; if your instance returns 403 Forbidden (e.g. bot detection), set a custom User-Agent or pass `user_agent:` to `Searxng::Client.new`.
|
|
53
|
+
- **HTTP_PROXY / HTTPS_PROXY / ALL_PROXY / NO_PROXY** (optional): Proxy configuration for HTTP(S) requests.
|
|
54
|
+
- **FTP_USER / FTP_PASSWORD** (optional): FTP credentials used when URL doesn’t include auth.
|
|
55
|
+
- **SFTP_USER / SFTP_PASSWORD** (optional): SFTP credentials used when URL doesn’t include auth.
|
|
56
|
+
- **SMB_USER / SMB_PASSWORD / SMB_DOMAIN** (optional): SMB credentials/domain used when URL doesn’t include auth.
|
|
57
|
+
- **IPFS_GATEWAY** (optional): Gateway used for `ipfs://` URLs (default: `https://ipfs.io`).
|
|
58
|
+
- **GEMINI_INSECURE=1** (optional): disables TLS certificate verification for Gemini fetches in `web_url_read`.
|
|
35
59
|
|
|
36
60
|
To fully customize the HTTP client (e.g. custom certs, proxy, timeouts), override `#build_http(uri)` in a subclass, or pass a `configure_http:` callable to the client; it is invoked with the `Net::HTTP` instance and the request URI before each request.
|
|
37
61
|
|
|
62
|
+
## MCP Tools
|
|
63
|
+
|
|
64
|
+
The MCP server provides the following tools:
|
|
65
|
+
|
|
66
|
+
1. **searxng_web_search** - Web search via SearXNG
|
|
67
|
+
- Parameters: `query` (required), `pageno` (optional), `max_results` (optional, default 10), `time_range` (optional), `language` (optional), `safesearch` (optional). Use `max_results` to limit how many results are returned in one response (reduces token usage); use `pageno` for the next page.
|
|
68
|
+
2. **web_url_read** - Fetch a URL and return Markdown content
|
|
69
|
+
- Parameters: `url` (required), `startChar`, `maxLength`, `section`, `paragraphRange`, `readHeadings`, `timeoutMs` (all optional).
|
|
70
|
+
- Uses `teplo_core` for HTML/XML to Markdown conversion (with workspace fallback to `textplorer/core-ruby` in development).
|
|
71
|
+
- Supports `http`, `https`, `ftp`, `sftp`, `smb`, `gemini`, and `ipfs` URLs (`ipfs://` is resolved via `IPFS_GATEWAY`).
|
|
72
|
+
|
|
73
|
+
The MCP server also exposes resources:
|
|
74
|
+
|
|
75
|
+
- `config://server-config` - current server/environment capabilities
|
|
76
|
+
- `help://usage-guide` - concise usage and configuration guide
|
|
77
|
+
|
|
38
78
|
### Cursor IDE Configuration
|
|
39
79
|
|
|
40
80
|
For Cursor IDE, create or update `.cursor/mcp.json` in your project:
|
|
@@ -120,37 +160,18 @@ Useful for testing the `searxng_web_search` tool before integrating with Cursor
|
|
|
120
160
|
- **MCP Server Integration**: Ready-to-use MCP server with web search tool, compatible with Cursor IDE, Claude Desktop, and other MCP-enabled tools
|
|
121
161
|
- **Configurable HTTP**: Optional custom CA, verify mode, and `configure_http` hook or `build_http` override for proxy/timeouts/certs
|
|
122
162
|
- **Basic Auth**: Optional Basic authentication via options or ENV
|
|
163
|
+
- **Fail-fast MCP validation**: MCP startup validates `SEARXNG_URL` and auth variable pairs
|
|
164
|
+
- **User-facing errors**: Clear network/HTTP/configuration error messages
|
|
123
165
|
|
|
124
|
-
##
|
|
125
|
-
|
|
126
|
-
### Ruby API
|
|
127
|
-
|
|
128
|
-
```ruby
|
|
129
|
-
require "searxng"
|
|
130
|
-
|
|
131
|
-
client = Searxng::Client.new
|
|
132
|
-
data = client.search("ruby programming", pageno: 1, language: "en")
|
|
133
|
-
|
|
134
|
-
data[:results].each do |r|
|
|
135
|
-
puts r[:title], r[:url], r[:content]
|
|
136
|
-
end
|
|
137
|
-
```
|
|
138
|
-
|
|
139
|
-
### CLI
|
|
140
|
-
|
|
141
|
-
```bash
|
|
142
|
-
searxng search "your query"
|
|
143
|
-
searxng search "ruby" --page 2 --language en --time-range day --json
|
|
144
|
-
```
|
|
145
|
-
|
|
146
|
-
Options: `--url`, `--page`, `--language`, `--time-range` (day|month|year), `--safesearch` (0|1|2), `--json`.
|
|
147
|
-
|
|
148
|
-
## MCP Tools
|
|
149
|
-
|
|
150
|
-
The MCP server provides the following tools:
|
|
166
|
+
## Protocol Support Notes
|
|
151
167
|
|
|
152
|
-
|
|
153
|
-
|
|
168
|
+
- `http` / `https`: fully supported.
|
|
169
|
+
- `ftp`: supported by `web_url_read` (requires `net-ftp` runtime gem).
|
|
170
|
+
- `sftp`: supported by `web_url_read` (requires `net-sftp` runtime gem).
|
|
171
|
+
- `smb`: supported by `web_url_read` via `smbclient` available in `PATH`.
|
|
172
|
+
- `gemini`: supported by `web_url_read` (Gemini fetch + Gemtext to Markdown conversion).
|
|
173
|
+
- `ipfs`: supported by `web_url_read` through gateway resolution (`ipfs://` -> `IPFS_GATEWAY`).
|
|
174
|
+
- `tor` / `i2p`: supported through `.onion` / `.i2p` hosts over HTTP(S), typically with proxy configuration.
|
|
154
175
|
|
|
155
176
|
## Examples
|
|
156
177
|
|
|
@@ -199,3 +220,11 @@ If you discover a security vulnerability, please report it responsibly. See [SEC
|
|
|
199
220
|
## License
|
|
200
221
|
|
|
201
222
|
The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
|
|
223
|
+
|
|
224
|
+
## Sponsors
|
|
225
|
+
|
|
226
|
+
Sponsored by [Kisko Labs](https://www.kiskolabs.com).
|
|
227
|
+
|
|
228
|
+
<a href="https://www.kiskolabs.com">
|
|
229
|
+
<img src="kisko.svg" width="200" alt="Sponsored by Kisko Labs" />
|
|
230
|
+
</a>
|
data/lib/searxng/client.rb
CHANGED
|
@@ -2,6 +2,8 @@ require "uri"
|
|
|
2
2
|
require "net/http"
|
|
3
3
|
require "openssl"
|
|
4
4
|
require "json"
|
|
5
|
+
require "ipaddr"
|
|
6
|
+
require "timeout"
|
|
5
7
|
|
|
6
8
|
module Searxng
|
|
7
9
|
class Client
|
|
@@ -23,8 +25,8 @@ module Searxng
|
|
|
23
25
|
configure_http: nil
|
|
24
26
|
)
|
|
25
27
|
@base_url = (base_url || ENV["SEARXNG_URL"] || DEFAULT_BASE_URL).to_s.chomp("/")
|
|
26
|
-
@user = user || ENV["SEARXNG_USER"]
|
|
27
|
-
@password = password || ENV["SEARXNG_PASSWORD"]
|
|
28
|
+
@user = user || ENV["SEARXNG_USER"] || ENV["AUTH_USERNAME"]
|
|
29
|
+
@password = password || ENV["SEARXNG_PASSWORD"] || ENV["AUTH_PASSWORD"]
|
|
28
30
|
@open_timeout = open_timeout
|
|
29
31
|
@read_timeout = read_timeout
|
|
30
32
|
@ca_file = ca_file || ENV["SEARXNG_CA_FILE"]
|
|
@@ -35,7 +37,7 @@ module Searxng
|
|
|
35
37
|
end
|
|
36
38
|
|
|
37
39
|
def search(query, pageno: 1, time_range: nil, language: "all", safesearch: nil)
|
|
38
|
-
raise ConfigurationError,
|
|
40
|
+
raise ConfigurationError, ErrorMessages.configuration_missing_url if @base_url.nil? || @base_url.empty?
|
|
39
41
|
|
|
40
42
|
uri = build_search_uri(query, pageno: pageno, time_range: time_range, language: language, safesearch: safesearch)
|
|
41
43
|
response = perform_request(uri)
|
|
@@ -65,13 +67,19 @@ module Searxng
|
|
|
65
67
|
|
|
66
68
|
http.request(request)
|
|
67
69
|
rescue SocketError, Errno::ECONNREFUSED, Errno::ETIMEDOUT, Errno::ECONNRESET, Errno::EPIPE, Timeout::Error, OpenSSL::SSL::SSLError, EOFError => e
|
|
68
|
-
raise NetworkError,
|
|
70
|
+
raise NetworkError, ErrorMessages.network_error(e, uri: uri)
|
|
69
71
|
end
|
|
70
72
|
|
|
71
73
|
# Builds and returns a configured Net::HTTP instance. Override in a subclass to customize
|
|
72
74
|
# timeouts, SSL, proxy, or other options. Called once per request.
|
|
73
75
|
def build_http(uri)
|
|
74
|
-
|
|
76
|
+
proxy_uri = proxy_uri_for(uri)
|
|
77
|
+
http_klass = if proxy_uri
|
|
78
|
+
Net::HTTP::Proxy(proxy_uri.host, proxy_uri.port, proxy_uri.user, proxy_uri.password)
|
|
79
|
+
else
|
|
80
|
+
Net::HTTP
|
|
81
|
+
end
|
|
82
|
+
http = http_klass.new(uri.host, uri.port)
|
|
75
83
|
http.use_ssl = (uri.scheme == "https")
|
|
76
84
|
http.open_timeout = @open_timeout
|
|
77
85
|
http.read_timeout = @read_timeout
|
|
@@ -103,7 +111,7 @@ module Searxng
|
|
|
103
111
|
}
|
|
104
112
|
else
|
|
105
113
|
raise APIError.new(
|
|
106
|
-
|
|
114
|
+
ErrorMessages.api_error(response.code.to_i, response.message),
|
|
107
115
|
status_code: response.code.to_i,
|
|
108
116
|
response_data: response.body,
|
|
109
117
|
uri: uri.to_s
|
|
@@ -117,6 +125,51 @@ module Searxng
|
|
|
117
125
|
)
|
|
118
126
|
end
|
|
119
127
|
|
|
128
|
+
def proxy_uri_for(uri)
|
|
129
|
+
return nil if no_proxy_match?(uri.host)
|
|
130
|
+
|
|
131
|
+
raw = if uri.scheme == "https"
|
|
132
|
+
ENV["HTTPS_PROXY"] || ENV["https_proxy"] || ENV["HTTP_PROXY"] || ENV["http_proxy"] || ENV["ALL_PROXY"] || ENV["all_proxy"]
|
|
133
|
+
else
|
|
134
|
+
ENV["HTTP_PROXY"] || ENV["http_proxy"] || ENV["ALL_PROXY"] || ENV["all_proxy"]
|
|
135
|
+
end
|
|
136
|
+
return nil if raw.nil? || raw.strip.empty?
|
|
137
|
+
|
|
138
|
+
URI(raw.include?("://") ? raw : "http://#{raw}")
|
|
139
|
+
rescue
|
|
140
|
+
nil
|
|
141
|
+
end
|
|
142
|
+
|
|
143
|
+
def no_proxy_match?(host)
|
|
144
|
+
return false if host.nil? || host.empty?
|
|
145
|
+
|
|
146
|
+
raw = ENV["NO_PROXY"] || ENV["no_proxy"]
|
|
147
|
+
return false if raw.nil? || raw.strip.empty?
|
|
148
|
+
|
|
149
|
+
host_ip = begin
|
|
150
|
+
IPAddr.new(host)
|
|
151
|
+
rescue
|
|
152
|
+
nil
|
|
153
|
+
end
|
|
154
|
+
raw.split(",").any? do |entry|
|
|
155
|
+
token = entry.to_s.strip
|
|
156
|
+
next false if token.empty?
|
|
157
|
+
return true if token == "*"
|
|
158
|
+
|
|
159
|
+
cidr = begin
|
|
160
|
+
IPAddr.new(token)
|
|
161
|
+
rescue
|
|
162
|
+
nil
|
|
163
|
+
end
|
|
164
|
+
if cidr && host_ip
|
|
165
|
+
next cidr.include?(host_ip)
|
|
166
|
+
end
|
|
167
|
+
|
|
168
|
+
normalized = token.sub(/\A\./, "")
|
|
169
|
+
host == normalized || host.end_with?(".#{normalized}")
|
|
170
|
+
end
|
|
171
|
+
end
|
|
172
|
+
|
|
120
173
|
def normalize_results(results)
|
|
121
174
|
Array(results).map do |r|
|
|
122
175
|
{
|
data/lib/searxng/errors.rb
CHANGED
|
@@ -15,4 +15,60 @@ module Searxng
|
|
|
15
15
|
@uri = uri
|
|
16
16
|
end
|
|
17
17
|
end
|
|
18
|
+
|
|
19
|
+
module ErrorMessages
|
|
20
|
+
module_function
|
|
21
|
+
|
|
22
|
+
def configuration_missing_url
|
|
23
|
+
"SEARXNG_URL not set. Set it to your SearXNG instance (e.g., http://localhost:8080 or https://search.example.com)"
|
|
24
|
+
end
|
|
25
|
+
|
|
26
|
+
def configuration_invalid_url(url)
|
|
27
|
+
"SEARXNG_URL has invalid format: #{url.inspect}. Use format: http://localhost:8080 or https://search.example.com"
|
|
28
|
+
end
|
|
29
|
+
|
|
30
|
+
def configuration_invalid_protocol(protocol)
|
|
31
|
+
"SEARXNG_URL must use http or https protocol, got: #{protocol.inspect}"
|
|
32
|
+
end
|
|
33
|
+
|
|
34
|
+
def configuration_auth_pair
|
|
35
|
+
"Authentication variables must be set together: SEARXNG_USER with SEARXNG_PASSWORD (or AUTH_USERNAME with AUTH_PASSWORD)"
|
|
36
|
+
end
|
|
37
|
+
|
|
38
|
+
def no_results(query)
|
|
39
|
+
%(No results found for "#{query}". Try different search terms or check if SearXNG search engines are working.)
|
|
40
|
+
end
|
|
41
|
+
|
|
42
|
+
def network_error(exception, uri: nil, target: "SearXNG server")
|
|
43
|
+
case exception
|
|
44
|
+
when SocketError
|
|
45
|
+
host = uri&.host || "unknown host"
|
|
46
|
+
%(DNS Error: Cannot resolve hostname "#{host}".)
|
|
47
|
+
when Errno::ECONNREFUSED
|
|
48
|
+
"Connection Error: #{target} is not responding."
|
|
49
|
+
when Errno::ETIMEDOUT, Timeout::Error
|
|
50
|
+
"Timeout Error: #{target} is too slow to respond."
|
|
51
|
+
when OpenSSL::SSL::SSLError
|
|
52
|
+
"SSL Error: Certificate or TLS problem while connecting to #{target}."
|
|
53
|
+
else
|
|
54
|
+
message = exception&.message.to_s.strip
|
|
55
|
+
"Network Error: #{message.empty? ? "Connection failed" : message}"
|
|
56
|
+
end
|
|
57
|
+
end
|
|
58
|
+
|
|
59
|
+
def api_error(status_code, status_message)
|
|
60
|
+
case status_code.to_i
|
|
61
|
+
when 403
|
|
62
|
+
"SearXNG Error (403): Authentication required or IP blocked."
|
|
63
|
+
when 404
|
|
64
|
+
"SearXNG Error (404): Search endpoint not found."
|
|
65
|
+
when 429
|
|
66
|
+
"SearXNG Error (429): Rate limit exceeded."
|
|
67
|
+
when 500..599
|
|
68
|
+
"SearXNG Error (#{status_code}): Internal server error."
|
|
69
|
+
else
|
|
70
|
+
"SearXNG returned #{status_code} #{status_message}".strip
|
|
71
|
+
end
|
|
72
|
+
end
|
|
73
|
+
end
|
|
18
74
|
end
|
data/lib/searxng/server.rb
CHANGED
|
@@ -1,5 +1,13 @@
|
|
|
1
1
|
require "fast_mcp"
|
|
2
2
|
require "searxng"
|
|
3
|
+
require "json"
|
|
4
|
+
require "net/http"
|
|
5
|
+
require "openssl"
|
|
6
|
+
require "open3"
|
|
7
|
+
require "socket"
|
|
8
|
+
require "time"
|
|
9
|
+
require "timeout"
|
|
10
|
+
require "uri"
|
|
3
11
|
|
|
4
12
|
FastMcp = MCP unless defined?(FastMcp)
|
|
5
13
|
|
|
@@ -53,15 +61,47 @@ module Searxng
|
|
|
53
61
|
end
|
|
54
62
|
|
|
55
63
|
def self.start
|
|
64
|
+
validate_environment!
|
|
56
65
|
server = FastMcp::Server.new(
|
|
57
66
|
name: "searxng",
|
|
58
67
|
version: Searxng::VERSION,
|
|
59
68
|
logger: NullLogger.new
|
|
60
69
|
)
|
|
61
|
-
server
|
|
70
|
+
register_tools(server)
|
|
71
|
+
register_resources(server)
|
|
62
72
|
server.start
|
|
63
73
|
end
|
|
64
74
|
|
|
75
|
+
def self.register_tools(server)
|
|
76
|
+
server.register_tool(SearxngWebSearchTool)
|
|
77
|
+
server.register_tool(WebUrlReadTool)
|
|
78
|
+
end
|
|
79
|
+
|
|
80
|
+
def self.register_resources(server)
|
|
81
|
+
server.register_resource(ServerConfigResource)
|
|
82
|
+
server.register_resource(UsageGuideResource)
|
|
83
|
+
end
|
|
84
|
+
|
|
85
|
+
def self.validate_environment!
|
|
86
|
+
searxng_url = ENV["SEARXNG_URL"]
|
|
87
|
+
if searxng_url.nil? || searxng_url.strip.empty?
|
|
88
|
+
raise ConfigurationError, ErrorMessages.configuration_missing_url
|
|
89
|
+
end
|
|
90
|
+
|
|
91
|
+
uri = URI.parse(searxng_url)
|
|
92
|
+
unless %w[http https].include?(uri.scheme)
|
|
93
|
+
raise ConfigurationError, ErrorMessages.configuration_invalid_protocol(uri.scheme)
|
|
94
|
+
end
|
|
95
|
+
|
|
96
|
+
user = ENV["SEARXNG_USER"] || ENV["AUTH_USERNAME"]
|
|
97
|
+
password = ENV["SEARXNG_PASSWORD"] || ENV["AUTH_PASSWORD"]
|
|
98
|
+
if (user && !password) || (!user && password)
|
|
99
|
+
raise ConfigurationError, ErrorMessages.configuration_auth_pair
|
|
100
|
+
end
|
|
101
|
+
rescue URI::InvalidURIError
|
|
102
|
+
raise ConfigurationError, ErrorMessages.configuration_invalid_url(searxng_url)
|
|
103
|
+
end
|
|
104
|
+
|
|
65
105
|
class BaseTool < FastMcp::Tool
|
|
66
106
|
protected
|
|
67
107
|
|
|
@@ -73,6 +113,7 @@ module Searxng
|
|
|
73
113
|
class SearxngWebSearchTool < BaseTool
|
|
74
114
|
tool_name "searxng_web_search"
|
|
75
115
|
description "Performs a web search using the SearXNG API. Use this to find information on the web. Aggregates results from multiple search engines."
|
|
116
|
+
annotations(readOnlyHint: true, openWorldHint: true) if respond_to?(:annotations)
|
|
76
117
|
|
|
77
118
|
arguments do
|
|
78
119
|
required(:query).filled(:string).description("The search query")
|
|
@@ -98,6 +139,7 @@ module Searxng
|
|
|
98
139
|
|
|
99
140
|
def format_search_result(data, max_results: 10, pageno: 1)
|
|
100
141
|
out = []
|
|
142
|
+
query = data[:query].to_s
|
|
101
143
|
data[:infoboxes]&.each do |ib|
|
|
102
144
|
out << "Infobox: #{ib[:infobox]}"
|
|
103
145
|
out << "ID: #{ib[:id]}"
|
|
@@ -107,7 +149,7 @@ module Searxng
|
|
|
107
149
|
results = data[:results] || []
|
|
108
150
|
total = data[:number_of_results]
|
|
109
151
|
if results.empty?
|
|
110
|
-
out << "
|
|
152
|
+
out << ErrorMessages.no_results(query.empty? ? "your query" : query)
|
|
111
153
|
else
|
|
112
154
|
limit = [max_results.to_i, 1].max
|
|
113
155
|
shown = results.first(limit)
|
|
@@ -127,5 +169,535 @@ module Searxng
|
|
|
127
169
|
out.join("\n").strip
|
|
128
170
|
end
|
|
129
171
|
end
|
|
172
|
+
|
|
173
|
+
class WebUrlReadTool < BaseTool
|
|
174
|
+
tool_name "web_url_read"
|
|
175
|
+
description "Reads a URL and converts HTML/XML content into Markdown. Supports character pagination, section extraction, paragraph ranges, and heading-only mode."
|
|
176
|
+
annotations(readOnlyHint: true, openWorldHint: true) if respond_to?(:annotations)
|
|
177
|
+
|
|
178
|
+
CACHE_TTL_SECONDS = 300
|
|
179
|
+
@@url_cache = {} # rubocop:disable Style/ClassVars
|
|
180
|
+
|
|
181
|
+
arguments do
|
|
182
|
+
required(:url).filled(:string).description("URL to fetch and read")
|
|
183
|
+
optional(:startChar).filled(:integer).description("Starting character position (default: 0)")
|
|
184
|
+
optional(:maxLength).filled(:integer).description("Maximum characters to return")
|
|
185
|
+
optional(:section).filled(:string).description("Extract by heading text")
|
|
186
|
+
optional(:paragraphRange).filled(:string).description("Paragraph range, e.g. '1-5', '3', '10-'")
|
|
187
|
+
optional(:readHeadings).filled(:bool).description("Return only headings when true")
|
|
188
|
+
optional(:timeoutMs).filled(:integer).description("Request timeout in ms (default: 10000)")
|
|
189
|
+
end
|
|
190
|
+
|
|
191
|
+
def call(url:, **kwargs)
|
|
192
|
+
start_char = kwargs.fetch(:startChar, 0)
|
|
193
|
+
max_length = kwargs[:maxLength]
|
|
194
|
+
section = kwargs[:section]
|
|
195
|
+
paragraph_range = kwargs[:paragraphRange]
|
|
196
|
+
read_headings = kwargs.fetch(:readHeadings, false)
|
|
197
|
+
timeout_ms = kwargs.fetch(:timeoutMs, 10_000)
|
|
198
|
+
|
|
199
|
+
normalized = normalize_url(url)
|
|
200
|
+
uri = URI.parse(normalized)
|
|
201
|
+
markdown = case uri.scheme
|
|
202
|
+
when "http", "https"
|
|
203
|
+
html = fetch_html(normalized, timeout_ms: timeout_ms.to_i)
|
|
204
|
+
convert_to_markdown(html, normalized)
|
|
205
|
+
when "ftp"
|
|
206
|
+
content = fetch_ftp(uri, timeout_ms: timeout_ms.to_i)
|
|
207
|
+
convert_fetched_content(content, normalized)
|
|
208
|
+
when "sftp"
|
|
209
|
+
content = fetch_sftp(uri, timeout_ms: timeout_ms.to_i)
|
|
210
|
+
convert_fetched_content(content, normalized)
|
|
211
|
+
when "smb"
|
|
212
|
+
content = fetch_smb(uri, timeout_ms: timeout_ms.to_i)
|
|
213
|
+
convert_fetched_content(content, normalized)
|
|
214
|
+
when "gemini"
|
|
215
|
+
gemtext = fetch_gemini(uri, timeout_ms: timeout_ms.to_i)
|
|
216
|
+
gemtext_to_markdown(gemtext, uri)
|
|
217
|
+
when "ipfs"
|
|
218
|
+
gateway_url = resolve_ipfs_url(uri)
|
|
219
|
+
html = fetch_html(gateway_url, timeout_ms: timeout_ms.to_i)
|
|
220
|
+
convert_to_markdown(html, gateway_url)
|
|
221
|
+
else
|
|
222
|
+
raise ConfigurationError, %(Unsupported URL scheme "#{uri.scheme}". Supported schemes: http, https, ftp, sftp, smb, gemini, ipfs.)
|
|
223
|
+
end
|
|
224
|
+
apply_options(markdown, start_char: start_char, max_length: max_length, section: section, paragraph_range: paragraph_range, read_headings: read_headings)
|
|
225
|
+
end
|
|
226
|
+
|
|
227
|
+
private
|
|
228
|
+
|
|
229
|
+
def normalize_url(url)
|
|
230
|
+
raw = url.to_s.strip
|
|
231
|
+
uri = URI.parse(raw)
|
|
232
|
+
if uri.scheme
|
|
233
|
+
scheme = uri.scheme.downcase
|
|
234
|
+
return raw if uri.host && %w[http https ftp sftp smb gemini ipfs].include?(scheme)
|
|
235
|
+
|
|
236
|
+
raise ConfigurationError, %(Unsupported URL scheme "#{uri.scheme}". Supported schemes: http, https, ftp, sftp, smb, gemini, ipfs.)
|
|
237
|
+
end
|
|
238
|
+
|
|
239
|
+
candidate = "https://#{raw}"
|
|
240
|
+
parsed = URI.parse(candidate)
|
|
241
|
+
return candidate if parsed.host
|
|
242
|
+
|
|
243
|
+
raise ArgumentError
|
|
244
|
+
rescue URI::InvalidURIError, ArgumentError
|
|
245
|
+
raise ConfigurationError, %(URL Format Error: Invalid URL "#{url}")
|
|
246
|
+
end
|
|
247
|
+
|
|
248
|
+
def fetch_html(url, timeout_ms:)
|
|
249
|
+
now = Time.now.to_i
|
|
250
|
+
cached = @@url_cache[url]
|
|
251
|
+
if cached && (now - cached[:at] <= CACHE_TTL_SECONDS)
|
|
252
|
+
return cached[:html]
|
|
253
|
+
end
|
|
254
|
+
|
|
255
|
+
uri = URI.parse(url)
|
|
256
|
+
client = get_client
|
|
257
|
+
http = client.send(:build_http, uri)
|
|
258
|
+
request = Net::HTTP::Get.new(uri)
|
|
259
|
+
request["User-Agent"] = "searxng-ruby/#{Searxng::VERSION} web_url_read"
|
|
260
|
+
response = nil
|
|
261
|
+
Timeout.timeout(timeout_ms / 1000.0) { response = http.request(request) }
|
|
262
|
+
unless response.is_a?(Net::HTTPSuccess)
|
|
263
|
+
raise APIError.new(
|
|
264
|
+
ErrorMessages.api_error(response.code.to_i, response.message),
|
|
265
|
+
status_code: response.code.to_i,
|
|
266
|
+
response_data: response.body,
|
|
267
|
+
uri: url
|
|
268
|
+
)
|
|
269
|
+
end
|
|
270
|
+
|
|
271
|
+
body = response.body.to_s
|
|
272
|
+
if body.strip.empty?
|
|
273
|
+
raise APIError.new("Content Error: Website returned empty content.", status_code: response.code.to_i, uri: url)
|
|
274
|
+
end
|
|
275
|
+
|
|
276
|
+
@@url_cache[url] = {at: now, html: body}
|
|
277
|
+
body
|
|
278
|
+
rescue Timeout::Error
|
|
279
|
+
raise NetworkError, "Timeout Error: #{URI.parse(url).host} took longer than #{timeout_ms}ms to respond"
|
|
280
|
+
rescue SocketError, Errno::ECONNREFUSED, Errno::ETIMEDOUT, Errno::ECONNRESET, Errno::EPIPE, OpenSSL::SSL::SSLError, EOFError => e
|
|
281
|
+
raise NetworkError, ErrorMessages.network_error(e, uri: URI.parse(url), target: "website")
|
|
282
|
+
end
|
|
283
|
+
|
|
284
|
+
def convert_to_markdown(html, base_url)
|
|
285
|
+
load_teplo_core!
|
|
286
|
+
ast = TeploCore.parse_html(html)
|
|
287
|
+
markdown = TeploCore.ast_to_markdown(ast, base_url)
|
|
288
|
+
if markdown.to_s.strip.empty?
|
|
289
|
+
"Content Warning: Page fetched but appears empty after conversion (#{base_url}). May contain only media or require JavaScript."
|
|
290
|
+
else
|
|
291
|
+
markdown
|
|
292
|
+
end
|
|
293
|
+
rescue => e
|
|
294
|
+
raise APIError.new("Conversion Error: Cannot convert HTML to Markdown (#{base_url}) - #{e.message}", uri: base_url)
|
|
295
|
+
end
|
|
296
|
+
|
|
297
|
+
def convert_fetched_content(content, base_url)
|
|
298
|
+
text = content.to_s.encode("UTF-8", invalid: :replace, undef: :replace, replace: "")
|
|
299
|
+
return text if text.strip.empty?
|
|
300
|
+
|
|
301
|
+
looks_like_markup = text.match?(/\A\s*<(?:!doctype|html|body|xml|rss|feed|svg|\?xml)/i)
|
|
302
|
+
looks_like_markup ? convert_to_markdown(text, base_url) : text
|
|
303
|
+
end
|
|
304
|
+
|
|
305
|
+
def fetch_ftp(uri, timeout_ms:)
|
|
306
|
+
begin
|
|
307
|
+
require "net/ftp"
|
|
308
|
+
rescue LoadError
|
|
309
|
+
raise ConfigurationError, "FTP support requires the 'net-ftp' gem."
|
|
310
|
+
end
|
|
311
|
+
|
|
312
|
+
body = +""
|
|
313
|
+
Timeout.timeout(timeout_ms / 1000.0) do
|
|
314
|
+
ftp = Net::FTP.new
|
|
315
|
+
begin
|
|
316
|
+
ftp.connect(uri.host, uri.port || 21)
|
|
317
|
+
ftp.read_timeout = [timeout_ms / 1000.0, 1.0].max
|
|
318
|
+
ftp.open_timeout = [timeout_ms / 1000.0, 1.0].max
|
|
319
|
+
user = uri.user || ENV["FTP_USER"] || "anonymous"
|
|
320
|
+
pass = uri.password || ENV["FTP_PASSWORD"] || "anonymous@"
|
|
321
|
+
ftp.login(user, pass)
|
|
322
|
+
path = uri.path.to_s
|
|
323
|
+
raise APIError.new("FTP path is required", uri: uri.to_s) if path.empty? || path == "/"
|
|
324
|
+
|
|
325
|
+
ftp.retrbinary("RETR #{path}", 16_384) { |chunk| body << chunk }
|
|
326
|
+
ensure
|
|
327
|
+
ftp.close unless ftp.closed?
|
|
328
|
+
end
|
|
329
|
+
end
|
|
330
|
+
body
|
|
331
|
+
rescue Timeout::Error
|
|
332
|
+
raise NetworkError, "Timeout Error: #{uri.host} took longer than #{timeout_ms}ms to respond"
|
|
333
|
+
rescue SocketError, Errno::ECONNREFUSED, Errno::ETIMEDOUT, Errno::ECONNRESET, Errno::EPIPE, EOFError => e
|
|
334
|
+
raise NetworkError, ErrorMessages.network_error(e, uri: uri, target: "FTP server")
|
|
335
|
+
rescue => e
|
|
336
|
+
if e.class.name.start_with?("Net::FTP")
|
|
337
|
+
raise APIError.new("FTP Error: #{e.message}", uri: uri.to_s)
|
|
338
|
+
end
|
|
339
|
+
raise e
|
|
340
|
+
end
|
|
341
|
+
|
|
342
|
+
def fetch_sftp(uri, timeout_ms:)
|
|
343
|
+
begin
|
|
344
|
+
require "net/sftp"
|
|
345
|
+
rescue LoadError
|
|
346
|
+
raise ConfigurationError, "SFTP support requires the 'net-sftp' gem."
|
|
347
|
+
end
|
|
348
|
+
|
|
349
|
+
user = uri.user || ENV["SFTP_USER"]
|
|
350
|
+
password = uri.password || ENV["SFTP_PASSWORD"]
|
|
351
|
+
raise ConfigurationError, "SFTP requires username (sftp://user@host/path or SFTP_USER)." unless user
|
|
352
|
+
|
|
353
|
+
path = uri.path.to_s
|
|
354
|
+
raise APIError.new("SFTP path is required", uri: uri.to_s) if path.empty? || path == "/"
|
|
355
|
+
|
|
356
|
+
result = nil
|
|
357
|
+
Timeout.timeout(timeout_ms / 1000.0) do
|
|
358
|
+
Net::SFTP.start(uri.host, user, password: password, port: uri.port || 22, non_interactive: true, verify_host_key: :never) do |sftp|
|
|
359
|
+
result = sftp.download!(path)
|
|
360
|
+
end
|
|
361
|
+
end
|
|
362
|
+
result.to_s
|
|
363
|
+
rescue Timeout::Error
|
|
364
|
+
raise NetworkError, "Timeout Error: #{uri.host} took longer than #{timeout_ms}ms to respond"
|
|
365
|
+
rescue SocketError, Errno::ECONNREFUSED, Errno::ETIMEDOUT, Errno::ECONNRESET, Errno::EPIPE, EOFError => e
|
|
366
|
+
raise NetworkError, ErrorMessages.network_error(e, uri: uri, target: "SFTP server")
|
|
367
|
+
rescue => e
|
|
368
|
+
raise e if e.is_a?(ConfigurationError)
|
|
369
|
+
if e.class.name.include?("StatusException")
|
|
370
|
+
description = e.respond_to?(:description) ? e.description : e.message
|
|
371
|
+
raise APIError.new("SFTP Error: #{description}", uri: uri.to_s)
|
|
372
|
+
end
|
|
373
|
+
if e.class.name.include?("AuthenticationFailed")
|
|
374
|
+
raise APIError.new("SFTP authentication failed", uri: uri.to_s)
|
|
375
|
+
end
|
|
376
|
+
raise APIError.new("SFTP Error: #{e.message}", uri: uri.to_s)
|
|
377
|
+
end
|
|
378
|
+
|
|
379
|
+
def fetch_smb(uri, timeout_ms:)
|
|
380
|
+
share, remote_path = parse_smb_path(uri)
|
|
381
|
+
user = uri.user || ENV["SMB_USER"]
|
|
382
|
+
pass = uri.password || ENV["SMB_PASSWORD"]
|
|
383
|
+
domain = ENV["SMB_DOMAIN"]
|
|
384
|
+
|
|
385
|
+
auth = if user
|
|
386
|
+
full_user = (domain && !domain.empty?) ? "#{domain}\\#{user}" : user
|
|
387
|
+
"#{full_user}%#{pass}"
|
|
388
|
+
end
|
|
389
|
+
|
|
390
|
+
escaped_remote = remote_path.gsub('"', '\"')
|
|
391
|
+
cmd = ["smbclient", "//#{uri.host}/#{share}", "-c", %(get "#{escaped_remote}" -)]
|
|
392
|
+
if auth
|
|
393
|
+
cmd << "-U" << auth
|
|
394
|
+
else
|
|
395
|
+
cmd << "-N"
|
|
396
|
+
end
|
|
397
|
+
|
|
398
|
+
stdout = +""
|
|
399
|
+
stderr = +""
|
|
400
|
+
status = nil
|
|
401
|
+
Timeout.timeout(timeout_ms / 1000.0) do
|
|
402
|
+
stdout, stderr, status = Open3.capture3(*cmd)
|
|
403
|
+
end
|
|
404
|
+
|
|
405
|
+
raise APIError.new("SMB Error: #{stderr.strip.empty? ? "request failed" : stderr.strip}", uri: uri.to_s) unless status.success?
|
|
406
|
+
|
|
407
|
+
stdout
|
|
408
|
+
rescue Errno::ENOENT
|
|
409
|
+
raise ConfigurationError, "SMB support requires 'smbclient' to be installed and available in PATH."
|
|
410
|
+
rescue Timeout::Error
|
|
411
|
+
raise NetworkError, "Timeout Error: #{uri.host} took longer than #{timeout_ms}ms to respond"
|
|
412
|
+
rescue SocketError, Errno::ECONNREFUSED, Errno::ETIMEDOUT, Errno::ECONNRESET, Errno::EPIPE, EOFError => e
|
|
413
|
+
raise NetworkError, ErrorMessages.network_error(e, uri: uri, target: "SMB server")
|
|
414
|
+
end
|
|
415
|
+
|
|
416
|
+
def parse_smb_path(uri)
|
|
417
|
+
segments = uri.path.to_s.split("/").reject(&:empty?)
|
|
418
|
+
share = segments.shift
|
|
419
|
+
remote = segments.join("/")
|
|
420
|
+
raise APIError.new("SMB path must include share and file (smb://host/share/path/file)", uri: uri.to_s) if share.to_s.empty? || remote.empty?
|
|
421
|
+
|
|
422
|
+
[share, remote]
|
|
423
|
+
end
|
|
424
|
+
|
|
425
|
+
def fetch_gemini(uri, timeout_ms:)
|
|
426
|
+
response_header = nil
|
|
427
|
+
body = +""
|
|
428
|
+
Timeout.timeout(timeout_ms / 1000.0) do
|
|
429
|
+
tcp = TCPSocket.new(uri.host, uri.port || 1965)
|
|
430
|
+
begin
|
|
431
|
+
ctx = OpenSSL::SSL::SSLContext.new
|
|
432
|
+
ctx.verify_mode = (ENV["GEMINI_INSECURE"] == "1") ? OpenSSL::SSL::VERIFY_NONE : OpenSSL::SSL::VERIFY_PEER
|
|
433
|
+
ssl = OpenSSL::SSL::SSLSocket.new(tcp, ctx)
|
|
434
|
+
ssl.hostname = uri.host if ssl.respond_to?(:hostname=)
|
|
435
|
+
ssl.connect
|
|
436
|
+
ssl.write("#{uri}\r\n")
|
|
437
|
+
response_header = ssl.gets("\r\n")
|
|
438
|
+
body = ssl.read.to_s
|
|
439
|
+
ssl.close
|
|
440
|
+
ensure
|
|
441
|
+
tcp.close unless tcp.closed?
|
|
442
|
+
end
|
|
443
|
+
end
|
|
444
|
+
|
|
445
|
+
unless response_header
|
|
446
|
+
raise APIError.new("Gemini response error: empty response header", uri: uri.to_s)
|
|
447
|
+
end
|
|
448
|
+
|
|
449
|
+
status, meta = response_header.strip.split(/\s+/, 2)
|
|
450
|
+
status_code = status.to_i
|
|
451
|
+
case status_code
|
|
452
|
+
when 20..29
|
|
453
|
+
body
|
|
454
|
+
when 30..39
|
|
455
|
+
raise APIError.new("Gemini redirect (#{status_code}): #{meta}", status_code: status_code, uri: uri.to_s)
|
|
456
|
+
when 40..49
|
|
457
|
+
raise APIError.new("Gemini temporary failure (#{status_code}): #{meta}", status_code: status_code, uri: uri.to_s)
|
|
458
|
+
when 50..59
|
|
459
|
+
raise APIError.new("Gemini permanent failure (#{status_code}): #{meta}", status_code: status_code, uri: uri.to_s)
|
|
460
|
+
when 60..69
|
|
461
|
+
raise APIError.new("Gemini certificate required (#{status_code}): #{meta}", status_code: status_code, uri: uri.to_s)
|
|
462
|
+
else
|
|
463
|
+
raise APIError.new("Gemini response error (#{status_code}): #{meta}", status_code: status_code, uri: uri.to_s)
|
|
464
|
+
end
|
|
465
|
+
rescue Timeout::Error
|
|
466
|
+
raise NetworkError, "Timeout Error: #{uri.host} took longer than #{timeout_ms}ms to respond"
|
|
467
|
+
rescue SocketError, Errno::ECONNREFUSED, Errno::ETIMEDOUT, Errno::ECONNRESET, Errno::EPIPE, OpenSSL::SSL::SSLError, EOFError => e
|
|
468
|
+
raise NetworkError, ErrorMessages.network_error(e, uri: uri, target: "Gemini server")
|
|
469
|
+
end
|
|
470
|
+
|
|
471
|
+
def gemtext_to_markdown(gemtext, base_uri)
|
|
472
|
+
lines = gemtext.to_s.split("\n")
|
|
473
|
+
out = lines.map do |line|
|
|
474
|
+
if line.start_with?("=>")
|
|
475
|
+
target, *label_parts = line.sub(/\A=>\s*/, "").split(/\s+/)
|
|
476
|
+
label = label_parts.join(" ").strip
|
|
477
|
+
next "" if target.to_s.empty?
|
|
478
|
+
resolved = resolve_relative_uri(base_uri, target)
|
|
479
|
+
display = label.empty? ? resolved : label
|
|
480
|
+
"- [#{display}](#{resolved})"
|
|
481
|
+
elsif line.start_with?("### ")
|
|
482
|
+
"### #{line[4..].to_s.strip}"
|
|
483
|
+
elsif line.start_with?("## ")
|
|
484
|
+
"## #{line[3..].to_s.strip}"
|
|
485
|
+
elsif line.start_with?("# ")
|
|
486
|
+
"# #{line[2..].to_s.strip}"
|
|
487
|
+
elsif line.start_with?("> ")
|
|
488
|
+
"> #{line[2..]}"
|
|
489
|
+
else
|
|
490
|
+
line
|
|
491
|
+
end
|
|
492
|
+
end
|
|
493
|
+
out.join("\n")
|
|
494
|
+
end
|
|
495
|
+
|
|
496
|
+
def resolve_relative_uri(base_uri, target)
|
|
497
|
+
target_uri = URI.parse(target)
|
|
498
|
+
return target if target_uri.scheme
|
|
499
|
+
|
|
500
|
+
URI.join(base_uri.to_s, target).to_s
|
|
501
|
+
rescue URI::InvalidURIError
|
|
502
|
+
target
|
|
503
|
+
end
|
|
504
|
+
|
|
505
|
+
def resolve_ipfs_url(uri)
|
|
506
|
+
gateway = ENV["IPFS_GATEWAY"] || "https://ipfs.io"
|
|
507
|
+
gateway = gateway.sub(%r{/+\z}, "")
|
|
508
|
+
parsed_gateway = URI.parse(gateway)
|
|
509
|
+
unless %w[http https].include?(parsed_gateway.scheme)
|
|
510
|
+
raise ConfigurationError, "IPFS_GATEWAY must use http or https, got: #{parsed_gateway.scheme.inspect}"
|
|
511
|
+
end
|
|
512
|
+
|
|
513
|
+
cid = uri.host.to_s
|
|
514
|
+
path = uri.path.to_s
|
|
515
|
+
query = uri.query ? "?#{uri.query}" : ""
|
|
516
|
+
"#{gateway}/ipfs/#{cid}#{path}#{query}"
|
|
517
|
+
rescue URI::InvalidURIError
|
|
518
|
+
raise ConfigurationError, "IPFS_GATEWAY has invalid format: #{gateway.inspect}"
|
|
519
|
+
end
|
|
520
|
+
|
|
521
|
+
def load_teplo_core!
|
|
522
|
+
return if defined?(TeploCore)
|
|
523
|
+
|
|
524
|
+
require "teplo_core"
|
|
525
|
+
rescue LoadError
|
|
526
|
+
local = File.expand_path("../../../textplorer/core-ruby/lib/teplo_core", __dir__)
|
|
527
|
+
if File.exist?("#{local}.rb")
|
|
528
|
+
require local
|
|
529
|
+
else
|
|
530
|
+
raise ConfigurationError, "web_url_read requires teplo_core. Install the gem or provide textplorer/core-ruby in the expected workspace location."
|
|
531
|
+
end
|
|
532
|
+
end
|
|
533
|
+
|
|
534
|
+
def apply_options(markdown, start_char:, max_length:, section:, paragraph_range:, read_headings:)
|
|
535
|
+
result = markdown.to_s
|
|
536
|
+
return extract_headings(result) if read_headings
|
|
537
|
+
|
|
538
|
+
if section && !section.strip.empty?
|
|
539
|
+
section_text = extract_section(result, section)
|
|
540
|
+
result = section_text.empty? ? %(Section "#{section}" not found in the content.) : section_text
|
|
541
|
+
end
|
|
542
|
+
|
|
543
|
+
if paragraph_range && !paragraph_range.strip.empty?
|
|
544
|
+
paragraph_text = extract_paragraph_range(result, paragraph_range)
|
|
545
|
+
result = paragraph_text.empty? ? %(Paragraph range "#{paragraph_range}" is invalid or out of bounds.) : paragraph_text
|
|
546
|
+
end
|
|
547
|
+
|
|
548
|
+
start = [start_char.to_i, 0].max
|
|
549
|
+
result = (start >= result.length) ? "" : result[start..]
|
|
550
|
+
|
|
551
|
+
if max_length
|
|
552
|
+
max = max_length.to_i
|
|
553
|
+
result = result[0, max] if max.positive?
|
|
554
|
+
end
|
|
555
|
+
result
|
|
556
|
+
end
|
|
557
|
+
|
|
558
|
+
def extract_section(markdown, heading)
|
|
559
|
+
lines = markdown.split("\n")
|
|
560
|
+
section_regex = /^\#{1,6}\s*.*#{Regexp.escape(heading)}.*$/i
|
|
561
|
+
|
|
562
|
+
start_index = -1
|
|
563
|
+
current_level = 0
|
|
564
|
+
lines.each_with_index do |line, idx|
|
|
565
|
+
next unless line.match?(section_regex)
|
|
566
|
+
|
|
567
|
+
start_index = idx
|
|
568
|
+
current_level = line[/^#+/].to_s.length
|
|
569
|
+
break
|
|
570
|
+
end
|
|
571
|
+
return "" if start_index < 0
|
|
572
|
+
|
|
573
|
+
end_index = lines.length
|
|
574
|
+
lines[(start_index + 1)..]&.each_with_index do |line, offset|
|
|
575
|
+
level = line[/^#+/].to_s.length
|
|
576
|
+
next if level.zero? || level > current_level
|
|
577
|
+
|
|
578
|
+
end_index = start_index + 1 + offset
|
|
579
|
+
break
|
|
580
|
+
end
|
|
581
|
+
|
|
582
|
+
lines[start_index...end_index].join("\n")
|
|
583
|
+
end
|
|
584
|
+
|
|
585
|
+
def extract_paragraph_range(markdown, range)
|
|
586
|
+
paragraphs = markdown.split(/\n{2,}/).map(&:strip).reject(&:empty?)
|
|
587
|
+
match = range.to_s.strip.match(/\A(\d+)(?:-(\d*))?\z/)
|
|
588
|
+
return "" unless match
|
|
589
|
+
|
|
590
|
+
start = match[1].to_i - 1
|
|
591
|
+
return "" if start.negative? || start >= paragraphs.length
|
|
592
|
+
|
|
593
|
+
if match[2].nil?
|
|
594
|
+
paragraphs[start].to_s
|
|
595
|
+
elsif match[2].empty?
|
|
596
|
+
paragraphs[start..].join("\n\n")
|
|
597
|
+
else
|
|
598
|
+
ending = match[2].to_i
|
|
599
|
+
paragraphs[start...ending].join("\n\n")
|
|
600
|
+
end
|
|
601
|
+
end
|
|
602
|
+
|
|
603
|
+
def extract_headings(markdown)
|
|
604
|
+
headings = markdown.split("\n").select { |line| line.match?(/^\#{1,6}\s+/) }
|
|
605
|
+
headings.empty? ? "No headings found in the content." : headings.join("\n")
|
|
606
|
+
end
|
|
607
|
+
end
|
|
608
|
+
|
|
609
|
+
class ServerConfigResource < FastMcp::Resource
|
|
610
|
+
uri "config://server-config"
|
|
611
|
+
resource_name "Server Configuration"
|
|
612
|
+
description "Current SearXNG MCP server configuration and capabilities"
|
|
613
|
+
mime_type "application/json"
|
|
614
|
+
|
|
615
|
+
def content
|
|
616
|
+
searxng_url = ENV["SEARXNG_URL"].to_s
|
|
617
|
+
user = ENV["SEARXNG_USER"] || ENV["AUTH_USERNAME"]
|
|
618
|
+
password = ENV["SEARXNG_PASSWORD"] || ENV["AUTH_PASSWORD"]
|
|
619
|
+
config = {
|
|
620
|
+
server_info: {
|
|
621
|
+
name: "searxng",
|
|
622
|
+
version: Searxng::VERSION
|
|
623
|
+
},
|
|
624
|
+
environment: {
|
|
625
|
+
searxng_url: safe_url(searxng_url),
|
|
626
|
+
has_auth: !!(user && password),
|
|
627
|
+
has_proxy: !!(ENV["HTTP_PROXY"] || ENV["HTTPS_PROXY"] || ENV["http_proxy"] || ENV["https_proxy"]),
|
|
628
|
+
has_no_proxy: !!(ENV["NO_PROXY"] || ENV["no_proxy"])
|
|
629
|
+
},
|
|
630
|
+
capabilities: {
|
|
631
|
+
tools: %w[searxng_web_search web_url_read],
|
|
632
|
+
resources: %w[config://server-config help://usage-guide],
|
|
633
|
+
transport: ["stdio"]
|
|
634
|
+
}
|
|
635
|
+
}
|
|
636
|
+
JSON.pretty_generate(config)
|
|
637
|
+
end
|
|
638
|
+
|
|
639
|
+
private
|
|
640
|
+
|
|
641
|
+
def safe_url(raw)
|
|
642
|
+
return "(not configured)" if raw.nil? || raw.strip.empty?
|
|
643
|
+
|
|
644
|
+
uri = URI.parse(raw)
|
|
645
|
+
uri.user = nil
|
|
646
|
+
uri.password = nil
|
|
647
|
+
uri.to_s
|
|
648
|
+
rescue URI::InvalidURIError
|
|
649
|
+
"(invalid URL)"
|
|
650
|
+
end
|
|
651
|
+
end
|
|
652
|
+
|
|
653
|
+
class UsageGuideResource < FastMcp::Resource
|
|
654
|
+
uri "help://usage-guide"
|
|
655
|
+
resource_name "Usage Guide"
|
|
656
|
+
description "Short guide for using SearXNG MCP tools and environment variables"
|
|
657
|
+
mime_type "text/markdown"
|
|
658
|
+
|
|
659
|
+
def content
|
|
660
|
+
<<~MD
|
|
661
|
+
# SearXNG MCP Server Help
|
|
662
|
+
|
|
663
|
+
## Tools
|
|
664
|
+
1. `searxng_web_search` - Search the web via SearXNG.
|
|
665
|
+
2. `web_url_read` - Fetch a URL and return Markdown-converted content.
|
|
666
|
+
|
|
667
|
+
## Required Environment
|
|
668
|
+
- `SEARXNG_URL` (must be `http://` or `https://`)
|
|
669
|
+
|
|
670
|
+
## Optional Environment
|
|
671
|
+
- `SEARXNG_USER` and `SEARXNG_PASSWORD` (or `AUTH_USERNAME` and `AUTH_PASSWORD`)
|
|
672
|
+
- `HTTP_PROXY` / `HTTPS_PROXY`
|
|
673
|
+
- `ALL_PROXY` (for shared proxy config)
|
|
674
|
+
- `NO_PROXY`
|
|
675
|
+
- `FTP_USER` / `FTP_PASSWORD` (optional FTP credentials)
|
|
676
|
+
- `SFTP_USER` / `SFTP_PASSWORD` (optional SFTP credentials)
|
|
677
|
+
- `SMB_USER` / `SMB_PASSWORD` / `SMB_DOMAIN` (optional SMB credentials)
|
|
678
|
+
- `IPFS_GATEWAY` (default: `https://ipfs.io`)
|
|
679
|
+
- `GEMINI_INSECURE=1` (optional, disables TLS verification for Gemini)
|
|
680
|
+
|
|
681
|
+
## Protocol Notes
|
|
682
|
+
- `http` / `https`: fully supported (`searxng_web_search` + `web_url_read`)
|
|
683
|
+
- `ftp`: supported in `web_url_read` (file retrieval via FTP)
|
|
684
|
+
- `sftp`: supported in `web_url_read` (requires `net-sftp` gem)
|
|
685
|
+
- `smb`: supported in `web_url_read` via `smbclient` command
|
|
686
|
+
- `gemini`: supported in `web_url_read` (Gemini fetch + Gemtext conversion)
|
|
687
|
+
- `ipfs`: supported via HTTP gateway mapping (`ipfs://...` -> `IPFS_GATEWAY`)
|
|
688
|
+
- `tor` / `i2p`: supported via `.onion` / `.i2p` hosts over HTTP(S) with proxy configuration
|
|
689
|
+
- `socks5`: configure via proxy environment and run through a local bridge/proxy compatible with your client setup
|
|
690
|
+
|
|
691
|
+
## Examples
|
|
692
|
+
- Search: `{"query":"latest ruby news","time_range":"day"}`
|
|
693
|
+
- Read URL: `{"url":"https://example.com","section":"Introduction","maxLength":2000}`
|
|
694
|
+
- Read FTP: `{"url":"ftp://ftp.example.com/path/file.txt"}`
|
|
695
|
+
- Read SFTP: `{"url":"sftp://user@example.com/path/file.txt"}`
|
|
696
|
+
- Read SMB: `{"url":"smb://fileserver/share/path/file.txt"}`
|
|
697
|
+
- Read Gemini: `{"url":"gemini://geminiprotocol.net"}`
|
|
698
|
+
- Read IPFS: `{"url":"ipfs://bafybeigdyrzt.../index.html"}`
|
|
699
|
+
MD
|
|
700
|
+
end
|
|
701
|
+
end
|
|
130
702
|
end
|
|
131
703
|
end
|
data/lib/searxng/version.rb
CHANGED
data/sig/searxng.rbs
CHANGED
|
@@ -12,6 +12,16 @@ module Searxng
|
|
|
12
12
|
attr_reader uri: String?
|
|
13
13
|
end
|
|
14
14
|
|
|
15
|
+
module ErrorMessages
|
|
16
|
+
def self.configuration_missing_url: () -> String
|
|
17
|
+
def self.configuration_invalid_url: (untyped url) -> String
|
|
18
|
+
def self.configuration_invalid_protocol: (untyped protocol) -> String
|
|
19
|
+
def self.configuration_auth_pair: () -> String
|
|
20
|
+
def self.no_results: (String query) -> String
|
|
21
|
+
def self.network_error: (Exception exception, ?uri: URI::Generic?, ?target: String) -> String
|
|
22
|
+
def self.api_error: (Integer status_code, String status_message) -> String
|
|
23
|
+
end
|
|
24
|
+
|
|
15
25
|
class Client
|
|
16
26
|
def initialize: (
|
|
17
27
|
?base_url: String?,
|
|
@@ -23,7 +33,7 @@ module Searxng
|
|
|
23
33
|
?ca_path: String?,
|
|
24
34
|
?verify_mode: Integer?,
|
|
25
35
|
?user_agent: String?,
|
|
26
|
-
?configure_http: (Net::HTTP, URI::Generic) -> void
|
|
36
|
+
?configure_http: ((Net::HTTP, URI::Generic) -> void)?
|
|
27
37
|
) -> void
|
|
28
38
|
def search: (
|
|
29
39
|
String query,
|