gitlab-secret_detection 0.1.0 → 0.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: d72d76a2e018b5e645b7e3e2e7c4277ad1b6d464300a450ea118a8335bc3f962
4
- data.tar.gz: e43b97d92391548323a25ae88a0814cc1ee9770373ea9646bb52982e9085d770
3
+ metadata.gz: ff032818993f54b85b958ad43293df4e9112ad20b9d68ab654abf74ebeb778a7
4
+ data.tar.gz: e0556ec3bc97e6973bbb6c8ba4e041897eccfbb37e23d4af16ff4fccd4850eb0
5
5
  SHA512:
6
- metadata.gz: 29e37b4f36f013350f2ce86c3dba13c9af689e5091608b2c259ba64ff0d3cd8bf200f07d6ee381dc1effd15e2eb657a1a5e56c8d4de23163e2452149001b265d
7
- data.tar.gz: 0ac708cee73ec8f5b3b32545c31481cffab6cc648e5963f50115d20495cca0d8ce8f07e29a9285da53b0e55f7ee30ad9a8fcfe9b2925f07dd16c0a337b336864
6
+ metadata.gz: ae963ac226ddab5d154879c078b44581f859232175ecf69269818a0e880368855469f10265a63d408089b5da7d56e37f51a865bbd9efc66ddd5474d1b718af0a
7
+ data.tar.gz: ff4c2bff936d88972a1e51563e2f6a0fb61f3c431213058fccae5e7abd18ac72636e37099a0dece9f083424795078bf2b69dd224aa88b27e6752bed1ae9d37ee
data/LICENSE ADDED
@@ -0,0 +1,19 @@
1
+ Copyright GitLab B.V.
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining a copy
4
+ of this software and associated documentation files (the "Software"), to deal
5
+ in the Software without restriction, including without limitation the rights
6
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
7
+ copies of the Software, and to permit persons to whom the Software is
8
+ furnished to do so, subject to the following conditions:
9
+
10
+ The above copyright notice and this permission notice shall be included in all
11
+ copies or substantial portions of the Software.
12
+
13
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
14
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
15
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
16
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
17
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
18
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
19
+ SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,337 @@
1
+ # Secret Detection Service
2
+
3
+ Secret Detection service is primarily responsible for detecting secrets in the given input payloads with RPC methods as the communication interface served via gRPC.
4
+ This service will initially be invoked by Rails monolith when performing access checks for Git Push event, and eventually extended for the other usecases too.
5
+
6
+ Reference Issue: https://gitlab.com/groups/gitlab-org/-/epics/13792
7
+
8
+ #### Tools and Framework
9
+
10
+ - Ruby `3.2.5`
11
+ - gRPC framework for serving RPC requests
12
+
13
+ ## Feature Distribution
14
+
15
+ In addition to offering the feature as an gRPC service, this project also includes the provision for distributing the same feature into a Ruby Gem.
16
+ This provision was added to fulfil [certain limitations](https://gitlab.com/gitlab-org/gitlab/-/issues/462359#note_1915874628). Here's the illustration representing
17
+ the approach:
18
+
19
+ ![Feature Distribution](./doc/rpc_service.png "Feature Distribution")
20
+
21
+ ## Project Layout
22
+
23
+ ```
24
+ ├── .runway
25
+ │ ├── runway.yml # Runway configuration file for Production environment
26
+ │ ├── runway-staging.yml # Runway configuration file for Staging environment
27
+ │ └── env-*.yml # Environment vars for the respective environments. Uses vault for secrets
28
+ ├── bin
29
+ │ └── start_server # gRPC server initiator that loads the server configuration file
30
+ ├── config
31
+ │ └── log.rb # Logger configuration
32
+ ├── ci
33
+ │ └── scripts/.. # CI scripts used in the CI templates
34
+ │ └── templates
35
+ │ ├── build.yml # CI jobs for building container image and ruby gems
36
+ │ ├── test.yml # CI jobs for running tests
37
+ │ └── release.yml # CI jobs for releases related to Ruby gem, GitLab releases
38
+ ├── lib
39
+ │ └── gitlab
40
+ │ └── secret_detection
41
+ │ ├── version.rb # Secret Detection Gem release version
42
+ │ ├── core/.. # Secret detection logic (most of it pulled from existing gem)
43
+ │ └── grpc
44
+ │ ├── generated/.. # gRPC generated files and secret detection gRPC service
45
+ │ ├── client/.. # gRPC client to invoke secret detection service's RPC endpoints
46
+ │ └── scanner_service.rb # Secret detection gRPC service implementation
47
+ ├── examples
48
+ │ └── sample-client/.. # Sample Ruby RPC client that connects with gRPC server and calls RPC scan
49
+ ├── proto
50
+ │ └── secret_detection.proto # Service Definition file containing gRPC request and response interface
51
+ ├── server
52
+ │ ├── interceptors/.. # gRPC server-side interceptors like Auth, Log etc.
53
+ │ └── server.rb # gRPC server file with configuration
54
+ ├── spec/.. # Rspec tests and related test helpers
55
+ ├── gitlab-secret_detection.gemspec # Gemspec file for Ruby Gem
56
+ ├── Dockerfile # Dockerfile for running gRPC server
57
+ └── Makefile # All the CLI commands placed here
58
+ ```
59
+
60
+ ### Makefile commands
61
+
62
+ Usage `make <command>`
63
+
64
+ | Command | Description |
65
+ |---------------------|---------------------------------------------------------------------------------------------------------------------------------|
66
+ | `install` | Installs ruby gems in the project using Ruby bundler |
67
+ | `lint_fix` | Fixes all the fixable Rubocop lint offenses |
68
+ | `gem_clean` | Cleans existing gem file(if any) generated through gem build process |
69
+ | `gem_build` | Builds Ruby gem file wrapping secret detection logic (lib directory) |
70
+ | `generate_proto` | Generates ruby(.rb) files for the Protobud Service Definition files(.proto) |
71
+ | `grpc_docker_build` | Builds a docker container image for gRPC server |
72
+ | `grpc_docker_serve` | Runs gRPC server via docker container listening on port 8080. Run `grpc_docker_build` make command before running this command. |
73
+ | `grpc_serve` | Runs gRPC server on the CLI listening on port 50001. Run `install` make command before running this command. |
74
+ | `run_core_tests` | Runs RSpec tests for Secret Detection core logic |
75
+ | `run_grpc_tests` | Runs RSpec tests for Secret Detection gRPC endpoints |
76
+ | `run_all_tests` | Runs all the RSpec tests in the project |
77
+
78
+ ## Generating a ruby gem
79
+
80
+ In the project directory, run `make gem` command in the terminal that builds a ruby gem(ex: `secret_detection-0.1.0.gem`) in the root of
81
+ the project directory.
82
+
83
+ ## Server
84
+
85
+ This project currently runs on **gRPC** server.
86
+
87
+ ### Service Definitions
88
+
89
+ - Scanner Service: [ProtoBuf Service Definition](proto/secret_detection.proto)
90
+
91
+ - Health Check Service: [ProtoBuf Service Definition](https://github.com/grpc/grpc/blob/v1.64.x/doc/health-checking.md#service-definition)
92
+
93
+
94
+ ### Running the server locally
95
+
96
+ Pre-requisite: gRPC installed on your system (`brew install grpc`)
97
+
98
+ | server | mode | command | Listening port |
99
+ |--------|--------|----------------------------------------------------|----------------|
100
+ | gRPC | CLI | `make grpc_serve` | `50001` |
101
+ | gRPC | Docker | `make grpc_docker_build && make grpc_docker_serve` | `8080` |
102
+
103
+ gRPC server port can be configured via `RPC_SERVER_PORT` environment variable
104
+
105
+ ### Calling gRPC endpoints from terminal
106
+
107
+ Pre-requisite:
108
+ - gRPC installed on your system (via `brew install grpc`)
109
+ - gRPC client like Postman or you can install `grpcurl` (via `brew install grpcurl`)
110
+
111
+ **Authentication:**
112
+
113
+ The RPC service uses a basic form of token-based authentication. When invoking an RPC request
114
+ on the client side, we need to append an `x-sd-auth` RPC Header whose value is dependent
115
+ on the environment where the server is running.
116
+
117
+ The auth token value when the server is running in:
118
+
119
+ - Localhost: `12345`
120
+ - Staging: `env/staging/service/secret-detection/API_AUTH_TOKEN` path in [Vault](https://vault.gitlab.net/)
121
+ - Production: `env/production/service/secret-detection/API_AUTH_TOKEN` path in [Vault](https://vault.gitlab.net/)
122
+
123
+ Note that only Secret Detection RPC requests are guarded with authentication.
124
+
125
+ More details can be found in this [issue](https://gitlab.com/gitlab-org/gitlab/-/issues/467531?work_item_iid=477700).
126
+
127
+ #### Health Check
128
+
129
+ **RPC Method:** `grpc.health.v1.Health/Check`
130
+
131
+ <details><summary>Example</summary>
132
+
133
+ ```shell
134
+ $ grpcurl -plaintext -d '{"service":"gitlab.secret_detection.Scanner"}' localhost:50001 grpc.health.v1.Health/Check
135
+ ```
136
+
137
+ You should see the following response as a result:
138
+
139
+ ```shell
140
+ {
141
+ "status": "SERVING"
142
+ }
143
+ ```
144
+
145
+ </details>
146
+
147
+ #### Secret Detection Scan (Unary RPC Call)
148
+
149
+ **RPC Method:** `gitlab.secret_detection.Scanner/Scan`
150
+ **Default Timeout(configurable):** per-request: `180 seconds`, per-payload: `30 seconds`
151
+
152
+
153
+ <details><summary>Example: Basic</summary>
154
+
155
+
156
+ ```shell
157
+ $ grpcurl -d @ \
158
+ localhost:50001 \
159
+ -rpc-header 'x-sd-auth:12345' \
160
+ gitlab.secret_detection.Scanner/Scan <<EOM
161
+ {
162
+ "payloads": [
163
+ {
164
+ "id": "94283",
165
+ "data": "glpat-12345123451234512345"
166
+ }
167
+ ]
168
+ }
169
+ EOM
170
+ ```
171
+
172
+ You should see the following response as a result:
173
+
174
+ ```json
175
+ {
176
+ "results": [
177
+ {
178
+ "payload_id": "94283",
179
+ "status": "FOUND",
180
+ "type": "gitlab_personal_access_token",
181
+ "description": "GitLab Personal Access Token",
182
+ "line_number": 1
183
+ }
184
+ ],
185
+ "status": "FOUND"
186
+ }
187
+ ```
188
+
189
+ </details>
190
+
191
+
192
+ <details><summary>Example: Using Exclusions and Tags</summary>
193
+
194
+ ```shell
195
+ $ grpcurl -d @ \
196
+ localhost:50001 \
197
+ -rpc-header 'x-sd-auth:12345' \
198
+ gitlab.secret_detection.Scanner/Scan <<EOM
199
+ {
200
+ "payloads": [
201
+ {
202
+ "id": "94283",
203
+ "data": "glpat-12345123451234512345"
204
+ },
205
+ {
206
+ "id": "37405",
207
+ "data": "glrt-12345123451234512345"
208
+ },
209
+ {
210
+ "id": "14954",
211
+ "data": "GR1348941__DUMMY_RUNNER_TOKEN"
212
+ }
213
+ ],
214
+ "exclusions": [
215
+ {
216
+ "exclusion_type": "EXCLUSION_TYPE_RULE",
217
+ "value": "gitlab_personal_access_token"
218
+ },
219
+ {
220
+ "exclusion_type": "EXCLUSION_TYPE_RAW_VALUE",
221
+ "value": "glrt-12345123451234512345"
222
+ }
223
+ ],
224
+ "tags": [
225
+ "gitlab_blocking"
226
+ ]
227
+ }
228
+ EOM
229
+ ```
230
+
231
+ You should see the following response as a result:
232
+
233
+ ```json
234
+ {
235
+ "results": [
236
+ {
237
+ "payload_id": "14954",
238
+ "status": "STATUS_FOUND",
239
+ "type": "gitlab_pipeline_trigger_token",
240
+ "description": "GitLab Pipeline Trigger Token",
241
+ "line_number": 1
242
+ }
243
+ ],
244
+ "status": "STATUS_FOUND"
245
+ }
246
+ ```
247
+
248
+ </details>
249
+
250
+
251
+ #### Secret Detection Scan (Bi-directional RPC Streaming)
252
+
253
+ **RPC Method:** `gitlab.secret_detection.Scanner/ScanStream`
254
+ **Default Timeout(configurable):** per-request: `180 seconds`, per-payload: `30 seconds`
255
+
256
+ Bi-directional RPC streaming allows the server to receive a continuous stream of requests and respond with a
257
+ corresponding stream of responses as each request is processed. Unlike unary RPC calls, where requests are sent once and
258
+ the channel closes after a single response, bi-directional streaming keeps the RPC channel open to handle ongoing
259
+ requests and responses, enabling continuous communication between the client and server.
260
+
261
+
262
+ <details><summary>Example</summary>
263
+
264
+ To try this out, we will accept the request input from STDIN. As on when you provide request input json in the terminal,
265
+ you should see a corresponding server response. The connection will continue to remain open for accepting
266
+ requests unless you explicitly close the connection with `Ctrl+C`.
267
+
268
+ Here's the command to start an RPC streaming channel on the terminal:
269
+
270
+ ```shell
271
+ $ grpcurl -d @ localhost:50001 -rpc-header 'x-sd-auth:12345' gitlab.secret_detection.Scanner/ScanStream
272
+ ```
273
+
274
+ Once the channel is open, enter the following sample request input in the terminal:
275
+
276
+ ```json
277
+ {
278
+ "payloads": [
279
+ {
280
+ "id": "94283",
281
+ "data": "glpat-12345123451234512345"
282
+ }
283
+ ]
284
+ }
285
+ ```
286
+
287
+ you should immediately see the Server's response like below:
288
+
289
+ ```json
290
+ {
291
+ "results": [
292
+ {
293
+ "payload_id": "94283",
294
+ "status": "STATUS_FOUND",
295
+ "type": "gitlab_personal_access_token",
296
+ "description": "GitLab Personal Access Token",
297
+ "line_number": 1
298
+ }
299
+ ],
300
+ "status": "STATUS_FOUND"
301
+ }
302
+ ```
303
+
304
+ You may continue to provide more request inputs to the opened RPC channel to receive corresponding server responses. Features like Exclusions, Tag Filtering works same as outlined under Unary call section.
305
+
306
+ </details>
307
+
308
+ **NOTE**:
309
+ - In case you're running the local server via Docker, replace `localhost:50001` with `localhost:8080` in the example `grpcurl` commands.
310
+ - In case you want to access **staging** server, replace `localhost:50001` with `secret-detection.staging.runway.gitlab.net:443`
311
+
312
+
313
+ ## How to invoke the server from the code (as an RPC Client)
314
+
315
+ There is a sample [Ruby RPC Client](examples/sample-client/sample_client.rb) in the project that contains following code for reference:
316
+ - Setting up the connection with the server (both local and remote server with SSL)
317
+ - Invoke Secret Detection Unary call
318
+ - Invoke Secret Detection bi-directional stream call
319
+
320
+ Run `ruby examples/sample-client/sample_client.rb` on your terminal to run the sample RPC client calling both unary and streaming RPC Scan methods.
321
+
322
+ ## Benchmark
323
+
324
+ RPC service is benchmarked using [`ghz`](https://ghz.sh), a powerful CLI-based tool for load testing and benchmarking gRPC services. More details added [here](https://gitlab.com/gitlab-org/gitlab/-/work_items/468107).
325
+
326
+ ## Project Status
327
+
328
+ Secret Detection service's status can be tracked here: https://gitlab.com/gitlab-org/gitlab/-/issues/467531
329
+
330
+ #### Changes made in the secret detection logic that were previously not present in the Gem
331
+
332
+ - [Gitlab::SecretDetection::Core::Scanner#initialize(...)](lib/gitlab/secret_detection/core/scanner.rb): To reuse the logic of ruleset parsing from a file source, we parse the ruleset file at once and pass the parsed rules around. So,
333
+ the `initialize()` method now accepts parsed rules instead of ruleset file path
334
+ - [Gitlab::SecretDetection::Core::Status](lib/gitlab/secret_detection/core/status.rb): `NOT_FOUND` status moved from `0` to `7` since
335
+ gRPC reserves `0` for enums. We need to reflect this change on the Rails side too
336
+ - [Gitlab::SecretDetection::Core::Scanner#scan(...)](lib/gitlab/secret_detection/core/scanner.rb): Introduced `rule_exclusions`, `raw_value_exclusions` and `tags` args to `scan(..)`
337
+ method to suport [exclusions](https://gitlab.com/groups/gitlab-org/-/epics/14315) feature.
data/config/log.rb ADDED
@@ -0,0 +1,23 @@
1
+ # frozen_string_literal: true
2
+
3
+ require 'grpc'
4
+ require 'logger'
5
+
6
+ # SD_ENV env var is used to determine which environment the
7
+ # server is running. This var is defined in `.runway/env-<env>.yml` files.
8
+ def local_env?
9
+ ENV.fetch('SD_ENV', 'localhost') == 'localhost'
10
+ end
11
+
12
+ module SDLogger
13
+ LOGGER = Logger.new $stderr, level: local_env? ? Logger::DEBUG : Logger::INFO
14
+
15
+ def logger
16
+ LOGGER
17
+ end
18
+ end
19
+
20
+ # Configure logger for GRPC
21
+ module GRPC
22
+ extend SDLogger
23
+ end
@@ -0,0 +1,40 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Gitlab
4
+ module SecretDetection
5
+ module Core
6
+ # Finding is a data object representing a secret finding identified within a payload
7
+ class Finding
8
+ attr_reader :payload_id, :status, :line_number, :type, :description
9
+
10
+ def initialize(payload_id, status, line_number = nil, type = nil, description = nil)
11
+ @payload_id = payload_id
12
+ @status = status
13
+ @line_number = line_number
14
+ @type = type
15
+ @description = description
16
+ end
17
+
18
+ def ==(other)
19
+ self.class == other.class && other.state == state
20
+ end
21
+
22
+ def to_h
23
+ {
24
+ payload_id:,
25
+ status:,
26
+ line_number:,
27
+ type:,
28
+ description:
29
+ }
30
+ end
31
+
32
+ protected
33
+
34
+ def state
35
+ [payload_id, status, line_number, type, description]
36
+ end
37
+ end
38
+ end
39
+ end
40
+ end