gitlab-secret_detection 0.1.0 → 0.3.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: d72d76a2e018b5e645b7e3e2e7c4277ad1b6d464300a450ea118a8335bc3f962
4
- data.tar.gz: e43b97d92391548323a25ae88a0814cc1ee9770373ea9646bb52982e9085d770
3
+ metadata.gz: ac680c8a2cba1e9e00ac740981dd8fef0dcbd4e77ef24c6b127179070bec8750
4
+ data.tar.gz: 8ba958d266f9a42d40e2e4854282efaa2ee9b2c1ac01e139487c526a6dd58f95
5
5
  SHA512:
6
- metadata.gz: 29e37b4f36f013350f2ce86c3dba13c9af689e5091608b2c259ba64ff0d3cd8bf200f07d6ee381dc1effd15e2eb657a1a5e56c8d4de23163e2452149001b265d
7
- data.tar.gz: 0ac708cee73ec8f5b3b32545c31481cffab6cc648e5963f50115d20495cca0d8ce8f07e29a9285da53b0e55f7ee30ad9a8fcfe9b2925f07dd16c0a337b336864
6
+ metadata.gz: bfcee35fa4f88c8de6d7a9a3338e4c45e88786ea0eda8d531f554c55711e6b2d63c4f21ab481bc8c1ce647a96bc43058a9da945d269537ff1a0c69de43e74549
7
+ data.tar.gz: 9783adeb2b7718a23352635794938b33e04a0bb0ac9c13ee02a92335b9ddeab2cd8eb4b6968f6082174119677a667c447134fe4bcd8f1502fc3648ca105b4160
data/LICENSE ADDED
@@ -0,0 +1,19 @@
1
+ Copyright GitLab B.V.
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining a copy
4
+ of this software and associated documentation files (the "Software"), to deal
5
+ in the Software without restriction, including without limitation the rights
6
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
7
+ copies of the Software, and to permit persons to whom the Software is
8
+ furnished to do so, subject to the following conditions:
9
+
10
+ The above copyright notice and this permission notice shall be included in all
11
+ copies or substantial portions of the Software.
12
+
13
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
14
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
15
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
16
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
17
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
18
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
19
+ SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,334 @@
1
+ # Secret Detection Service
2
+
3
+ Secret Detection service is primarily responsible for detecting secrets in the given input payloads with RPC methods as the communication interface served via gRPC.
4
+ This service will initially be invoked by Rails monolith when performing access checks for Git Push event, and eventually extended for the other usecases too.
5
+
6
+ Reference Issue: https://gitlab.com/groups/gitlab-org/-/epics/13792
7
+
8
+ #### Tools and Framework
9
+
10
+ - Ruby `3.3.3`
11
+ - gRPC framework for serving RPC requests
12
+
13
+ ## Feature Distribution
14
+
15
+ In addition to offering the feature as an gRPC service, this project also includes the provision for distributing the same feature into a Ruby Gem.
16
+ This provision was added to fulfil [certain limitations](https://gitlab.com/gitlab-org/gitlab/-/issues/462359#note_1915874628). Here's the illustration representing
17
+ the approach:
18
+
19
+ ![Feature Distribution](./doc/rpc_service.png "Feature Distribution")
20
+
21
+ ## Project Layout
22
+
23
+ ```
24
+ ├── .runway
25
+ │ ├── runway.yml # Runway configuration file for Production environment
26
+ │ ├── runway-staging.yml # Runway configuration file for Staging environment
27
+ │ └── env-*.yml # Environment vars for the respective environments. Uses vault for secrets
28
+ ├── bin
29
+ │ └── start_server # gRPC server initiator that loads the server configuration file
30
+ ├── config
31
+ │ └── log.rb # Logger configuration
32
+ ├── ci-templates
33
+ │ ├── build.yml # CI jobs for building container image and ruby gems
34
+ │ ├── test.yml # CI jobs for running tests
35
+ │ └── runway.yml # Runway-downstream trigger and overriding some of runway's downstream CI jobs
36
+ ├── lib
37
+ │ └── gitlab
38
+ │ └── secret_detection
39
+ │ ├── version.rb # Secret Detection Gem release version
40
+ │ ├── core/.. # Secret detection logic (most of it pulled from existing gem)
41
+ │ └── grpc
42
+ │ ├── generated/.. # gRPC generated files and secret detection gRPC service
43
+ │ └── scanner_service.rb # Secret detection gRPC service implementation
44
+ ├── examples
45
+ │ └── sample-client/.. # Sample Ruby RPC client that connects with gRPC server and calls RPC scan
46
+ ├── proto
47
+ │ └── secret_detection.proto # Service Definition file containing gRPC request and response interface
48
+ ├── server
49
+ │ ├── interceptors/.. # gRPC server-side interceptors like Auth, Log etc.
50
+ │ └── server.rb # gRPC server file with configuration
51
+ ├── spec/.. # Rspec tests and related test helpers
52
+ ├── gitlab-secret_detection.gemspec # Gemspec file for Ruby Gem
53
+ ├── Dockerfile # Dockerfile for running gRPC server
54
+ └── Makefile # All the CLI commands placed here
55
+ ```
56
+
57
+ ### Makefile commands
58
+
59
+ Usage `make <command>`
60
+
61
+ | Command | Description |
62
+ |---------------------|---------------------------------------------------------------------------------------------------------------------------------|
63
+ | `install` | Installs ruby gems in the project using Ruby bundler |
64
+ | `lint_fix` | Fixes all the fixable Rubocop lint offenses |
65
+ | `gem_clean` | Cleans existing gem file(if any) generated through gem build process |
66
+ | `gem_build` | Builds Ruby gem file wrapping secret detection logic (lib directory) |
67
+ | `generate_proto` | Generates ruby(.rb) files for the Protobud Service Definition files(.proto) |
68
+ | `grpc_docker_build` | Builds a docker container image for gRPC server |
69
+ | `grpc_docker_serve` | Runs gRPC server via docker container listening on port 8080. Run `grpc_docker_build` make command before running this command. |
70
+ | `grpc_serve` | Runs gRPC server on the CLI listening on port 50001. Run `install` make command before running this command. |
71
+ | `run_core_tests` | Runs RSpec tests for Secret Detection core logic |
72
+ | `run_grpc_tests` | Runs RSpec tests for Secret Detection gRPC endpoints |
73
+ | `run_all_tests` | Runs all the RSpec tests in the project |
74
+
75
+ ## Generating a ruby gem
76
+
77
+ In the project directory, run `make gem` command in the terminal that builds a ruby gem(ex: `secret_detection-0.1.0.gem`) in the root of
78
+ the project directory.
79
+
80
+ ## Server
81
+
82
+ This project currently runs on **gRPC** server.
83
+
84
+ ### Service Definitions
85
+
86
+ - Scanner Service: [ProtoBuf Service Definition](proto/secret_detection.proto)
87
+
88
+ - Health Check Service: [ProtoBuf Service Definition](https://github.com/grpc/grpc/blob/v1.64.x/doc/health-checking.md#service-definition)
89
+
90
+
91
+ ### Running the server locally
92
+
93
+ Pre-requisite: gRPC installed on your system (`brew install grpc`)
94
+
95
+ | server | mode | command | Listening port |
96
+ |--------|--------|----------------------------------------------------|----------------|
97
+ | gRPC | CLI | `make grpc_serve` | `50001` |
98
+ | gRPC | Docker | `make grpc_docker_build && make grpc_docker_serve` | `8080` |
99
+
100
+ gRPC server port can be configured via `RPC_SERVER_PORT` environment variable
101
+
102
+ ### Calling gRPC endpoints from terminal
103
+
104
+ Pre-requisite:
105
+ - gRPC installed on your system (via `brew install grpc`)
106
+ - gRPC client like Postman or you can install `grpcurl` (via `brew install grpcurl`)
107
+
108
+ **Authentication:**
109
+
110
+ The RPC service uses a basic form of token-based authentication. When invoking an RPC request
111
+ on the client side, we need to append an `x-sd-auth` RPC Header whose value is dependent
112
+ on the environment where the server is running.
113
+
114
+ The auth token value when the server is running in:
115
+
116
+ - Localhost: `12345`
117
+ - Staging: `env/staging/service/secret-detection/API_AUTH_TOKEN` path in [Vault](https://vault.gitlab.net/)
118
+ - Production: `env/production/service/secret-detection/API_AUTH_TOKEN` path in [Vault](https://vault.gitlab.net/)
119
+
120
+ Note that only Secret Detection RPC requests are guarded with authentication.
121
+
122
+ More details can be found in this [issue](https://gitlab.com/gitlab-org/gitlab/-/issues/467531?work_item_iid=477700).
123
+
124
+ #### Health Check
125
+
126
+ **RPC Method:** `grpc.health.v1.Health/Check`
127
+
128
+ <details><summary>Example</summary>
129
+
130
+ ```shell
131
+ $ grpcurl -plaintext -d '{"service":"gitlab.secret_detection.Scanner"}' localhost:50001 grpc.health.v1.Health/Check
132
+ ```
133
+
134
+ You should see the following response as a result:
135
+
136
+ ```shell
137
+ {
138
+ "status": "SERVING"
139
+ }
140
+ ```
141
+
142
+ </details>
143
+
144
+ #### Secret Detection Scan (Unary RPC Call)
145
+
146
+ **RPC Method:** `gitlab.secret_detection.Scanner/Scan`
147
+ **Default Timeout(configurable):** per-request: `180 seconds`, per-payload: `30 seconds`
148
+
149
+
150
+ <details><summary>Example: Basic</summary>
151
+
152
+
153
+ ```shell
154
+ $ grpcurl -d @ \
155
+ localhost:50001 \
156
+ -rpc-header 'x-sd-auth:12345' \
157
+ gitlab.secret_detection.Scanner/Scan <<EOM
158
+ {
159
+ "payloads": [
160
+ {
161
+ "id": "94283",
162
+ "data": "glpat-12345123451234512345"
163
+ }
164
+ ]
165
+ }
166
+ EOM
167
+ ```
168
+
169
+ You should see the following response as a result:
170
+
171
+ ```json
172
+ {
173
+ "results": [
174
+ {
175
+ "payload_id": "94283",
176
+ "status": "FOUND",
177
+ "type": "gitlab_personal_access_token",
178
+ "description": "GitLab Personal Access Token",
179
+ "line_number": 1
180
+ }
181
+ ],
182
+ "status": "FOUND"
183
+ }
184
+ ```
185
+
186
+ </details>
187
+
188
+
189
+ <details><summary>Example: Using Allowlist and Tags</summary>
190
+
191
+ ```shell
192
+ $ grpcurl -d @ \
193
+ localhost:50001 \
194
+ -rpc-header 'x-sd-auth:12345' \
195
+ gitlab.secret_detection.Scanner/Scan <<EOM
196
+ {
197
+ "payloads": [
198
+ {
199
+ "id": "94283",
200
+ "data": "glpat-12345123451234512345"
201
+ },
202
+ {
203
+ "id": "37405",
204
+ "data": "glrt-12345123451234512345"
205
+ },
206
+ {
207
+ "id": "14954",
208
+ "data": "GR1348941__DUMMY_RUNNER_TOKEN"
209
+ }
210
+ ],
211
+ "allowlist": [
212
+ {
213
+ "allow_type": "ALLOW_RULE_TYPE",
214
+ "value": "gitlab_personal_access_token"
215
+ },
216
+ {
217
+ "allow_type": "ALLOW_RAW_VALUE",
218
+ "value": "glrt-12345123451234512345"
219
+ }
220
+ ],
221
+ "tags": [
222
+ "gitlab_blocking"
223
+ ]
224
+ }
225
+ EOM
226
+ ```
227
+
228
+ You should see the following response as a result:
229
+
230
+ ```json
231
+ {
232
+ "results": [
233
+ {
234
+ "payload_id": "14954",
235
+ "status": "STATUS_FOUND",
236
+ "type": "gitlab_pipeline_trigger_token",
237
+ "description": "GitLab Pipeline Trigger Token",
238
+ "line_number": 1
239
+ }
240
+ ],
241
+ "status": "STATUS_FOUND"
242
+ }
243
+ ```
244
+
245
+ </details>
246
+
247
+
248
+ #### Secret Detection Scan (Bi-directional RPC Streaming)
249
+
250
+ **RPC Method:** `gitlab.secret_detection.Scanner/ScanStream`
251
+ **Default Timeout(configurable):** per-request: `180 seconds`, per-payload: `30 seconds`
252
+
253
+ Bi-directional RPC streaming allows the server to receive a continuous stream of requests and respond with a
254
+ corresponding stream of responses as each request is processed. Unlike unary RPC calls, where requests are sent once and
255
+ the channel closes after a single response, bi-directional streaming keeps the RPC channel open to handle ongoing
256
+ requests and responses, enabling continuous communication between the client and server.
257
+
258
+
259
+ <details><summary>Example</summary>
260
+
261
+ To try this out, we will accept the request input from STDIN. As on when you provide request input json in the terminal,
262
+ you should see a corresponding server response. The connection will continue to remain open for accepting
263
+ requests unless you explicitly close the connection with `Ctrl+C`.
264
+
265
+ Here's the command to start an RPC streaming channel on the terminal:
266
+
267
+ ```shell
268
+ $ grpcurl -d @ localhost:50001 -rpc-header 'x-sd-auth:12345' gitlab.secret_detection.Scanner/ScanStream
269
+ ```
270
+
271
+ Once the channel is open, enter the following sample request input in the terminal:
272
+
273
+ ```json
274
+ {
275
+ "payloads": [
276
+ {
277
+ "id": "94283",
278
+ "data": "glpat-12345123451234512345"
279
+ }
280
+ ]
281
+ }
282
+ ```
283
+
284
+ you should immediately see the Server's response like below:
285
+
286
+ ```json
287
+ {
288
+ "results": [
289
+ {
290
+ "payload_id": "94283",
291
+ "status": "STATUS_FOUND",
292
+ "type": "gitlab_personal_access_token",
293
+ "description": "GitLab Personal Access Token",
294
+ "line_number": 1
295
+ }
296
+ ],
297
+ "status": "STATUS_FOUND"
298
+ }
299
+ ```
300
+
301
+ You may continue to provide more request inputs to the opened RPC channel to receive corresponding server responses. Features like Allowlisting, Tag filtering works same as outlined under Unary call section.
302
+
303
+ </details>
304
+
305
+ **NOTE**:
306
+ - In case you're running the local server via Docker, replace `localhost:50001` with `localhost:8080` in the example `grpcurl` commands.
307
+ - In case you want to access **staging** server, replace `localhost:50001` with `secret-detection.staging.runway.gitlab.net:443`
308
+
309
+
310
+ ## How to invoke the server from the code (as an RPC Client)
311
+
312
+ There is a sample [Ruby RPC Client](examples/sample-client/sample_client.rb) in the project that contains following code for reference:
313
+ - Setting up the connection with the server (both local and remote server with SSL)
314
+ - Invoke Secret Detection Unary call
315
+ - Invoke Secret Detection bi-directional stream call
316
+
317
+ Run `ruby examples/sample-client/sample_client.rb` on your terminal to run the sample RPC client calling both unary and streaming RPC Scan methods.
318
+
319
+ ## Benchmark
320
+
321
+ RPC service is benchmarked using [`ghz`](https://ghz.sh), a powerful CLI-based tool for load testing and benchmarking gRPC services. More details added [here](benchmark/README.md).
322
+
323
+ ## Project Status
324
+
325
+ Secret Detection service's status can be tracked here: https://gitlab.com/gitlab-org/gitlab/-/issues/467531
326
+
327
+ #### Changes made in the secret detection logic that were previously not present in the Gem
328
+
329
+ - [GitLab::SecretDetection::Core::Scanner#initialize(...)](lib/gitlab/secret_detection/core/scanner.rb): To reuse the logic of ruleset parsing from a file source, we parse the ruleset file at once and pass the parsed rules around. So,
330
+ the `initialize()` method now accepts parsed rules instead of ruleset file path
331
+ - [GitLab::SecretDetection::Core::Status](lib/gitlab/secret_detection/core/status.rb): `NOT_FOUND` status moved from `0` to `7` since
332
+ gRPC reserves `0` for enums. We need to reflect this change on the Rails side too
333
+ - [GitLab::SecretDetection::Core::Scanner#scan(...)](lib/gitlab/secret_detection/core/scanner.rb): Introduced `rule_exclusions`, `allow_values` and `tags` args to `scan(..)`
334
+ method to suport allowlist feature
data/config/log.rb ADDED
@@ -0,0 +1,23 @@
1
+ # frozen_string_literal: true
2
+
3
+ require 'grpc'
4
+ require 'logger'
5
+
6
+ # SD_ENV env var is used to determine which environment the
7
+ # server is running. This var is defined in `.runway/env-<env>.yml` files.
8
+ def local_env?
9
+ ENV.fetch('SD_ENV', 'localhost') == 'localhost'
10
+ end
11
+
12
+ module SDLogger
13
+ LOGGER = Logger.new $stderr, level: local_env? ? Logger::DEBUG : Logger::INFO
14
+
15
+ def logger
16
+ LOGGER
17
+ end
18
+ end
19
+
20
+ # Configure logger for GRPC
21
+ module GRPC
22
+ extend SDLogger
23
+ end
@@ -0,0 +1,40 @@
1
+ # frozen_string_literal: true
2
+
3
+ module GitLab
4
+ module SecretDetection
5
+ module Core
6
+ # Finding is a data object representing a secret finding identified within a payload
7
+ class Finding
8
+ attr_reader :payload_id, :status, :line_number, :type, :description
9
+
10
+ def initialize(payload_id, status, line_number = nil, type = nil, description = nil)
11
+ @payload_id = payload_id
12
+ @status = status
13
+ @line_number = line_number
14
+ @type = type
15
+ @description = description
16
+ end
17
+
18
+ def ==(other)
19
+ self.class == other.class && other.state == state
20
+ end
21
+
22
+ def to_h
23
+ {
24
+ payload_id:,
25
+ status:,
26
+ line_number:,
27
+ type:,
28
+ description:
29
+ }
30
+ end
31
+
32
+ protected
33
+
34
+ def state
35
+ [payload_id, status, line_number, type, description]
36
+ end
37
+ end
38
+ end
39
+ end
40
+ end