gitlab-secret_detection 0.19.0 → 0.21.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 24a3afdfa8519bd53576f9fe18ffffc12d9ed32d80e27610d03300666423e8a7
4
- data.tar.gz: f7b000df1c5c6e712e528f388a44506f3cfce79bd6044de44ad106087f59d52f
3
+ metadata.gz: edc523ab4e978a6870d4ec9d8055390cfe54b96bf5d2db9c5165dc860b8068a3
4
+ data.tar.gz: 55c39c2c1862db5f17a183eb77bebd00bb7faebcb10bfc83ea5956d69f718031
5
5
  SHA512:
6
- metadata.gz: 4ca94bc1d02d099c7f7404b321abbc1bf7be811112e564dc44c1256b42e5aeea6d4ab207280d0ccb88ef2c62f7f17eb3fbb47f6eadc717ef5cd33f4e66cf5e26
7
- data.tar.gz: f8815666aa5ff2f129c40eb42814449d9201755a91ee2b4f3bb33b48bbe3364a9e3d78b36ec8adf2fdd94e51a03d448d9d36fbf12326f4312a14cb31843bdb96
6
+ metadata.gz: afa785c41b6b0af5f7f462a80b7618874b8d6b408e0fd8921b06bf4a58ae9cf75bd2e23352fd67fc840475d047c9a760172de8278152a44c4daa99a33c584e9a
7
+ data.tar.gz: 71a2de7afb06997658c98cd14b2167b05c1a4ca191bb854aef7f309a5d40c0353de1feeba462f35012cd0fed47f3e2a41a95a527a8dfe5e172b03d28f2d532e3
data/README.md CHANGED
@@ -62,20 +62,21 @@ the approach:
62
62
 
63
63
  Usage `make <command>`
64
64
 
65
- | Command | Description |
66
- |---------------------|---------------------------------------------------------------------------------------------------------------------------------|
67
- | `install_secret_detection_rules` | Downloads secret-detection-rules based on package version defined in RULES_VERSION |
68
- | `install` | Installs ruby gems in the project using Ruby bundler |
69
- | `lint_fix` | Fixes all the fixable Rubocop lint offenses |
70
- | `gem_clean` | Cleans existing gem file(if any) generated through gem build process |
71
- | `gem_build` | Builds Ruby gem file wrapping secret detection logic (lib directory) |
72
- | `generate_proto` | Generates ruby(.rb) files for the Protobud Service Definition files(.proto) |
73
- | `grpc_docker_build` | Builds a docker container image for gRPC server |
74
- | `grpc_docker_serve` | Runs gRPC server via docker container listening on port 8080. Run `grpc_docker_build` make command before running this command. |
75
- | `grpc_serve` | Runs gRPC server on the CLI listening on port 50001. Run `install` make command before running this command. |
76
- | `run_core_tests` | Runs RSpec tests for Secret Detection core logic |
77
- | `run_grpc_tests` | Runs RSpec tests for Secret Detection gRPC endpoints |
78
- | `run_all_tests` | Runs all the RSpec tests in the project |
65
+ | Command | Description |
66
+ |----------------------------------|---------------------------------------------------------------------------------------------------------------------------------|
67
+ | `install_secret_detection_rules` | Downloads secret-detection-rules based on package version defined in RULES_VERSION |
68
+ | `install` | Installs ruby gems in the project using Ruby bundler |
69
+ | `lint_fix` | Fixes all the fixable Rubocop lint offenses |
70
+ | `gem_clean` | Cleans existing gem file(if any) generated through gem build process |
71
+ | `gem_build` | Builds Ruby gem file wrapping secret detection logic (lib directory) |
72
+ | `generate_proto` | Generates ruby(.rb) files for the Protobud Service Definition files(.proto) |
73
+ | `grpc_docker_build` | Builds a docker container image for gRPC server |
74
+ | `grpc_docker_serve` | Runs gRPC server via docker container listening on port 8080. Run `grpc_docker_build` make command before running this command. |
75
+ | `grpc_serve` | Runs gRPC server on the CLI listening on port 50001. Run `install` make command before running this command. |
76
+ | `run_core_tests` | Runs RSpec tests for Secret Detection core logic |
77
+ | `run_grpc_tests` | Runs RSpec tests for Secret Detection gRPC endpoints |
78
+ | `run_utils_tests` | Runs RSpec tests for Secret Detection utilities |
79
+ | `run_all_tests` | Runs all the RSpec tests in the project |
79
80
 
80
81
 
81
82
  ## Secret Detection Rules
@@ -166,9 +167,9 @@ You should see the following response as a result:
166
167
 
167
168
 
168
169
  ```shell
169
- $ grpcurl -d @ \
170
- localhost:50001 \
170
+ grpcurl -plaintext -d @ \
171
171
  -rpc-header 'x-sd-auth:12345' \
172
+ localhost:50001 \
172
173
  gitlab.secret_detection.Scanner/Scan <<EOM
173
174
  {
174
175
  "payloads": [
@@ -335,15 +336,22 @@ Run `ruby examples/sample-client/sample_client.rb` on your terminal to run the s
335
336
 
336
337
  RPC service is benchmarked using [`ghz`](https://ghz.sh), a powerful CLI-based tool for load testing and benchmarking gRPC services. More details added [here](https://gitlab.com/gitlab-org/gitlab/-/work_items/468107).
337
338
 
338
- ## Project Status
339
+ ## Release Process
340
+
341
+ We do three primary actions for every merge to `main` branch:
342
+
343
+ - **Build and Publish SD ruby gem to RubyGems.org**:
344
+ - The latest version for releasing Secret Detection gem is pulled from `Gitlab::SecretDetection::Gem::VERSION` (located at`lib/gitlab/secret_detection/version.rb`).
345
+ - We build a ruby gem for the code snapshot and tag it to the extract release version.
346
+ - The script for publising the gem to RubyGems.org is available [here](ci/scripts/publish_ruby_gem.sh).
339
347
 
340
- Secret Detection service's status can be tracked here: https://gitlab.com/gitlab-org/gitlab/-/issues/467531
348
+ - **Deploy SD gRPC server to GCP using Runway**:
349
+ - We build a docker container for the current code snapshot and tag it under `$CI_REGISTRY_IMAGE/image:$CI_COMMIT_SHORT_SHA` container registry path.
350
+ - The same container registry path is given as input to the Runway CI downstream which takes it forward for deploying in Staging and Production environments.
341
351
 
342
- #### Changes made in the secret detection logic that were previously not present in the Gem
352
+ - **Make a GitLab Release**:
353
+ - We use a modified version of [`upsert git tag`](https://gitlab.com/gitlab-org/security-products/ci-templates/-/blob/master/includes-dev/upsert-git-tag.ym) job where instead of fetching the version from the first changelog entry, we fetch it from `Gitlab::SecretDetection::Gem::VERSION`. The rest of the behaviour is retained i.e., creating a tag from the version and then creating a new GitLab release against that tag.
354
+ - The job pulls the description of the latest version entry from the [`CHANGELOG.md`](CHANGELOG.md) and uses it for the Release description.
355
+ - The script for creating a git tag and making GitLab release is available [here](ci/scripts/make_gitlab_release.sh).
343
356
 
344
- - [Gitlab::SecretDetection::Core::Scanner#initialize(...)](lib/gitlab/secret_detection/core/scanner.rb): To reuse the logic of ruleset parsing from a file source, we parse the ruleset file at once and pass the parsed rules around. So,
345
- the `initialize()` method now accepts parsed rules instead of ruleset file path
346
- - [Gitlab::SecretDetection::Core::Status](lib/gitlab/secret_detection/core/status.rb): `NOT_FOUND` status moved from `0` to `7` since
347
- gRPC reserves `0` for enums. We need to reflect this change on the Rails side too
348
- - [Gitlab::SecretDetection::Core::Scanner#scan(...)](lib/gitlab/secret_detection/core/scanner.rb): Introduced `rule_exclusions`, `raw_value_exclusions` and `tags` args to `scan(..)`
349
- method to suport [exclusions](https://gitlab.com/groups/gitlab-org/-/epics/14315) feature.
357
+ *NOTE: There is no logical requirement for the versions defined in `Gitlab::SecretDetection::Gem::VERSION` and latest entry of `CHANGELOG.md` to be the same. However, we expect them to be the same to keep it consistent. We've added a CI job([`validate version sync`](ci/templates/validate.yml)) that ensures the version sync between them.*
@@ -7,6 +7,14 @@ module Gitlab
7
7
  module SecretDetection
8
8
  module Core
9
9
  class Ruleset
10
+ # RulesetParseError is thrown when the code fails to parse the
11
+ # ruleset file from the given path
12
+ RulesetParseError = Class.new(StandardError)
13
+
14
+ # RulesetCompilationError is thrown when the code fails to compile
15
+ # the predefined rulesets
16
+ RulesetCompilationError = Class.new(StandardError)
17
+
10
18
  # file path where the secrets ruleset file is located
11
19
  RULESET_FILE_PATH = File.expand_path('secret_push_protection_rules.toml', __dir__)
12
20
 
@@ -21,18 +29,37 @@ module Gitlab
21
29
  @rule_data = parse_ruleset
22
30
  end
23
31
 
32
+ def extract_ruleset_version
33
+ @ruleset_version ||= if File.readable?(RULESET_FILE_PATH)
34
+ first_line = File.open(RULESET_FILE_PATH, &:gets)
35
+ first_line&.split(":")&.[](1)&.strip
36
+ end
37
+ rescue StandardError => e
38
+ logger.error(message: "Failed to extract Secret Detection Ruleset version from ruleset file: #{e.message}")
39
+ end
40
+
24
41
  private
25
42
 
26
43
  attr_reader :path, :logger
27
44
 
28
45
  # parses given ruleset file and returns the parsed rules
29
46
  def parse_ruleset
30
- # rule_file_content = File.read(path)
47
+ logger.info(
48
+ message: "Parsing local ruleset file",
49
+ ruleset_path: RULESET_FILE_PATH
50
+ )
31
51
  rules_data = TomlRB.load_file(path, symbolize_keys: true).freeze
52
+ ruleset_version = extract_ruleset_version
53
+
54
+ logger.info(
55
+ message: "Ruleset details fetched for running Secret Detection scan",
56
+ total_rules: rules_data[:rules]&.length,
57
+ ruleset_version:
58
+ )
32
59
  rules_data[:rules].freeze
33
60
  rescue StandardError => e
34
- logger.error "Failed to parse secret detection ruleset from '#{path}' path: #{e}"
35
- raise Core::Scanner::RulesetParseError, e
61
+ logger.error(message: "Failed to parse local secret detection ruleset: #{e.message}")
62
+ raise RulesetParseError, e
36
63
  end
37
64
  end
38
65
  end
@@ -11,14 +11,6 @@ module Gitlab
11
11
  module Core
12
12
  # Scan is responsible for running Secret Detection scan operation
13
13
  class Scanner
14
- # RulesetParseError is thrown when the code fails to parse the
15
- # ruleset file from the given path
16
- RulesetParseError = Class.new(StandardError)
17
-
18
- # RulesetCompilationError is thrown when the code fails to compile
19
- # the predefined rulesets
20
- RulesetCompilationError = Class.new(StandardError)
21
-
22
14
  # default time limit(in seconds) for running the scan operation per invocation
23
15
  DEFAULT_SCAN_TIMEOUT_SECS = 180 # 3 minutes
24
16
  # default time limit(in seconds) for running the scan operation on a single payload
@@ -46,7 +38,7 @@ module Gitlab
46
38
  tags: DEFAULT_PATTERN_MATCHER_TAGS,
47
39
  include_missing_tags: false
48
40
  )
49
- @default_pattern_matcher = build_pattern_matcher(
41
+ @default_pattern_matcher, @default_rules = build_pattern_matcher(
50
42
  tags: DEFAULT_PATTERN_MATCHER_TAGS,
51
43
  include_missing_tags: false
52
44
  ) # includes only gitlab_blocking rules
@@ -91,7 +83,6 @@ module Gitlab
91
83
  tags: DEFAULT_PATTERN_MATCHER_TAGS,
92
84
  subprocess: RUN_IN_SUBPROCESS
93
85
  )
94
-
95
86
  return Core::Response.new(status: Core::Status::INPUT_ERROR) unless validate_scan_input(payloads)
96
87
 
97
88
  # assign defaults since grpc passing zero timeout value to `Timeout.timeout(..)` makes it effectively useless.
@@ -106,17 +97,38 @@ module Gitlab
106
97
 
107
98
  next Core::Response.new(status: Core::Status::NOT_FOUND) if matched_payloads.empty?
108
99
 
100
+ # the pattern matcher will filter rules by tags so we use the filtered rule list
101
+ pattern_matcher, active_rules = build_pattern_matcher(tags:)
102
+
109
103
  scan_args = {
110
104
  payloads: matched_payloads,
111
105
  payload_timeout:,
112
- pattern_matcher: build_pattern_matcher(tags:),
113
- exclusions:
114
- }
106
+ pattern_matcher:,
107
+ exclusions:,
108
+ rules: active_rules
109
+ }.freeze
110
+
111
+ logger.info(
112
+ message: "Scan input parameters for running Secret Detection scan",
113
+ timeout:,
114
+ payload_timeout:,
115
+ given_total_payloads: payloads.length,
116
+ scannable_payloads_post_keyword_filter: matched_payloads.length,
117
+ tags:,
118
+ run_in_subprocess: subprocess,
119
+ given_exclusions: format_exclusions_hash(exclusions)
120
+ )
115
121
 
116
122
  secrets, applied_exclusions = subprocess ? run_scan_within_subprocess(**scan_args) : run_scan(**scan_args)
117
123
 
118
124
  scan_status = overall_scan_status(secrets)
119
125
 
126
+ logger.info(
127
+ message: "Secret Detection scan completed with #{secrets.length} secrets detected in the given payloads",
128
+ detected_secrets_metadata: format_detected_secrets_metadata(secrets),
129
+ applied_exclusions: format_exclusions_arr(applied_exclusions)
130
+ )
131
+
120
132
  Core::Response.new(status: scan_status, results: secrets, applied_exclusions:)
121
133
  end
122
134
  rescue Timeout::Error => e
@@ -127,7 +139,7 @@ module Gitlab
127
139
 
128
140
  private
129
141
 
130
- attr_reader :logger, :rules, :keywords, :default_pattern_matcher, :default_keyword_matcher
142
+ attr_reader :logger, :rules, :keywords, :default_pattern_matcher, :default_keyword_matcher, :default_rules
131
143
 
132
144
  # Builds RE2::Set pattern matcher for the given combination of rules
133
145
  # and tags. It also allows a choice(via `include_missing_tags`) to consider rules
@@ -135,31 +147,49 @@ module Gitlab
135
147
  # are same as +DEFAULT_PATTERN_MATCHER_TAGS+ then returns the eagerly loaded default
136
148
  # pattern matcher created during initialization.
137
149
  def build_pattern_matcher(tags:, include_missing_tags: false)
138
- return default_pattern_matcher if tags.eql?(DEFAULT_PATTERN_MATCHER_TAGS) && !default_pattern_matcher.nil?
150
+ if tags.eql?(DEFAULT_PATTERN_MATCHER_TAGS) && !default_pattern_matcher.nil?
151
+ logger.info(
152
+ message: "Given tags input matches default matcher tags, using pre-defined RE2 Pattern Matcher"
153
+ )
154
+ return [default_pattern_matcher, default_rules]
155
+ end
156
+
157
+ logger.info(
158
+ message: "Creating a new RE2 Pattern Matcher with given tags",
159
+ tags:,
160
+ include_missing_tags:
161
+ )
162
+ active_rules = []
139
163
 
140
164
  matcher = RE2::Set.new
141
165
 
142
- rules.each do |rule|
143
- rule_tags = rule[:tags]
166
+ begin
167
+ rules.each do |rule|
168
+ rule_tags = rule[:tags]
144
169
 
145
- include_rule = if tags.empty?
146
- true
147
- elsif rule_tags
148
- tags.intersect?(rule_tags)
149
- else
150
- include_missing_tags
151
- end
170
+ include_rule = if tags.empty?
171
+ true
172
+ elsif rule_tags
173
+ tags.intersect?(rule_tags)
174
+ else
175
+ include_missing_tags
176
+ end
152
177
 
153
- matcher.add(rule[:regex]) if include_rule
178
+ active_rules << rule if include_rule
179
+ matcher.add(rule[:regex]) if include_rule
180
+ end
181
+ rescue StandardError => e
182
+ logger.error "Failed to add regex secret detection ruleset in RE::Set: #{e.message}"
183
+ raise Core::Ruleset::RulesetCompilationError, cause: e
154
184
  end
155
185
 
156
186
  unless matcher.compile
157
- logger.error "Failed to compile secret detection rulesets in RE::Set"
187
+ logger.error "Failed to compile secret detection ruleset in RE::Set"
158
188
 
159
- raise RulesetCompilationError
189
+ raise Core::Ruleset::RulesetCompilationError
160
190
  end
161
191
 
162
- matcher
192
+ [matcher, active_rules]
163
193
  end
164
194
 
165
195
  # Creates and returns the unique set of rule matching keywords
@@ -174,7 +204,18 @@ module Gitlab
174
204
  end
175
205
 
176
206
  def build_keyword_matcher(tags:, include_missing_tags: false)
177
- return default_keyword_matcher if tags.eql?(DEFAULT_PATTERN_MATCHER_TAGS) && !default_keyword_matcher.nil?
207
+ if tags.eql?(DEFAULT_PATTERN_MATCHER_TAGS) && !default_keyword_matcher.nil?
208
+ logger.info(
209
+ message: "Given tags input matches default tags, using pre-defined RE2 Keyword Matcher"
210
+ )
211
+ return default_keyword_matcher
212
+ end
213
+
214
+ logger.info(
215
+ message: "Creating a new RE2 Keyword Matcher..",
216
+ tags:,
217
+ include_missing_tags:
218
+ )
178
219
 
179
220
  include_keywords = Set.new
180
221
 
@@ -187,15 +228,28 @@ module Gitlab
187
228
  include_keywords.merge(rule[:keywords]) unless rule[:keywords].nil?
188
229
  end
189
230
 
190
- return nil if include_keywords.empty?
231
+ if include_keywords.empty?
232
+ logger.error(
233
+ message: "No rule keywords found a match with given rule tags, returning empty RE2 Keyword Matcher"
234
+ )
235
+ return nil
236
+ end
191
237
 
192
238
  keywords_regex = include_keywords.join('|')
193
239
 
240
+ logger.debug(
241
+ message: "Creating RE2 Keyword Matcher with set of rule keywords",
242
+ keywords: include_keywords.to_a
243
+ )
244
+
194
245
  RE2("\\b(#{keywords_regex})")
195
246
  end
196
247
 
197
248
  def filter_by_keywords(keyword_matcher, payloads)
198
- return [] if keyword_matcher.nil?
249
+ if keyword_matcher.nil?
250
+ logger.warn "No RE2 Keyword Matcher instance available, skipping payload filter by rule keywords step.."
251
+ return payloads
252
+ end
199
253
 
200
254
  matched_payloads = []
201
255
  payloads.each do |payload|
@@ -204,6 +258,20 @@ module Gitlab
204
258
  matched_payloads << payload
205
259
  end
206
260
 
261
+ total_payloads_retained = matched_payloads.length == payloads.length ? 'all' : matched_payloads.length
262
+ log_message = if matched_payloads.empty?
263
+ "No payloads available to scan further after keyword-matching, exiting Secret Detection scan"
264
+ else
265
+ "Retained #{total_payloads_retained} payloads to scan further after keyword-matching step"
266
+ end
267
+
268
+ logger.info(
269
+ message: log_message,
270
+ given_total_payloads: payloads.length,
271
+ matched_payloads: matched_payloads.length,
272
+ payloads_to_scan_further: matched_payloads.map(&:id)
273
+ )
274
+
207
275
  matched_payloads
208
276
  end
209
277
 
@@ -214,22 +282,29 @@ module Gitlab
214
282
  payloads:,
215
283
  payload_timeout:,
216
284
  pattern_matcher:,
217
- exclusions: {}
285
+ exclusions: {},
286
+ rules: []
218
287
  )
219
288
  all_applied_exclusions = Set.new
220
289
 
290
+ logger.info(
291
+ message: "Running Secret Detection scan sequentially",
292
+ payload_timeout:
293
+ )
294
+
221
295
  all_findings = payloads.flat_map do |payload|
222
296
  Timeout.timeout(payload_timeout) do
223
297
  findings, applied_exclusions = find_secrets_in_payload(
224
298
  payload:,
225
299
  pattern_matcher:,
226
- exclusions:
300
+ exclusions:,
301
+ rules:
227
302
  )
228
303
  all_applied_exclusions.merge(applied_exclusions)
229
304
  findings
230
305
  end
231
306
  rescue Timeout::Error => e
232
- logger.error "Secret Detection scan timed out on the payload(id:#{payload.id}): #{e}"
307
+ logger.warn "Secret Detection scan timed out on the payload(id:#{payload.id}): #{e}"
233
308
 
234
309
  Core::Finding.new(payload.id,
235
310
  Core::Status::PAYLOAD_TIMEOUT)
@@ -241,14 +316,22 @@ module Gitlab
241
316
  payloads:,
242
317
  payload_timeout:,
243
318
  pattern_matcher:,
244
- exclusions: {}
319
+ exclusions: {},
320
+ rules: []
245
321
  )
246
322
  all_applied_exclusions = Set.new
323
+
247
324
  payload_sizes = payloads.map(&:size)
248
325
  grouped_payload_indices = group_by_chunk_size(payload_sizes)
249
326
 
250
327
  grouped_payloads = grouped_payload_indices.map { |idx_arr| idx_arr.map { |i| payloads[i] } }
251
328
 
329
+ logger.info(
330
+ message: "Running Secret Detection scan within a subprocess",
331
+ grouped_payloads: grouped_payloads.length,
332
+ payload_timeout:
333
+ )
334
+
252
335
  found_secrets = Parallel.flat_map(
253
336
  grouped_payloads,
254
337
  in_processes: MAX_PROCS_PER_REQUEST,
@@ -259,13 +342,14 @@ module Gitlab
259
342
  findings, applied_exclusions = find_secrets_in_payload(
260
343
  payload:,
261
344
  pattern_matcher:,
262
- exclusions:
345
+ exclusions:,
346
+ rules:
263
347
  )
264
348
  all_applied_exclusions.merge(applied_exclusions)
265
349
  findings
266
350
  end
267
351
  rescue Timeout::Error => e
268
- logger.error "Secret Detection scan timed out on the payload(id:#{payload.id}): #{e}"
352
+ logger.warn "Secret Detection scan timed out on the payload(id:#{payload.id}): #{e}"
269
353
 
270
354
  Core::Finding.new(payload.id, Core::Status::PAYLOAD_TIMEOUT)
271
355
  end
@@ -277,7 +361,7 @@ module Gitlab
277
361
  # Finds secrets in the given payload guarded with a timeout as a circuit breaker. It accepts
278
362
  # literal values to exclude from the input before the scan, also SD rules to exclude during
279
363
  # the scan.
280
- def find_secrets_in_payload(payload:, pattern_matcher:, exclusions: {})
364
+ def find_secrets_in_payload(payload:, pattern_matcher:, exclusions: {}, rules: @default_rules)
281
365
  findings = []
282
366
  applied_exclusions = Set.new
283
367
 
@@ -291,8 +375,10 @@ module Gitlab
291
375
  .each_with_index do |line, index|
292
376
  unless raw_value_exclusions.empty?
293
377
  raw_value_exclusions.each do |exclusion|
294
- line.gsub!(exclusion.value, '') # replace input that doesn't contain allowed value in it
295
- applied_exclusions << exclusion
378
+ # replace input that doesn't contain allowed value in it
379
+ # replace exclusion value, `.gsub!` returns 'self' if replaced otherwise 'nil'
380
+ excl_replaced = !!line.gsub!(exclusion.value, '')
381
+ applied_exclusions << exclusion if excl_replaced
296
382
  end
297
383
  end
298
384
 
@@ -323,6 +409,13 @@ module Gitlab
323
409
  end
324
410
  end
325
411
 
412
+ logger.info(
413
+ message: "Secret Detection scan found #{findings.length} secret leaks in the payload(id:#{payload.id})",
414
+ payload_id: payload.id,
415
+ detected_rules: findings.map { |f| "#{f.type}:#{f.line_number}" },
416
+ applied_exclusions: format_exclusions_arr(applied_exclusions)
417
+ )
418
+
326
419
  [findings, applied_exclusions]
327
420
  rescue StandardError => e
328
421
  logger.error "Secret Detection scan failed on the payload(id:#{payload.id}): #{e}"
@@ -338,10 +431,20 @@ module Gitlab
338
431
  # Validates the given payloads by verifying the type and
339
432
  # presence of `id` and `data` fields necessary for the scan
340
433
  def validate_scan_input(payloads)
341
- return false if payloads.nil? || !payloads.instance_of?(Array)
434
+ if payloads.nil? || !payloads.instance_of?(Array)
435
+ logger.debug(message: "Scan input validation error: payloads arg is empty or not instance of array")
436
+ return false
437
+ end
342
438
 
343
439
  payloads.all? do |payload|
344
- payload.respond_to?(:id) && payload.respond_to?(:data)
440
+ has_valid_fields = payload.respond_to?(:id) && payload.respond_to?(:data) && payload.data.is_a?(String)
441
+ unless has_valid_fields
442
+ logger.debug(
443
+ message: "Scan input validation error: one of the payloads does not respond to `id` or `data`"
444
+ )
445
+ end
446
+
447
+ has_valid_fields
345
448
  end
346
449
  end
347
450
 
@@ -390,6 +493,75 @@ module Gitlab
390
493
 
391
494
  chunk_indexes
392
495
  end
496
+
497
+ # Returns array of strings with each representing a masked exclusion
498
+ #
499
+ # Example: For given arg exclusions = {
500
+ # rule: ["gitlab_personal_access_token", "aws_key"],
501
+ # path: ["test.py"],
502
+ # raw_value: ["ABC123XYZ"]
503
+ # }
504
+ #
505
+ # The output will look like the following:
506
+ # [
507
+ # "rule=gitlab_personal_access_token,aws_key",
508
+ # "raw_value=AB*****YZ",
509
+ # "paths=test.py"
510
+ # ]
511
+ def format_exclusions_hash(exclusions = {})
512
+ masked_raw_values = exclusions.fetch(:raw_value, []).map do |exclusion|
513
+ Gitlab::SecretDetection::Utils::Masker.mask_secret(exclusion.value)
514
+ end.join(", ")
515
+ paths = exclusions.fetch(:path, []).map(&:value).join(", ")
516
+ rules = exclusions.fetch(:rule, []).map(&:value).join(", ")
517
+
518
+ out = []
519
+
520
+ out << "rules=#{rules}" unless rules.empty?
521
+ out << "raw_values=#{masked_raw_values}" unless masked_raw_values.empty?
522
+ out << "paths=#{paths}" unless paths.empty?
523
+
524
+ out
525
+ end
526
+
527
+ def format_exclusions_arr(exclusions = [])
528
+ return [] if exclusions.empty?
529
+
530
+ masked_raw_values = Set.new
531
+ paths = Set.new
532
+ rules = Set.new
533
+
534
+ exclusions.each do |exclusion|
535
+ case exclusion.exclusion_type
536
+ when :EXCLUSION_TYPE_RAW_VALUE
537
+ masked_raw_values << Gitlab::SecretDetection::Utils::Masker.mask_secret(exclusion.value)
538
+ when :EXCLUSION_TYPE_RULE
539
+ rules << exclusion.value
540
+ when :EXCLUSION_TYPE_PATH
541
+ paths << exclusion.value
542
+ else
543
+ logger.warn("Unknown exclusion type #{exclusion.exclusion_type}")
544
+ end
545
+ end
546
+
547
+ out = []
548
+
549
+ out << "rules=#{rules.join(',')}" unless rules.empty?
550
+ out << "raw_values=#{masked_raw_values.join(',')}" unless masked_raw_values.empty?
551
+ out << "paths=#{paths.join(',')}" unless paths.empty?
552
+
553
+ out
554
+ end
555
+
556
+ def format_detected_secrets_metadata(findings = [])
557
+ return [] if findings.empty?
558
+
559
+ found_secrets = findings.filter do |f|
560
+ f.status == Core::Status::FOUND
561
+ end
562
+
563
+ found_secrets.map { |f| "#{f.payload_id}=>#{f.type}:#{f.line_number}" }
564
+ end
393
565
  end
394
566
  end
395
567
  end