star-dlp 0.1.0 → 0.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 0f53a06472e77562560428a8f65cce8a670f26d0d430cd2c947dc8843cd5c697
4
- data.tar.gz: 5e9ca4e9a2fc62a854fc1f8b2f8473c5379f8e03c6f8772e9f4eeb1055cbc4dd
3
+ metadata.gz: 25b1201d34bb3a3e4d219f9faadde2b8d34aba5747c1ceb926482699107b44e2
4
+ data.tar.gz: f431fbe097ef52988772208ac1021d4e1dbd11f59fe16c4b8ab512e84ad19906
5
5
  SHA512:
6
- metadata.gz: 290b940570744dc5ed74fbdfeda58794312109c5412a7751b15d40d3eb857418c676414e0ee692a1db31bd2458ff566e1e4437a41786102386563df065cd4b59
7
- data.tar.gz: f186110f59e875bd99aa586544ad8a2d69243172d45ee4b76bfa458e373808d9dd6ee8fa9e952251c06e9264a009bf6b3408556334445ed88c4cd4dded8ac8c5
6
+ metadata.gz: a21e8d101a58153efd911c4974e25683291d9b1ae52dc1f519b8b01f93dbb679c23f0d442eaad8dbf5c1b941e9db6dfa9abea42bc917c1f2a6d65984590dd195
7
+ data.tar.gz: 6783258316126bcdd95f5ebedd283d9ffa2ce38c6f24acfdd124e7e4ff584549cf1dc1ed9caa0d3118a1bd210ba2f5de64b51ac456777d6225e61d979480a4fc
data/Gemfile.lock CHANGED
@@ -1,7 +1,7 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- star-dlp (0.1.0)
4
+ star-dlp (0.1.1)
5
5
  fileutils (~> 1.6)
6
6
  github_api (~> 0.19.0)
7
7
  json (~> 2.6)
data/README.md CHANGED
@@ -48,6 +48,44 @@ $ star-dlp download your_github_username
48
48
 
49
49
  This will download all your starred repositories and save them as JSON and Markdown files. If you've previously downloaded some repositories, it will only download newly starred repositories.
50
50
 
51
+ Available options:
52
+ - `--token`: GitHub API token
53
+ - `--output_dir`: Output directory
54
+ - `--json_dir`: JSON files directory
55
+ - `--markdown_dir`: Markdown files directory
56
+ - `--threads`: Number of download threads (default: 16)
57
+ - `--skip_readme`: Skip downloading README files
58
+ - `--retry_count`: Number of retry attempts for failed downloads (default: 5)
59
+ - `--retry_delay`: Delay in seconds between retry attempts (default: 1)
60
+
61
+ Example with options:
62
+
63
+ ```bash
64
+ $ star-dlp download your_github_username --threads=8 --skip_readme --retry_count=3
65
+ ```
66
+
67
+ ### Downloading READMEs
68
+
69
+ If you've already downloaded your starred repositories but want to download or update their README files separately:
70
+
71
+ ```bash
72
+ $ star-dlp download_readme
73
+ ```
74
+
75
+ This command will scan your JSON files directory, extract repository information, and download README files for repositories that don't already have them.
76
+
77
+ Available options:
78
+ - `--threads`: Number of download threads (default: 16)
79
+ - `--retry_count`: Number of retry attempts for failed downloads (default: 5)
80
+ - `--retry_delay`: Delay in seconds between retry attempts (default: 1)
81
+ - `--force`: Force download even if README was already downloaded
82
+
83
+ Example with options:
84
+
85
+ ```bash
86
+ $ star-dlp download_readme --threads=8 --force
87
+ ```
88
+
51
89
  ### View Version
52
90
 
53
91
  ```bash
@@ -60,8 +98,10 @@ Star-DLP saves files in the following locations:
60
98
 
61
99
  - Configuration file: `~/.star-dlp/config.json`
62
100
  - Starred repositories: `~/.star-dlp/stars/`
63
- - JSON files: `~/.star-dlp/stars/json/`
64
- - Markdown files: `~/.star-dlp/stars/markdown/`
101
+ - JSON files: `~/.star-dlp/stars/json/YYYY/MM/YYYYMMDD.owner.repo.json`
102
+ - Markdown files: `~/.star-dlp/stars/markdown/YYYY/MM/YYYYMMDD.owner.repo.md`
103
+ - Last downloaded repository: `~/.star-dlp/stars/last_downloaded_repo.txt`
104
+ - Downloaded READMEs list: `~/.star-dlp/stars/downloaded_readmes.txt`
65
105
 
66
106
  ## Development
67
107
 
data/README_zh.md CHANGED
@@ -48,6 +48,44 @@ $ star-dlp download your_github_username
48
48
 
49
49
  这将下载您所有的星标仓库,并将它们保存为 JSON 和 Markdown 文件。如果您之前已经下载过一些仓库,它只会下载新的星标仓库。
50
50
 
51
+ 可用选项:
52
+ - `--token`: GitHub API 令牌
53
+ - `--output_dir`: 输出目录
54
+ - `--json_dir`: JSON 文件目录
55
+ - `--markdown_dir`: Markdown 文件目录
56
+ - `--threads`: 下载线程数 (默认: 16)
57
+ - `--skip_readme`: 跳过下载 README 文件
58
+ - `--retry_count`: 下载失败时的重试次数 (默认: 5)
59
+ - `--retry_delay`: 重试之间的延迟秒数 (默认: 1)
60
+
61
+ 带选项的示例:
62
+
63
+ ```bash
64
+ $ star-dlp download your_github_username --threads=8 --skip_readme --retry_count=3
65
+ ```
66
+
67
+ ### 下载 README 文件
68
+
69
+ 如果您已经下载了星标仓库,但想单独下载或更新它们的 README 文件:
70
+
71
+ ```bash
72
+ $ star-dlp download_readme
73
+ ```
74
+
75
+ 此命令将扫描您的 JSON 文件目录,提取仓库信息,并为尚未下载 README 的仓库下载 README 文件。
76
+
77
+ 可用选项:
78
+ - `--threads`: 下载线程数 (默认: 16)
79
+ - `--retry_count`: 下载失败时的重试次数 (默认: 5)
80
+ - `--retry_delay`: 重试之间的延迟秒数 (默认: 1)
81
+ - `--force`: 强制下载,即使 README 已经下载过
82
+
83
+ 带选项的示例:
84
+
85
+ ```bash
86
+ $ star-dlp download_readme --threads=8 --force
87
+ ```
88
+
51
89
  ### 查看版本
52
90
 
53
91
  ```bash
@@ -60,8 +98,10 @@ Star-DLP 将文件保存在以下位置:
60
98
 
61
99
  - 配置文件: `~/.star-dlp/config.json`
62
100
  - 星标仓库: `~/.star-dlp/stars/`
63
- - JSON 文件: `~/.star-dlp/stars/json/`
64
- - Markdown 文件: `~/.star-dlp/stars/markdown/`
101
+ - JSON 文件: `~/.star-dlp/stars/json/YYYY/MM/YYYYMMDD.owner.repo.json`
102
+ - Markdown 文件: `~/.star-dlp/stars/markdown/YYYY/MM/YYYYMMDD.owner.repo.md`
103
+ - 最后下载的仓库: `~/.star-dlp/stars/last_downloaded_repo.txt`
104
+ - 已下载 README 列表: `~/.star-dlp/stars/downloaded_readmes.txt`
65
105
 
66
106
  ## 开发
67
107
 
data/lib/star/dlp/cli.rb CHANGED
@@ -1,6 +1,9 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  require "thor"
4
+ require "fileutils"
5
+ require "json"
6
+ require "time"
4
7
  require_relative "config"
5
8
  require_relative "downloader"
6
9
 
@@ -12,6 +15,10 @@ module Star
12
15
  option :output_dir, type: :string, desc: "Output directory for stars"
13
16
  option :json_dir, type: :string, desc: "Directory for JSON files"
14
17
  option :markdown_dir, type: :string, desc: "Directory for Markdown files"
18
+ option :threads, type: :numeric, default: 16, desc: "Number of download threads"
19
+ option :skip_readme, type: :boolean, default: false, desc: "Skip downloading README files"
20
+ option :retry_count, type: :numeric, default: 5, desc: "Number of retry attempts for failed downloads"
21
+ option :retry_delay, type: :numeric, default: 1, desc: "Delay in seconds between retry attempts"
15
22
  def download(username)
16
23
  config = Config.load
17
24
 
@@ -24,10 +31,42 @@ module Star
24
31
  # Save config for future use
25
32
  config.save
26
33
 
27
- downloader = Downloader.new(config, username)
34
+ downloader = Downloader.new(
35
+ config,
36
+ username,
37
+ thread_count: options[:threads],
38
+ skip_readme: options[:skip_readme],
39
+ retry_count: options[:retry_count],
40
+ retry_delay: options[:retry_delay]
41
+ )
28
42
  downloader.download
29
43
  end
30
44
 
45
+ desc "download_readme", "Download READMEs for all repositories from JSON files"
46
+ option :threads, type: :numeric, default: 16, desc: "Number of download threads"
47
+ option :retry_count, type: :numeric, default: 5, desc: "Number of retry attempts for failed downloads"
48
+ option :retry_delay, type: :numeric, default: 1, desc: "Delay in seconds between retry attempts"
49
+ option :force, type: :boolean, default: false, desc: "Force download even if README was already downloaded"
50
+ def download_readme
51
+ config = Config.load
52
+
53
+ # Create a downloader instance
54
+ downloader = Downloader.new(
55
+ config,
56
+ "readme_downloader", # Placeholder username
57
+ thread_count: options[:threads],
58
+ retry_count: options[:retry_count],
59
+ retry_delay: options[:retry_delay]
60
+ )
61
+
62
+ # Call the download_readmes method in the Downloader class
63
+ result = downloader.download_readmes(force: options[:force])
64
+
65
+ puts "README download completed!"
66
+ puts "Successfully downloaded: #{result[:success]}"
67
+ puts "Failed or not found: #{result[:failed]}"
68
+ end
69
+
31
70
  desc "config", "Configure star-dlp"
32
71
  option :token, type: :string, desc: "GitHub API token"
33
72
  option :output_dir, type: :string, desc: "Output directory for stars"
@@ -5,6 +5,7 @@ require "json"
5
5
  require "fileutils"
6
6
  require "time"
7
7
  require "base64"
8
+ require "thread"
8
9
 
9
10
  module Star
10
11
  module Dlp
@@ -12,10 +13,18 @@ module Star
12
13
  attr_reader :config, :github, :username
13
14
 
14
15
  LAST_REPO_FILE = "last_downloaded_repo.txt"
16
+ DOWNLOADED_READMES_FILE = "downloaded_readmes.txt"
17
+ DEFAULT_THREAD_COUNT = 16
18
+ DEFAULT_RETRY_COUNT = 5
19
+ DEFAULT_RETRY_DELAY = 1 # seconds
15
20
 
16
- def initialize(config, username)
21
+ def initialize(config, username, thread_count: DEFAULT_THREAD_COUNT, skip_readme: false, retry_count: DEFAULT_RETRY_COUNT, retry_delay: DEFAULT_RETRY_DELAY)
17
22
  @config = config
18
23
  @username = username
24
+ @thread_count = thread_count
25
+ @skip_readme = skip_readme
26
+ @retry_count = retry_count
27
+ @retry_delay = retry_delay
19
28
 
20
29
  # Initialize GitHub API client with the special Accept header for starred_at field
21
30
  options = {
@@ -98,14 +107,19 @@ module Star
98
107
 
99
108
  puts "Found #{new_stars.size} new starred repositories to download"
100
109
 
101
- # Save new stars
110
+ # Save new stars using multiple threads
102
111
  if new_stars.any?
103
- puts "Downloading new repositories:"
104
- new_stars.each_with_index do |star, index|
105
- puts " [#{index + 1}/#{new_stars.size}] Downloading: #{get_repo_full_name(star)}"
106
- save_star_as_json(star)
107
- save_star_as_markdown(star)
108
- end
112
+ puts "Downloading new repositories using #{@thread_count} threads:"
113
+
114
+ # Process stars with multithreading
115
+ process_items_with_threads(
116
+ new_stars,
117
+ ->(star) { get_repo_full_name(star) },
118
+ ->(star) {
119
+ save_star_as_json(star)
120
+ save_star_as_markdown(star)
121
+ }
122
+ )
109
123
 
110
124
  puts "Download completed successfully!"
111
125
  else
@@ -119,8 +133,320 @@ module Star
119
133
  end
120
134
  end
121
135
 
136
+ # Download READMEs for all repositories from JSON files
137
+ def download_readmes(force: false)
138
+ puts "Downloading READMEs for repositories from JSON files"
139
+
140
+ # File to track repositories with downloaded READMEs
141
+ downloaded_readmes_file = File.join(config.output_dir, DOWNLOADED_READMES_FILE)
142
+
143
+ # Load list of repositories with already downloaded READMEs
144
+ downloaded_repos = Set.new
145
+ if File.exist?(downloaded_readmes_file) && !force
146
+ File.readlines(downloaded_readmes_file).each do |line|
147
+ downloaded_repos.add(line.strip)
148
+ end
149
+ puts "Found #{downloaded_repos.size} repositories with already downloaded READMEs"
150
+ end
151
+
152
+ # Find all JSON files in the json directory
153
+ json_files = Dir.glob(File.join(config.json_dir, "**", "*.json"))
154
+ puts "Found #{json_files.size} JSON files"
155
+
156
+ # Extract repository names from JSON files
157
+ repos_to_process = []
158
+ repo_dates = {} # Store starred_at dates for repositories
159
+
160
+ json_files.each do |json_file|
161
+ begin
162
+ data = JSON.parse(File.read(json_file))
163
+
164
+ # Extract repository full name from JSON data
165
+ repo_full_name = nil
166
+ starred_at = nil
167
+
168
+ if data.is_a?(Hash) && data["repo"] && data["repo"]["full_name"]
169
+ repo_full_name = data["repo"]["full_name"]
170
+ starred_at = data["starred_at"] if data.key?("starred_at")
171
+ elsif data.is_a?(Hash) && data["full_name"]
172
+ repo_full_name = data["full_name"]
173
+ starred_at = data["starred_at"] if data.key?("starred_at")
174
+ elsif File.basename(json_file) =~ /(\d{8})\.(.+)\.json$/
175
+ # Try to extract from filename (format: YYYYMMDD.owner.repo.json)
176
+ date_str = $1
177
+ parts = $2.split('.')
178
+ if parts.size >= 2
179
+ repo_full_name = "#{parts[0]}/#{parts[1]}"
180
+ # Convert YYYYMMDD to ISO date format
181
+ if date_str =~ /^(\d{4})(\d{2})(\d{2})$/
182
+ starred_at = "#{$1}-#{$2}-#{$3}T00:00:00Z"
183
+ end
184
+ end
185
+ end
186
+
187
+ # Skip if we couldn't determine the repository name or if README was already downloaded
188
+ next if repo_full_name.nil?
189
+ next if downloaded_repos.include?(repo_full_name) && !force
190
+
191
+ repos_to_process << repo_full_name
192
+ # Store the starred_at date if available
193
+ repo_dates[repo_full_name] = starred_at if starred_at
194
+ rescue JSON::ParserError => e
195
+ puts "Error parsing JSON file #{json_file}: #{e.message}"
196
+ end
197
+ end
198
+
199
+ puts "Found #{repos_to_process.size} repositories that need README downloads"
200
+
201
+ # Create a mutex for thread-safe file writing
202
+ mutex = Mutex.new
203
+ success_count = 0
204
+ failed_count = 0
205
+
206
+ # Process repositories with multithreading
207
+ result = process_items_with_threads(
208
+ repos_to_process,
209
+ ->(repo) { repo }, # Item name is the repo name itself
210
+ ->(repo_full_name) {
211
+ # Try to download README
212
+ readme_content = fetch_readme(repo_full_name)
213
+
214
+ if readme_content
215
+ # Get starred_at date if available, or use current date as fallback
216
+ date = nil
217
+ if repo_dates.key?(repo_full_name) && repo_dates[repo_full_name]
218
+ begin
219
+ date = Time.parse(repo_dates[repo_full_name])
220
+ rescue
221
+ date = Time.now
222
+ end
223
+ else
224
+ date = Time.now
225
+ end
226
+
227
+ # Create markdown file path
228
+ md_filepath = get_markdown_filepath(repo_full_name, date)
229
+
230
+ mutex.synchronize do
231
+ # Check if file exists
232
+ if File.exist?(md_filepath)
233
+ # Append README content to existing file
234
+ File.open(md_filepath, 'a') do |file|
235
+ file.puts "\n\n## README\n\n#{readme_content}\n"
236
+ end
237
+ else
238
+ # Create new file with repository information and README
239
+ content = <<~MARKDOWN
240
+ # #{repo_full_name}
241
+
242
+ - **Downloaded at**: #{Time.now.iso8601}
243
+ - **Starred at**: #{date.iso8601}
244
+
245
+ [View on GitHub](https://github.com/#{repo_full_name})
246
+
247
+ ## README
248
+
249
+ #{readme_content}
250
+ MARKDOWN
251
+
252
+ File.write(md_filepath, content)
253
+ end
254
+
255
+ # Add to downloaded repositories list
256
+ File.open(downloaded_readmes_file, 'a') do |file|
257
+ file.puts repo_full_name
258
+ end
259
+
260
+ success_count += 1
261
+ end
262
+
263
+ true
264
+ else
265
+ mutex.synchronize do
266
+ puts "No README found for #{repo_full_name}"
267
+ failed_count += 1
268
+ end
269
+ true # Mark as success even if README not found to avoid retries
270
+ end
271
+ }
272
+ )
273
+
274
+ puts "README download completed!"
275
+ puts "Successfully downloaded: #{success_count}"
276
+ puts "Failed or not found: #{failed_count}"
277
+
278
+ return {
279
+ total: repos_to_process.size,
280
+ success: success_count,
281
+ failed: failed_count
282
+ }
283
+ end
284
+
285
+ # Fetch README.md content from GitHub
286
+ def fetch_readme(repo_full_name)
287
+ begin
288
+ # Get README content using GitHub API
289
+ response = github.repos.contents.get(
290
+ user: repo_full_name.split('/').first,
291
+ repo: repo_full_name.split('/').last,
292
+ path: 'README.md'
293
+ )
294
+
295
+ # Decode content from Base64
296
+ if response.content && response.encoding == 'base64'
297
+ return Base64.decode64(response.content).force_encoding('UTF-8')
298
+ end
299
+ rescue Github::Error::NotFound
300
+ # Try README.markdown if README.md not found
301
+ begin
302
+ response = github.repos.contents.get(
303
+ user: repo_full_name.split('/').first,
304
+ repo: repo_full_name.split('/').last,
305
+ path: 'README.markdown'
306
+ )
307
+
308
+ if response.content && response.encoding == 'base64'
309
+ return Base64.decode64(response.content).force_encoding('UTF-8')
310
+ end
311
+ rescue Github::Error::NotFound
312
+ # Try readme.md (lowercase) if previous attempts failed
313
+ begin
314
+ response = github.repos.contents.get(
315
+ user: repo_full_name.split('/').first,
316
+ repo: repo_full_name.split('/').last,
317
+ path: 'readme.md'
318
+ )
319
+
320
+ if response.content && response.encoding == 'base64'
321
+ return Base64.decode64(response.content).force_encoding('UTF-8')
322
+ end
323
+ rescue Github::Error::NotFound
324
+ # README not found
325
+ return nil
326
+ rescue => e
327
+ puts "Error fetching lowercase readme.md for #{repo_full_name}: #{e.message}"
328
+ raise e
329
+ end
330
+ rescue => e
331
+ puts "Error fetching README.markdown for #{repo_full_name}: #{e.message}"
332
+ raise e
333
+ end
334
+ rescue => e
335
+ puts "Error fetching README.md for #{repo_full_name}: #{e.message}"
336
+ raise e
337
+ end
338
+
339
+ nil
340
+ end
341
+
122
342
  private
123
343
 
344
+ # Process a list of items using multiple threads
345
+ # items: Array of items to process
346
+ # name_proc: Proc to get item name for logging
347
+ # process_proc: Proc to process each item
348
+ def process_items_with_threads(items, name_proc, process_proc)
349
+ return if items.empty?
350
+
351
+ # Create a thread-safe queue for the items
352
+ queue = Queue.new
353
+ items.each { |item| queue << item }
354
+
355
+ # Create a mutex for thread-safe output
356
+ mutex = Mutex.new
357
+
358
+ # Create a progress counter
359
+ total = items.size
360
+ completed = 0
361
+
362
+ # Create and start the worker threads
363
+ threads = Array.new(@thread_count) do
364
+ Thread.new do
365
+ until queue.empty?
366
+ # Try to get an item from the queue (non-blocking)
367
+ item = queue.pop(true) rescue nil
368
+ break unless item
369
+
370
+ # Get the item name for logging
371
+ item_name = name_proc.call(item)
372
+
373
+ # Process the item with retry mechanism
374
+ success = false
375
+ retry_count = 0
376
+
377
+ until success || retry_count >= @retry_count
378
+ begin
379
+ # Process the item
380
+ process_proc.call(item)
381
+ success = true
382
+ rescue => e
383
+ retry_count += 1
384
+
385
+ # Log the error and retry information
386
+ mutex.synchronize do
387
+ puts " Error processing #{item_name}: #{e.message}"
388
+ if retry_count < @retry_count
389
+ puts " Retrying in #{@retry_delay} seconds (attempt #{retry_count + 1}/#{@retry_count})..."
390
+ else
391
+ puts " Failed to process after #{@retry_count} attempts."
392
+ end
393
+ end
394
+
395
+ # Wait before retrying
396
+ sleep(@retry_delay)
397
+ end
398
+ end
399
+
400
+ # Update progress
401
+ mutex.synchronize do
402
+ completed += 1
403
+ puts " [#{completed}/#{total}] Processed: #{item_name} (#{(completed.to_f / total * 100).round(1)}%)"
404
+ end
405
+ end
406
+ end
407
+ end
408
+
409
+ # Wait for all threads to complete
410
+ threads.each(&:join)
411
+
412
+ return {
413
+ total: total,
414
+ completed: completed
415
+ }
416
+ end
417
+
418
+ # Get the markdown file path for a repository
419
+ def get_markdown_filepath(repo_full_name, date = Time.now)
420
+ # Create directory structure based on date: markdown/YYYY/MM/
421
+ year_dir = date.strftime("%Y")
422
+ month_dir = date.strftime("%m")
423
+ target_dir = File.join(config.markdown_dir, year_dir, month_dir)
424
+ FileUtils.mkdir_p(target_dir) unless Dir.exist?(target_dir)
425
+
426
+ # Format filename: YYYYMMDD.repo_owner.repo_name.md
427
+ date_str = date.strftime("%Y%m%d")
428
+ repo_name = repo_full_name.gsub('/', '.')
429
+ filename = "#{date_str}.#{repo_name}.md"
430
+
431
+ File.join(target_dir, filename)
432
+ end
433
+
434
+ # Get the JSON file path for a repository
435
+ def get_json_filepath(repo_full_name, date = Time.now)
436
+ # Create directory structure based on date: json/YYYY/MM/
437
+ year_dir = date.strftime("%Y")
438
+ month_dir = date.strftime("%m")
439
+ target_dir = File.join(config.json_dir, year_dir, month_dir)
440
+ FileUtils.mkdir_p(target_dir) unless Dir.exist?(target_dir)
441
+
442
+ # Format filename: YYYYMMDD.repo_owner.repo_name.json
443
+ date_str = date.strftime("%Y%m%d")
444
+ repo_name = repo_full_name.gsub('/', '.')
445
+ filename = "#{date_str}.#{repo_name}.json"
446
+
447
+ File.join(target_dir, filename)
448
+ end
449
+
124
450
  def get_last_repo_name
125
451
  last_repo_file = File.join(config.output_dir, LAST_REPO_FILE)
126
452
  return nil unless File.exist?(last_repo_file)
@@ -133,25 +459,19 @@ module Star
133
459
  File.write(last_repo_file, repo_name)
134
460
  end
135
461
 
136
-
137
462
  def save_star_as_json(star)
138
463
  star_data = star.to_hash
139
464
 
140
465
  # Get starred_at date or use current date as fallback
141
466
  starred_at = star.respond_to?(:starred_at) ? Time.parse(star.starred_at) : Time.now
142
467
 
143
- # Create directory structure based on starred_at date: json/YYYY/MM/
144
- year_dir = starred_at.strftime("%Y")
145
- month_dir = starred_at.strftime("%m")
146
- target_dir = File.join(config.json_dir, year_dir, month_dir)
147
- FileUtils.mkdir_p(target_dir) unless Dir.exist?(target_dir)
468
+ # Get the repository name
469
+ repo_full_name = get_repo_full_name(star)
148
470
 
149
- # Format filename: YYYYMMDD.username.repo_name.json
150
- date_str = starred_at.strftime("%Y%m%d")
151
- repo_name = get_repo_full_name(star).gsub('/', '.')
152
- filename = "#{date_str}.#{repo_name}.json"
471
+ # Get the JSON file path
472
+ filepath = get_json_filepath(repo_full_name, starred_at)
153
473
 
154
- filepath = File.join(target_dir, filename)
474
+ # Write the JSON file
155
475
  File.write(filepath, JSON.pretty_generate(star_data))
156
476
  end
157
477
 
@@ -159,19 +479,14 @@ module Star
159
479
  # Get starred_at date or use current date as fallback
160
480
  starred_at = star.respond_to?(:starred_at) ? Time.parse(star.starred_at) : Time.now
161
481
 
162
- # Create directory structure based on starred_at date: markdown/YYYY/MM/
163
- year_dir = starred_at.strftime("%Y")
164
- month_dir = starred_at.strftime("%m")
165
- target_dir = File.join(config.markdown_dir, year_dir, month_dir)
166
- FileUtils.mkdir_p(target_dir) unless Dir.exist?(target_dir)
167
-
168
- # Format filename: YYYYMMDD.username.repo_name.md
169
- date_str = starred_at.strftime("%Y%m%d")
482
+ # Get the repository name
170
483
  repo_full_name = get_repo_full_name(star)
171
- repo_name = repo_full_name.gsub('/', '.')
172
- filename = "#{date_str}.#{repo_name}.md"
173
484
 
174
- filepath = File.join(target_dir, filename)
485
+ # Get the markdown file path
486
+ filepath = get_markdown_filepath(repo_full_name, starred_at)
487
+
488
+ # Skip if file already exists
489
+ return if File.exist?(filepath)
175
490
 
176
491
  # Include starred_at in the markdown
177
492
  starred_at_str = star.respond_to?(:starred_at) ? star.starred_at : "N/A"
@@ -196,10 +511,14 @@ module Star
196
511
  #{(get_topics(star) || []).map { |topic| "- #{topic}" }.join("\n")}
197
512
  MARKDOWN
198
513
 
199
- # Try to fetch README.md content
200
- readme_content = fetch_readme(repo_full_name)
201
- if readme_content
202
- content += "\n\n## README\n\n#{readme_content}\n"
514
+ # Try to fetch README.md content if not skipped
515
+ unless @skip_readme
516
+ readme_content = fetch_readme(repo_full_name)
517
+ if readme_content
518
+ content += "\n\n## README\n\n#{readme_content}\n"
519
+ else
520
+ content += "\n\n## Description\n\n#{get_description(star)}\n"
521
+ end
203
522
  else
204
523
  content += "\n\n## Description\n\n#{get_description(star)}\n"
205
524
  end
@@ -297,63 +616,6 @@ module Star
297
616
  []
298
617
  end
299
618
  end
300
-
301
- # Fetch README.md content from GitHub
302
- def fetch_readme(repo_full_name)
303
- begin
304
- # Get README content using GitHub API
305
- response = github.repos.contents.get(
306
- user: repo_full_name.split('/').first,
307
- repo: repo_full_name.split('/').last,
308
- path: 'README.md'
309
- )
310
-
311
- # Decode content from Base64
312
- if response.content && response.encoding == 'base64'
313
- return Base64.decode64(response.content).force_encoding('UTF-8')
314
- end
315
- rescue Github::Error::NotFound
316
- # Try README.markdown if README.md not found
317
- begin
318
- response = github.repos.contents.get(
319
- user: repo_full_name.split('/').first,
320
- repo: repo_full_name.split('/').last,
321
- path: 'README.markdown'
322
- )
323
-
324
- if response.content && response.encoding == 'base64'
325
- return Base64.decode64(response.content).force_encoding('UTF-8')
326
- end
327
- rescue Github::Error::NotFound
328
- # Try readme.md (lowercase) if previous attempts failed
329
- begin
330
- response = github.repos.contents.get(
331
- user: repo_full_name.split('/').first,
332
- repo: repo_full_name.split('/').last,
333
- path: 'readme.md'
334
- )
335
-
336
- if response.content && response.encoding == 'base64'
337
- return Base64.decode64(response.content).force_encoding('UTF-8')
338
- end
339
- rescue Github::Error::NotFound
340
- # README not found
341
- return nil
342
- rescue => e
343
- puts "Error fetching lowercase readme.md for #{repo_full_name}: #{e.message}"
344
- return nil
345
- end
346
- rescue => e
347
- puts "Error fetching README.markdown for #{repo_full_name}: #{e.message}"
348
- return nil
349
- end
350
- rescue => e
351
- puts "Error fetching README.md for #{repo_full_name}: #{e.message}"
352
- return nil
353
- end
354
-
355
- nil
356
- end
357
619
  end
358
620
  end
359
621
  end
@@ -2,6 +2,6 @@
2
2
 
3
3
  module Star
4
4
  module Dlp
5
- VERSION = "0.1.0"
5
+ VERSION = "0.1.1"
6
6
  end
7
7
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: star-dlp
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.0
4
+ version: 0.1.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - Liu Xiang