kaggle 0.0.2 → 0.0.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 9dd4c5fcd2e2e0f7841b00ba3b27d9dc75e898dc77bbc25b9ddbf10aad9f7561
4
- data.tar.gz: e44814e2269cf74aafc044e60653c36523fa0868b4c1548da1083ece0866bf90
3
+ metadata.gz: '009830e3354d63a31d7f75ca303651389d53d99eb190d89dbd6f3a81765842ed'
4
+ data.tar.gz: 4de17ec0780676292396440832dbfd6c474ba2687b52d36e6c4e150fd9a6c989
5
5
  SHA512:
6
- metadata.gz: 9f81008f7591da868c6eea42414b36159aa6d6c4e3be1ffd18f89f85a0e6e2c120310e4c56f71d32bd406d2eaced158d96972ccd653c3da215f6cfb256eac7d2
7
- data.tar.gz: f3215a4798e90fa5e3496f2364f5da4e5bad5adf982e470c54118c8a5afea3c9ebdca99251e3ba4bf76a9c8f64dbe90664f91d7bff1d6ecec600708cf12c34ec
6
+ metadata.gz: 85b83dd390d80fe152b1214b729b757c6a118d06c89016f259927e998b65d7e9cd31bf8464fc986729c3fc48b2a5f90469e8c78324ded6e753222892f781a7d1
7
+ data.tar.gz: 7cdf4a9655d9b98d93a7081ddaad14ff19979646101e4f9bb45af4aa34bd955f2ea6dafe5fb4cc5b1d68e622c5e0e41851eac33b87a18560a296b2d53ff9e521
data/CLAUDE.md CHANGED
@@ -6,6 +6,15 @@ This file documents how Claude helped develop this Ruby gem and provides guidanc
6
6
 
7
7
  This Kaggle Ruby gem was created with assistance from Claude (Sonnet 4) on 2025-08-23. The development process followed established Ruby gem conventions and best practices.
8
8
 
9
+ ### Version 0.0.3 Updates (2025-08-26)
10
+ - **Added cache-only mode**: New `cache_only: true` initialization option
11
+ - **Enhanced error handling**: Added `CacheNotFoundError` for cache-only scenarios
12
+ - **Flexible cache behavior**: Support for graceful degradation vs strict cache requirements
13
+ - **Comprehensive testing**: Added 6 new test cases for cache-only functionality
14
+ - **Updated documentation**: README and CLAUDE.md updated with new features
15
+
16
+ This update enables offline usage and integration with systems where credentials aren't available but cached data exists.
17
+
9
18
  ## Architecture Decisions
10
19
 
11
20
  ### 1. Gem Structure
@@ -31,14 +40,18 @@ This Kaggle Ruby gem was created with assistance from Claude (Sonnet 4) on 2025-
31
40
  ## Key Implementation Notes
32
41
 
33
42
  ### Authentication
34
- The gem supports two authentication methods:
35
- 1. Environment variables (`KAGGLE_USERNAME`, `KAGGLE_KEY`)
36
- 2. Explicit parameters during client initialization
43
+ The gem supports multiple authentication methods:
44
+ 1. JSON credentials file (`kaggle.json`) - automatic detection or custom path
45
+ 2. Environment variables (`KAGGLE_USERNAME`, `KAGGLE_KEY`)
46
+ 3. Explicit parameters during client initialization
47
+ 4. **Cache-only mode** - no credentials required, access cached data only
37
48
 
38
49
  ### Caching Strategy
39
50
  - Simple file-based caching for parsed CSV data
40
51
  - Cache keys generated from dataset paths
41
52
  - Optional cache usage controlled by method parameters
53
+ - **Cache-only mode** allows accessing datasets without valid credentials
54
+ - Support for both graceful degradation (return nil) and strict mode (raise error)
42
55
 
43
56
  ### CSV Parsing
44
57
  - Uses Ruby's built-in CSV library for reliability
data/README.md CHANGED
@@ -8,6 +8,7 @@ This is an unofficial project and still a work in progress (WIP) ... more to com
8
8
  - 📊 Download Kaggle datasets programmatically
9
9
  - 📄 Parse CSV datasets to JSON format
10
10
  - 💾 Configurable caching to avoid re-downloading
11
+ - 🔐 Cache-only mode for accessing datasets without credentials
11
12
  - 🔧 Flexible download and cache paths
12
13
  - ⚡ Built-in error handling and validation
13
14
  - 🛠️ Command-line interface for quick operations
@@ -46,6 +47,9 @@ export KAGGLE_KEY="your_api_key"
46
47
  ### Option 3: Direct Credentials
47
48
  Pass credentials directly when initializing the client.
48
49
 
50
+ ### Option 4: Cache-Only Mode
51
+ Access only cached datasets without providing credentials (useful for offline access or when credentials are unavailable).
52
+
49
53
  ### Kaggle JSON File Format
50
54
  The `kaggle.json` file downloaded from Kaggle should have this format:
51
55
  ```json
@@ -76,6 +80,40 @@ client = Kaggle::Client.new(
76
80
  username: 'your_username',
77
81
  api_key: 'your_api_key'
78
82
  )
83
+
84
+ # Option 4: Cache-only mode (no credentials needed)
85
+ client = Kaggle::Client.new(cache_only: true)
86
+ ```
87
+
88
+ ### Cache-Only Mode
89
+
90
+ Cache-only mode allows you to access previously downloaded and cached datasets without requiring valid Kaggle credentials. This is useful for:
91
+
92
+ - Offline development environments
93
+ - CI/CD pipelines where credentials aren't available
94
+ - Production systems that should only use pre-cached data
95
+
96
+ ```ruby
97
+ # Initialize in cache-only mode
98
+ client = Kaggle::Client.new(
99
+ cache_only: true,
100
+ cache_path: '/path/to/cache'
101
+ )
102
+
103
+ # This will return cached data if available, nil if not cached
104
+ data = client.download_dataset('zillow', 'zecon',
105
+ parse_csv: true,
106
+ use_cache: true)
107
+
108
+ # Force cache mode - raises CacheNotFoundError if not cached
109
+ begin
110
+ data = client.download_dataset('zillow', 'zecon',
111
+ parse_csv: true,
112
+ use_cache: true,
113
+ force_cache: true)
114
+ rescue Kaggle::CacheNotFoundError
115
+ puts "Dataset not found in cache"
116
+ end
79
117
  ```
80
118
 
81
119
  ### Download Datasets
@@ -145,8 +183,10 @@ kaggle --version
145
183
  | `download_path` | `./downloads` | Where to save downloaded files |
146
184
  | `cache_path` | `./cache` | Where to cache parsed data |
147
185
  | `timeout` | `30` | HTTP request timeout in seconds |
186
+ | `cache_only` | `false` | Enable cache-only mode (no credentials required) |
148
187
  | `use_cache` | `false` | Use cached parsed data when available |
149
188
  | `parse_csv` | `false` | Automatically parse CSV files to JSON |
189
+ | `force_cache` | `false` | Raise error if cached data not found (cache-only mode) |
150
190
 
151
191
  ## Error Handling
152
192
 
@@ -163,6 +203,8 @@ rescue Kaggle::DownloadError
163
203
  puts "Download failed"
164
204
  rescue Kaggle::ParseError
165
205
  puts "Failed to parse data"
206
+ rescue Kaggle::CacheNotFoundError
207
+ puts "Dataset not found in cache (cache-only mode)"
166
208
  end
167
209
  ```
168
210
 
data/lib/kaggle/client.rb CHANGED
@@ -4,22 +4,23 @@ module Kaggle
4
4
 
5
5
  base_uri Constants::BASE_URL
6
6
 
7
- attr_reader :username, :api_key, :download_path, :cache_path, :timeout
7
+ attr_reader :username, :api_key, :download_path, :cache_path, :timeout, :cache_only
8
8
 
9
9
  def initialize(username: nil, api_key: nil, credentials_file: nil, download_path: nil, cache_path: nil,
10
- timeout: nil)
10
+ timeout: nil, cache_only: false)
11
11
  load_credentials(username, api_key, credentials_file)
12
12
  @download_path = download_path || Constants::DEFAULT_DOWNLOAD_PATH
13
13
  @cache_path = cache_path || Constants::DEFAULT_CACHE_PATH
14
14
  @timeout = timeout || Constants::DEFAULT_TIMEOUT
15
+ @cache_only = cache_only
15
16
 
16
- unless valid_credential?(@username) && valid_credential?(@api_key)
17
+ unless cache_only || (valid_credential?(@username) && valid_credential?(@api_key))
17
18
  raise AuthenticationError,
18
- 'Username and API key are required'
19
+ 'Username and API key are required (or set cache_only: true for cache-only access)'
19
20
  end
20
21
 
21
22
  ensure_directories_exist
22
- setup_httparty_options
23
+ setup_httparty_options unless cache_only
23
24
  end
24
25
 
25
26
  def download_dataset(dataset_owner, dataset_name, options = {})
@@ -37,6 +38,15 @@ module Kaggle
37
38
  return handle_existing_dataset(extracted_dir, options)
38
39
  end
39
40
 
41
+ # If cache_only mode and no cached data found, return nil or raise based on force_cache option
42
+ if @cache_only
43
+ if options[:force_cache]
44
+ raise CacheNotFoundError, "Dataset '#{dataset_path}' not found in cache and force_cache is enabled"
45
+ else
46
+ return nil # Gracefully return nil when cache_only but not forced
47
+ end
48
+ end
49
+
40
50
  # Download the zip file
41
51
  response = authenticated_request(:get, "#{Constants::DATASET_ENDPOINTS[:download]}/#{dataset_path}")
42
52
 
@@ -1,3 +1,3 @@
1
1
  module Kaggle
2
- VERSION = '0.0.2'
2
+ VERSION = '0.0.5'
3
3
  end
data/lib/kaggle.rb CHANGED
@@ -16,4 +16,5 @@ module Kaggle
16
16
  class DatasetNotFoundError < Error; end
17
17
  class DownloadError < Error; end
18
18
  class ParseError < Error; end
19
+ class CacheNotFoundError < Error; end
19
20
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: kaggle
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.0.2
4
+ version: 0.0.5
5
5
  platform: ruby
6
6
  authors:
7
7
  - Your Name