kaggle 0.0.3 → 0.0.5
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CLAUDE.md +16 -3
- data/README.md +42 -0
- data/lib/kaggle/version.rb +1 -1
- metadata +1 -1
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: '009830e3354d63a31d7f75ca303651389d53d99eb190d89dbd6f3a81765842ed'
|
|
4
|
+
data.tar.gz: 4de17ec0780676292396440832dbfd6c474ba2687b52d36e6c4e150fd9a6c989
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 85b83dd390d80fe152b1214b729b757c6a118d06c89016f259927e998b65d7e9cd31bf8464fc986729c3fc48b2a5f90469e8c78324ded6e753222892f781a7d1
|
|
7
|
+
data.tar.gz: 7cdf4a9655d9b98d93a7081ddaad14ff19979646101e4f9bb45af4aa34bd955f2ea6dafe5fb4cc5b1d68e622c5e0e41851eac33b87a18560a296b2d53ff9e521
|
data/CLAUDE.md
CHANGED
|
@@ -6,6 +6,15 @@ This file documents how Claude helped develop this Ruby gem and provides guidanc
|
|
|
6
6
|
|
|
7
7
|
This Kaggle Ruby gem was created with assistance from Claude (Sonnet 4) on 2025-08-23. The development process followed established Ruby gem conventions and best practices.
|
|
8
8
|
|
|
9
|
+
### Version 0.0.3 Updates (2025-08-26)
|
|
10
|
+
- **Added cache-only mode**: New `cache_only: true` initialization option
|
|
11
|
+
- **Enhanced error handling**: Added `CacheNotFoundError` for cache-only scenarios
|
|
12
|
+
- **Flexible cache behavior**: Support for graceful degradation vs strict cache requirements
|
|
13
|
+
- **Comprehensive testing**: Added 6 new test cases for cache-only functionality
|
|
14
|
+
- **Updated documentation**: README and CLAUDE.md updated with new features
|
|
15
|
+
|
|
16
|
+
This update enables offline usage and integration with systems where credentials aren't available but cached data exists.
|
|
17
|
+
|
|
9
18
|
## Architecture Decisions
|
|
10
19
|
|
|
11
20
|
### 1. Gem Structure
|
|
@@ -31,14 +40,18 @@ This Kaggle Ruby gem was created with assistance from Claude (Sonnet 4) on 2025-
|
|
|
31
40
|
## Key Implementation Notes
|
|
32
41
|
|
|
33
42
|
### Authentication
|
|
34
|
-
The gem supports
|
|
35
|
-
1.
|
|
36
|
-
2.
|
|
43
|
+
The gem supports multiple authentication methods:
|
|
44
|
+
1. JSON credentials file (`kaggle.json`) - automatic detection or custom path
|
|
45
|
+
2. Environment variables (`KAGGLE_USERNAME`, `KAGGLE_KEY`)
|
|
46
|
+
3. Explicit parameters during client initialization
|
|
47
|
+
4. **Cache-only mode** - no credentials required, access cached data only
|
|
37
48
|
|
|
38
49
|
### Caching Strategy
|
|
39
50
|
- Simple file-based caching for parsed CSV data
|
|
40
51
|
- Cache keys generated from dataset paths
|
|
41
52
|
- Optional cache usage controlled by method parameters
|
|
53
|
+
- **Cache-only mode** allows accessing datasets without valid credentials
|
|
54
|
+
- Support for both graceful degradation (return nil) and strict mode (raise error)
|
|
42
55
|
|
|
43
56
|
### CSV Parsing
|
|
44
57
|
- Uses Ruby's built-in CSV library for reliability
|
data/README.md
CHANGED
|
@@ -8,6 +8,7 @@ This is an unofficial project and still a work in progress (WIP) ... more to com
|
|
|
8
8
|
- 📊 Download Kaggle datasets programmatically
|
|
9
9
|
- 📄 Parse CSV datasets to JSON format
|
|
10
10
|
- 💾 Configurable caching to avoid re-downloading
|
|
11
|
+
- 🔐 Cache-only mode for accessing datasets without credentials
|
|
11
12
|
- 🔧 Flexible download and cache paths
|
|
12
13
|
- ⚡ Built-in error handling and validation
|
|
13
14
|
- 🛠️ Command-line interface for quick operations
|
|
@@ -46,6 +47,9 @@ export KAGGLE_KEY="your_api_key"
|
|
|
46
47
|
### Option 3: Direct Credentials
|
|
47
48
|
Pass credentials directly when initializing the client.
|
|
48
49
|
|
|
50
|
+
### Option 4: Cache-Only Mode
|
|
51
|
+
Access only cached datasets without providing credentials (useful for offline access or when credentials are unavailable).
|
|
52
|
+
|
|
49
53
|
### Kaggle JSON File Format
|
|
50
54
|
The `kaggle.json` file downloaded from Kaggle should have this format:
|
|
51
55
|
```json
|
|
@@ -76,6 +80,40 @@ client = Kaggle::Client.new(
|
|
|
76
80
|
username: 'your_username',
|
|
77
81
|
api_key: 'your_api_key'
|
|
78
82
|
)
|
|
83
|
+
|
|
84
|
+
# Option 4: Cache-only mode (no credentials needed)
|
|
85
|
+
client = Kaggle::Client.new(cache_only: true)
|
|
86
|
+
```
|
|
87
|
+
|
|
88
|
+
### Cache-Only Mode
|
|
89
|
+
|
|
90
|
+
Cache-only mode allows you to access previously downloaded and cached datasets without requiring valid Kaggle credentials. This is useful for:
|
|
91
|
+
|
|
92
|
+
- Offline development environments
|
|
93
|
+
- CI/CD pipelines where credentials aren't available
|
|
94
|
+
- Production systems that should only use pre-cached data
|
|
95
|
+
|
|
96
|
+
```ruby
|
|
97
|
+
# Initialize in cache-only mode
|
|
98
|
+
client = Kaggle::Client.new(
|
|
99
|
+
cache_only: true,
|
|
100
|
+
cache_path: '/path/to/cache'
|
|
101
|
+
)
|
|
102
|
+
|
|
103
|
+
# This will return cached data if available, nil if not cached
|
|
104
|
+
data = client.download_dataset('zillow', 'zecon',
|
|
105
|
+
parse_csv: true,
|
|
106
|
+
use_cache: true)
|
|
107
|
+
|
|
108
|
+
# Force cache mode - raises CacheNotFoundError if not cached
|
|
109
|
+
begin
|
|
110
|
+
data = client.download_dataset('zillow', 'zecon',
|
|
111
|
+
parse_csv: true,
|
|
112
|
+
use_cache: true,
|
|
113
|
+
force_cache: true)
|
|
114
|
+
rescue Kaggle::CacheNotFoundError
|
|
115
|
+
puts "Dataset not found in cache"
|
|
116
|
+
end
|
|
79
117
|
```
|
|
80
118
|
|
|
81
119
|
### Download Datasets
|
|
@@ -145,8 +183,10 @@ kaggle --version
|
|
|
145
183
|
| `download_path` | `./downloads` | Where to save downloaded files |
|
|
146
184
|
| `cache_path` | `./cache` | Where to cache parsed data |
|
|
147
185
|
| `timeout` | `30` | HTTP request timeout in seconds |
|
|
186
|
+
| `cache_only` | `false` | Enable cache-only mode (no credentials required) |
|
|
148
187
|
| `use_cache` | `false` | Use cached parsed data when available |
|
|
149
188
|
| `parse_csv` | `false` | Automatically parse CSV files to JSON |
|
|
189
|
+
| `force_cache` | `false` | Raise error if cached data not found (cache-only mode) |
|
|
150
190
|
|
|
151
191
|
## Error Handling
|
|
152
192
|
|
|
@@ -163,6 +203,8 @@ rescue Kaggle::DownloadError
|
|
|
163
203
|
puts "Download failed"
|
|
164
204
|
rescue Kaggle::ParseError
|
|
165
205
|
puts "Failed to parse data"
|
|
206
|
+
rescue Kaggle::CacheNotFoundError
|
|
207
|
+
puts "Dataset not found in cache (cache-only mode)"
|
|
166
208
|
end
|
|
167
209
|
```
|
|
168
210
|
|
data/lib/kaggle/version.rb
CHANGED