yf_as_dataframe 0.4.0 → 0.4.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 13496d1eaf3e5ce09c9477de83957acb3d2183214eec2b0dedabdf8b612bd80b
4
- data.tar.gz: 604c01e5bba073e9350146c6636c6ce29f1ac62a7d2a9a4e6e9338a33f1a65e0
3
+ metadata.gz: ef9b7d363312088694d19ce68cd538579f9b3f2b93f57cf5e111f3247cd87b00
4
+ data.tar.gz: 5bab2fa2aa65c4b6496025aa13f86c15d5df49deeeff2f6b60db07818d6fe1c8
5
5
  SHA512:
6
- metadata.gz: e758fafbc0396c7582a9ad0654c94d7721afd2457a403408376be07fa7ace874887a83499c4507a122f1d027a0126da31c9447cb5b61a3d9fc3ab190e7e36f96
7
- data.tar.gz: 74a1263cd148add7c56417b4912052471ae6c49fc4163daacc5d5ce3988e53135802f4ec7912aeef372e735fec470072921f15f536d628e5449315fb31ea3142
6
+ metadata.gz: '0019c0c2f4293f3c1b3684bff81f2e0a78eb614c505031aaa77c4ba541c88def5189ed8d96310dc6b7217f3651f37f7ea9dbb64c05009a2286d12856027368b2'
7
+ data.tar.gz: 39b3d46681bbc24409315878a8722e563266553e9b81592051404ef53df6aef397c0248ab1fc724ea6edfc5f6d647e2fb89de6bf675b975aeeb592ae7244f66c
data/README.md CHANGED
@@ -12,7 +12,7 @@
12
12
  Yahoo, Inc.**
13
13
 
14
14
  yf_as_dataframe is **not** affiliated, endorsed, or vetted by Yahoo, Inc. It is
15
- an open-source tool that uses Yahoo's publicly available APIs, and is
15
+ an open-source tool that uses Yahoo's publicly available APIs, and is **only**
16
16
  intended for research and educational purposes.
17
17
 
18
18
  **You should refer to Yahoo!'s terms of use**
@@ -216,7 +216,9 @@ ls -la /usr/local/bin/curl_*
216
216
 
217
217
  ### Custom Installation Directory
218
218
 
219
- If you have curl-impersonate installed in a different directory, you can set the `CURL_IMPERSONATE_DIR` environment variable:
219
+ The codebase will look for the location of the curl-impersonate binaries per the `CURL_IMPERSONATE_DIR` environment variable;
220
+ if it is not assigned, the default location of the binaries is `/usr/local/bin`.
221
+ The code will randomly select one of the binaries (expected to be named "curl_chrome*", "curl_ff*", "curl_edge*", etc.) for its communications with the servers.
220
222
 
221
223
  ```bash
222
224
  # Set custom directory
@@ -226,8 +228,6 @@ export CURL_IMPERSONATE_DIR="/opt/curl-impersonate/bin"
226
228
  CURL_IMPERSONATE_DIR="/opt/curl-impersonate/bin" ruby your_script.rb
227
229
  ```
228
230
 
229
- The default directory is `/usr/local/bin` if the environment variable is not set.
230
-
231
231
  ### Configuration (Optional)
232
232
 
233
233
  You can configure the curl-impersonate behavior if needed:
@@ -250,14 +250,6 @@ puts "Available: #{executables.length} executables"
250
250
  puts "Using directory: #{YfAsDataframe::CurlImpersonateIntegration.executable_directory}"
251
251
  ```
252
252
 
253
- ### How It Works
254
-
255
- 1. **Automatic Detection**: Dynamically finds curl-impersonate executables in the configured directory
256
- 2. **Default Behavior**: Uses curl-impersonate for all requests by default
257
- 3. **Seamless Fallback**: Falls back to HTTParty if curl-impersonate fails
258
- 4. **Browser Rotation**: Randomly selects from Chrome, Firefox, Edge, and Safari configurations
259
- 5. **Zero Interface Changes**: All existing method signatures remain the same
260
-
261
253
  For more detailed information, see [MINIMAL_INTEGRATION.md](MINIMAL_INTEGRATION.md).
262
254
 
263
255
  ---
@@ -1,20 +1,24 @@
1
1
  require 'open3'
2
2
  require 'json'
3
3
  require 'ostruct'
4
+ require 'timeout'
4
5
 
5
6
  class YfAsDataframe
6
7
  module CurlImpersonateIntegration
7
8
  # Configuration
8
9
  @curl_impersonate_enabled = true
9
10
  @curl_impersonate_fallback = true
10
- @curl_impersonate_timeout = 5
11
+ @curl_impersonate_timeout = 30 # Increased from 5 to 30 seconds
12
+ @curl_impersonate_connect_timeout = 10 # New: connection timeout
11
13
  @curl_impersonate_retries = 2
12
14
  @curl_impersonate_retry_delay = 1
15
+ @curl_impersonate_process_timeout = 60 # New: process timeout protection
13
16
 
14
17
  class << self
15
18
  attr_accessor :curl_impersonate_enabled, :curl_impersonate_fallback,
16
- :curl_impersonate_timeout, :curl_impersonate_retries,
17
- :curl_impersonate_retry_delay
19
+ :curl_impersonate_timeout, :curl_impersonate_connect_timeout,
20
+ :curl_impersonate_retries, :curl_impersonate_retry_delay,
21
+ :curl_impersonate_process_timeout
18
22
  end
19
23
 
20
24
  # Get the curl-impersonate executable directory from environment variable or default
@@ -50,51 +54,76 @@ class YfAsDataframe
50
54
  available.sample
51
55
  end
52
56
 
53
- # Make a curl-impersonate request
54
- def self.make_request(url, headers: {}, params: {}, timeout: nil)
57
+ # Make a curl-impersonate request with improved timeout handling
58
+ def self.make_request(url, headers: {}, params: {}, timeout: nil, retries: nil)
55
59
  executable_info = get_random_executable
56
60
  return nil unless executable_info
57
61
 
58
62
  timeout ||= @curl_impersonate_timeout
63
+ retries ||= @curl_impersonate_retries
59
64
 
60
- # Build command
61
- cmd = [executable_info[:path], "--max-time", timeout.to_s]
62
-
63
- # Add headers
64
- headers.each do |key, value|
65
- cmd.concat(["-H", "#{key}: #{value}"])
66
- end
67
-
68
- # Add query parameters
65
+ cmd = [
66
+ executable_info[:path],
67
+ "--max-time", timeout.to_s,
68
+ "--connect-timeout", @curl_impersonate_connect_timeout.to_s,
69
+ "--retry", retries.to_s,
70
+ "--retry-delay", @curl_impersonate_retry_delay.to_s,
71
+ "--retry-max-time", (timeout * 2).to_s,
72
+ "--fail",
73
+ "--silent",
74
+ "--show-error"
75
+ ]
76
+ headers.each { |key, value| cmd.concat(["-H", "#{key}: #{value}"]) }
69
77
  unless params.empty?
70
78
  query_string = params.map { |k, v| "#{k}=#{v}" }.join('&')
71
79
  separator = url.include?('?') ? '&' : '?'
72
80
  url = "#{url}#{separator}#{query_string}"
73
81
  end
74
-
75
- # Add URL
76
82
  cmd << url
77
83
 
78
- # Debug output
79
84
  puts "DEBUG: curl-impersonate command: #{cmd.join(' ')}"
80
85
  puts "DEBUG: curl-impersonate timeout: #{timeout} seconds"
81
86
 
82
- # Execute
83
- stdout, stderr, status = Open3.capture3(*cmd)
84
-
85
- puts "DEBUG: curl-impersonate stdout: #{stdout[0..200]}..." if stdout && !stdout.empty?
86
- puts "DEBUG: curl-impersonate stderr: #{stderr}" if stderr && !stderr.empty?
87
+ begin
88
+ stdout_str = ''
89
+ stderr_str = ''
90
+ status = nil
91
+ Open3.popen3(*cmd) do |stdin, stdout, stderr, wait_thr|
92
+ stdin.close
93
+ pid = wait_thr.pid
94
+ done = false
95
+ monitor = Thread.new do
96
+ sleep(timeout + 10)
97
+ unless done
98
+ puts "DEBUG: Killing curl-impersonate PID \\#{pid} after timeout"
99
+ Process.kill('TERM', pid) rescue nil
100
+ sleep(1)
101
+ Process.kill('KILL', pid) rescue nil if wait_thr.alive?
102
+ end
103
+ end
104
+ stdout_str = stdout.read
105
+ stderr_str = stderr.read
106
+ status = wait_thr.value
107
+ done = true
108
+ monitor.kill
109
+ end
110
+ puts "DEBUG: curl-impersonate stdout: #{stdout_str[0..200]}..." if stdout_str && !stdout_str.empty?
111
+ puts "DEBUG: curl-impersonate stderr: #{stderr_str}" if stderr_str && !stderr_str.empty?
87
112
  puts "DEBUG: curl-impersonate status: #{status.exitstatus}"
88
-
89
113
  if status.success?
90
- # Create a response object similar to HTTParty
91
114
  response = OpenStruct.new
92
- response.body = stdout
115
+ response.body = stdout_str
93
116
  response.code = 200
94
117
  response.define_singleton_method(:success?) { true }
95
- response.parsed_response = parse_json_if_possible(stdout)
118
+ response.parsed_response = parse_json_if_possible(stdout_str)
96
119
  response
97
120
  else
121
+ error_message = "curl failed with code \\#{status.exitstatus}: \\#{stderr_str}"
122
+ puts "DEBUG: curl-impersonate failed with error: \\#{error_message}"
123
+ nil
124
+ end
125
+ rescue => e
126
+ puts "DEBUG: curl-impersonate exception: \\#{e.message}"
98
127
  nil
99
128
  end
100
129
  end
@@ -1,4 +1,5 @@
1
1
  require 'logger'
2
+ require 'open-uri'
2
3
 
3
4
  class YfAsDataframe
4
5
  module Holders
@@ -119,7 +119,7 @@ class YfAsDataframe
119
119
  @shared::_PROGRESS_BAR.completed if progress
120
120
 
121
121
  unless @shared::_ERRORS.empty?
122
- logger.error("\n#{@shared::_ERRORS.length} Failed download#{@shared::_ERRORS.length > 1 ? 's' : ''}:")
122
+ # logger.error("\n#{@shared::_ERRORS.length} Failed download#{@shared::_ERRORS.length > 1 ? 's' : ''}:")
123
123
 
124
124
  errors = {}
125
125
  @shared::_ERRORS.each do |ticker, err|
@@ -127,9 +127,9 @@ class YfAsDataframe
127
127
  errors[err] ||= []
128
128
  errors[err] << ticker
129
129
  end
130
- errors.each do |err, tickers|
131
- logger.error("#{tickers.join(', ')}: #{err}")
132
- end
130
+ # errors.each do |err, tickers|
131
+ # logger.error("#{tickers.join(', ')}: #{err}")
132
+ # end
133
133
 
134
134
  tbs = {}
135
135
  @shared::_TRACEBACKS.each do |ticker, tb|
@@ -137,9 +137,9 @@ class YfAsDataframe
137
137
  tbs[tb] ||= []
138
138
  tbs[tb] << ticker
139
139
  end
140
- tbs.each do |tb, tickers|
141
- logger.debug("#{tickers.join(', ')}: #{tb}")
142
- end
140
+ # tbs.each do |tb, tickers|
141
+ # logger.debug("#{tickers.join(', ')}: #{tb}")
142
+ # end
143
143
  end
144
144
 
145
145
  if ignore_tz
@@ -719,7 +719,7 @@ class YfAsDataframe
719
719
  # startDt = quotes.index[0].floor('D')
720
720
  startDt = quotes['Timestamps'].to_a.map(&:to_date).min
721
721
  # Rails.logger.info { "#{__FILE__}:#{__LINE__} startDt = #{startDt.inspect}" }
722
- endDt = !fin.nil? && !(fin.respond_to?(:empty?) && fin.empty?) ? fin.to_date : Time.at((Time.now + 1.day).to_i).to_i
722
+ endDt = !fin.nil? && !(fin.respond_to?(:empty?) && fin.empty?) ? fin.to_date : (Time.now + 86400).to_date
723
723
 
724
724
  # Rails.logger.info { "#{__FILE__}:#{__LINE__} @history[events][dividends] = #{@history['events']["dividends"].inspect}" }
725
725
  # divi = {}
@@ -731,32 +731,32 @@ class YfAsDataframe
731
731
  # Rails.logger.info { "#{__FILE__}:#{__LINE__} ts = #{ts.inspect}" }
732
732
  @history['events']["dividends"].select{|k,v|
733
733
  Time.at(k.to_i).utc.to_date >= startDt && Time.at(k.to_i).utc.to_date <= endDt }.each{|k,v|
734
- d[ts.index(Time.at(k.to_i).utc)] = v['amount'].to_f} unless @history.try(:[],'events').try(:[],"dividends").nil?
734
+ d[ts.index(Time.at(k.to_i).utc)] = v['amount'].to_f} unless @history.dig('events', 'dividends').nil?
735
735
  df['Dividends'] = Polars::Series.new(d)
736
736
  # Rails.logger.info { "#{__FILE__}:#{__LINE__} df = #{df.inspect}" }
737
737
 
738
738
  # caga = {}
739
739
  # @history['events']["capital gains"].select{|k,v|
740
740
  # Time.at(k.to_i).utc.to_date >= startDt && Time.at(k.to_i).utc.to_date <= endDt }.each{|k,v|
741
- # caga['date'] = v['amount']} unless @history.try(:[],'events').try(:[],"capital gains").nil?
741
+ # caga['date'] = v['amount']} unless @history.dig('events', 'capital gains').nil?
742
742
  # capital_gains = capital_gains.loc[startDt:] if capital_gains.shape.first > 0
743
743
  # Rails.logger.info { "#{__FILE__}:#{__LINE__} caga = #{caga.inspect}" }
744
744
  d = [0.0] * df.length
745
745
  @history['events']["capital gains"].select{|k,v|
746
746
  Time.at(k.to_i).utc.to_date >= startDt && Time.at(k.to_i).utc.to_date <= endDt }.each{|k,v|
747
- d[ts.index(Time.at(k.to_i).utc)] = v['amount'].to_f} unless @history.try(:[],'events').try(:[],"capital gains").nil?
747
+ d[ts.index(Time.at(k.to_i).utc)] = v['amount'].to_f} unless @history.dig('events', 'capital gains').nil?
748
748
  df['Capital Gains'] = Polars::Series.new(d)
749
749
 
750
750
  # splits = splits.loc[startDt:] if splits.shape[0] > 0
751
751
  # stspl = {}
752
752
  # @history['events']['stock splits'].select{|k,v|
753
753
  # Time.at(k.to_i).utc.to_date >= startDt && Time.at(k.to_i).utc.to_date <= endDt }.each{|k,v|
754
- # stspl['date'] = v['numerator'].to_f/v['denominator'].to_f} unless @history.try(:[],'events').try(:[],"stock splits").nil?
754
+ # stspl['date'] = v['numerator'].to_f/v['denominator'].to_f} unless @history.dig('events', 'capital gains').nil?
755
755
  # Rails.logger.info { "#{__FILE__}:#{__LINE__} stspl = #{stspl.inspect}" }
756
756
  d = [0.0] * df.length
757
757
  @history['events']["capital gains"].select{|k,v|
758
758
  Time.at(k.to_i).utc.to_date >= startDt && Time.at(k.to_i).utc.to_date <= endDt }.each{|k,v|
759
- d[ts.index(Time.at(k.to_i).utc)] = v['numerator'].to_f/v['denominator'].to_f} unless @history.try(:[],'events').try(:[],"capital gains").nil?
759
+ d[ts.index(Time.at(k.to_i).utc)] = v['numerator'].to_f/v['denominator'].to_f} unless @history.dig('events', 'capital gains').nil?
760
760
  df['Stock Splits'] = Polars::Series.new(d)
761
761
  end
762
762
 
@@ -54,21 +54,21 @@ class YfAsDataframe
54
54
  if start
55
55
  start_ts = YfAsDataframe::Utils.parse_user_dt(start, tz)
56
56
  # Rails.logger.info { "#{__FILE__}:#{__LINE__} start_ts = #{start_ts}" }
57
- start = Time.at(start_ts).in_time_zone(tz)
57
+ start = Time.at(start_ts)
58
58
  # Rails.logger.info { "#{__FILE__}:#{__LINE__} start = #{start.inspect}, fin = #{fin.inspect}" }
59
59
  end
60
60
  if fin
61
61
  end_ts = YfAsDataframe::Utils.parse_user_dt(fin, tz)
62
62
  # Rails.logger.info { "#{__FILE__}:#{__LINE__} end_ts = #{end_ts}" }
63
- fin = Time.at(end_ts).in_time_zone(tz)
63
+ fin = Time.at(end_ts)
64
64
  # Rails.logger.info { "#{__FILE__}:#{__LINE__} start = #{start.inspect}, fin = #{fin.inspect}" }
65
65
  end
66
66
 
67
67
  # Rails.logger.info { "#{__FILE__}:#{__LINE__} start = #{start.inspect}, fin = #{fin.inspect}" }
68
68
 
69
- dt_now = Time.now.in_time_zone(tz)
69
+ dt_now = Time.now
70
70
  fin ||= dt_now
71
- start ||= (fin - 548.days).midnight
71
+ start ||= Time.new(fin.year, fin.month, fin.day) - 548*24*60*60
72
72
 
73
73
  if start >= fin
74
74
  logger.error("Start date (#{start}) must be before end (#{fin})")
@@ -76,7 +76,7 @@ class YfAsDataframe
76
76
  end
77
77
 
78
78
  ts_url_base = "https://query2.finance.yahoo.com/ws/fundamentals-timeseries/v1/finance/timeseries/#{@ticker}?symbol=#{@ticker}"
79
- shares_url = "#{ts_url_base}&period1=#{start.to_i}&period2=#{fin.tomorrow.midnight.to_i}"
79
+ shares_url = "#{ts_url_base}&period1=#{Time.new(start.year, start.month, start.day).to_i}&period2=#{Time.new((fin + 86400).year, (fin + 86400).month, (fin + 86400).day).to_i}"
80
80
 
81
81
  begin
82
82
  json_data = get(shares_url).parsed_response
@@ -95,7 +95,7 @@ class YfAsDataframe
95
95
 
96
96
  return nil if !shares_data[0].key?("shares_out")
97
97
 
98
- timestamps = shares_data[0]["timestamp"].map{|t| Time.at(t).to_datetime }
98
+ timestamps = shares_data[0]["timestamp"].map{|t| Time.at(t) }
99
99
 
100
100
  df = Polars::DataFrame.new(
101
101
  {
@@ -247,13 +247,17 @@ class YfAsDataframe
247
247
 
248
248
  def self.parse_user_dt(dt, exchange_tz)
249
249
  if dt.is_a?(Integer)
250
- Time.at(dt)
250
+ return Time.at(dt)
251
251
  elsif dt.is_a?(String)
252
- dt = DateTime.strptime(dt.to_s, '%Y-%m-%d')
252
+ dt = DateTime.strptime(dt.to_s, '%Y-%m-%d')
253
253
  elsif dt.is_a?(Date)
254
- dt = dt.to_datetime
255
- elsif dt.is_a?(DateTime) && dt.zone.nil?
256
- dt = dt.in_time_zone(exchange_tz)
254
+ dt = dt.to_datetime
255
+ end
256
+ # If it's a DateTime, convert to Time
257
+ if dt.is_a?(DateTime)
258
+ # If zone is nil, try to set it, else just convert
259
+ dt = dt.in_time_zone(exchange_tz) if dt.zone.nil? && dt.respond_to?(:in_time_zone)
260
+ dt = dt.to_time
257
261
  end
258
262
  dt.to_i
259
263
  end
@@ -334,7 +338,7 @@ class YfAsDataframe
334
338
  when '4wk'
335
339
  28.days
336
340
  else
337
- Logger.new(STDOUT).warn { "#{__FILE__}:#{__LINE__} #{interval} not a recognized interval" }
341
+ # Logger.new(STDOUT).warn { "#{__FILE__}:#{__LINE__} #{interval} not a recognized interval" }
338
342
  interval
339
343
  end
340
344
  end
@@ -1,3 +1,3 @@
1
1
  class YfAsDataframe
2
- VERSION = "0.4.0"
2
+ VERSION = "0.4.1"
3
3
  end
@@ -265,7 +265,7 @@ class YfAsDataframe
265
265
  @@cookie = nil
266
266
  # Clear curl-impersonate executables cache to force re-selection
267
267
  CurlImpersonateIntegration.instance_variable_set(:@available_executables, nil)
268
- warn "[yf_as_dataframe] Retrying crumb fetch (attempt #{attempt + 1}/3)"
268
+ # warn "[yf_as_dataframe] Retrying crumb fetch (attempt #{attempt + 1}/3)"
269
269
  # Add delay between retries to be respectful of rate limits
270
270
  sleep(2 ** attempt) # Exponential backoff: 2s, 4s, 8s
271
271
  end
@@ -283,20 +283,20 @@ class YfAsDataframe
283
283
 
284
284
  # Validate crumb: must be short, alphanumeric, no spaces, not an error message
285
285
  if crumb_valid?(@@crumb)
286
- warn "[yf_as_dataframe] Successfully fetched valid crumb on attempt #{attempt + 1}"
286
+ # warn "[yf_as_dataframe] Successfully fetched valid crumb on attempt #{attempt + 1}"
287
287
  return @@crumb
288
288
  else
289
- warn "[yf_as_dataframe] Invalid crumb received on attempt #{attempt + 1}: '#{@@crumb.inspect}'"
289
+ # warn "[yf_as_dataframe] Invalid crumb received on attempt #{attempt + 1}: '#{@@crumb.inspect}'"
290
290
  @@crumb = nil
291
291
  end
292
292
  rescue => e
293
- warn "[yf_as_dataframe] Error fetching crumb on attempt #{attempt + 1}: #{e.message}"
293
+ # warn "[yf_as_dataframe] Error fetching crumb on attempt #{attempt + 1}: #{e.message}"
294
294
  @@crumb = nil
295
295
  end
296
296
  end
297
297
 
298
298
  # All attempts failed
299
- warn "[yf_as_dataframe] Failed to fetch valid crumb after 3 attempts"
299
+ # warn "[yf_as_dataframe] Failed to fetch valid crumb after 3 attempts"
300
300
  raise "Could not fetch a valid Yahoo Finance crumb after 3 attempts"
301
301
  end
302
302
 
@@ -441,7 +441,7 @@ class YfAsDataframe
441
441
  def refresh_session_if_needed
442
442
  return unless session_needs_refresh?
443
443
 
444
- warn "[yf_as_dataframe] Refreshing session (age: #{session_age} seconds, requests: #{@@request_count})"
444
+ # warn "[yf_as_dataframe] Refreshing session (age: #{session_age} seconds, requests: #{@@request_count})"
445
445
  refresh_session
446
446
  end
447
447
 
@@ -461,7 +461,7 @@ class YfAsDataframe
461
461
  @@crumb = nil
462
462
  @@session_created_at = Time.now
463
463
  @@request_count = 0
464
- warn "[yf_as_dataframe] Session refreshed"
464
+ # warn "[yf_as_dataframe] Session refreshed"
465
465
  end
466
466
 
467
467
  # Circuit breaker methods
@@ -472,7 +472,7 @@ class YfAsDataframe
472
472
  when :open
473
473
  if Time.now - @@last_failure_time > @@circuit_breaker_timeout
474
474
  @@circuit_breaker_state = :half_open
475
- warn "[yf_as_dataframe] Circuit breaker transitioning to half-open"
475
+ # warn "[yf_as_dataframe] Circuit breaker transitioning to half-open"
476
476
  true
477
477
  else
478
478
  false
@@ -490,7 +490,7 @@ class YfAsDataframe
490
490
  @@circuit_breaker_state = :open
491
491
  # Exponential backoff: 60s, 120s, 240s, 480s, etc.
492
492
  @@circuit_breaker_timeout = @@circuit_breaker_base_timeout * (2 ** (@@failure_count - @@circuit_breaker_threshold))
493
- warn "[yf_as_dataframe] Circuit breaker opened after #{@@failure_count} failures (timeout: #{@@circuit_breaker_timeout}s)"
493
+ # warn "[yf_as_dataframe] Circuit breaker opened after #{@@failure_count} failures (timeout: #{@@circuit_breaker_timeout}s)"
494
494
  end
495
495
  end
496
496
 
@@ -499,7 +499,7 @@ class YfAsDataframe
499
499
  @@circuit_breaker_state = :closed
500
500
  @@failure_count = 0
501
501
  @@circuit_breaker_timeout = @@circuit_breaker_base_timeout
502
- warn "[yf_as_dataframe] Circuit breaker closed after successful request"
502
+ # warn "[yf_as_dataframe] Circuit breaker closed after successful request"
503
503
  elsif @@circuit_breaker_state == :closed
504
504
  # Reset failure count on success
505
505
  @@failure_count = 0
@@ -538,10 +538,10 @@ class YfAsDataframe
538
538
  begin
539
539
  data = JSON.parse(json_blob)
540
540
  crumb = data.dig('context', 'dispatcher', 'stores', 'CrumbStore', 'crumb')
541
- warn "[yf_as_dataframe] Scraped crumb from quote page: #{crumb.inspect}"
541
+ # warn "[yf_as_dataframe] Scraped crumb from quote page: #{crumb.inspect}"
542
542
  return crumb
543
543
  rescue => e
544
- warn "[yf_as_dataframe] Failed to parse crumb from quote page: #{e.message}"
544
+ # warn "[yf_as_dataframe] Failed to parse crumb from quote page: #{e.message}"
545
545
  return nil
546
546
  end
547
547
  end
@@ -22,9 +22,16 @@ class YfAsDataframe
22
22
  # Prepare headers and params as in original method
23
23
  headers ||= {}
24
24
  params ||= {}
25
- params.merge!(crumb: @@crumb) unless @@crumb.nil?
26
- cookie, crumb, strategy = _get_cookie_and_crumb()
27
- crumbs = !crumb.nil? ? {'crumb' => crumb} : {}
25
+
26
+ # Only fetch crumb for /v7/finance/download endpoint
27
+ crumb_needed = url.include?('/v7/finance/download')
28
+ if crumb_needed
29
+ crumb = get_crumb_scrape_quote_page(params[:symbol] || params['symbol'])
30
+ params.merge!(crumb: crumb) unless crumb.nil?
31
+ end
32
+
33
+ cookie, _, strategy = _get_cookie_and_crumb(crumb_needed)
34
+ crumbs = {} # crumb logic handled above if needed
28
35
 
29
36
  # Prepare headers for curl-impersonate
30
37
  curl_headers = headers.dup.merge(@@user_agent_headers)
@@ -39,12 +46,13 @@ class YfAsDataframe
39
46
  # Add crumb if available
40
47
  curl_headers['crumb'] = crumb if crumb
41
48
 
42
- # Make curl-impersonate request
49
+ # Make curl-impersonate request with improved timeout handling
43
50
  response = CurlImpersonateIntegration.make_request(
44
51
  url,
45
52
  headers: curl_headers,
46
53
  params: params.merge(crumbs),
47
- timeout: CurlImpersonateIntegration.curl_impersonate_timeout
54
+ timeout: CurlImpersonateIntegration.curl_impersonate_timeout,
55
+ retries: CurlImpersonateIntegration.curl_impersonate_retries
48
56
  )
49
57
 
50
58
  if response && response.success?
@@ -56,7 +64,7 @@ class YfAsDataframe
56
64
  rescue => e
57
65
  # Log error but continue to fallback
58
66
  puts "DEBUG: curl-impersonate exception: #{e.message}"
59
- warn "curl-impersonate request failed: #{e.message}" if $VERBOSE
67
+ # warn "curl-impersonate request failed: #{e.message}" if $VERBOSE
60
68
  end
61
69
  else
62
70
  puts "DEBUG: curl-impersonate is disabled, skipping to fallback"
@@ -89,9 +97,33 @@ class YfAsDataframe
89
97
  CurlImpersonateIntegration.curl_impersonate_timeout = timeout
90
98
  end
91
99
 
100
+ def set_curl_impersonate_connect_timeout(timeout)
101
+ CurlImpersonateIntegration.curl_impersonate_connect_timeout = timeout
102
+ end
103
+
104
+ def set_curl_impersonate_process_timeout(timeout)
105
+ CurlImpersonateIntegration.curl_impersonate_process_timeout = timeout
106
+ end
107
+
108
+ def set_curl_impersonate_retries(retries)
109
+ CurlImpersonateIntegration.curl_impersonate_retries = retries
110
+ end
111
+
92
112
  def get_available_curl_impersonate_executables
93
113
  CurlImpersonateIntegration.available_executables
94
114
  end
115
+
116
+ def get_curl_impersonate_config
117
+ {
118
+ enabled: CurlImpersonateIntegration.curl_impersonate_enabled,
119
+ fallback: CurlImpersonateIntegration.curl_impersonate_fallback,
120
+ timeout: CurlImpersonateIntegration.curl_impersonate_timeout,
121
+ connect_timeout: CurlImpersonateIntegration.curl_impersonate_connect_timeout,
122
+ process_timeout: CurlImpersonateIntegration.curl_impersonate_process_timeout,
123
+ retries: CurlImpersonateIntegration.curl_impersonate_retries,
124
+ retry_delay: CurlImpersonateIntegration.curl_impersonate_retry_delay
125
+ }
126
+ end
95
127
  end
96
128
  end
97
129
  end
@@ -11,7 +11,7 @@ class YfAsDataframe
11
11
  class YFNotImplementedError < NotImplementedError
12
12
  def initialize(str)
13
13
  @msg = "Have not implemented fetching \"#{str}\" from Yahoo API"
14
- Logger.new(STDOUT).warn { @msg }
14
+ # Logger.new(STDOUT).warn { @msg }
15
15
  end
16
16
  end
17
17
  end
data/smoke_test.rb ADDED
@@ -0,0 +1,64 @@
1
+ # require "bundler/setup"
2
+ require "yf_as_dataframe"
3
+
4
+ def print_section(title)
5
+ puts "\n=== #{title} ==="
6
+ end
7
+
8
+ begin
9
+ print_section("Ticker Creation")
10
+ msft = YfAsDataframe::Ticker.new("MSFT")
11
+ puts "Ticker created: #{msft.ticker}"
12
+
13
+ print_section("Price History")
14
+ hist = msft.history(period: "1mo")
15
+ puts "History DataFrame shape: #{hist.shape}" if hist
16
+
17
+ print_section("Meta Information")
18
+ meta = msft.history_metadata
19
+ puts "Meta: #{meta.inspect}"
20
+
21
+ print_section("Actions")
22
+ puts "Dividends: #{msft.dividends.inspect}"
23
+ puts "Splits: #{msft.splits.inspect}"
24
+
25
+ print_section("Share Count")
26
+ shares = msft.shares_full(start: "2022-01-01", fin: nil)
27
+ puts "Shares DataFrame shape: #{shares.shape}" if shares
28
+
29
+ print_section("Financials")
30
+ puts "Income Statement: #{msft.income_stmt.inspect}"
31
+ puts "Balance Sheet: #{msft.balance_sheet.inspect}"
32
+ puts "Cash Flow: #{msft.cashflow.inspect}"
33
+
34
+ print_section("Holders")
35
+ puts "Major Holders: #{msft.major_holders.inspect}"
36
+ puts "Institutional Holders: #{msft.institutional_holders.inspect}"
37
+
38
+ print_section("Recommendations")
39
+ puts "Recommendations: #{msft.recommendations.inspect}"
40
+
41
+ print_section("Earnings Dates")
42
+ puts "Earnings Dates: #{msft.earnings_dates.inspect}"
43
+
44
+ print_section("ISIN")
45
+ puts "ISIN: #{msft.isin.inspect}"
46
+
47
+ print_section("Options")
48
+ puts "Options: #{msft.options.inspect}"
49
+
50
+ print_section("News")
51
+ puts "News: #{msft.news.inspect}"
52
+
53
+ print_section("Technical Indicator Example")
54
+ if hist
55
+ ad = YfAsDataframe.ad(hist)
56
+ puts "AD indicator: #{ad.inspect}"
57
+ end
58
+
59
+ puts "\nAll tests completed successfully!"
60
+
61
+ rescue => e
62
+ puts "\nTest failed: #{e.class} - #{e.message}"
63
+ puts e.backtrace.first(10)
64
+ end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: yf_as_dataframe
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.4.0
4
+ version: 0.4.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - Bill McKinnon
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2025-06-25 00:00:00.000000000 Z
11
+ date: 2025-07-07 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: tzinfo
@@ -156,8 +156,7 @@ files:
156
156
  - lib/yf_as_dataframe/yf_connection.rb
157
157
  - lib/yf_as_dataframe/yf_connection_minimal_patch.rb
158
158
  - lib/yf_as_dataframe/yfinance_exception.rb
159
- - quick_test.rb
160
- - test_minimal_integration.rb
159
+ - smoke_test.rb
161
160
  homepage: https://www.github.com/bmck/yf_as_dataframe
162
161
  licenses:
163
162
  - MIT
data/quick_test.rb DELETED
@@ -1,143 +0,0 @@
1
- #!/usr/bin/env ruby
2
-
3
- # Quick test for minimal curl-impersonate integration
4
- # This test verifies the integration without making actual HTTP requests
5
-
6
- puts "=== Quick Curl-Impersonate Integration Test ==="
7
- puts
8
-
9
- # Test 1: Check curl-impersonate integration module
10
- puts "1. Testing curl-impersonate integration module..."
11
- begin
12
- require_relative 'lib/yf_as_dataframe/curl_impersonate_integration'
13
-
14
- executables = YfAsDataframe::CurlImpersonateIntegration.available_executables
15
- if executables.empty?
16
- puts " ❌ No curl-impersonate executables found!"
17
- exit 1
18
- else
19
- puts " ✅ Found #{executables.length} curl-impersonate executables"
20
- puts " Sample: #{executables.first[:executable]} (#{executables.first[:browser]})"
21
- end
22
- rescue => e
23
- puts " ❌ Error loading integration module: #{e.message}"
24
- exit 1
25
- end
26
-
27
- puts
28
-
29
- # Test 2: Test executable selection
30
- puts "2. Testing executable selection..."
31
- begin
32
- executable = YfAsDataframe::CurlImpersonateIntegration.get_random_executable
33
- if executable
34
- puts " ✅ Random executable selected: #{executable[:executable]} (#{executable[:browser]})"
35
- else
36
- puts " ❌ No executable selected"
37
- end
38
- rescue => e
39
- puts " ❌ Error selecting executable: #{e.message}"
40
- end
41
-
42
- puts
43
-
44
- # Test 3: Test environment variable functionality
45
- puts "3. Testing environment variable functionality..."
46
- begin
47
- default_dir = YfAsDataframe::CurlImpersonateIntegration.executable_directory
48
- puts " ✅ Default directory: #{default_dir}"
49
-
50
- # Test with a custom directory (should still use default if not set)
51
- old_env = ENV['CURL_IMPERSONATE_DIR']
52
- ENV['CURL_IMPERSONATE_DIR'] = '/nonexistent/path'
53
-
54
- # Clear the cached executables to force re-discovery
55
- YfAsDataframe::CurlImpersonateIntegration.instance_variable_set(:@available_executables, nil)
56
-
57
- custom_dir = YfAsDataframe::CurlImpersonateIntegration.executable_directory
58
- puts " ✅ Custom directory (set): #{custom_dir}"
59
-
60
- # Restore original environment
61
- if old_env
62
- ENV['CURL_IMPERSONATE_DIR'] = old_env
63
- else
64
- ENV.delete('CURL_IMPERSONATE_DIR')
65
- end
66
-
67
- # Clear cache again
68
- YfAsDataframe::CurlImpersonateIntegration.instance_variable_set(:@available_executables, nil)
69
-
70
- restored_dir = YfAsDataframe::CurlImpersonateIntegration.executable_directory
71
- puts " ✅ Restored directory: #{restored_dir}"
72
-
73
- rescue => e
74
- puts " ❌ Error testing environment variable: #{e.message}"
75
- end
76
-
77
- puts
78
-
79
- # Test 4: Test minimal patch loading
80
- puts "4. Testing minimal patch structure..."
81
- begin
82
- # This would normally require the full YfConnection class
83
- # For this test, we'll just verify the patch file loads
84
- require_relative 'lib/yf_as_dataframe/curl_impersonate_integration'
85
- require_relative 'lib/yf_as_dataframe/yf_connection_minimal_patch'
86
-
87
- puts " ✅ Minimal patch files load successfully"
88
- puts " ✅ Integration module is available"
89
- rescue => e
90
- puts " ❌ Error loading minimal patch: #{e.message}"
91
- end
92
-
93
- puts
94
-
95
- # Test 5: Test configuration
96
- puts "5. Testing configuration..."
97
- begin
98
- puts " ✅ Configuration methods available:"
99
- puts " - enable_curl_impersonate"
100
- puts " - enable_curl_impersonate_fallback"
101
- puts " - set_curl_impersonate_timeout"
102
- puts " - get_available_curl_impersonate_executables"
103
-
104
- # Test setting configuration
105
- YfAsDataframe::CurlImpersonateIntegration.curl_impersonate_timeout = 20
106
- puts " ✅ Configuration can be modified"
107
- rescue => e
108
- puts " ❌ Error with configuration: #{e.message}"
109
- end
110
-
111
- puts
112
-
113
- # Test 6: Test command building (without execution)
114
- puts "6. Testing command building..."
115
- begin
116
- executable = YfAsDataframe::CurlImpersonateIntegration.get_random_executable
117
- if executable
118
- # Build a command without executing it
119
- cmd = [executable[:path], "--max-time", "5", "https://httpbin.org/get"]
120
- puts " ✅ Command built: #{cmd.join(' ')}"
121
- else
122
- puts " ❌ Could not build command"
123
- end
124
- rescue => e
125
- puts " ❌ Error building command: #{e.message}"
126
- end
127
-
128
- puts
129
- puts "=== Quick Test Summary ==="
130
- puts "✅ Integration module loads successfully"
131
- puts "✅ Executables are detected"
132
- puts "✅ Environment variable functionality works"
133
- puts "✅ Configuration works"
134
- puts "✅ Patch files load without errors"
135
- puts
136
- puts "The minimal curl-impersonate integration is ready for use!"
137
- puts
138
- puts "To integrate with your code:"
139
- puts "require 'yf_as_dataframe/curl_impersonate_integration'"
140
- puts "require 'yf_as_dataframe/yf_connection_minimal_patch'"
141
- puts
142
- puts "Environment variable support:"
143
- puts "export CURL_IMPERSONATE_DIR='/custom/path' # Optional"
@@ -1,121 +0,0 @@
1
- #!/usr/bin/env ruby
2
-
3
- # Test script for minimal curl-impersonate integration
4
- # This tests the approach where curl-impersonate is the default behavior
5
-
6
- puts "=== Minimal Curl-Impersonate Integration Test ==="
7
- puts
8
-
9
- # Test 1: Check curl-impersonate integration module
10
- puts "1. Testing curl-impersonate integration module..."
11
- begin
12
- require_relative 'lib/yf_as_dataframe/curl_impersonate_integration'
13
-
14
- executables = YfAsDataframe::CurlImpersonateIntegration.available_executables
15
- if executables.empty?
16
- puts " ❌ No curl-impersonate executables found!"
17
- exit 1
18
- else
19
- puts " ✅ Found #{executables.length} curl-impersonate executables"
20
- puts " Sample: #{executables.first[:executable]} (#{executables.first[:browser]})"
21
- end
22
- rescue => e
23
- puts " ❌ Error loading integration module: #{e.message}"
24
- exit 1
25
- end
26
-
27
- puts
28
-
29
- # Test 2: Test direct curl-impersonate request with short timeout
30
- puts "2. Testing direct curl-impersonate request..."
31
- begin
32
- response = YfAsDataframe::CurlImpersonateIntegration.make_request(
33
- "https://httpbin.org/get",
34
- headers: { "User-Agent" => "Test-Agent" },
35
- timeout: 10 # 10 second timeout
36
- )
37
-
38
- if response && response.success?
39
- puts " ✅ Direct curl-impersonate request successful"
40
- puts " Response length: #{response.body.length} characters"
41
- else
42
- puts " ❌ Direct curl-impersonate request failed"
43
- end
44
- rescue => e
45
- puts " ❌ Error with direct request: #{e.message}"
46
- end
47
-
48
- puts
49
-
50
- # Test 3: Test minimal patch (without full gem)
51
- puts "3. Testing minimal patch structure..."
52
- begin
53
- # This would normally require the full YfConnection class
54
- # For this test, we'll just verify the patch file loads
55
- require_relative 'lib/yf_as_dataframe/curl_impersonate_integration'
56
- require_relative 'lib/yf_as_dataframe/yf_connection_minimal_patch'
57
-
58
- puts " ✅ Minimal patch files load successfully"
59
- puts " ✅ Integration module is available"
60
- rescue => e
61
- puts " ❌ Error loading minimal patch: #{e.message}"
62
- end
63
-
64
- puts
65
-
66
- # Test 4: Test configuration methods
67
- puts "4. Testing configuration methods..."
68
- begin
69
- # Test configuration (these would work with the full YfConnection class)
70
- puts " ✅ Configuration methods available:"
71
- puts " - enable_curl_impersonate"
72
- puts " - enable_curl_impersonate_fallback"
73
- puts " - set_curl_impersonate_timeout"
74
- puts " - get_available_curl_impersonate_executables"
75
- rescue => e
76
- puts " ❌ Error with configuration: #{e.message}"
77
- end
78
-
79
- puts
80
-
81
- # Test 5: Test with Yahoo Finance endpoint with short timeout
82
- puts "5. Testing Yahoo Finance endpoint..."
83
- begin
84
- response = YfAsDataframe::CurlImpersonateIntegration.make_request(
85
- "https://query1.finance.yahoo.com/v8/finance/chart/MSFT",
86
- params: { "interval" => "1d", "range" => "1d" },
87
- timeout: 15 # 15 second timeout
88
- )
89
-
90
- if response && response.success?
91
- puts " ✅ Yahoo Finance request successful"
92
- puts " Response length: #{response.body.length} characters"
93
-
94
- if response.body.strip.start_with?('{') && response.body.include?('"chart"')
95
- puts " ✅ Response appears to be valid Yahoo Finance JSON"
96
- else
97
- puts " ⚠️ Response format unexpected"
98
- end
99
- else
100
- puts " ❌ Yahoo Finance request failed"
101
- end
102
- rescue => e
103
- puts " ❌ Error with Yahoo Finance: #{e.message}"
104
- end
105
-
106
- puts
107
- puts "=== Test Summary ==="
108
- puts "The minimal curl-impersonate integration is ready."
109
- puts
110
- puts "To use with the full gem:"
111
- puts "1. Add the two integration files to lib/yf_as_dataframe/"
112
- puts "2. Add require statements to your code"
113
- puts "3. Your existing code will automatically use curl-impersonate"
114
- puts
115
- puts "Files needed:"
116
- puts "- lib/yf_as_dataframe/curl_impersonate_integration.rb"
117
- puts "- lib/yf_as_dataframe/yf_connection_minimal_patch.rb"
118
- puts
119
- puts "Integration code:"
120
- puts "require 'yf_as_dataframe/curl_impersonate_integration'"
121
- puts "require 'yf_as_dataframe/yf_connection_minimal_patch'"