zipline_polygon_bundle 0.2.0.dev1__py3-none-any.whl → 0.2.3__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.3
2
2
  Name: zipline_polygon_bundle
3
- Version: 0.2.0.dev1
3
+ Version: 0.2.3
4
4
  Summary: A zipline-reloaded data provider bundle for Polygon.io
5
5
  License: GNU AFFERO GENERAL PUBLIC LICENSE
6
6
  Version 3, 19 November 2007
@@ -666,31 +666,36 @@ License: GNU AFFERO GENERAL PUBLIC LICENSE
666
666
  Keywords: zipline,data-bundle,finance
667
667
  Author: Jim White
668
668
  Author-email: jim@fovi.com
669
- Requires-Python: >=3.9,<4.0
669
+ Requires-Python: >= 3.10,<4.0
670
670
  Classifier: Programming Language :: Python :: 3
671
671
  Classifier: License :: OSI Approved :: GNU Affero General Public License v3
672
672
  Classifier: Operating System :: OS Independent
673
673
  Requires-Dist: bcolz-zipline (>=1.2.11)
674
+ Requires-Dist: filelock (>=3.16.0)
674
675
  Requires-Dist: fsspec (>=2024.10)
675
676
  Requires-Dist: numpy (<2)
676
677
  Requires-Dist: pandas (>=2.2,<3)
677
- Requires-Dist: pandas-market-calendars (>=4.4.2)
678
- Requires-Dist: pandas_ta (>=0.3)
679
678
  Requires-Dist: polygon-api-client (>=1.14.2)
680
679
  Requires-Dist: pyarrow (>=18.1.0,<19)
681
680
  Requires-Dist: pytz (>=2018.5)
682
681
  Requires-Dist: requests (>=2.9.1)
683
682
  Requires-Dist: toolz (>=0.8.2)
684
- Requires-Dist: zipline-reloaded (>=3.1)
683
+ Requires-Dist: zipline-arrow (>=3.2.2)
685
684
  Project-URL: Repository, https://github.com/fovi-llc/zipline-polygon-bundle
686
685
  Description-Content-Type: text/markdown
687
686
 
688
687
  # zipline-polygon-bundle
689
- `zipline-polygon-bundle` is a `zipline-reloaded` (https://github.com/stefan-jansen/zipline-reloaded) data ingestion bundle for [Polygon.io](https://polygon.io/).
688
+ `zipline-polygon-bundle` is a `zipline-arrow` (https://github.com/fovi-llc/zipline-arrow) data ingestion bundle for [Polygon.io](https://polygon.io/).
689
+
690
+ Zipline Arrow is a fork of Zipline Reloaded `zipline-reloaded` (https://github.com/stefan-jansen/zipline-reloaded) which is only required if you want to use Polygon.io trades flatfiles. So if you only need to use Polygon daily or minute agg flatfiles then you may want to use `zipline-polygon-bundle<0.2` which depends on `zipline-reloaded>=3.1`.
690
691
 
691
692
  ## GitHub
692
693
  https://github.com/fovi-llc/zipline-polygon-bundle
693
694
 
695
+ ## PyPi
696
+
697
+ https://pypi.org/project/zipline_polygon_bundle
698
+
694
699
  ## Resources
695
700
 
696
701
  Get a subscription to https://polygon.io/ for an API key and access to flat files.
@@ -707,7 +712,25 @@ Code from *Trading Evolved* with some small updates for convenience: https://git
707
712
 
708
713
  One of the modifications I've made to that code is so that some of the notebooks can be run on Colab with a minimum of fuss: https://github.com/fovi-llc/trading_evolved/blob/main/Chapter%207%20-%20Backtesting%20Trading%20Strategies/First%20Zipline%20Backtest.ipynb
709
714
 
710
- # Ingest data from Polygon.io into Zipline
715
+ # Zipline Reloaded (`zipline-reloaded`) or Zipline Arrow (`zipline-arrow`)?
716
+
717
+ This bundle supports Polygon daily and minute aggregates and now trades too (quotes coming). The trades are converted to minute and daily aggregates for all trading hours (extended both pre and post, as well as regular market). But in order to support those extended hours I needed to change how Zipline handles `get_calendar` for Exchange Calendar (`exchange-calendar`) initialization. To make that work I've forked `zipline-reloaded` as `zipline-arrow`. The versions of this package before 0.2 depend on `zipline-reloaded>=3.1` and only support daily and minute flatfiles. Versions >= 0.2 of `zipline-polygon-bundle` depend on `zipline-arrow` and will work with daily and minute flatfiles as well as trades flatfiles.
718
+
719
+ # Ingest data from Polygon.io into Zipline using `aws s3` CLI
720
+ Get AWS S3 CLI in the usual way: https://docs.aws.amazon.com/cli/latest/reference/s3/
721
+
722
+ This will get everything which is currently around 12TB.
723
+ ```bash
724
+ aws s3 sync s3://flatfiles/us_stocks_sip $POLYGON_DATA_DIR/flatfiles/us_stocks_sip --checksum-mode ENABLED --endpoint-url https://files.polygon.io
725
+ ```
726
+
727
+ If you don't need quotes yet (and this bundle doesn't use them yet) then this will be faster (quotes about twice as big as trades):
728
+ ```bash
729
+ aws s3 sync s3://flatfiles/us_stocks_sip/{subdir} $POLYGON_DATA_DIR/flatfiles/us_stocks_sip/{subdir} --checksum-mode ENABLED --endpoint-url https://files.polygon.io
730
+ ```
731
+
732
+ # Alternative: Ingest data using `rclone`.
733
+ I've had problems with `rclone` on the larger files for trades and quotes so I recommend using `aws s3` CLI instead.
711
734
 
712
735
  ## Set up your `rclone` (https://rclone.org/) configuration
713
736
  ```bash
@@ -742,9 +765,23 @@ register_polygon_equities_bundle(
742
765
  )
743
766
  ```
744
767
 
768
+ ## Cython build setup
769
+
770
+ ```bash
771
+ sudo apt-get update
772
+ sudo apt-get install python3-dev python3-poetry
773
+
774
+ CFLAGS=$(python3-config --includes) pip install git+https://github.com/fovi-llc/zipline-arrow.git
775
+ ```
776
+
777
+
745
778
  ## Install the Zipline Polygon.io Bundle PyPi package and check that it works.
746
779
  Listing bundles will show if everything is working correctly.
747
780
  ```bash
781
+
782
+ pip install -U git+https://github.com/fovi-llc/zipline-reloaded.git@calendar
783
+ pip install -U git+https://github.com/fovi-llc/zipline-polygon-bundle.git
784
+
748
785
  pip install zipline_polygon_bundle
749
786
  zipline -e extension.py bundles
750
787
  ```
@@ -759,7 +796,7 @@ quantopian-quandl <no ingestions>
759
796
 
760
797
  ## Ingest the Polygon.io data. The API key is needed for the split and dividend data.
761
798
 
762
- Note that ingest currently stores cached API data and shuffled agg data in the `POLYGON_DATA_DIR` directory (`flatfiles/us_stocks_sip/api_cache` and `flatfiles/us_stocks_sip/day_by_ticker_v1` respectively) so write access is needed at this stage. After ingestion the data in `POLYGON_DATA_DIR` is not accessed.
799
+ Note that ingest currently stores cached API data and shuffled agg ("by ticker") data in the `$CUSTOM_ASSET_FILES_DIR` directory which is `$ZIPLINE_ROOT/data/polygon_custom_assets` by default.
763
800
 
764
801
  ```bash
765
802
  export POLYGON_API_KEY=<your API key here>
@@ -793,6 +830,51 @@ This ingestion for 10 years of minute bars took around 10 hours on my Mac using
793
830
  zipline ingest -b polygon-minute
794
831
  ```
795
832
 
833
+ ## Using trades flat files.
834
+ This takes a lot of space for the trades flatfiles (currently the 22 years of trades take around 4TB) and a fair bit of time to convert to minute aggregates. The benefit though is the whole trading day is covered from premarket open to after hours close. Also the current conversion logic ignores trade corrections, official close updates, and the TRF "dark pool" trades (because they are not reported when they occurred nor were they offered on the exchanges). That is to make the aggregates be as good of a simulation of real-time as we can do for algo training and backtesting. Details in the `trades_to_custom_aggs` function in `zipline_polygon_bundle/trades.py`.
835
+
836
+ The conversion process creates `.csv.gz` files in the same format as Polygon flatfiles in the custom assets dir, which is `$ZIPLINE_ROOT/data/polygon_custom_assets` by default. So while `$ZIPLINE_ROOT` needs to be writable, the Polygon flatfiles (`$POLYGON_DATA_DIR`) can be read-only.
837
+
838
+ Get AWS S3 CLI in the usual way: https://docs.aws.amazon.com/cli/latest/reference/s3/
839
+
840
+ ```bash
841
+ aws s3 sync s3://flatfiles/us_stocks_sip/trades_v1 $POLYGON_DATA_DIR/flatfiles/us_stocks_sip/trades_v1 --checksum-mode ENABLED --endpoint-url https://files.polygon.io
842
+ ```
843
+
844
+ ## `extension.py`
845
+
846
+ If you set the `ZIPLINE_ROOT` environment variable (recommended and likely necessary because the default of `~/.zipline` is probably not what you'll want) and copy your `extension.py` config there then you don't need to put `-e extension.py` on the `zipline` command line.
847
+
848
+ If you leave out the `start_date` and/or `end_date` args then `register_polygon_equities_bundle` will scan for the dates of the first and last trade file in `$POLYGON_DATA_DIR` and use them respectively.
849
+
850
+ The `NYSE_ALL_HOURS` calendar (defined in `zipline_polygon_bundle/nyse_all_hours_calendar.py`) uses open and close times for the entire trading day from premarket open to after hours close.
851
+
852
+ Right now `agg_time="1min"` is the only supported aggregate duration because Zipline can only deal with day or minute duration aggregates.
853
+
854
+ ```python
855
+ from zipline_polygon_bundle import register_polygon_equities_bundle, register_nyse_all_hours_calendar, NYSE_ALL_HOURS
856
+ from exchange_calendars.calendar_helpers import parse_date
857
+ # from zipline.utils.calendar_utils import get_calendar
858
+
859
+ # Register the NYSE_ALL_HOURS ExchangeCalendar.
860
+ register_nyse_all_hours_calendar()
861
+
862
+ register_polygon_equities_bundle(
863
+ "polygon-trades",
864
+ calendar_name=NYSE_ALL_HOURS,
865
+ # start_date=parse_date("2020-01-03", raise_oob=False),
866
+ # end_date=parse_date("2021-01-29", raise_oob=False),
867
+ agg_time="1min",
868
+ minutes_per_day=16 * 60,
869
+ )
870
+ ```
871
+
872
+ As with the daily and minute aggs, the POLYGON_API_KEY is needed for the split and dividend data. Also coming is SID assignment across ticker changes using the Polygon tickers API data.
873
+
874
+ ```bash
875
+ zipline ingest -b polygon-trades
876
+ ```
877
+
796
878
  # License is Affero General Public License v3 (AGPL v3)
797
879
  The content of this project is Copyright (C) 2024 Fovi LLC and authored by James P. White (https://www.linkedin.com/in/jamespaulwhite/). It is distributed under the terms of the GNU AFFERO GENERAL PUBLIC LICENSE (AGPL) Version 3 (See LICENSE file).
798
880
 
@@ -0,0 +1,18 @@
1
+ zipline_polygon_bundle/__init__.py,sha256=KGN5kBi021Eiz_GDtxVRTUdXgYWe6loG_C8XcrVNHrY,1765
2
+ zipline_polygon_bundle/adjustments.py,sha256=4garYK7RUrYyCIhCm0ZqHsk3y2bCt9vHUkWoHvVniTA,8233
3
+ zipline_polygon_bundle/bundle.py,sha256=7f_rpVBhR1XyOJ1e7Lulq1Uh4DWJmHxFQKZNfz9OSgQ,19805
4
+ zipline_polygon_bundle/compute_signals.py,sha256=FxcMuwMmxuvyy45y1avdL_uFEn0B4_2ekcv_B4AyPo0,10115
5
+ zipline_polygon_bundle/concat_all_aggs.py,sha256=Nuj0pytQAVoK8OK7qx5m3jWCV8uJIPsa0XHnmicgSmg,12066
6
+ zipline_polygon_bundle/concat_all_aggs_partitioned.py,sha256=AQq4ai5u5GyclWzQq2C8zIvHl_zjvLiDtxarNejwCQ4,6325
7
+ zipline_polygon_bundle/config.py,sha256=VdgwvnLKeb_WppQI6Rr97GqulEfufjDVww4ulkmlbdU,10474
8
+ zipline_polygon_bundle/nyse_all_hours_calendar.py,sha256=QrwWHm3_sfwrtt1tN5u6rqjTQcwN3qxyhjNGeHdyqcI,698
9
+ zipline_polygon_bundle/polygon_file_reader.py,sha256=TCq6hKlxixwtL57xLxs9GnvH3MMa6aWBI9mi1-PBNHw,3749
10
+ zipline_polygon_bundle/process_all_aggs.py,sha256=MVhb8xn9-DngSNSrRIpMG4XAgHjMXktoqYrxuM9ph-c,3069
11
+ zipline_polygon_bundle/quotes.py,sha256=yFjlPiQXPp0t6w2Bo96VLtYSqITP7WCLwMp5CH3zx1E,4260
12
+ zipline_polygon_bundle/split_aggs_by_ticker.py,sha256=HI_3nuN6E_VCq7LfOj4Dib_qm8wYME-jdXXX4rt-9YI,2150
13
+ zipline_polygon_bundle/tickers_and_names.py,sha256=BjYquIlSBQGd1yDW3m3cGuXKVvUfh_waYwdMR7eAhuM,15402
14
+ zipline_polygon_bundle/trades.py,sha256=XK2ed06ekByAVCimCDtJUIQ3HZaQbfKc0BXC9orHoJg,20192
15
+ zipline_polygon_bundle-0.2.3.dist-info/LICENSE,sha256=hIahDEOTzuHCU5J2nd07LWwkLW7Hko4UFO__ffsvB-8,34523
16
+ zipline_polygon_bundle-0.2.3.dist-info/METADATA,sha256=a7tw9uwGWQ-cmRk3xzc5r8WYZ03276_3NYajuuxcQR4,51908
17
+ zipline_polygon_bundle-0.2.3.dist-info/WHEEL,sha256=fGIA9gx4Qxk2KDKeNJCbOEwSrmLtjWCwzBz351GyrPQ,88
18
+ zipline_polygon_bundle-0.2.3.dist-info/RECORD,,
@@ -1,4 +1,4 @@
1
1
  Wheel-Version: 1.0
2
- Generator: poetry-core 2.1.1
2
+ Generator: poetry-core 2.1.2
3
3
  Root-Is-Purelib: true
4
4
  Tag: py3-none-any
@@ -1,17 +0,0 @@
1
- zipline_polygon_bundle/__init__.py,sha256=CSJfLks6OWBJy7Tur3nzb9Dr0B9erlCl9BejCKl5fZE,1850
2
- zipline_polygon_bundle/adjustments.py,sha256=9RsLzWSDBcoYRfHPTGb5K1czxycDyr7VPR5WCp2XlvA,8622
3
- zipline_polygon_bundle/bundle.py,sha256=7BpaoSt6SxwKhy4ygzJ-LM9W9VKF90RL7y8pAF18VQ0,24112
4
- zipline_polygon_bundle/concat_all_aggs.py,sha256=Yi4fguBSxPeNVMVP7JrA6pGegnN5UVYyGPokllsW26c,7906
5
- zipline_polygon_bundle/concat_all_aggs_partitioned.py,sha256=AQq4ai5u5GyclWzQq2C8zIvHl_zjvLiDtxarNejwCQ4,6325
6
- zipline_polygon_bundle/config.py,sha256=rO-0apuGvZDhrjxIzgJ0tLttU_2ds5EmZasMrxpnubU,9654
7
- zipline_polygon_bundle/nyse_all_hours_calendar.py,sha256=QrwWHm3_sfwrtt1tN5u6rqjTQcwN3qxyhjNGeHdyqcI,698
8
- zipline_polygon_bundle/polygon_file_reader.py,sha256=TCq6hKlxixwtL57xLxs9GnvH3MMa6aWBI9mi1-PBNHw,3749
9
- zipline_polygon_bundle/process_all_aggs.py,sha256=MVhb8xn9-DngSNSrRIpMG4XAgHjMXktoqYrxuM9ph-c,3069
10
- zipline_polygon_bundle/quotes.py,sha256=yFjlPiQXPp0t6w2Bo96VLtYSqITP7WCLwMp5CH3zx1E,4260
11
- zipline_polygon_bundle/split_aggs_by_ticker.py,sha256=HI_3nuN6E_VCq7LfOj4Dib_qm8wYME-jdXXX4rt-9YI,2150
12
- zipline_polygon_bundle/tickers_and_names.py,sha256=BjYquIlSBQGd1yDW3m3cGuXKVvUfh_waYwdMR7eAhuM,15402
13
- zipline_polygon_bundle/trades.py,sha256=rbLjIHwBR26Ke9ptequiBZrXLzoXjNvNgMZJyS8R_OM,38660
14
- zipline_polygon_bundle-0.2.0.dev1.dist-info/LICENSE,sha256=hIahDEOTzuHCU5J2nd07LWwkLW7Hko4UFO__ffsvB-8,34523
15
- zipline_polygon_bundle-0.2.0.dev1.dist-info/METADATA,sha256=sOBzjG30cFIdm7Ps3ChpNHqgr0uVUVyRl-dpM4WqkuE,46814
16
- zipline_polygon_bundle-0.2.0.dev1.dist-info/WHEEL,sha256=XbeZDeTWKc1w7CSIyre5aMDU_-PohRwTQceYnisIYYY,88
17
- zipline_polygon_bundle-0.2.0.dev1.dist-info/RECORD,,