crawlee 0.6.13b41__tar.gz → 1.1.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Potentially problematic release.
This version of crawlee might be problematic. Click here for more details.
- {crawlee-0.6.13b41 → crawlee-1.1.0}/.github/workflows/build_and_deploy_docs.yaml +3 -3
- {crawlee-0.6.13b41 → crawlee-1.1.0}/.github/workflows/pre_release.yaml +1 -1
- {crawlee-0.6.13b41 → crawlee-1.1.0}/.github/workflows/release.yaml +3 -3
- {crawlee-0.6.13b41 → crawlee-1.1.0}/.github/workflows/run_code_checks.yaml +4 -3
- {crawlee-0.6.13b41 → crawlee-1.1.0}/.github/workflows/templates_e2e_tests.yaml +3 -3
- {crawlee-0.6.13b41 → crawlee-1.1.0}/.gitignore +1 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/CHANGELOG.md +74 -4
- {crawlee-0.6.13b41 → crawlee-1.1.0}/CONTRIBUTING.md +1 -1
- {crawlee-0.6.13b41 → crawlee-1.1.0}/Makefile +1 -1
- {crawlee-0.6.13b41 → crawlee-1.1.0}/PKG-INFO +16 -5
- {crawlee-0.6.13b41 → crawlee-1.1.0}/README.md +0 -2
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/deployment/apify_platform.mdx +1 -1
- crawlee-1.1.0/docs/examples/code_examples/using_browser_profiles_chrome.py +54 -0
- crawlee-1.1.0/docs/examples/code_examples/using_browser_profiles_firefox.py +42 -0
- crawlee-1.1.0/docs/examples/code_examples/using_sitemap_request_loader.py +101 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/playwright_crawler_with_fingerprint_generator.mdx +1 -1
- crawlee-1.1.0/docs/examples/using_browser_profile.mdx +39 -0
- crawlee-1.1.0/docs/examples/using_sitemap_request_loader.mdx +22 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/architecture_overview.mdx +1 -1
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/avoid_blocking.mdx +1 -1
- crawlee-1.1.0/docs/guides/code_examples/storage_clients/redis_storage_client_basic_example.py +10 -0
- crawlee-1.1.0/docs/guides/code_examples/storage_clients/redis_storage_client_configuration_example.py +27 -0
- crawlee-1.1.0/docs/guides/code_examples/storage_clients/sql_storage_client_basic_example.py +12 -0
- crawlee-1.1.0/docs/guides/code_examples/storage_clients/sql_storage_client_configuration_example.py +33 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/request_loaders.mdx +8 -2
- crawlee-1.1.0/docs/guides/storage_clients.mdx +516 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/trace_and_monitor_crawlers.mdx +1 -1
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/introduction/03_adding_more_urls.mdx +1 -1
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/introduction/09_running_in_cloud.mdx +1 -1
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/pyproject.toml +1 -1
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/upgrading/upgrading_to_v1.md +4 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/pyproject.toml +33 -22
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/_request.py +32 -20
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/_service_locator.py +4 -4
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/_types.py +67 -24
- crawlee-1.1.0/src/crawlee/_utils/raise_if_too_many_kwargs.py +12 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/_utils/recoverable_state.py +32 -8
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/_utils/recurring_task.py +15 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/_utils/robots.py +17 -5
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/_utils/sitemap.py +1 -1
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/_utils/urls.py +9 -2
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/browsers/_browser_pool.py +4 -1
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/browsers/_playwright_browser_controller.py +21 -15
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/browsers/_playwright_browser_plugin.py +17 -3
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/browsers/_types.py +1 -1
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/configuration.py +3 -1
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/crawlers/_abstract_http/_abstract_http_crawler.py +3 -1
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/crawlers/_adaptive_playwright/_adaptive_playwright_crawler.py +38 -14
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/crawlers/_basic/_basic_crawler.py +51 -14
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/crawlers/_playwright/_playwright_crawler.py +16 -4
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/events/_event_manager.py +3 -1
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/fingerprint_suite/_header_generator.py +2 -2
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/otel/crawler_instrumentor.py +3 -3
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/project_template/{{cookiecutter.project_name}}/pyproject.toml +2 -2
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/project_template/{{cookiecutter.project_name}}/requirements.txt +3 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/request_loaders/_sitemap_request_loader.py +22 -4
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/sessions/_session_pool.py +1 -1
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/statistics/_error_snapshotter.py +1 -1
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/statistics/_models.py +32 -1
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/statistics/_statistics.py +24 -33
- crawlee-1.1.0/src/crawlee/storage_clients/__init__.py +25 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/storage_clients/_base/_storage_client.py +5 -4
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/storage_clients/_file_system/_dataset_client.py +4 -5
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/storage_clients/_file_system/_key_value_store_client.py +4 -5
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/storage_clients/_file_system/_request_queue_client.py +28 -12
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/storage_clients/_file_system/_storage_client.py +2 -2
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/storage_clients/_memory/_dataset_client.py +4 -5
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/storage_clients/_memory/_key_value_store_client.py +4 -5
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/storage_clients/_memory/_request_queue_client.py +4 -5
- crawlee-1.1.0/src/crawlee/storage_clients/_redis/__init__.py +6 -0
- crawlee-1.1.0/src/crawlee/storage_clients/_redis/_client_mixin.py +295 -0
- crawlee-1.1.0/src/crawlee/storage_clients/_redis/_dataset_client.py +325 -0
- crawlee-1.1.0/src/crawlee/storage_clients/_redis/_key_value_store_client.py +264 -0
- crawlee-1.1.0/src/crawlee/storage_clients/_redis/_request_queue_client.py +586 -0
- crawlee-1.1.0/src/crawlee/storage_clients/_redis/_storage_client.py +146 -0
- crawlee-1.1.0/src/crawlee/storage_clients/_redis/_utils.py +23 -0
- crawlee-1.1.0/src/crawlee/storage_clients/_redis/lua_scripts/atomic_bloom_add_requests.lua +36 -0
- crawlee-1.1.0/src/crawlee/storage_clients/_redis/lua_scripts/atomic_fetch_request.lua +49 -0
- crawlee-1.1.0/src/crawlee/storage_clients/_redis/lua_scripts/atomic_set_add_requests.lua +37 -0
- crawlee-1.1.0/src/crawlee/storage_clients/_redis/lua_scripts/reclaim_stale_requests.lua +34 -0
- crawlee-1.1.0/src/crawlee/storage_clients/_sql/__init__.py +6 -0
- crawlee-1.1.0/src/crawlee/storage_clients/_sql/_client_mixin.py +385 -0
- crawlee-1.1.0/src/crawlee/storage_clients/_sql/_dataset_client.py +310 -0
- crawlee-1.1.0/src/crawlee/storage_clients/_sql/_db_models.py +268 -0
- crawlee-1.1.0/src/crawlee/storage_clients/_sql/_key_value_store_client.py +300 -0
- crawlee-1.1.0/src/crawlee/storage_clients/_sql/_request_queue_client.py +720 -0
- crawlee-1.1.0/src/crawlee/storage_clients/_sql/_storage_client.py +291 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/storage_clients/models.py +10 -10
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/storages/_base.py +3 -1
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/storages/_dataset.py +5 -3
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/storages/_key_value_store.py +11 -6
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/storages/_request_queue.py +5 -3
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/storages/_storage_instance_manager.py +62 -72
- crawlee-1.1.0/src/crawlee/storages/_utils.py +11 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/e2e/project_template/utils.py +3 -2
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/_autoscaling/test_autoscaled_pool.py +2 -0
- crawlee-1.1.0/tests/unit/_autoscaling/test_snapshotter.py +353 -0
- crawlee-1.1.0/tests/unit/_utils/test_raise_if_too_many_kwargs.py +38 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/_utils/test_system.py +8 -6
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/browsers/test_playwright_browser_controller.py +28 -2
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/browsers/test_playwright_browser_plugin.py +10 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/conftest.py +37 -10
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/crawlers/_adaptive_playwright/test_adaptive_playwright_crawler.py +108 -8
- crawlee-1.1.0/tests/unit/crawlers/_adaptive_playwright/test_adaptive_playwright_crawling_context.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/crawlers/_basic/test_basic_crawler.py +192 -14
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/crawlers/_beautifulsoup/test_beautifulsoup_crawler.py +147 -4
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/crawlers/_http/test_http_crawler.py +9 -5
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/crawlers/_parsel/test_parsel_crawler.py +145 -4
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/crawlers/_playwright/test_playwright_crawler.py +143 -2
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/events/test_event_manager.py +12 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/otel/test_crawler_instrumentor.py +9 -3
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/request_loaders/test_sitemap_request_loader.py +35 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/server.py +10 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/server_endpoints.py +11 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/sessions/test_session_pool.py +1 -1
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/storage_clients/_file_system/test_fs_dataset_client.py +4 -4
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/storage_clients/_file_system/test_fs_kvs_client.py +4 -4
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/storage_clients/_file_system/test_fs_rq_client.py +14 -6
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/storage_clients/_memory/test_memory_dataset_client.py +3 -3
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/storage_clients/_memory/test_memory_kvs_client.py +3 -3
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/storage_clients/_memory/test_memory_rq_client.py +3 -3
- crawlee-1.1.0/tests/unit/storage_clients/_redis/test_redis_dataset_client.py +146 -0
- crawlee-1.1.0/tests/unit/storage_clients/_redis/test_redis_kvs_client.py +217 -0
- crawlee-1.1.0/tests/unit/storage_clients/_redis/test_redis_rq_client.py +257 -0
- crawlee-1.1.0/tests/unit/storage_clients/_sql/test_sql_dataset_client.py +236 -0
- crawlee-1.1.0/tests/unit/storage_clients/_sql/test_sql_kvs_client.py +287 -0
- crawlee-1.1.0/tests/unit/storage_clients/_sql/test_sql_rq_client.py +239 -0
- crawlee-1.1.0/tests/unit/storages/conftest.py +39 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/storages/test_dataset.py +87 -42
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/storages/test_key_value_store.py +120 -39
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/storages/test_request_queue.py +141 -40
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/storages/test_storage_instance_manager.py +20 -20
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/test_configuration.py +30 -13
- crawlee-1.1.0/uv.lock +4382 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/docusaurus.config.js +8 -4
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/package.json +20 -15
- crawlee-1.1.0/website/src/components/RunnableCodeBlock.jsx +42 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/src/css/custom.css +4 -2
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/src/pages/home_page_example.py +14 -9
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/src/pages/index.js +1 -1
- crawlee-1.1.0/website/static/.nojekyll +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/yarn.lock +1992 -1647
- crawlee-0.6.13b41/docs/guides/storage_clients.mdx +0 -156
- crawlee-0.6.13b41/src/crawlee/storage_clients/__init__.py +0 -9
- crawlee-0.6.13b41/tests/unit/_autoscaling/test_snapshotter.py +0 -333
- crawlee-0.6.13b41/tests/unit/storages/conftest.py +0 -13
- crawlee-0.6.13b41/uv.lock +0 -3652
- crawlee-0.6.13b41/website/src/components/RunnableCodeBlock.jsx +0 -40
- crawlee-0.6.13b41/website/static/img/apify_logo.svg +0 -5
- crawlee-0.6.13b41/website/static/img/apify_og_SDK.png +0 -0
- crawlee-0.6.13b41/website/static/img/apify_sdk.svg +0 -13
- crawlee-0.6.13b41/website/static/img/apify_sdk_white.svg +0 -13
- {crawlee-0.6.13b41 → crawlee-1.1.0}/.editorconfig +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/.github/CODEOWNERS +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/.github/pull_request_template.md +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/.github/workflows/check_pr_title.yaml +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/.github/workflows/update_new_issue.yaml +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/.markdownlint.yaml +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/.pre-commit-config.yaml +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/LICENSE +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/deployment/code_examples/apify/crawler_as_actor_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/deployment/code_examples/apify/get_public_url.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/deployment/code_examples/apify/log_with_config_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/deployment/code_examples/apify/proxy_advanced_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/deployment/code_examples/apify/proxy_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/deployment/code_examples/google/cloud_run_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/deployment/code_examples/google/google_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/deployment/google_cloud.mdx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/deployment/google_cloud_run.mdx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/add_data_to_dataset.mdx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/beautifulsoup_crawler.mdx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/capture_screenshot_using_playwright.mdx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/capturing_page_snapshots_with_error_snapshotter.mdx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/code_examples/adaptive_playwright_crawler.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/code_examples/add_data_to_dataset_bs.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/code_examples/add_data_to_dataset_dataset.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/code_examples/add_data_to_dataset_pw.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/code_examples/beautifulsoup_crawler.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/code_examples/beautifulsoup_crawler_keep_alive.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/code_examples/beautifulsoup_crawler_stop.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/code_examples/capture_screenshot_using_playwright.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/code_examples/configure_json_logging.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/code_examples/crawl_all_links_on_website_bs.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/code_examples/crawl_all_links_on_website_pw.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/code_examples/crawl_multiple_urls_bs.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/code_examples/crawl_multiple_urls_pw.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/code_examples/crawl_specific_links_on_website_bs.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/code_examples/crawl_specific_links_on_website_pw.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/code_examples/crawl_website_with_relative_links_all_links.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/code_examples/crawl_website_with_relative_links_same_domain.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/code_examples/crawl_website_with_relative_links_same_hostname.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/code_examples/crawl_website_with_relative_links_same_origin.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/code_examples/export_entire_dataset_to_file_csv.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/code_examples/export_entire_dataset_to_file_json.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/code_examples/extract_and_add_specific_links_on_website_bs.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/code_examples/extract_and_add_specific_links_on_website_pw.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/code_examples/fill_and_submit_web_form_crawler.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/code_examples/fill_and_submit_web_form_request.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/code_examples/parsel_crawler.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/code_examples/parsel_crawler_with_error_snapshotter.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/code_examples/playwright_block_requests.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/code_examples/playwright_crawler.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/code_examples/playwright_crawler_with_camoufox.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/code_examples/playwright_crawler_with_error_snapshotter.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/code_examples/playwright_crawler_with_fingerprint_generator.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/code_examples/respect_robots_on_skipped_request.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/code_examples/respect_robots_txt_file.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/code_examples/resuming_paused_crawl.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/crawl_all_links_on_website.mdx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/crawl_multiple_urls.mdx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/crawl_specific_links_on_website.mdx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/crawl_website_with_relative_links.mdx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/crawler_keep_alive.mdx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/crawler_stop.mdx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/export_entire_dataset_to_file.mdx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/fill_and_submit_web_form.mdx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/json_logging.mdx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/parsel_crawler.mdx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/playwright_crawler.mdx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/playwright_crawler_adaptive.mdx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/playwright_crawler_with_block_requests.mdx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/playwright_crawler_with_camoufox.mdx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/respect_robots_txt_file.mdx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/examples/resuming_paused_crawl.mdx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/avoid_blocking/default_fingerprint_generator_with_args.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/avoid_blocking/playwright_with_fingerprint_generator.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/creating_web_archive/manual_archiving_parsel_crawler.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/creating_web_archive/manual_archiving_playwright_crawler.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/creating_web_archive/simple_pw_through_proxy_pywb_server.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/error_handling/change_handle_error_status.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/error_handling/disable_retry.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/error_handling/handle_proxy_error.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/http_clients/parsel_curl_impersonate_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/http_clients/parsel_httpx_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/http_clients/parsel_impit_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/http_crawlers/beautifulsoup_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/http_crawlers/custom_crawler_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/http_crawlers/http_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/http_crawlers/parsel_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/login_crawler/http_login.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/login_crawler/playwright_login.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/playwright_crawler/browser_configuration_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/playwright_crawler/multiple_launch_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/playwright_crawler/plugin_browser_configuration_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/playwright_crawler/pre_navigation_hook_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/playwright_crawler_adaptive/handler.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/playwright_crawler_adaptive/init_beautifulsoup.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/playwright_crawler_adaptive/init_parsel.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/playwright_crawler_adaptive/init_prediction.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/playwright_crawler_adaptive/pre_nav_hooks.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/playwright_crawler_stagehand/__init__.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/playwright_crawler_stagehand/browser_classes.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/playwright_crawler_stagehand/stagehand_run.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/playwright_crawler_stagehand/support_classes.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/proxy_management/inspecting_bs_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/proxy_management/inspecting_pw_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/proxy_management/integration_bs_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/proxy_management/integration_pw_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/proxy_management/quick_start_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/proxy_management/session_bs_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/proxy_management/session_pw_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/proxy_management/tiers_bs_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/proxy_management/tiers_pw_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/request_loaders/rl_basic_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/request_loaders/rl_basic_example_with_persist.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/request_loaders/rl_tandem_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/request_loaders/rl_tandem_example_explicit.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/request_loaders/sitemap_basic_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/request_loaders/sitemap_example_with_persist.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/request_loaders/sitemap_tandem_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/request_loaders/sitemap_tandem_example_explicit.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/request_router/adaptive_crawler_handlers.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/request_router/basic_request_handlers.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/request_router/custom_router_default_only.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/request_router/error_handler.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/request_router/failed_request_handler.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/request_router/http_pre_navigation.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/request_router/playwright_pre_navigation.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/request_router/simple_default_handler.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/running_in_web_server/__init__.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/running_in_web_server/crawler.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/running_in_web_server/server.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/scaling_crawlers/max_tasks_per_minute_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/scaling_crawlers/min_and_max_concurrency_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/service_locator/service_conflicts.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/service_locator/service_crawler_configuration.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/service_locator/service_crawler_event_manager.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/service_locator/service_crawler_storage_client.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/service_locator/service_locator_configuration.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/service_locator/service_locator_event_manager.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/service_locator/service_locator_storage_client.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/service_locator/service_storage_configuration.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/service_locator/service_storage_storage_client.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/session_management/multi_sessions_http.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/session_management/one_session_http.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/session_management/sm_basic.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/session_management/sm_beautifulsoup.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/session_management/sm_http.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/session_management/sm_parsel.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/session_management/sm_playwright.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/session_management/sm_standalone.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/storage_clients/custom_storage_client_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/storage_clients/file_system_storage_client_basic_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/storage_clients/file_system_storage_client_configuration_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/storage_clients/memory_storage_client_basic_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/storage_clients/registering_storage_clients_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/storages/cleaning_do_not_purge_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/storages/cleaning_purge_explicitly_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/storages/dataset_basic_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/storages/dataset_with_crawler_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/storages/dataset_with_crawler_explicit_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/storages/helper_add_requests_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/storages/helper_enqueue_links_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/storages/kvs_basic_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/storages/kvs_with_crawler_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/storages/kvs_with_crawler_explicit_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/storages/opening.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/storages/rq_basic_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/storages/rq_with_crawler_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/storages/rq_with_crawler_explicit_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/code_examples/trace_and_monitor_crawlers/instrument_crawler.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/crawler_login.mdx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/creating_web_archive.mdx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/error_handling.mdx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/http_clients.mdx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/http_crawlers.mdx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/playwright_crawler.mdx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/playwright_crawler_adaptive.mdx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/playwright_crawler_stagehand.mdx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/proxy_management.mdx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/request_router.mdx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/running_in_web_server.mdx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/scaling_crawlers.mdx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/service_locator.mdx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/session_management.mdx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/guides/storages.mdx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/introduction/01_setting_up.mdx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/introduction/02_first_crawler.mdx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/introduction/04_real_world_project.mdx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/introduction/05_crawling.mdx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/introduction/06_scraping.mdx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/introduction/07_saving_data.mdx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/introduction/08_refactoring.mdx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/introduction/code_examples/02_bs.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/introduction/code_examples/02_bs_better.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/introduction/code_examples/02_request_queue.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/introduction/code_examples/03_enqueue_strategy.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/introduction/code_examples/03_finding_new_links.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/introduction/code_examples/03_globs.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/introduction/code_examples/03_original_code.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/introduction/code_examples/03_transform_request.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/introduction/code_examples/04_sanity_check.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/introduction/code_examples/05_crawling_detail.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/introduction/code_examples/05_crawling_listing.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/introduction/code_examples/06_scraping.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/introduction/code_examples/07_final_code.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/introduction/code_examples/07_first_code.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/introduction/code_examples/08_main.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/introduction/code_examples/08_routes.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/introduction/code_examples/09_apify_sdk.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/introduction/code_examples/__init__.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/introduction/code_examples/routes.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/introduction/index.mdx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/quick-start/code_examples/beautifulsoup_crawler_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/quick-start/code_examples/parsel_crawler_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/quick-start/code_examples/playwright_crawler_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/quick-start/code_examples/playwright_crawler_headful_example.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/quick-start/index.mdx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/docs/upgrading/upgrading_to_v0x.md +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/renovate.json +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/__init__.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/_autoscaling/__init__.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/_autoscaling/_types.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/_autoscaling/autoscaled_pool.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/_autoscaling/py.typed +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/_autoscaling/snapshotter.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/_autoscaling/system_status.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/_browserforge_workaround.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/_cli.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/_consts.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/_log_config.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/_utils/__init__.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/_utils/blocked.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/_utils/byte_size.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/_utils/console.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/_utils/context.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/_utils/crypto.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/_utils/docs.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/_utils/file.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/_utils/globs.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/_utils/html_to_text.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/_utils/models.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/_utils/requests.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/_utils/system.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/_utils/time.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/_utils/try_import.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/_utils/wait.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/_utils/web.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/browsers/__init__.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/browsers/_browser_controller.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/browsers/_browser_plugin.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/browsers/_playwright_browser.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/browsers/py.typed +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/crawlers/__init__.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/crawlers/_abstract_http/__init__.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/crawlers/_abstract_http/_abstract_http_parser.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/crawlers/_abstract_http/_http_crawling_context.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/crawlers/_abstract_http/py.typed +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/crawlers/_adaptive_playwright/__init__.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/crawlers/_adaptive_playwright/_adaptive_playwright_crawler_statistics.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/crawlers/_adaptive_playwright/_adaptive_playwright_crawling_context.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/crawlers/_adaptive_playwright/_rendering_type_predictor.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/crawlers/_adaptive_playwright/_result_comparator.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/crawlers/_adaptive_playwright/_utils.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/crawlers/_basic/__init__.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/crawlers/_basic/_basic_crawling_context.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/crawlers/_basic/_context_pipeline.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/crawlers/_basic/_logging_utils.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/crawlers/_basic/py.typed +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/crawlers/_beautifulsoup/__init__.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/crawlers/_beautifulsoup/_beautifulsoup_crawler.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/crawlers/_beautifulsoup/_beautifulsoup_crawling_context.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/crawlers/_beautifulsoup/_beautifulsoup_parser.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/crawlers/_beautifulsoup/_utils.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/crawlers/_beautifulsoup/py.typed +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/crawlers/_http/__init__.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/crawlers/_http/_http_crawler.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/crawlers/_http/_http_parser.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/crawlers/_parsel/__init__.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/crawlers/_parsel/_parsel_crawler.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/crawlers/_parsel/_parsel_crawling_context.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/crawlers/_parsel/_parsel_parser.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/crawlers/_parsel/_utils.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/crawlers/_playwright/__init__.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/crawlers/_playwright/_playwright_crawling_context.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/crawlers/_playwright/_playwright_http_client.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/crawlers/_playwright/_playwright_pre_nav_crawling_context.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/crawlers/_playwright/_types.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/crawlers/_playwright/_utils.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/crawlers/_types.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/crawlers/py.typed +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/errors.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/events/__init__.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/events/_local_event_manager.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/events/_types.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/events/py.typed +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/fingerprint_suite/__init__.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/fingerprint_suite/_browserforge_adapter.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/fingerprint_suite/_consts.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/fingerprint_suite/_fingerprint_generator.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/fingerprint_suite/_types.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/fingerprint_suite/py.typed +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/http_clients/__init__.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/http_clients/_base.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/http_clients/_curl_impersonate.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/http_clients/_httpx.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/http_clients/_impit.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/otel/__init__.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/project_template/cookiecutter.json +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/project_template/hooks/post_gen_project.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/project_template/hooks/pre_gen_project.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/project_template/templates/main.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/project_template/templates/main_beautifulsoup.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/project_template/templates/main_parsel.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/project_template/templates/main_playwright.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/project_template/templates/main_playwright_camoufox.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/project_template/templates/routes_beautifulsoup.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/project_template/templates/routes_camoufox.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/project_template/templates/routes_parsel.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/project_template/templates/routes_playwright.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/project_template/templates/routes_playwright_camoufox.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/project_template/{{cookiecutter.project_name}}/.dockerignore +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/project_template/{{cookiecutter.project_name}}/Dockerfile +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/project_template/{{cookiecutter.project_name}}/README.md +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/project_template/{{cookiecutter.project_name}}/{{cookiecutter.__package_name}}/__init__.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/project_template/{{cookiecutter.project_name}}/{{cookiecutter.__package_name}}/__main__.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/project_template/{{cookiecutter.project_name}}/{{cookiecutter.__package_name}}/main.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/project_template/{{cookiecutter.project_name}}/{{cookiecutter.__package_name}}/routes.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/proxy_configuration.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/py.typed +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/request_loaders/__init__.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/request_loaders/_request_list.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/request_loaders/_request_loader.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/request_loaders/_request_manager.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/request_loaders/_request_manager_tandem.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/router.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/sessions/__init__.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/sessions/_cookies.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/sessions/_models.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/sessions/_session.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/sessions/py.typed +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/statistics/__init__.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/statistics/_error_tracker.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/storage_clients/_base/__init__.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/storage_clients/_base/_dataset_client.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/storage_clients/_base/_key_value_store_client.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/storage_clients/_base/_request_queue_client.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/storage_clients/_base/py.typed +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/storage_clients/_file_system/__init__.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/storage_clients/_file_system/_utils.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/storage_clients/_file_system/py.typed +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/storage_clients/_memory/__init__.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/storage_clients/_memory/_storage_client.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/storage_clients/_memory/py.typed +0 -0
- {crawlee-0.6.13b41/src/crawlee/storage_clients → crawlee-1.1.0/src/crawlee/storage_clients/_redis}/py.typed +0 -0
- {crawlee-0.6.13b41/src/crawlee/storages → crawlee-1.1.0/src/crawlee/storage_clients/_sql}/py.typed +0 -0
- /crawlee-0.6.13b41/tests/__init__.py → /crawlee-1.1.0/src/crawlee/storage_clients/py.typed +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/src/crawlee/storages/__init__.py +0 -0
- /crawlee-0.6.13b41/tests/e2e/__init__.py → /crawlee-1.1.0/src/crawlee/storages/py.typed +0 -0
- {crawlee-0.6.13b41/tests/unit → crawlee-1.1.0/tests}/__init__.py +0 -0
- /crawlee-0.6.13b41/tests/unit/crawlers/_adaptive_playwright/test_adaptive_playwright_crawling_context.py → /crawlee-1.1.0/tests/e2e/__init__.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/e2e/conftest.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/e2e/project_template/test_static_crawlers_templates.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/README.md +0 -0
- /crawlee-0.6.13b41/website/static/.nojekyll → /crawlee-1.1.0/tests/unit/__init__.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/_autoscaling/test_system_status.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/_statistics/test_error_tracker.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/_statistics/test_periodic_logging.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/_statistics/test_persistence.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/_statistics/test_request_processing_record.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/_utils/test_byte_size.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/_utils/test_console.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/_utils/test_crypto.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/_utils/test_file.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/_utils/test_globs.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/_utils/test_html_to_text.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/_utils/test_measure_time.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/_utils/test_recurring_task.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/_utils/test_requests.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/_utils/test_robots.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/_utils/test_sitemap.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/_utils/test_timedelata_ms.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/_utils/test_urls.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/browsers/test_browser_pool.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/browsers/test_playwright_browser.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/crawlers/_adaptive_playwright/test_adaptive_playwright_crawler_statistics.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/crawlers/_adaptive_playwright/test_predictor.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/crawlers/_basic/test_context_pipeline.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/events/test_local_event_manager.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/fingerprint_suite/test_adapters.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/fingerprint_suite/test_header_generator.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/http_clients/test_http_clients.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/http_clients/test_httpx.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/proxy_configuration/test_new_proxy_info.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/proxy_configuration/test_tiers.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/request_loaders/test_request_list.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/sessions/test_cookies.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/sessions/test_models.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/sessions/test_session.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/storages/test_request_manager_tandem.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/test_cli.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/test_log_config.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/test_router.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/tests/unit/test_service_locator.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/.eslintrc.json +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/.yarnrc.yml +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/babel.config.js +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/build_api_reference.sh +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/generate_module_shortcuts.py +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/patches/@docusaurus+core+3.4.0.patch +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/patches/@docusaurus+core+3.5.2.patch +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/roa-loader/index.js +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/roa-loader/package.json +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/sidebars.js +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/src/components/ApiLink.jsx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/src/components/Button.jsx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/src/components/Button.module.css +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/src/components/CopyButton.jsx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/src/components/CopyButton.module.css +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/src/components/Gradients.jsx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/src/components/Highlights.jsx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/src/components/Highlights.module.css +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/src/components/Homepage/HomepageCliExample.jsx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/src/components/Homepage/HomepageCliExample.module.css +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/src/components/Homepage/HomepageCtaSection.jsx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/src/components/Homepage/HomepageCtaSection.module.css +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/src/components/Homepage/HomepageHeroSection.jsx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/src/components/Homepage/HomepageHeroSection.module.css +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/src/components/Homepage/LanguageInfoWidget.jsx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/src/components/Homepage/LanguageInfoWidget.module.css +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/src/components/Homepage/LanguageSwitch.jsx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/src/components/Homepage/LanguageSwitch.module.css +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/src/components/Homepage/RiverSection.jsx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/src/components/Homepage/RiverSection.module.css +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/src/components/Homepage/ThreeCardsWithIcon.jsx +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/src/components/Homepage/ThreeCardsWithIcon.module.css +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/src/components/Homepage/animated-crawlee-logo-dark.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/src/components/Homepage/animated-crawlee-logo-light.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/src/components/RunnableCodeBlock.module.css +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/src/pages/index.module.css +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/src/theme/ColorModeToggle/dark-mode-icon.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/src/theme/ColorModeToggle/index.js +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/src/theme/ColorModeToggle/light-mode-icon.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/src/theme/ColorModeToggle/styles.module.css +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/src/theme/DocItem/Layout/index.js +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/src/theme/DocItem/Layout/styles.module.css +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/src/theme/Footer/LinkItem/index.js +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/src/theme/Footer/LinkItem/index.module.css +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/src/theme/Footer/index.js +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/src/theme/Footer/index.module.css +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/src/theme/MDXComponents/A.js +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/src/theme/Navbar/Content/index.js +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/src/theme/Navbar/Content/styles.module.css +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/src/theme/Navbar/Logo/index.js +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/src/theme/Navbar/Logo/index.module.css +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/src/theme/Navbar/MobileSidebar/Header/index.js +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/src/theme/Navbar/MobileSidebar/Header/index.module.css +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/src/theme/Navbar/MobileSidebar/Layout/index.js +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/src/theme/Navbar/MobileSidebar/PrimaryMenu/index.js +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/src/theme/Navbar/MobileSidebar/index.js +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/src/theme/NavbarItem/ComponentTypes.js +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/font/lota.woff +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/font/lota.woff2 +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/API.png +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/arrow_right.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/auto-scaling-dark.webp +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/auto-scaling-light.webp +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/check.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/chrome-scrape-dark.gif +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/chrome-scrape-light.gif +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/cloud_icon.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/community-dark-icon.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/community-light-icon.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/crawlee-dark-new.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/crawlee-dark.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/crawlee-javascript-dark.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/crawlee-javascript-light.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/crawlee-light-new.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/crawlee-light.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/crawlee-logo-monocolor.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/crawlee-logo.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/crawlee-python-dark.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/crawlee-python-light.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/crawlee-python-og.png +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/defaults-dark-icon.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/defaults-light-icon.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/discord-brand-dark.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/discord-brand.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/docusaurus.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/external-link.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/favicon.ico +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/favorite-tools-dark.webp +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/favorite-tools-light.webp +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/features/auto-scaling.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/features/automate-everything.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/features/fingerprints.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/features/node-requests.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/features/runs-on-py.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/features/storage.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/features/works-everywhere.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/fill-and-submit-web-form/00.jpg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/fill-and-submit-web-form/01.jpg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/fill-and-submit-web-form/02.jpg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/fill-and-submit-web-form/03.jpg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/getting-started/current-price.jpg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/getting-started/scraping-practice.jpg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/getting-started/select-an-element.jpg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/getting-started/selected-element.jpg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/getting-started/sku.jpg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/getting-started/title.jpg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/github-brand-dark.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/github-brand.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/guides/jaeger_otel_search_view_example.png +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/guides/jaeger_otel_trace_example.png +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/hearth copy.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/hearth.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/javascript_logo.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/js_file.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/logo-big.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/logo-blur.png +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/logo-blur.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/logo-zoom.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/menu-arrows.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/oss_logo.png +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/puppeteer-live-view-dashboard.png +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/puppeteer-live-view-detail.png +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/queue-dark-icon.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/queue-light-icon.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/resuming-paused-crawl/00.webp +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/resuming-paused-crawl/01.webp +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/robot.png +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/routing-dark-icon.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/routing-light-icon.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/scraping-utils-dark-icon.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/scraping-utils-light-icon.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/smart-proxy-dark.webp +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/smart-proxy-light.webp +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/source_code.png +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/system.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/triangles_dark.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/triangles_light.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/workflow.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/zero-setup-dark-icon.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/img/zero-setup-light-icon.svg +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/js/custom.js +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/static/robots.txt +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/tools/docs-prettier.config.js +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/tools/utils/externalLink.js +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/tools/website_gif/chrome-scrape-dark.gif +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/tools/website_gif/chrome-scrape-dark.mp4 +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/tools/website_gif/chrome-scrape-light.gif +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/tools/website_gif/chrome-scrape-light.mp4 +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/tools/website_gif/website_gif.mjs +0 -0
- {crawlee-0.6.13b41 → crawlee-1.1.0}/website/tsconfig.eslint.json +0 -0
|
@@ -10,7 +10,7 @@ on:
|
|
|
10
10
|
|
|
11
11
|
env:
|
|
12
12
|
NODE_VERSION: 20
|
|
13
|
-
PYTHON_VERSION: 3.
|
|
13
|
+
PYTHON_VERSION: 3.14
|
|
14
14
|
|
|
15
15
|
jobs:
|
|
16
16
|
build_and_deploy_docs:
|
|
@@ -30,7 +30,7 @@ jobs:
|
|
|
30
30
|
ref: ${{ github.event_name == 'workflow_call' && inputs.ref || github.ref }}
|
|
31
31
|
|
|
32
32
|
- name: Set up Node
|
|
33
|
-
uses: actions/setup-node@
|
|
33
|
+
uses: actions/setup-node@v6
|
|
34
34
|
with:
|
|
35
35
|
node-version: ${{ env.NODE_VERSION }}
|
|
36
36
|
|
|
@@ -40,7 +40,7 @@ jobs:
|
|
|
40
40
|
python-version: ${{ env.PYTHON_VERSION }}
|
|
41
41
|
|
|
42
42
|
- name: Set up uv package manager
|
|
43
|
-
uses: astral-sh/setup-uv@
|
|
43
|
+
uses: astral-sh/setup-uv@v7
|
|
44
44
|
with:
|
|
45
45
|
python-version: ${{ env.PYTHON_VERSION }}
|
|
46
46
|
|
|
@@ -47,13 +47,13 @@ jobs:
|
|
|
47
47
|
name: Lint check
|
|
48
48
|
uses: apify/workflows/.github/workflows/python_lint_check.yaml@main
|
|
49
49
|
with:
|
|
50
|
-
python-versions: '["3.10", "3.11", "3.12", "3.13"]'
|
|
50
|
+
python-versions: '["3.10", "3.11", "3.12", "3.13", "3.14"]'
|
|
51
51
|
|
|
52
52
|
type_check:
|
|
53
53
|
name: Type check
|
|
54
54
|
uses: apify/workflows/.github/workflows/python_type_check.yaml@main
|
|
55
55
|
with:
|
|
56
|
-
python-versions: '["3.10", "3.11", "3.12", "3.13"]'
|
|
56
|
+
python-versions: '["3.10", "3.11", "3.12", "3.13", "3.14"]'
|
|
57
57
|
|
|
58
58
|
unit_tests:
|
|
59
59
|
name: Unit tests
|
|
@@ -61,7 +61,7 @@ jobs:
|
|
|
61
61
|
secrets:
|
|
62
62
|
httpbin_url: ${{ secrets.APIFY_HTTPBIN_TOKEN && format('https://httpbin.apify.actor?token={0}', secrets.APIFY_HTTPBIN_TOKEN) || 'https://httpbin.org'}}
|
|
63
63
|
with:
|
|
64
|
-
python-versions: '["3.10", "3.11", "3.12", "3.13"]'
|
|
64
|
+
python-versions: '["3.10", "3.11", "3.12", "3.13", "3.14"]'
|
|
65
65
|
|
|
66
66
|
update_changelog:
|
|
67
67
|
name: Update changelog
|
|
@@ -21,13 +21,13 @@ jobs:
|
|
|
21
21
|
name: Lint check
|
|
22
22
|
uses: apify/workflows/.github/workflows/python_lint_check.yaml@main
|
|
23
23
|
with:
|
|
24
|
-
python-versions: '["3.10", "3.11", "3.12", "3.13"]'
|
|
24
|
+
python-versions: '["3.10", "3.11", "3.12", "3.13", "3.14"]'
|
|
25
25
|
|
|
26
26
|
type_check:
|
|
27
27
|
name: Type check
|
|
28
28
|
uses: apify/workflows/.github/workflows/python_type_check.yaml@main
|
|
29
29
|
with:
|
|
30
|
-
python-versions: '["3.10", "3.11", "3.12", "3.13"]'
|
|
30
|
+
python-versions: '["3.10", "3.11", "3.12", "3.13", "3.14"]'
|
|
31
31
|
|
|
32
32
|
unit_tests:
|
|
33
33
|
name: Unit tests
|
|
@@ -35,8 +35,9 @@ jobs:
|
|
|
35
35
|
secrets:
|
|
36
36
|
httpbin_url: ${{ secrets.APIFY_HTTPBIN_TOKEN && format('https://httpbin.apify.actor?token={0}', secrets.APIFY_HTTPBIN_TOKEN) || 'https://httpbin.org'}}
|
|
37
37
|
with:
|
|
38
|
-
python-versions: '["3.10", "3.11", "3.12", "3.13"]'
|
|
38
|
+
python-versions: '["3.10", "3.11", "3.12", "3.13", "3.14"]'
|
|
39
39
|
|
|
40
40
|
docs_check:
|
|
41
41
|
name: Docs check
|
|
42
42
|
uses: apify/workflows/.github/workflows/python_docs_check.yaml@main
|
|
43
|
+
secrets: inherit
|
|
@@ -7,7 +7,7 @@ on:
|
|
|
7
7
|
|
|
8
8
|
env:
|
|
9
9
|
NODE_VERSION: 22
|
|
10
|
-
PYTHON_VERSION: 3.
|
|
10
|
+
PYTHON_VERSION: 3.14
|
|
11
11
|
|
|
12
12
|
jobs:
|
|
13
13
|
end_to_end_tests:
|
|
@@ -27,7 +27,7 @@ jobs:
|
|
|
27
27
|
uses: actions/checkout@v5
|
|
28
28
|
|
|
29
29
|
- name: Setup node
|
|
30
|
-
uses: actions/setup-node@
|
|
30
|
+
uses: actions/setup-node@v6
|
|
31
31
|
with:
|
|
32
32
|
node-version: ${{ env.NODE_VERSION }}
|
|
33
33
|
|
|
@@ -44,7 +44,7 @@ jobs:
|
|
|
44
44
|
run: pipx install poetry
|
|
45
45
|
|
|
46
46
|
- name: Set up uv package manager
|
|
47
|
-
uses: astral-sh/setup-uv@
|
|
47
|
+
uses: astral-sh/setup-uv@v7
|
|
48
48
|
with:
|
|
49
49
|
python-version: ${{ env.PYTHON_VERSION }}
|
|
50
50
|
|
|
@@ -2,8 +2,65 @@
|
|
|
2
2
|
|
|
3
3
|
All notable changes to this project will be documented in this file.
|
|
4
4
|
|
|
5
|
-
|
|
6
|
-
|
|
5
|
+
## [1.1.0](https://github.com/apify/crawlee-python/releases/tag/v1.1.0) (2025-11-18)
|
|
6
|
+
|
|
7
|
+
### 🚀 Features
|
|
8
|
+
|
|
9
|
+
- Add `chrome` `BrowserType` for `PlaywrightCrawler` to use the Chrome browser ([#1487](https://github.com/apify/crawlee-python/pull/1487)) ([b06937b](https://github.com/apify/crawlee-python/commit/b06937bbc3afe3c936b554bfc503365c1b2c526b)) by [@Mantisus](https://github.com/Mantisus), closes [#1071](https://github.com/apify/crawlee-python/issues/1071)
|
|
10
|
+
- Add `RedisStorageClient` based on Redis v8.0+ ([#1406](https://github.com/apify/crawlee-python/pull/1406)) ([d08d13d](https://github.com/apify/crawlee-python/commit/d08d13d39203c24ab61fe254b0956d6744db3b5f)) by [@Mantisus](https://github.com/Mantisus)
|
|
11
|
+
- Add support for Python 3.14 ([#1553](https://github.com/apify/crawlee-python/pull/1553)) ([89e9130](https://github.com/apify/crawlee-python/commit/89e9130cabee0fbc974b29c26483b7fa0edf627c)) by [@Mantisus](https://github.com/Mantisus)
|
|
12
|
+
- Add `transform_request_function` parameter for `SitemapRequestLoader` ([#1525](https://github.com/apify/crawlee-python/pull/1525)) ([dc90127](https://github.com/apify/crawlee-python/commit/dc901271849b239ba2a947e8ebff8e1815e8c4fb)) by [@Mantisus](https://github.com/Mantisus)
|
|
13
|
+
|
|
14
|
+
### 🐛 Bug Fixes
|
|
15
|
+
|
|
16
|
+
- Improve indexing of the `request_queue_records` table for `SqlRequestQueueClient` ([#1527](https://github.com/apify/crawlee-python/pull/1527)) ([6509534](https://github.com/apify/crawlee-python/commit/65095346a9d8b703b10c91e0510154c3c48a4176)) by [@Mantisus](https://github.com/Mantisus), closes [#1526](https://github.com/apify/crawlee-python/issues/1526)
|
|
17
|
+
- Improve error handling for `RobotsTxtFile.load` ([#1524](https://github.com/apify/crawlee-python/pull/1524)) ([596a311](https://github.com/apify/crawlee-python/commit/596a31184914a254b3e7a81fd2f48ea8eda7db49)) by [@Mantisus](https://github.com/Mantisus)
|
|
18
|
+
- Fix `crawler_runtime` not being updated during run and only in the end ([#1540](https://github.com/apify/crawlee-python/pull/1540)) ([0d6c3f6](https://github.com/apify/crawlee-python/commit/0d6c3f6d3337ddb6cab4873747c28cf95605d550)) by [@Pijukatel](https://github.com/Pijukatel), closes [#1541](https://github.com/apify/crawlee-python/issues/1541)
|
|
19
|
+
- Ensure persist state event emission when exiting `EventManager` context ([#1562](https://github.com/apify/crawlee-python/pull/1562)) ([6a44f17](https://github.com/apify/crawlee-python/commit/6a44f172600cbcacebab899082d6efc9105c4e03)) by [@Pijukatel](https://github.com/Pijukatel), closes [#1560](https://github.com/apify/crawlee-python/issues/1560)
|
|
20
|
+
|
|
21
|
+
|
|
22
|
+
## [1.0.4](https://github.com/apify/crawlee-python/releases/tag/v1.0.4) (2025-10-24)
|
|
23
|
+
|
|
24
|
+
### 🐛 Bug Fixes
|
|
25
|
+
|
|
26
|
+
- Respect `enqueue_strategy` in `enqueue_links` ([#1505](https://github.com/apify/crawlee-python/pull/1505)) ([6ee04bc](https://github.com/apify/crawlee-python/commit/6ee04bc08c50a70f2e956a79d4ce5072a726c3a8)) by [@Mantisus](https://github.com/Mantisus), closes [#1504](https://github.com/apify/crawlee-python/issues/1504)
|
|
27
|
+
- Exclude incorrect links before checking `robots.txt` ([#1502](https://github.com/apify/crawlee-python/pull/1502)) ([3273da5](https://github.com/apify/crawlee-python/commit/3273da5fee62ec9254666b376f382474c3532a56)) by [@Mantisus](https://github.com/Mantisus), closes [#1499](https://github.com/apify/crawlee-python/issues/1499)
|
|
28
|
+
- Resolve compatibility issue between `SqlStorageClient` and `AdaptivePlaywrightCrawler` ([#1496](https://github.com/apify/crawlee-python/pull/1496)) ([ce172c4](https://github.com/apify/crawlee-python/commit/ce172c425a8643a1d4c919db4f5e5a6e47e91deb)) by [@Mantisus](https://github.com/Mantisus), closes [#1495](https://github.com/apify/crawlee-python/issues/1495)
|
|
29
|
+
- Fix `BasicCrawler` statistics persistence ([#1490](https://github.com/apify/crawlee-python/pull/1490)) ([1eb1c19](https://github.com/apify/crawlee-python/commit/1eb1c19aa6f9dda4a0e3f7eda23f77a554f95076)) by [@Pijukatel](https://github.com/Pijukatel), closes [#1501](https://github.com/apify/crawlee-python/issues/1501)
|
|
30
|
+
- Save context state in result for `AdaptivePlaywrightCrawler` after isolated processing in `SubCrawler` ([#1488](https://github.com/apify/crawlee-python/pull/1488)) ([62b7c70](https://github.com/apify/crawlee-python/commit/62b7c70b54085fc65a660062028014f4502beba9)) by [@Mantisus](https://github.com/Mantisus), closes [#1483](https://github.com/apify/crawlee-python/issues/1483)
|
|
31
|
+
|
|
32
|
+
|
|
33
|
+
## [1.0.3](https://github.com/apify/crawlee-python/releases/tag/v1.0.3) (2025-10-17)
|
|
34
|
+
|
|
35
|
+
### 🐛 Bug Fixes
|
|
36
|
+
|
|
37
|
+
- Add support for Pydantic v2.12 ([#1471](https://github.com/apify/crawlee-python/pull/1471)) ([35c1108](https://github.com/apify/crawlee-python/commit/35c110878c2f445a2866be2522ea8703e9b371dd)) by [@Mantisus](https://github.com/Mantisus), closes [#1464](https://github.com/apify/crawlee-python/issues/1464)
|
|
38
|
+
- Fix database version warning message ([#1485](https://github.com/apify/crawlee-python/pull/1485)) ([18a545e](https://github.com/apify/crawlee-python/commit/18a545ee8add92e844acd0068f9cb8580a82e1c9)) by [@Mantisus](https://github.com/Mantisus)
|
|
39
|
+
- Fix `reclaim_request` in `SqlRequestQueueClient` to correctly update the request state ([#1486](https://github.com/apify/crawlee-python/pull/1486)) ([1502469](https://github.com/apify/crawlee-python/commit/150246957f8f7f1ceb77bb77e3a02a903c50cae1)) by [@Mantisus](https://github.com/Mantisus), closes [#1484](https://github.com/apify/crawlee-python/issues/1484)
|
|
40
|
+
- Fix `KeyValueStore.auto_saved_value` failing in some scenarios ([#1438](https://github.com/apify/crawlee-python/pull/1438)) ([b35dee7](https://github.com/apify/crawlee-python/commit/b35dee78180e57161b826641d45a61b8d8f6ef51)) by [@Pijukatel](https://github.com/Pijukatel), closes [#1354](https://github.com/apify/crawlee-python/issues/1354)
|
|
41
|
+
|
|
42
|
+
|
|
43
|
+
## [1.0.2](https://github.com/apify/crawlee-python/releases/tag/v1.0.2) (2025-10-08)
|
|
44
|
+
|
|
45
|
+
### 🐛 Bug Fixes
|
|
46
|
+
|
|
47
|
+
- Use Self type in the open() method of storage clients ([#1462](https://github.com/apify/crawlee-python/pull/1462)) ([4ec6f6c](https://github.com/apify/crawlee-python/commit/4ec6f6c08f81632197f602ff99151338b3eba6e7)) by [@janbuchar](https://github.com/janbuchar)
|
|
48
|
+
- Add storages name validation ([#1457](https://github.com/apify/crawlee-python/pull/1457)) ([84de11a](https://github.com/apify/crawlee-python/commit/84de11a3a603503076f5b7df487c9abab68a9015)) by [@Mantisus](https://github.com/Mantisus), closes [#1434](https://github.com/apify/crawlee-python/issues/1434)
|
|
49
|
+
- Pin pydantic version to <2.12.0 to avoid compatibility issues ([#1467](https://github.com/apify/crawlee-python/pull/1467)) ([f11b86f](https://github.com/apify/crawlee-python/commit/f11b86f7ed57f98e83dc1b52f15f2017a919bf59)) by [@vdusek](https://github.com/vdusek)
|
|
50
|
+
|
|
51
|
+
|
|
52
|
+
## [1.0.1](https://github.com/apify/crawlee-python/releases/tag/v1.0.1) (2025-10-06)
|
|
53
|
+
|
|
54
|
+
### 🐛 Bug Fixes
|
|
55
|
+
|
|
56
|
+
- Fix memory leak in `PlaywrightCrawler` on browser context creation ([#1446](https://github.com/apify/crawlee-python/pull/1446)) ([bb181e5](https://github.com/apify/crawlee-python/commit/bb181e58d8070fba38e62d6e57fe981a00e5f035)) by [@Pijukatel](https://github.com/Pijukatel), closes [#1443](https://github.com/apify/crawlee-python/issues/1443)
|
|
57
|
+
- Update templates to handle optional httpx client ([#1440](https://github.com/apify/crawlee-python/pull/1440)) ([c087efd](https://github.com/apify/crawlee-python/commit/c087efd39baedf46ca3e5cae1ddc1acd6396e6c1)) by [@Pijukatel](https://github.com/Pijukatel)
|
|
58
|
+
|
|
59
|
+
|
|
60
|
+
## [1.0.0](https://github.com/apify/crawlee-python/releases/tag/v1.0.0) (2025-09-29)
|
|
61
|
+
|
|
62
|
+
- Check out the [Release blog post](https://crawlee.dev/blog/crawlee-for-python-v1) for more details.
|
|
63
|
+
- Check out the [Upgrading guide](https://crawlee.dev/python/docs/upgrading/upgrading-to-v1) to ensure a smooth update.
|
|
7
64
|
|
|
8
65
|
### 🚀 Features
|
|
9
66
|
|
|
@@ -20,6 +77,8 @@ All notable changes to this project will be documented in this file.
|
|
|
20
77
|
- Persist `DefaultRenderingTypePredictor` state ([#1340](https://github.com/apify/crawlee-python/pull/1340)) ([fad4c25](https://github.com/apify/crawlee-python/commit/fad4c25fc712915c4a45b24e3290b6f5dbd8a683)) by [@Mantisus](https://github.com/Mantisus), closes [#1272](https://github.com/apify/crawlee-python/issues/1272)
|
|
21
78
|
- Persist the `SitemapRequestLoader` state ([#1347](https://github.com/apify/crawlee-python/pull/1347)) ([27ef9ad](https://github.com/apify/crawlee-python/commit/27ef9ad194552ea9f1321d91a7a52054be9a8a51)) by [@Mantisus](https://github.com/Mantisus), closes [#1269](https://github.com/apify/crawlee-python/issues/1269)
|
|
22
79
|
- Add support for NDU storages ([#1401](https://github.com/apify/crawlee-python/pull/1401)) ([5dbd212](https://github.com/apify/crawlee-python/commit/5dbd212663e7abc37535713f4c6e3a5bbf30a12e)) by [@vdusek](https://github.com/vdusek), closes [#1175](https://github.com/apify/crawlee-python/issues/1175)
|
|
80
|
+
- Add RQ id, name, alias args to `add_requests` and `enqueue_links` methods ([#1413](https://github.com/apify/crawlee-python/pull/1413)) ([1cae2bc](https://github.com/apify/crawlee-python/commit/1cae2bca0b1508fcb3cb419dc239caf33e20a7ef)) by [@Mantisus](https://github.com/Mantisus), closes [#1402](https://github.com/apify/crawlee-python/issues/1402)
|
|
81
|
+
- Add `SqlStorageClient` based on `sqlalchemy` v2+ ([#1339](https://github.com/apify/crawlee-python/pull/1339)) ([07c75a0](https://github.com/apify/crawlee-python/commit/07c75a078b443b58bfaaeb72eb2aa1439458dc47)) by [@Mantisus](https://github.com/Mantisus), closes [#307](https://github.com/apify/crawlee-python/issues/307)
|
|
23
82
|
|
|
24
83
|
### 🐛 Bug Fixes
|
|
25
84
|
|
|
@@ -31,6 +90,8 @@ All notable changes to this project will be documented in this file.
|
|
|
31
90
|
- Include reason in the session rotation warning logs ([#1363](https://github.com/apify/crawlee-python/pull/1363)) ([d6d7a45](https://github.com/apify/crawlee-python/commit/d6d7a45dd64a906419d9552c45062d726cbb1a0f)) by [@vdusek](https://github.com/vdusek), closes [#1318](https://github.com/apify/crawlee-python/issues/1318)
|
|
32
91
|
- Improve crawler statistics logging ([#1364](https://github.com/apify/crawlee-python/pull/1364)) ([1eb6da5](https://github.com/apify/crawlee-python/commit/1eb6da5dd85870124593dcad877284ccaed9c0ce)) by [@vdusek](https://github.com/vdusek), closes [#1317](https://github.com/apify/crawlee-python/issues/1317)
|
|
33
92
|
- Do not add a request that is already in progress to `MemoryRequestQueueClient` ([#1384](https://github.com/apify/crawlee-python/pull/1384)) ([3af326c](https://github.com/apify/crawlee-python/commit/3af326c9dfa8fffd56a42ca42981374613739e39)) by [@Mantisus](https://github.com/Mantisus), closes [#1383](https://github.com/apify/crawlee-python/issues/1383)
|
|
93
|
+
- Save `RequestQueueState` for `FileSystemRequestQueueClient` in default KVS ([#1411](https://github.com/apify/crawlee-python/pull/1411)) ([6ee60a0](https://github.com/apify/crawlee-python/commit/6ee60a08ac1f9414e1b792f4935cc3799cb5089a)) by [@Mantisus](https://github.com/Mantisus), closes [#1410](https://github.com/apify/crawlee-python/issues/1410)
|
|
94
|
+
- Set default desired concurrency for non-browser crawlers to 10 ([#1419](https://github.com/apify/crawlee-python/pull/1419)) ([1cc9401](https://github.com/apify/crawlee-python/commit/1cc940197600d2539bda967880d7f9d241eb8c3e)) by [@vdusek](https://github.com/vdusek)
|
|
34
95
|
|
|
35
96
|
### Refactor
|
|
36
97
|
|
|
@@ -43,7 +104,6 @@ All notable changes to this project will be documented in this file.
|
|
|
43
104
|
- [**breaking**] Refactor storage creation and caching, configuration and services ([#1386](https://github.com/apify/crawlee-python/pull/1386)) ([04649bd](https://github.com/apify/crawlee-python/commit/04649bde60d46b2bc18ae4f6e3fd9667d02a9cef)) by [@Pijukatel](https://github.com/Pijukatel), closes [#1379](https://github.com/apify/crawlee-python/issues/1379)
|
|
44
105
|
|
|
45
106
|
|
|
46
|
-
<!-- git-cliff-unreleased-end -->
|
|
47
107
|
|
|
48
108
|
## [0.6.12](https://github.com/apify/crawlee-python/releases/tag/v0.6.12) (2025-07-30)
|
|
49
109
|
|
|
@@ -184,6 +244,9 @@ All notable changes to this project will be documented in this file.
|
|
|
184
244
|
|
|
185
245
|
## [0.6.0](https://github.com/apify/crawlee-python/releases/tag/v0.6.0) (2025-03-03)
|
|
186
246
|
|
|
247
|
+
- Check out the [Release blog post](https://crawlee.dev/blog/crawlee-for-python-v06) for more details.
|
|
248
|
+
- Check out the [Upgrading guide](https://crawlee.dev/python/docs/upgrading/upgrading-to-v0x#upgrading-to-v06) to ensure a smooth update.
|
|
249
|
+
|
|
187
250
|
### 🚀 Features
|
|
188
251
|
|
|
189
252
|
- Integrate browserforge fingerprints ([#829](https://github.com/apify/crawlee-python/pull/829)) ([2b156b4](https://github.com/apify/crawlee-python/commit/2b156b4ba688f9111195422e6058dff30eb1f782)) by [@Pijukatel](https://github.com/Pijukatel), closes [#549](https://github.com/apify/crawlee-python/issues/549)
|
|
@@ -222,7 +285,7 @@ All notable changes to this project will be documented in this file.
|
|
|
222
285
|
|
|
223
286
|
### 🐛 Bug Fixes
|
|
224
287
|
|
|
225
|
-
- Fix session
|
|
288
|
+
- Fix session management with retire ([#947](https://github.com/apify/crawlee-python/pull/947)) ([caee03f](https://github.com/apify/crawlee-python/commit/caee03fe3a43cc1d7a8d3f9e19b42df1bdb1c0aa)) by [@Mantisus](https://github.com/Mantisus)
|
|
226
289
|
- Fix templates - poetry-plugin-export version and camoufox template name ([#952](https://github.com/apify/crawlee-python/pull/952)) ([7addea6](https://github.com/apify/crawlee-python/commit/7addea6605359cceba208e16ec9131724bdb3e9b)) by [@Pijukatel](https://github.com/Pijukatel), closes [#951](https://github.com/apify/crawlee-python/issues/951)
|
|
227
290
|
- Fix convert relative link to absolute in `enqueue_links` for response with redirect ([#956](https://github.com/apify/crawlee-python/pull/956)) ([694102e](https://github.com/apify/crawlee-python/commit/694102e163bb9021a4830d2545d153f6f8f3de90)) by [@Mantisus](https://github.com/Mantisus), closes [#955](https://github.com/apify/crawlee-python/issues/955)
|
|
228
291
|
- Fix `CurlImpersonateHttpClient` cookies handler ([#946](https://github.com/apify/crawlee-python/pull/946)) ([ed415c4](https://github.com/apify/crawlee-python/commit/ed415c433da2a40b0ee62534f0730d0737e991b8)) by [@Mantisus](https://github.com/Mantisus)
|
|
@@ -264,6 +327,9 @@ All notable changes to this project will be documented in this file.
|
|
|
264
327
|
|
|
265
328
|
## [0.5.0](https://github.com/apify/crawlee-python/releases/tag/v0.5.0) (2025-01-02)
|
|
266
329
|
|
|
330
|
+
- Check out the [Release blog post](https://crawlee.dev/blog/crawlee-for-python-v05) for more details.
|
|
331
|
+
- Check out the [Upgrading guide](https://crawlee.dev/python/docs/upgrading/upgrading-to-v0x#upgrading-to-v05) to ensure a smooth update.
|
|
332
|
+
|
|
267
333
|
### 🚀 Features
|
|
268
334
|
|
|
269
335
|
- Add possibility to use None as no proxy in tiered proxies ([#760](https://github.com/apify/crawlee-python/pull/760)) ([0fbd017](https://github.com/apify/crawlee-python/commit/0fbd01723b9fe2e3410e0f358cab2f22848b08d0)) by [@Pijukatel](https://github.com/Pijukatel), closes [#687](https://github.com/apify/crawlee-python/issues/687)
|
|
@@ -355,6 +421,8 @@ All notable changes to this project will be documented in this file.
|
|
|
355
421
|
|
|
356
422
|
## [0.4.0](https://github.com/apify/crawlee-python/releases/tag/v0.4.0) (2024-11-01)
|
|
357
423
|
|
|
424
|
+
- Check out the [Upgrading guide](https://crawlee.dev/python/docs/upgrading/upgrading-to-v0x#upgrading-to-v04) to ensure a smooth update.
|
|
425
|
+
|
|
358
426
|
### 🚀 Features
|
|
359
427
|
|
|
360
428
|
- [**breaking**] Add headers in unique key computation ([#609](https://github.com/apify/crawlee-python/pull/609)) ([6c4746f](https://github.com/apify/crawlee-python/commit/6c4746fa8ff86952a812b32a1d70dc910e76b43e)) by [@Prathamesh010](https://github.com/Prathamesh010), closes [#548](https://github.com/apify/crawlee-python/issues/548)
|
|
@@ -464,6 +532,8 @@ All notable changes to this project will be documented in this file.
|
|
|
464
532
|
|
|
465
533
|
## [0.3.0](https://github.com/apify/crawlee-python/releases/tag/v0.3.0) (2024-08-27)
|
|
466
534
|
|
|
535
|
+
- Check out the [Upgrading guide](https://crawlee.dev/python/docs/upgrading/upgrading-to-v0x#upgrading-to-v03) to ensure a smooth update.
|
|
536
|
+
|
|
467
537
|
### 🚀 Features
|
|
468
538
|
|
|
469
539
|
- Implement ParselCrawler that adds support for Parsel ([#348](https://github.com/apify/crawlee-python/pull/348)) ([a3832e5](https://github.com/apify/crawlee-python/commit/a3832e527f022f32cce4a80055da3b7967b74522)) by [@asymness](https://github.com/asymness), closes [#335](https://github.com/apify/crawlee-python/issues/335)
|
|
@@ -103,7 +103,7 @@ make run-docs
|
|
|
103
103
|
Publishing new versions to [PyPI](https://pypi.org/project/crawlee) is automated through GitHub Actions.
|
|
104
104
|
|
|
105
105
|
- **Beta releases**: On each commit to the master branch, a new beta release is automatically published. The version number is determined based on the latest release and conventional commits. The beta version suffix is incremented by 1 from the last beta release on PyPI.
|
|
106
|
-
- **Stable releases**: A stable version release may be created by triggering the `release` GitHub Actions workflow. The version number is determined based on the latest release and conventional commits (`auto` release type), or it may be
|
|
106
|
+
- **Stable releases**: A stable version release may be created by triggering the `release` GitHub Actions workflow. The version number is determined based on the latest release and conventional commits (`auto` release type), or it may be overridden using the `custom` release type.
|
|
107
107
|
|
|
108
108
|
### Publishing to PyPI manually
|
|
109
109
|
|
|
@@ -38,7 +38,7 @@ unit-tests-cov:
|
|
|
38
38
|
uv run pytest --numprocesses=auto -vv --cov=src/crawlee --cov-append --cov-report=html tests/unit -m "not run_alone"
|
|
39
39
|
|
|
40
40
|
e2e-templates-tests $(args):
|
|
41
|
-
uv run pytest --numprocesses=$(E2E_TESTS_CONCURRENCY) -vv tests/e2e/project_template "$(args)"
|
|
41
|
+
uv run pytest --numprocesses=$(E2E_TESTS_CONCURRENCY) -vv tests/e2e/project_template "$(args)" --timeout=600
|
|
42
42
|
|
|
43
43
|
format:
|
|
44
44
|
uv run ruff check --fix
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: crawlee
|
|
3
|
-
Version:
|
|
3
|
+
Version: 1.1.0
|
|
4
4
|
Summary: Crawlee for Python
|
|
5
5
|
Project-URL: Apify Homepage, https://apify.com
|
|
6
6
|
Project-URL: Changelog, https://crawlee.dev/python/docs/changelog
|
|
@@ -223,15 +223,16 @@ Classifier: Programming Language :: Python :: 3.10
|
|
|
223
223
|
Classifier: Programming Language :: Python :: 3.11
|
|
224
224
|
Classifier: Programming Language :: Python :: 3.12
|
|
225
225
|
Classifier: Programming Language :: Python :: 3.13
|
|
226
|
+
Classifier: Programming Language :: Python :: 3.14
|
|
226
227
|
Classifier: Topic :: Software Development :: Libraries
|
|
227
228
|
Requires-Python: >=3.10
|
|
228
229
|
Requires-Dist: cachetools>=5.5.0
|
|
229
230
|
Requires-Dist: colorama>=0.4.0
|
|
230
|
-
Requires-Dist: impit>=0.
|
|
231
|
+
Requires-Dist: impit>=0.8.0
|
|
231
232
|
Requires-Dist: more-itertools>=10.2.0
|
|
232
233
|
Requires-Dist: protego>=0.5.0
|
|
233
234
|
Requires-Dist: psutil>=6.0.0
|
|
234
|
-
Requires-Dist: pydantic-settings
|
|
235
|
+
Requires-Dist: pydantic-settings>=2.12.0
|
|
235
236
|
Requires-Dist: pydantic>=2.11.0
|
|
236
237
|
Requires-Dist: pyee>=9.0.0
|
|
237
238
|
Requires-Dist: tldextract>=5.1.0
|
|
@@ -244,7 +245,9 @@ Requires-Dist: jaro-winkler>=2.0.3; extra == 'adaptive-crawler'
|
|
|
244
245
|
Requires-Dist: playwright>=1.27.0; extra == 'adaptive-crawler'
|
|
245
246
|
Requires-Dist: scikit-learn>=1.6.0; extra == 'adaptive-crawler'
|
|
246
247
|
Provides-Extra: all
|
|
248
|
+
Requires-Dist: aiosqlite>=0.21.0; extra == 'all'
|
|
247
249
|
Requires-Dist: apify-fingerprint-datapoints>=0.0.2; extra == 'all'
|
|
250
|
+
Requires-Dist: asyncpg>=0.24.0; (python_version < '3.14') and extra == 'all'
|
|
248
251
|
Requires-Dist: beautifulsoup4[lxml]>=4.12.0; extra == 'all'
|
|
249
252
|
Requires-Dist: browserforge>=1.2.3; extra == 'all'
|
|
250
253
|
Requires-Dist: cookiecutter>=2.6.0; extra == 'all'
|
|
@@ -261,8 +264,10 @@ Requires-Dist: opentelemetry-sdk>=1.34.1; extra == 'all'
|
|
|
261
264
|
Requires-Dist: opentelemetry-semantic-conventions>=0.54; extra == 'all'
|
|
262
265
|
Requires-Dist: parsel>=1.10.0; extra == 'all'
|
|
263
266
|
Requires-Dist: playwright>=1.27.0; extra == 'all'
|
|
267
|
+
Requires-Dist: redis[hiredis]>=7.0.0; extra == 'all'
|
|
264
268
|
Requires-Dist: rich>=13.9.0; extra == 'all'
|
|
265
269
|
Requires-Dist: scikit-learn>=1.6.0; extra == 'all'
|
|
270
|
+
Requires-Dist: sqlalchemy[asyncio]<3.0.0,>=2.0.0; extra == 'all'
|
|
266
271
|
Requires-Dist: typer>=0.12.0; extra == 'all'
|
|
267
272
|
Requires-Dist: wrapt>=1.17.0; extra == 'all'
|
|
268
273
|
Provides-Extra: beautifulsoup
|
|
@@ -293,6 +298,14 @@ Provides-Extra: playwright
|
|
|
293
298
|
Requires-Dist: apify-fingerprint-datapoints>=0.0.2; extra == 'playwright'
|
|
294
299
|
Requires-Dist: browserforge>=1.2.3; extra == 'playwright'
|
|
295
300
|
Requires-Dist: playwright>=1.27.0; extra == 'playwright'
|
|
301
|
+
Provides-Extra: redis
|
|
302
|
+
Requires-Dist: redis[hiredis]>=7.0.0; extra == 'redis'
|
|
303
|
+
Provides-Extra: sql-postgres
|
|
304
|
+
Requires-Dist: asyncpg>=0.24.0; (python_version < '3.14') and extra == 'sql-postgres'
|
|
305
|
+
Requires-Dist: sqlalchemy[asyncio]<3.0.0,>=2.0.0; extra == 'sql-postgres'
|
|
306
|
+
Provides-Extra: sql-sqlite
|
|
307
|
+
Requires-Dist: aiosqlite>=0.21.0; extra == 'sql-sqlite'
|
|
308
|
+
Requires-Dist: sqlalchemy[asyncio]<3.0.0,>=2.0.0; extra == 'sql-sqlite'
|
|
296
309
|
Description-Content-Type: text/markdown
|
|
297
310
|
|
|
298
311
|
<h1 align="center">
|
|
@@ -327,8 +340,6 @@ Description-Content-Type: text/markdown
|
|
|
327
340
|
|
|
328
341
|
Crawlee covers your crawling and scraping end-to-end and **helps you build reliable scrapers. Fast.**
|
|
329
342
|
|
|
330
|
-
> 🚀 Crawlee for Python is open to early adopters!
|
|
331
|
-
|
|
332
343
|
Your crawlers will appear almost human-like and fly under the radar of modern bot protections even with the default configuration. Crawlee gives you the tools to crawl the web for links, scrape data and persistently store it in machine-readable formats, without having to worry about the technical details. And thanks to rich configuration options, you can tweak almost any aspect of Crawlee to suit your project's needs if the default settings don't cut it.
|
|
333
344
|
|
|
334
345
|
> 👉 **View full documentation, guides and examples on the [Crawlee project website](https://crawlee.dev/python/)** 👈
|
|
@@ -30,8 +30,6 @@
|
|
|
30
30
|
|
|
31
31
|
Crawlee covers your crawling and scraping end-to-end and **helps you build reliable scrapers. Fast.**
|
|
32
32
|
|
|
33
|
-
> 🚀 Crawlee for Python is open to early adopters!
|
|
34
|
-
|
|
35
33
|
Your crawlers will appear almost human-like and fly under the radar of modern bot protections even with the default configuration. Crawlee gives you the tools to crawl the web for links, scrape data and persistently store it in machine-readable formats, without having to worry about the technical details. And thanks to rich configuration options, you can tweak almost any aspect of Crawlee to suit your project's needs if the default settings don't cut it.
|
|
36
34
|
|
|
37
35
|
> 👉 **View full documentation, guides and examples on the [Crawlee project website](https://crawlee.dev/python/)** 👈
|
|
@@ -99,7 +99,7 @@ apify run
|
|
|
99
99
|
For running Crawlee code as an Actor on [Apify platform](https://apify.com/actors) you need to wrap the body of the main function of your crawler with `async with Actor`.
|
|
100
100
|
|
|
101
101
|
:::info NOTE
|
|
102
|
-
Adding `async with Actor` is the only important thing needed to run it on Apify platform as an Actor. It is needed to initialize your Actor (e.g. to set the correct storage implementation) and to correctly handle
|
|
102
|
+
Adding `async with Actor` is the only important thing needed to run it on Apify platform as an Actor. It is needed to initialize your Actor (e.g. to set the correct storage implementation) and to correctly handle exiting the process.
|
|
103
103
|
:::
|
|
104
104
|
|
|
105
105
|
Let's look at the `BeautifulSoupCrawler` example from the [Quick start](../quick-start) guide:
|
|
@@ -0,0 +1,54 @@
|
|
|
1
|
+
import asyncio
|
|
2
|
+
import shutil
|
|
3
|
+
from pathlib import Path
|
|
4
|
+
from tempfile import TemporaryDirectory
|
|
5
|
+
|
|
6
|
+
from crawlee.crawlers import PlaywrightCrawler, PlaywrightCrawlingContext
|
|
7
|
+
|
|
8
|
+
# Profile name to use (usually 'Default' for single profile setups)
|
|
9
|
+
PROFILE_NAME = 'Default'
|
|
10
|
+
|
|
11
|
+
# Paths to Chrome profiles in your system (example for Windows)
|
|
12
|
+
# Use `chrome://version/` to find your profile path
|
|
13
|
+
PROFILE_PATH = Path(Path.home(), 'AppData', 'Local', 'Google', 'Chrome', 'User Data')
|
|
14
|
+
|
|
15
|
+
|
|
16
|
+
async def main() -> None:
|
|
17
|
+
# Create a temporary folder to copy the profile to
|
|
18
|
+
with TemporaryDirectory(prefix='crawlee-') as tmpdirname:
|
|
19
|
+
tmp_profile_dir = Path(tmpdirname)
|
|
20
|
+
|
|
21
|
+
# Copy the profile to a temporary folder
|
|
22
|
+
shutil.copytree(
|
|
23
|
+
PROFILE_PATH / PROFILE_NAME,
|
|
24
|
+
tmp_profile_dir / PROFILE_NAME,
|
|
25
|
+
dirs_exist_ok=True,
|
|
26
|
+
)
|
|
27
|
+
|
|
28
|
+
crawler = PlaywrightCrawler(
|
|
29
|
+
headless=False,
|
|
30
|
+
# Use the installed Chrome browser
|
|
31
|
+
browser_type='chrome',
|
|
32
|
+
# Disable fingerprints to preserve profile identity
|
|
33
|
+
fingerprint_generator=None,
|
|
34
|
+
# Set user data directory to temp folder
|
|
35
|
+
user_data_dir=tmp_profile_dir,
|
|
36
|
+
browser_launch_options={
|
|
37
|
+
# Slow down actions to mimic human behavior
|
|
38
|
+
'slow_mo': 200,
|
|
39
|
+
'args': [
|
|
40
|
+
# Use the specified profile
|
|
41
|
+
f'--profile-directory={PROFILE_NAME}',
|
|
42
|
+
],
|
|
43
|
+
},
|
|
44
|
+
)
|
|
45
|
+
|
|
46
|
+
@crawler.router.default_handler
|
|
47
|
+
async def default_handler(context: PlaywrightCrawlingContext) -> None:
|
|
48
|
+
context.log.info(f'Visiting {context.request.url}')
|
|
49
|
+
|
|
50
|
+
await crawler.run(['https://crawlee.dev/'])
|
|
51
|
+
|
|
52
|
+
|
|
53
|
+
if __name__ == '__main__':
|
|
54
|
+
asyncio.run(main())
|
|
@@ -0,0 +1,42 @@
|
|
|
1
|
+
import asyncio
|
|
2
|
+
from pathlib import Path
|
|
3
|
+
|
|
4
|
+
from crawlee.crawlers import PlaywrightCrawler, PlaywrightCrawlingContext
|
|
5
|
+
|
|
6
|
+
# Replace this with your actual Firefox profile name
|
|
7
|
+
# Find it at about:profiles in Firefox
|
|
8
|
+
PROFILE_NAME = 'your-profile-name-here'
|
|
9
|
+
|
|
10
|
+
# Paths to Firefox profiles in your system (example for Windows)
|
|
11
|
+
# Use `about:profiles` to find your profile path
|
|
12
|
+
PROFILE_PATH = Path(
|
|
13
|
+
Path.home(), 'AppData', 'Roaming', 'Mozilla', 'Firefox', 'Profiles', PROFILE_NAME
|
|
14
|
+
)
|
|
15
|
+
|
|
16
|
+
|
|
17
|
+
async def main() -> None:
|
|
18
|
+
crawler = PlaywrightCrawler(
|
|
19
|
+
# Use Firefox browser type
|
|
20
|
+
browser_type='firefox',
|
|
21
|
+
# Disable fingerprints to use the profile as is
|
|
22
|
+
fingerprint_generator=None,
|
|
23
|
+
headless=False,
|
|
24
|
+
# Path to your Firefox profile
|
|
25
|
+
user_data_dir=PROFILE_PATH,
|
|
26
|
+
browser_launch_options={
|
|
27
|
+
'args': [
|
|
28
|
+
# Required to avoid version conflicts
|
|
29
|
+
'--allow-downgrade'
|
|
30
|
+
]
|
|
31
|
+
},
|
|
32
|
+
)
|
|
33
|
+
|
|
34
|
+
@crawler.router.default_handler
|
|
35
|
+
async def default_handler(context: PlaywrightCrawlingContext) -> None:
|
|
36
|
+
context.log.info(f'Visiting {context.request.url}')
|
|
37
|
+
|
|
38
|
+
await crawler.run(['https://crawlee.dev/'])
|
|
39
|
+
|
|
40
|
+
|
|
41
|
+
if __name__ == '__main__':
|
|
42
|
+
asyncio.run(main())
|
|
@@ -0,0 +1,101 @@
|
|
|
1
|
+
import asyncio
|
|
2
|
+
from collections.abc import Callable
|
|
3
|
+
|
|
4
|
+
from yarl import URL
|
|
5
|
+
|
|
6
|
+
from crawlee import RequestOptions, RequestTransformAction
|
|
7
|
+
from crawlee.crawlers import BeautifulSoupCrawler, BeautifulSoupCrawlingContext
|
|
8
|
+
from crawlee.http_clients import ImpitHttpClient
|
|
9
|
+
from crawlee.request_loaders import SitemapRequestLoader
|
|
10
|
+
|
|
11
|
+
|
|
12
|
+
# Create a transform_request_function that maps request options based on the host in
|
|
13
|
+
# the URL
|
|
14
|
+
def create_transform_request(
|
|
15
|
+
data_mapper: dict[str, dict],
|
|
16
|
+
) -> Callable[[RequestOptions], RequestOptions | RequestTransformAction]:
|
|
17
|
+
def transform_request(
|
|
18
|
+
request_options: RequestOptions,
|
|
19
|
+
) -> RequestOptions | RequestTransformAction:
|
|
20
|
+
# According to the Sitemap protocol, all URLs in a Sitemap must be from a single
|
|
21
|
+
# host.
|
|
22
|
+
request_host = URL(request_options['url']).host
|
|
23
|
+
|
|
24
|
+
if request_host and (mapping_data := data_mapper.get(request_host)):
|
|
25
|
+
# Set properties from the mapping data
|
|
26
|
+
if 'label' in mapping_data:
|
|
27
|
+
request_options['label'] = mapping_data['label']
|
|
28
|
+
if 'user_data' in mapping_data:
|
|
29
|
+
request_options['user_data'] = mapping_data['user_data']
|
|
30
|
+
|
|
31
|
+
return request_options
|
|
32
|
+
|
|
33
|
+
return 'unchanged'
|
|
34
|
+
|
|
35
|
+
return transform_request
|
|
36
|
+
|
|
37
|
+
|
|
38
|
+
async def main() -> None:
|
|
39
|
+
# Prepare data mapping for hosts
|
|
40
|
+
apify_host = URL('https://apify.com/sitemap.xml').host
|
|
41
|
+
crawlee_host = URL('https://crawlee.dev/sitemap.xml').host
|
|
42
|
+
|
|
43
|
+
if not apify_host or not crawlee_host:
|
|
44
|
+
raise ValueError('Unable to extract host from URLs')
|
|
45
|
+
|
|
46
|
+
data_map = {
|
|
47
|
+
apify_host: {
|
|
48
|
+
'label': 'apify',
|
|
49
|
+
'user_data': {'source': 'apify'},
|
|
50
|
+
},
|
|
51
|
+
crawlee_host: {
|
|
52
|
+
'label': 'crawlee',
|
|
53
|
+
'user_data': {'source': 'crawlee'},
|
|
54
|
+
},
|
|
55
|
+
}
|
|
56
|
+
|
|
57
|
+
# Initialize the SitemapRequestLoader with the transform function
|
|
58
|
+
async with SitemapRequestLoader(
|
|
59
|
+
# Set the sitemap URLs and the HTTP client
|
|
60
|
+
sitemap_urls=['https://crawlee.dev/sitemap.xml', 'https://apify.com/sitemap.xml'],
|
|
61
|
+
http_client=ImpitHttpClient(),
|
|
62
|
+
transform_request_function=create_transform_request(data_map),
|
|
63
|
+
) as sitemap_loader:
|
|
64
|
+
# Convert the sitemap loader to a request manager
|
|
65
|
+
request_manager = await sitemap_loader.to_tandem()
|
|
66
|
+
|
|
67
|
+
# Create and configure the crawler
|
|
68
|
+
crawler = BeautifulSoupCrawler(
|
|
69
|
+
request_manager=request_manager,
|
|
70
|
+
max_requests_per_crawl=10,
|
|
71
|
+
)
|
|
72
|
+
|
|
73
|
+
# Create default handler for requests without a specific label
|
|
74
|
+
@crawler.router.default_handler
|
|
75
|
+
async def handler(context: BeautifulSoupCrawlingContext) -> None:
|
|
76
|
+
source = context.request.user_data.get('source', 'unknown')
|
|
77
|
+
context.log.info(
|
|
78
|
+
f'Processing request: {context.request.url} from source: {source}'
|
|
79
|
+
)
|
|
80
|
+
|
|
81
|
+
# Create handler for requests labeled 'apify'
|
|
82
|
+
@crawler.router.handler('apify')
|
|
83
|
+
async def apify_handler(context: BeautifulSoupCrawlingContext) -> None:
|
|
84
|
+
source = context.request.user_data.get('source', 'unknown')
|
|
85
|
+
context.log.info(
|
|
86
|
+
f'Apify handler processing: {context.request.url} from source: {source}'
|
|
87
|
+
)
|
|
88
|
+
|
|
89
|
+
# Create handler for requests labeled 'crawlee'
|
|
90
|
+
@crawler.router.handler('crawlee')
|
|
91
|
+
async def crawlee_handler(context: BeautifulSoupCrawlingContext) -> None:
|
|
92
|
+
source = context.request.user_data.get('source', 'unknown')
|
|
93
|
+
context.log.info(
|
|
94
|
+
f'Crawlee handler processing: {context.request.url} from source: {source}'
|
|
95
|
+
)
|
|
96
|
+
|
|
97
|
+
await crawler.run()
|
|
98
|
+
|
|
99
|
+
|
|
100
|
+
if __name__ == '__main__':
|
|
101
|
+
asyncio.run(main())
|
|
@@ -0,0 +1,39 @@
|
|
|
1
|
+
---
|
|
2
|
+
id: using_browser_profile
|
|
3
|
+
title: Using browser profile
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
import ApiLink from '@site/src/components/ApiLink';
|
|
7
|
+
|
|
8
|
+
import CodeBlock from '@theme/CodeBlock';
|
|
9
|
+
|
|
10
|
+
import ChromeProfileExample from '!!raw-loader!./code_examples/using_browser_profiles_chrome.py';
|
|
11
|
+
import FirefoxProfileExample from '!!raw-loader!./code_examples/using_browser_profiles_firefox.py';
|
|
12
|
+
|
|
13
|
+
This example demonstrates how to run <ApiLink to="class/PlaywrightCrawler">`PlaywrightCrawler`</ApiLink> using your local browser profile from [Chrome](https://www.google.com/intl/us/chrome/) or [Firefox](https://www.firefox.com/).
|
|
14
|
+
|
|
15
|
+
Using browser profiles allows you to leverage existing login sessions, saved passwords, bookmarks, and other personalized browser data during crawling. This can be particularly useful for testing scenarios or when you need to access content that requires authentication.
|
|
16
|
+
|
|
17
|
+
## Chrome browser
|
|
18
|
+
|
|
19
|
+
To run <ApiLink to="class/PlaywrightCrawler">`PlaywrightCrawler`</ApiLink> with your Chrome profile, you need to know the path to your profile files. You can find this information by entering `chrome://version/` as a URL in your Chrome browser. If you have multiple profiles, pay attention to the profile name - if you only have one profile, it's always `Default`.
|
|
20
|
+
|
|
21
|
+
:::warning Profile access limitation
|
|
22
|
+
Due to [Chrome's security policies](https://developer.chrome.com/blog/remote-debugging-port), automation cannot use your main browsing profile directly. The example copies your profile to a temporary location as a workaround.
|
|
23
|
+
:::
|
|
24
|
+
|
|
25
|
+
Make sure you don't have any running Chrome browser processes before running this code:
|
|
26
|
+
|
|
27
|
+
<CodeBlock className="language-python" language="python">
|
|
28
|
+
{ChromeProfileExample}
|
|
29
|
+
</CodeBlock>
|
|
30
|
+
|
|
31
|
+
## Firefox browser
|
|
32
|
+
|
|
33
|
+
To find the path to your Firefox profile, enter `about:profiles` as a URL in your Firefox browser. Unlike Chrome, you can use your standard profile path directly without copying it first.
|
|
34
|
+
|
|
35
|
+
Make sure you don't have any running Firefox browser processes before running this code:
|
|
36
|
+
|
|
37
|
+
<CodeBlock className="language-python" language="python">
|
|
38
|
+
{FirefoxProfileExample}
|
|
39
|
+
</CodeBlock>
|