waymore 6.5__py3-none-any.whl → 7.0__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- waymore/__init__.py +1 -1
- waymore/waymore.py +1082 -1193
- {waymore-6.5.dist-info → waymore-7.0.dist-info}/METADATA +10 -8
- waymore-7.0.dist-info/RECORD +8 -0
- {waymore-6.5.dist-info → waymore-7.0.dist-info}/WHEEL +1 -1
- waymore-6.5.dist-info/RECORD +0 -8
- {waymore-6.5.dist-info → waymore-7.0.dist-info}/entry_points.txt +0 -0
- {waymore-6.5.dist-info → waymore-7.0.dist-info/licenses}/LICENSE +0 -0
- {waymore-6.5.dist-info → waymore-7.0.dist-info}/top_level.txt +0 -0
|
@@ -1,6 +1,6 @@
|
|
|
1
|
-
Metadata-Version: 2.
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
2
|
Name: waymore
|
|
3
|
-
Version:
|
|
3
|
+
Version: 7.0
|
|
4
4
|
Summary: Find way more from the Wayback Machine, Common Crawl, Alien Vault OTX, URLScan, VirusTotal & Intelligence X!
|
|
5
5
|
Home-page: https://github.com/xnl-h4ck3r/waymore
|
|
6
6
|
Author: xnl-h4ck3r
|
|
@@ -15,10 +15,13 @@ Requires-Dist: termcolor
|
|
|
15
15
|
Requires-Dist: psutil
|
|
16
16
|
Requires-Dist: urlparse3
|
|
17
17
|
Requires-Dist: tldextract
|
|
18
|
+
Requires-Dist: aiohttp
|
|
19
|
+
Dynamic: home-page
|
|
20
|
+
Dynamic: license-file
|
|
18
21
|
|
|
19
22
|
<center><img src="https://github.com/xnl-h4ck3r/waymore/blob/main/waymore/images/title.png"></center>
|
|
20
23
|
|
|
21
|
-
## About -
|
|
24
|
+
## About - v7.0
|
|
22
25
|
|
|
23
26
|
The idea behind **waymore** is to find even more links from the Wayback Machine than other existing tools.
|
|
24
27
|
|
|
@@ -48,7 +51,7 @@ Now **waymore** gets URL's from ALL of those sources too (with ability to filter
|
|
|
48
51
|
|
|
49
52
|
**NOTE: If you already have a `config.yml` file, it will not be overwritten. The file `config.yml.NEW` will be created in the same directory. If you need the new config, remove `config.yml` and rename `config.yml.NEW` back to `config.yml`.**
|
|
50
53
|
|
|
51
|
-
`waymore` supports **Python 3
|
|
54
|
+
`waymore` supports **Python 3.7+** (Python 3.7 or higher required for async/await support).
|
|
52
55
|
|
|
53
56
|
Install `waymore` in default (global) python environment.
|
|
54
57
|
|
|
@@ -91,8 +94,8 @@ pipx install git+https://github.com/xnl-h4ck3r/waymore.git
|
|
|
91
94
|
| -mc | | Only Match HTTP status codes for retrieved URLs and responses. Comma separated list of codes. Passing this argument overrides the config `FILTER_CODE` and `-fc`. |
|
|
92
95
|
| -mt | | Only MIME Types for retrieved URLs and responses. Comma separated list of MIME types. Passing this argument overrides the config `FILTER_MIME` and `-ft`. **NOTE: This will NOT be applied to Alien Vault OTX, Virus Total and Intelligence X because they don't have the ability to filter on MIME Type. Sometimes URLScan does not have a MIME Type defined - these will always be included. Consider excluding sources if this matters to you.**. |
|
|
93
96
|
| -l | --limit | How many responses will be saved (if `-mode R` or `-mode B` is passed). A positive value will get the **first N** results, a negative value will get the **last N** results. A value of 0 will get **ALL** responses (default: 5000) |
|
|
94
|
-
| -from | --from-date | What date to get
|
|
95
|
-
| -to | --to-date | What date to get
|
|
97
|
+
| -from | --from-date | What date to get data from. If not specified it will get from the earliest possible results. A partial value can be passed, e.g. `2016`, `201805`, etc. **IMPORTANT: There are some exceptions with sources unable to get URLs within date limits: Virus Total - all known sub domains will still be returned; Intelligence X - all URLs will still be returned.** |
|
|
98
|
+
| -to | --to-date | What date to get data to. If not specified it will get to the latest possible results. A partial value can be passed, e.g. `2021`, `202112`, etc. **IMPORTANT: There are some exceptions with sources unable to get URLs within date limits: Virus Total - all known sub domains will still be returned; Intelligence X - all URLs will still be returned.** |
|
|
96
99
|
| -ci | --capture-interval | Filters the search on archive.org to only get at most 1 capture per hour (`h`), day (`d`) or month (`m`). This filter is used for responses only. The default is `d` but can also be set to `none` to not filter anything and get all responses. |
|
|
97
100
|
| -ra | --regex-after | RegEx for filtering purposes against links found from all sources of URLs AND responses downloaded. Only positive matches will be output. |
|
|
98
101
|
| -url-filename | | Set the file name of downloaded responses to the URL that generated the response, otherwise it will be set to the hash value of the response. Using the hash value means multiple URLs that generated the same response will only result in one file being saved for that response. |
|
|
@@ -103,9 +106,8 @@ pipx install git+https://github.com/xnl-h4ck3r/waymore.git
|
|
|
103
106
|
| -xvt | | Exclude checks for links from virustotal.com |
|
|
104
107
|
| -xix | | Exclude checks for links from Intelligence X.com |
|
|
105
108
|
| -lcc | | Limit the number of Common Crawl index collections searched, e.g. `-lcc 10` will just search the latest `10` collections (default: 1). As of November 2024 there are currently 106 collections. Setting to `0` will search **ALL** collections. If you don't want to search Common Crawl at all, use the `-xcc` option. |
|
|
106
|
-
| -lcy | | Limit the number of Common Crawl index collections searched by the year of the index data. The earliest index has data from 2008. Setting to 0 (default) will search collections or any year (but in conjuction with `-lcc`). For example, if you are only interested in data from 2015 and after, pass `-lcy 2015`. This will override the value of `-lcc` if passed. If you don't want to search Common Crawl at all, use the `-xcc` option. |
|
|
107
109
|
| -t | --timeout | This is for archived responses only! How many seconds to wait for the server to send data before giving up (default: 30) |
|
|
108
|
-
| -p | --processes | Basic multithreading is done when getting requests for a file of URLs. This argument determines the number of processes (threads) used (default:
|
|
110
|
+
| -p | --processes | Basic multithreading is done when getting requests for a file of URLs. This argument determines the number of processes (threads) used (default: 2) |
|
|
109
111
|
| -r | --retries | The number of retries for requests that get connection error or rate limited (default: 1). |
|
|
110
112
|
| -m | --memory-threshold | The memory threshold percentage. If the machines memory goes above the threshold, the program will be stopped and ended gracefully before running out of memory (default: 95) |
|
|
111
113
|
| -ko | --keywords-only | Only return links and responses that contain keywords that you are interested in. This can reduce the time it takes to get results. If you provide the flag with no value, Keywords are taken from the comma separated list in the `config.yml` file (typically in `~/.config/waymore/`) with the `FILTER_KEYWORDS` key, otherwise you can pass a specific Regex value to use, e.g. `-ko "admin"` to only get links containing the word `admin`, or `-ko "\.js(\?\|$)"` to only get JS files. The Regex check is NOT case sensitive. |
|
|
@@ -0,0 +1,8 @@
|
|
|
1
|
+
waymore/__init__.py,sha256=NvDQnrD09CEwUQaMbcu5QZqslCNsgpKzgt7scQ5ZgJ8,20
|
|
2
|
+
waymore/waymore.py,sha256=jNYKiZf7Toqlo5gpQo-ebm70cR47QvLfsyaDHsmmS0Y,273363
|
|
3
|
+
waymore-7.0.dist-info/licenses/LICENSE,sha256=o_jq62xZ1YxI8tqzQKbNtqr3RW2i5sh0rk6ixCJEroU,1068
|
|
4
|
+
waymore-7.0.dist-info/METADATA,sha256=YzkVssN-6ICCcv8Cnt3gXjvLYv6UAHOyNvwR4KAihZo,51560
|
|
5
|
+
waymore-7.0.dist-info/WHEEL,sha256=_zCd3N1l69ArxyTb8rzEoP9TpbYXkqRFSNOD5OuxnTs,91
|
|
6
|
+
waymore-7.0.dist-info/entry_points.txt,sha256=YHy5EUf3r_7OTkt9jvylLjNeg7Z5yvIVm5RUAyfNcN4,49
|
|
7
|
+
waymore-7.0.dist-info/top_level.txt,sha256=RFTphkWaRu1N7lUWIPUjabgCPQ3ETmNllF7qze4JJ_s,8
|
|
8
|
+
waymore-7.0.dist-info/RECORD,,
|
waymore-6.5.dist-info/RECORD
DELETED
|
@@ -1,8 +0,0 @@
|
|
|
1
|
-
waymore/__init__.py,sha256=Ma2JR69c8ytm7HBygCQZ7DwJUTgf5YIb_Ik6WxGHyxU,20
|
|
2
|
-
waymore/waymore.py,sha256=085oFlKeazTdX19znKLMzZlCFzDJJYG8D4FYiHWZiK8,272074
|
|
3
|
-
waymore-6.5.dist-info/LICENSE,sha256=o_jq62xZ1YxI8tqzQKbNtqr3RW2i5sh0rk6ixCJEroU,1068
|
|
4
|
-
waymore-6.5.dist-info/METADATA,sha256=uf9svctF3JfDXPrkz1-pb9I6MeRmN5ZYbO-bC_D6F34,52125
|
|
5
|
-
waymore-6.5.dist-info/WHEEL,sha256=iAkIy5fosb7FzIOwONchHf19Qu7_1wCWyFNR5gu9nU0,91
|
|
6
|
-
waymore-6.5.dist-info/entry_points.txt,sha256=YHy5EUf3r_7OTkt9jvylLjNeg7Z5yvIVm5RUAyfNcN4,49
|
|
7
|
-
waymore-6.5.dist-info/top_level.txt,sha256=RFTphkWaRu1N7lUWIPUjabgCPQ3ETmNllF7qze4JJ_s,8
|
|
8
|
-
waymore-6.5.dist-info/RECORD,,
|
|
File without changes
|
|
File without changes
|
|
File without changes
|