PyPI - scrape-cli - Versions diffs - 1.1__tar.gz → 1.1.2__tar.gz - Mend

scrape-cli 1.1tar.gz → 1.1.2tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (15) hide show

scrape_cli-1.1.2/PKG-INFO ADDED Viewed

@@ -0,0 +1,131 @@
+Metadata-Version: 2.1
+Name: scrape-cli
+Version: 1.1.2
+Summary: It's a command-line tool to extract HTML elements using an XPath query or CSS3 selector.
+Home-page: https://github.com/aborruso/scrape-cli
+Author: Andrea Borruso
+Author-email: aborruso@gmail.com
+Classifier: Programming Language :: Python :: 3
+Classifier: License :: OSI Approved :: MIT License
+Classifier: Operating System :: OS Independent
+Requires-Python: >=3.6
+Description-Content-Type: text/markdown
+License-File: LICENSE
+Requires-Dist: cssselect
+Requires-Dist: lxml
+[![PyPI version](https://badge.fury.io/py/scrape-cli.svg)](https://badge.fury.io/py/scrape-cli)
+[![Python Versions](https://img.shields.io/pypi/pyversions/scrape-cli.svg)](https://pypi.org/project/scrape-cli/)
+# scrape cli
+It's a **command-line tool** to **extract** HTML elements using an [**XPath**](https://www.w3schools.com/xml/xpath_intro.asp) query or [**CSS3 selector**](https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_Selectors).
+It's based on the great and simple [scraping tool](https://github.com/jeroenjanssens/data-science-at-the-command-line/blob/master/tools/scrape) written by [**Jeroen Janssens**](http://jeroenjanssens.com).
+- [How does it work?](#how-does-it-work)
+- [How to use it in Linux](#how-to-use-it-in-linux)
+- [Note on building it](#note-on-building-it)
+## Installation
+You can install scrape-cli using pip:
+### Using pipx (recommended for CLI tools)
+```bash
+pipx install scrape-cli
+```
+Using pip
+```bash
+pip install scrape-cli
+```
+Or install from source:
+```bash
+git clone https://github.com/aborruso/scrape-cli
+cd scrape-cli
+pip install -e .
+```
+## Requirements
+- Python >=3.6
+- requests
+- lxml
+- cssselect
+## How does it work?
+A CSS selector query like this
+```bash
+curl -L 'https://en.wikipedia.org/wiki/List_of_sovereign_states' -s \
+| scrape -be 'table.wikitable > tbody > tr > td > b > a'
+```
+or an XPATH query like this one:
+```bash
+curl -L 'https://en.wikipedia.org/wiki/List_of_sovereign_states' -s \
+| scrape -be '//table[contains(@class, 'wikitable')]/tbody/tr/td/b/a'
+```
+gives you back:
+```html
+<html>
+ <head>
+ </head>
+ <body>
+  <a href="/wiki/Afghanistan" title="Afghanistan">
+   Afghanistan
+  </a>
+  <a href="/wiki/Albania" title="Albania">
+   Albania
+  </a>
+  <a href="/wiki/Algeria" title="Algeria">
+   Algeria
+  </a>
+  <a href="/wiki/Andorra" title="Andorra">
+   Andorra
+  </a>
+  <a href="/wiki/Angola" title="Angola">
+   Angola
+  </a>
+  <a href="/wiki/Antigua_and_Barbuda" title="Antigua and Barbuda">
+   Antigua and Barbuda
+  </a>
+  <a href="/wiki/Argentina" title="Argentina">
+   Argentina
+  </a>
+  <a href="/wiki/Armenia" title="Armenia">
+   Armenia
+  </a>
+...
+...
+ </body>
+</html>
+```
+Some notes on the commands:
+- `-e` to set the query
+- `-b` to add `<html>`, `<head>` and `<body>` tags to the HTML output.
+## Linux 64 bit precompiled binary
+If you are looking for precompiled executables for Linux, please refer to the [Releases](https://github.com/aborruso/scrape-cli/releases) page on GitHub where you can find the latest precompiled binary file.
+I have built the `scrape-linux-x86_64` precompiled binary, using [pyinstaller](https://www.pyinstaller.org/) and this command: `pyinstaller --onefile scrape.py`.<br>
+Once you have built it, it's an executable, and it's possible to use it Linux 64 bit environment.
+## License
+[MIT](LICENSE)

{scrape_cli-1.1 → scrape_cli-1.1.2}/README.md RENAMED Viewed

@@ -1,3 +1,6 @@
+[![PyPI version](https://badge.fury.io/py/scrape-cli.svg)](https://badge.fury.io/py/scrape-cli)
+[![Python Versions](https://img.shields.io/pypi/pyversions/scrape-cli.svg)](https://pypi.org/project/scrape-cli/)
 # scrape cli
 It's a **command-line tool** to **extract** HTML elements using an [**XPath**](https://www.w3schools.com/xml/xpath_intro.asp) query or [**CSS3 selector**](https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_Selectors).
@@ -8,6 +11,38 @@ It's based on the great and simple [scraping tool](https://github.com/jeroenjans
 - [How to use it in Linux](#how-to-use-it-in-linux)
 - [Note on building it](#note-on-building-it)
+## Installation
+You can install scrape-cli using pip:
+### Using pipx (recommended for CLI tools)
+```bash
+pipx install scrape-cli
+```
+Using pip
+```bash
+pip install scrape-cli
+```
+Or install from source:
+```bash
+git clone https://github.com/aborruso/scrape-cli
+cd scrape-cli
+pip install -e .
+```
+## Requirements
+- Python >=3.6
+- requests
+- lxml
+- cssselect
 ## How does it work?
 A CSS selector query like this
@@ -66,27 +101,15 @@ Some notes on the commands:
 - `-e` to set the query
 - `-b` to add `<html>`, `<head>` and `<body>` tags to the HTML output.
-## How to use it in Linux
-```bash
-# go in example to the home folder
-cd ~
-# download scrape-cli
-wget "https://github.com/aborruso/scrape-cli/releases/download/v1.0/scrape"
-# move it in a folder of your PATH as /usr/bin
-sudo mv ./scrape /usr/bin
-# give it execute permission
-sudo chmod +x /usr/bin/scrape
-# use it
-```
+## Linux 64 bit precompiled binary
-**Please note**: in OSX it seems not to work ([#8](https://github.com/aborruso/scrape-cli/issues/8)).
+If you are looking for precompiled executables for Linux, please refer to the [Releases](https://github.com/aborruso/scrape-cli/releases) page on GitHub where you can find the latest precompiled binary file.
-## Note on building it
+I have built the `scrape-linux-x86_64` precompiled binary, using [pyinstaller](https://www.pyinstaller.org/) and this command: `pyinstaller --onefile scrape.py`.<br>
-The original source is written in Python 2, then I have built it in Python 2 environment.<br>
-There are two modules requirements: install in this environment `cssselect` and then `lxml`, in this order (using pip).
+Once you have built it, it's an executable, and it's possible to use it Linux 64 bit environment.
-I have built it using [pyinstaller](https://www.pyinstaller.org/) and this command: `pyinstaller --onefile scrape.py`.<br>
+## License
-Once you have built it, it's an executable, and it's possible to use it in any environment.
+[MIT](LICENSE)

{scrape_cli-1.1 → scrape_cli-1.1.2}/scrape_cli/__init__.py RENAMED Viewed

@@ -4,7 +4,7 @@ scrape-cli - A command-line tool to extract HTML elements using XPath or CSS3 se
 from scrape_cli.scrape import main
-__version__ = "1.1"
+__version__ = "1.1.2"
 __author__ = "Andrea Borruso"
 __author_email__ = "aborruso@gmail.com"

{scrape_cli-1.1 → scrape_cli-1.1.2}/scrape_cli/scrape.py RENAMED Viewed

@@ -42,7 +42,7 @@ def main():
     expression = [e if e.startswith('//') else GenericTranslator().css_to_xpath(e) for e in args.expression]
-    html_parser = etree.HTMLParser(encoding='utf-8', recover=True, strip_cdata=True)
+    html_parser = etree.HTMLParser(encoding='utf-8', recover=True)
     inp = open(args.file, 'rb') if args.file else args.html
     if args.rawinput:

scrape_cli-1.1.2/scrape_cli.egg-info/PKG-INFO ADDED Viewed

@@ -0,0 +1,131 @@
+Metadata-Version: 2.1
+Name: scrape-cli
+Version: 1.1.2
+Summary: It's a command-line tool to extract HTML elements using an XPath query or CSS3 selector.
+Home-page: https://github.com/aborruso/scrape-cli
+Author: Andrea Borruso
+Author-email: aborruso@gmail.com
+Classifier: Programming Language :: Python :: 3
+Classifier: License :: OSI Approved :: MIT License
+Classifier: Operating System :: OS Independent
+Requires-Python: >=3.6
+Description-Content-Type: text/markdown
+License-File: LICENSE
+Requires-Dist: cssselect
+Requires-Dist: lxml
+[![PyPI version](https://badge.fury.io/py/scrape-cli.svg)](https://badge.fury.io/py/scrape-cli)
+[![Python Versions](https://img.shields.io/pypi/pyversions/scrape-cli.svg)](https://pypi.org/project/scrape-cli/)
+# scrape cli
+It's a **command-line tool** to **extract** HTML elements using an [**XPath**](https://www.w3schools.com/xml/xpath_intro.asp) query or [**CSS3 selector**](https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_Selectors).
+It's based on the great and simple [scraping tool](https://github.com/jeroenjanssens/data-science-at-the-command-line/blob/master/tools/scrape) written by [**Jeroen Janssens**](http://jeroenjanssens.com).
+- [How does it work?](#how-does-it-work)
+- [How to use it in Linux](#how-to-use-it-in-linux)
+- [Note on building it](#note-on-building-it)
+## Installation
+You can install scrape-cli using pip:
+### Using pipx (recommended for CLI tools)
+```bash
+pipx install scrape-cli
+```
+Using pip
+```bash
+pip install scrape-cli
+```
+Or install from source:
+```bash
+git clone https://github.com/aborruso/scrape-cli
+cd scrape-cli
+pip install -e .
+```
+## Requirements
+- Python >=3.6
+- requests
+- lxml
+- cssselect
+## How does it work?
+A CSS selector query like this
+```bash
+curl -L 'https://en.wikipedia.org/wiki/List_of_sovereign_states' -s \
+| scrape -be 'table.wikitable > tbody > tr > td > b > a'
+```
+or an XPATH query like this one:
+```bash
+curl -L 'https://en.wikipedia.org/wiki/List_of_sovereign_states' -s \
+| scrape -be '//table[contains(@class, 'wikitable')]/tbody/tr/td/b/a'
+```
+gives you back:
+```html
+<html>
+ <head>
+ </head>
+ <body>
+  <a href="/wiki/Afghanistan" title="Afghanistan">
+   Afghanistan
+  </a>
+  <a href="/wiki/Albania" title="Albania">
+   Albania
+  </a>
+  <a href="/wiki/Algeria" title="Algeria">
+   Algeria
+  </a>
+  <a href="/wiki/Andorra" title="Andorra">
+   Andorra
+  </a>
+  <a href="/wiki/Angola" title="Angola">
+   Angola
+  </a>
+  <a href="/wiki/Antigua_and_Barbuda" title="Antigua and Barbuda">
+   Antigua and Barbuda
+  </a>
+  <a href="/wiki/Argentina" title="Argentina">
+   Argentina
+  </a>
+  <a href="/wiki/Armenia" title="Armenia">
+   Armenia
+  </a>
+...
+...
+ </body>
+</html>
+```
+Some notes on the commands:
+- `-e` to set the query
+- `-b` to add `<html>`, `<head>` and `<body>` tags to the HTML output.
+## Linux 64 bit precompiled binary
+If you are looking for precompiled executables for Linux, please refer to the [Releases](https://github.com/aborruso/scrape-cli/releases) page on GitHub where you can find the latest precompiled binary file.
+I have built the `scrape-linux-x86_64` precompiled binary, using [pyinstaller](https://www.pyinstaller.org/) and this command: `pyinstaller --onefile scrape.py`.<br>
+Once you have built it, it's an executable, and it's possible to use it Linux 64 bit environment.
+## License
+[MIT](LICENSE)

{scrape_cli-1.1 → scrape_cli-1.1.2}/setup.py RENAMED Viewed

@@ -1,9 +1,17 @@
+# setup.py
 from setuptools import setup
+from pathlib import Path
+# Leggi il README
+this_directory = Path(__file__).parent
+long_description = (this_directory / "README.md").read_text(encoding="utf-8")
 setup(
     name="scrape-cli",
-    version="1.1",
+    version="1.1.2",
     description="It's a command-line tool to extract HTML elements using an XPath query or CSS3 selector.",
+    long_description=long_description,
+    long_description_content_type="text/markdown",  # Specifica formato Markdown
     author="Andrea Borruso",
     author_email="aborruso@gmail.com",
     url="https://github.com/aborruso/scrape-cli",

scrape_cli-1.1/PKG-INFO DELETED Viewed

@@ -1,14 +0,0 @@
-Metadata-Version: 2.1
-Name: scrape-cli
-Version: 1.1
-Summary: It's a command-line tool to extract HTML elements using an XPath query or CSS3 selector.
-Home-page: https://github.com/aborruso/scrape-cli
-Author: Andrea Borruso
-Author-email: aborruso@gmail.com
-Classifier: Programming Language :: Python :: 3
-Classifier: License :: OSI Approved :: MIT License
-Classifier: Operating System :: OS Independent
-Requires-Python: >=3.6
-License-File: LICENSE
-Requires-Dist: cssselect
-Requires-Dist: lxml

scrape_cli-1.1/scrape_cli.egg-info/PKG-INFO DELETED Viewed

@@ -1,14 +0,0 @@
-Metadata-Version: 2.1
-Name: scrape-cli
-Version: 1.1
-Summary: It's a command-line tool to extract HTML elements using an XPath query or CSS3 selector.
-Home-page: https://github.com/aborruso/scrape-cli
-Author: Andrea Borruso
-Author-email: aborruso@gmail.com
-Classifier: Programming Language :: Python :: 3
-Classifier: License :: OSI Approved :: MIT License
-Classifier: Operating System :: OS Independent
-Requires-Python: >=3.6
-License-File: LICENSE
-Requires-Dist: cssselect
-Requires-Dist: lxml

{scrape_cli-1.1 → scrape_cli-1.1.2}/LICENSE RENAMED Viewed

File without changes

{scrape_cli-1.1 → scrape_cli-1.1.2}/scrape_cli.egg-info/SOURCES.txt RENAMED Viewed

File without changes

{scrape_cli-1.1 → scrape_cli-1.1.2}/scrape_cli.egg-info/dependency_links.txt RENAMED Viewed

File without changes

{scrape_cli-1.1 → scrape_cli-1.1.2}/scrape_cli.egg-info/entry_points.txt RENAMED Viewed

File without changes

{scrape_cli-1.1 → scrape_cli-1.1.2}/scrape_cli.egg-info/requires.txt RENAMED Viewed

File without changes

{scrape_cli-1.1 → scrape_cli-1.1.2}/scrape_cli.egg-info/top_level.txt RENAMED Viewed

File without changes

{scrape_cli-1.1 → scrape_cli-1.1.2}/setup.cfg RENAMED Viewed

File without changes

scrape-cli 1.1__tar.gz → 1.1.2__tar.gz

scrape-cli 1.1tar.gz → 1.1.2tar.gz