PyPI - pdfalyzer - Versions diffs - 1.16.11__tar.gz → 1.16.13__tar.gz - Mend

pdfalyzer 1.16.11tar.gz → 1.16.13tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.

This version of pdfalyzer might be problematic. Click here for more details.

Files changed (48) hide show

{pdfalyzer-1.16.11 → pdfalyzer-1.16.13}/CHANGELOG.md RENAMED Viewed

@@ -1,5 +1,11 @@
 # NEXT RELEASE
+### 1.16.13
+* Bump `yaralyzer` to v1.0.7 and fix reference to yaralyzer's renamed `prefix_with_style()` method
+### 1.16.12
+* Bump `PyPDF` to v6.0.0
 ### 1.16.11
 * Fix typo in `combine_pdfs` help
 * Add some more PyPi classifiers

{pdfalyzer-1.16.11 → pdfalyzer-1.16.13}/PKG-INFO RENAMED Viewed

@@ -1,13 +1,13 @@
 Metadata-Version: 2.1
 Name: pdfalyzer
-Version: 1.16.11
+Version: 1.16.13
 Summary: PDF analysis tool. Scan a PDF with YARA rules, visualize its inner tree-like data structure in living color (lots of colors), force decodes of suspicious font binaries, and more.
 Home-page: https://github.com/michelcrypt4d4mus/pdfalyzer
 License: GPL-3.0-or-later
-Keywords: ascii art,binary,color,cybersecurity,DFIR,encoding,font,infosec,maldoc,malicious pdf,malware,malware analysis,pdf,pdfs,pdf analysis,pypdf,threat assessment,visualization,yara
+Keywords: ascii art,binary,color,cybersecurity,DFIR,encoding,font,infosec,maldoc,malicious pdf,malware,malware analysis,pdf,pdfs,pdf analysis,pypdf,threat assessment,threat hunting,threat intelligence,threat research,threatintel,visualization,yara
 Author: Michel de Cryptadamus
 Author-email: michel@cryptadamus.com
-Requires-Python: >=3.9.2,<4.0.0
+Requires-Python: >=3.9.2,<4.0
 Classifier: Development Status :: 5 - Production/Stable
 Classifier: Environment :: Console
 Classifier: Intended Audience :: Information Technology
@@ -23,8 +23,8 @@ Classifier: Topic :: Artistic Software
 Classifier: Topic :: Scientific/Engineering :: Visualization
 Classifier: Topic :: Security
 Requires-Dist: anytree (>=2.13,<3.0)
-Requires-Dist: pypdf (>=5.9.0,<6.0.0)
-Requires-Dist: yaralyzer (>=1.0.4,<2.0.0)
+Requires-Dist: pypdf (>=6.0.0,<7.0.0)
+Requires-Dist: yaralyzer (>=1.0.7,<2.0.0)
 Project-URL: Changelog, https://github.com/michelcrypt4d4mus/pdfalyzer/blob/master/CHANGELOG.md
 Project-URL: Documentation, https://github.com/michelcrypt4d4mus/pdfalyzer
 Project-URL: Repository, https://github.com/michelcrypt4d4mus/pdfalyzer
@@ -65,10 +65,12 @@ If you're looking for one of these things this may be the tool for you.
 ### What It Don't Do
 This tool is mostly for examining/working with a PDF's data and logical structure. As such it doesn't have much to offer as far as extracting text, rendering[^3], writing, etc. etc.
+If you suspect you are dealing with a malcious PDF you can safely run `pdfalyze` on it; embedded javascript etc. will not be executed. If you want to actually look at the contents of a suspect PDF you can use [`dangerzone`](https://dangerzone.rocks/) to sanitize the contents with extreme prejudice before opening it.
 -------------
 # Installation
+#### All Platforms
 Installation with [pipx](https://pypa.github.io/pipx/)[^4] is preferred though `pip3` / `pip` should also work.
 ```sh
 pipx install pdfalyzer
@@ -76,7 +78,12 @@ pipx install pdfalyzer
 See [PyPDF installation notes](https://github.com/py-pdf/pypdf#installation) about `PyCryptodome` if you plan to `pdfalyze` any files that use AES encryption.
-If you are on macOS someone out there was kind enough to make [The Pdfalyzer available via homebrew](https://formulae.brew.sh/formula/pdfalyzer) so `brew install pdfalyzer` should work.
+#### macOS Homebrew
+If you are on macOS and use `homebrew` someone out there was kind enough to make [The Pdfalyzer available via homebrew](https://formulae.brew.sh/formula/pdfalyzer) so this should work:
+```sh
+brew install pdfalyzer
+```
 ### Troubleshooting
 1. If you used `pip3` instead of `pipx` and have an issue you should try to install with `pipx`.

{pdfalyzer-1.16.11 → pdfalyzer-1.16.13}/README.md RENAMED Viewed

@@ -33,10 +33,12 @@ If you're looking for one of these things this may be the tool for you.
 ### What It Don't Do
 This tool is mostly for examining/working with a PDF's data and logical structure. As such it doesn't have much to offer as far as extracting text, rendering[^3], writing, etc. etc.
+If you suspect you are dealing with a malcious PDF you can safely run `pdfalyze` on it; embedded javascript etc. will not be executed. If you want to actually look at the contents of a suspect PDF you can use [`dangerzone`](https://dangerzone.rocks/) to sanitize the contents with extreme prejudice before opening it.
 -------------
 # Installation
+#### All Platforms
 Installation with [pipx](https://pypa.github.io/pipx/)[^4] is preferred though `pip3` / `pip` should also work.
 ```sh
 pipx install pdfalyzer
@@ -44,7 +46,12 @@ pipx install pdfalyzer
 See [PyPDF installation notes](https://github.com/py-pdf/pypdf#installation) about `PyCryptodome` if you plan to `pdfalyze` any files that use AES encryption.
-If you are on macOS someone out there was kind enough to make [The Pdfalyzer available via homebrew](https://formulae.brew.sh/formula/pdfalyzer) so `brew install pdfalyzer` should work.
+#### macOS Homebrew
+If you are on macOS and use `homebrew` someone out there was kind enough to make [The Pdfalyzer available via homebrew](https://formulae.brew.sh/formula/pdfalyzer) so this should work:
+```sh
+brew install pdfalyzer
+```
 ### Troubleshooting
 1. If you used `pip3` instead of `pipx` and have an issue you should try to install with `pipx`.

{pdfalyzer-1.16.11 → pdfalyzer-1.16.13}/pdfalyzer/__init__.py RENAMED Viewed

@@ -19,7 +19,7 @@ if not environ.get('INVOKED_BY_PYTEST', False):
 from rich.columns import Columns
 from rich.panel import Panel
 from rich.text import Text
-from yaralyzer.helpers.rich_text_helper import prefix_with_plain_text_obj
+from yaralyzer.helpers.rich_text_helper import prefix_with_style
 from yaralyzer.output.file_export import invoke_rich_export
 from yaralyzer.output.rich_console import console
 from yaralyzer.util.logging import log_and_print
@@ -83,7 +83,7 @@ def pdfalyzer_show_color_theme() -> None:
     console.print(Panel('The Pdfalyzer Color Theme', style='reverse'))
     colors = [
-        prefix_with_plain_text_obj(name[:MAX_THEME_COL_SIZE], style=str(style)).append(' ')
+        prefix_with_style(name[:MAX_THEME_COL_SIZE], style=str(style)).append(' ')
         for name, style in PDFALYZER_THEME_DICT.items()
         if name not in ['reset', 'repr_url']
     ]
@@ -93,7 +93,7 @@ def pdfalyzer_show_color_theme() -> None:
 def combine_pdfs():
     """
-    Utility method to combine multiple PDFs into one. Invocable with 'combine_pdfs PDF1 [PDF2...]'.
+    Script method to combine multiple PDFs into one. Invocable with 'combine_pdfs PDF1 [PDF2...]'.
     Example: https://github.com/py-pdf/pypdf/blob/main/docs/user/merging-pdfs.md
     """
     args = parse_combine_pdfs_args()
@@ -130,3 +130,9 @@ def combine_pdfs():
     txt = Text('').append(f"  -> Wrote ")
     txt.append(str(file_size_in_mb(args.output_file)), style='cyan').append(" megabytes\n")
     print_highlighted(txt)
+# TODO: migrate this functionality from clown_sort
+# def extract_pages_from_pdf() -> None:
+#     args = parse_pdf_page_extraction_args()
+#     PdfFile(args.pdf_file).extract_page_range(args.page_range, destination_dir=args.destination_dir)

{pdfalyzer-1.16.11 → pdfalyzer-1.16.13}/pdfalyzer/output/tables/decoding_stats_table.py RENAMED Viewed

@@ -5,7 +5,7 @@ from numbers import Number
 from rich.table import Table
 from rich.text import Text
-from yaralyzer.helpers.rich_text_helper import CENTER, na_txt, prefix_with_plain_text_obj
+from yaralyzer.helpers.rich_text_helper import CENTER, na_txt, prefix_with_style
 from pdfalyzer.binary.binary_scanner import BinaryScanner
 from pdfalyzer.helpers.rich_text_helper import pct_txt
@@ -60,7 +60,7 @@ def build_decoding_stats_table(scanner: BinaryScanner) -> Table:
 def _new_decoding_stats_table(title) -> Table:
     """Build an empty table for displaying decoding stats"""
-    title = prefix_with_plain_text_obj(title, style='blue underline')
+    title = prefix_with_style(title, style='blue underline')
     title.append(": Decoding Attempts Summary Statistics", style='bright_white bold')
     table = Table(

{pdfalyzer-1.16.11 → pdfalyzer-1.16.13}/pyproject.toml RENAMED Viewed

@@ -1,6 +1,6 @@
 [tool.poetry]
 name = "pdfalyzer"
-version = "1.16.11"
+version = "1.16.13"
 description = "PDF analysis tool. Scan a PDF with YARA rules, visualize its inner tree-like data structure in living color (lots of colors), force decodes of suspicious font binaries, and more."
 authors = ["Michel de Cryptadamus <michel@cryptadamus.com>"]
 license = "GPL-3.0-or-later"
@@ -49,6 +49,10 @@ keywords = [
     "pdf analysis",
     "pypdf",
     "threat assessment",
+    "threat hunting",
+    "threat intelligence",
+    "threat research",
+    "threatintel",
     "visualization",
     "yara"
 ]
@@ -62,10 +66,10 @@ packages = [
 #   Dependencies    #
 #####################
 [tool.poetry.dependencies]
-python = "^3.9.2"
+python = "^3.9,>=3.9.2"
 anytree = "~=2.13"
-pypdf = "^5.9.0"
-yaralyzer = "^1.0.4"
+pypdf = "^6.0.0"
+yaralyzer = "^1.0.7"
 [tool.poetry.group.dev.dependencies]
 flake8 = "^7.3.0"

{pdfalyzer-1.16.11 → pdfalyzer-1.16.13}/.pdfalyzer.example RENAMED Viewed

File without changes

{pdfalyzer-1.16.11 → pdfalyzer-1.16.13}/LICENSE RENAMED Viewed

File without changes

{pdfalyzer-1.16.11 → pdfalyzer-1.16.13}/pdfalyzer/__main__.py RENAMED Viewed

File without changes

{pdfalyzer-1.16.11 → pdfalyzer-1.16.13}/pdfalyzer/binary/binary_scanner.py RENAMED Viewed

File without changes

{pdfalyzer-1.16.11 → pdfalyzer-1.16.13}/pdfalyzer/config.py RENAMED Viewed

File without changes

{pdfalyzer-1.16.11 → pdfalyzer-1.16.13}/pdfalyzer/decorators/document_model_printer.py RENAMED Viewed

File without changes

{pdfalyzer-1.16.11 → pdfalyzer-1.16.13}/pdfalyzer/decorators/indeterminate_node.py RENAMED Viewed

File without changes

{pdfalyzer-1.16.11 → pdfalyzer-1.16.13}/pdfalyzer/decorators/pdf_object_properties.py RENAMED Viewed

File without changes

{pdfalyzer-1.16.11 → pdfalyzer-1.16.13}/pdfalyzer/decorators/pdf_tree_node.py RENAMED Viewed

File without changes

{pdfalyzer-1.16.11 → pdfalyzer-1.16.13}/pdfalyzer/decorators/pdf_tree_verifier.py RENAMED Viewed

File without changes

{pdfalyzer-1.16.11 → pdfalyzer-1.16.13}/pdfalyzer/detection/constants/binary_regexes.py RENAMED Viewed

File without changes

{pdfalyzer-1.16.11 → pdfalyzer-1.16.13}/pdfalyzer/detection/constants/javascript_reserved_keywords.py RENAMED Viewed

File without changes

{pdfalyzer-1.16.11 → pdfalyzer-1.16.13}/pdfalyzer/detection/javascript_hunter.py RENAMED Viewed

File without changes

{pdfalyzer-1.16.11 → pdfalyzer-1.16.13}/pdfalyzer/detection/yaralyzer_helper.py RENAMED Viewed

File without changes

{pdfalyzer-1.16.11 → pdfalyzer-1.16.13}/pdfalyzer/font_info.py RENAMED Viewed

File without changes

{pdfalyzer-1.16.11 → pdfalyzer-1.16.13}/pdfalyzer/helpers/dict_helper.py RENAMED Viewed

File without changes

{pdfalyzer-1.16.11 → pdfalyzer-1.16.13}/pdfalyzer/helpers/filesystem_helper.py RENAMED Viewed

File without changes

{pdfalyzer-1.16.11 → pdfalyzer-1.16.13}/pdfalyzer/helpers/number_helper.py RENAMED Viewed

File without changes

{pdfalyzer-1.16.11 → pdfalyzer-1.16.13}/pdfalyzer/helpers/pdf_object_helper.py RENAMED Viewed

@@ -8,19 +8,14 @@ from pypdf.generic import IndirectObject, PdfObject
 from pdfalyzer.pdf_object_relationship import PdfObjectRelationship
-def pdf_object_id(pdf_object) -> Optional[int]:
-    """Return the ID of an IndirectObject and None for everything else"""
-    return pdf_object.idnum if isinstance(pdf_object, IndirectObject) else None
 def does_list_have_any_references(_list) -> bool:
     """Return true if any element of _list is an IndirectObject."""
     return any(isinstance(item, IndirectObject) for item in _list)
-def _sort_pdf_object_refs(refs: List[PdfObjectRelationship]) -> List[PdfObjectRelationship]:
-    """Sort a list of PdfObjectRelationship objects by their to_obj's idnum. Only used by pytest."""
-    return sorted(refs, key=lambda ref: ref.to_obj.idnum)
+def pdf_object_id(pdf_object) -> Optional[int]:
+    """Return the ID of an IndirectObject and None for everything else"""
+    return pdf_object.idnum if isinstance(pdf_object, IndirectObject) else None
 def pypdf_class_name(obj: PdfObject) -> str:
@@ -28,3 +23,8 @@ def pypdf_class_name(obj: PdfObject) -> str:
     class_pkgs = type(obj).__name__.split('.')
     class_pkgs.reverse()
     return class_pkgs[0].removesuffix('Object')
+def _sort_pdf_object_refs(refs: List[PdfObjectRelationship]) -> List[PdfObjectRelationship]:
+    """Sort a list of PdfObjectRelationship objects by their to_obj's idnum. Only used by pytest."""
+    return sorted(refs, key=lambda ref: ref.to_obj.idnum)