PyPI - epstein-files - Versions diffs - 1.0.1__tar.gz → 1.0.2__tar.gz - Mend

epstein-files 1.0.1tar.gz → 1.0.2tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (34) hide show

{epstein_files-1.0.1 → epstein_files-1.0.2}/PKG-INFO RENAMED Viewed

@@ -1,26 +1,41 @@
 Metadata-Version: 2.1
 Name: epstein-files
-Version: 1.0.1
+Version: 1.0.2
 Summary: Tools for working with the Jeffrey Epstein documents released in November 2025.
+Home-page: https://michelcrypt4d4mus.github.io/epstein_text_messages/
+License: GPL-3.0-or-later
+Keywords: Epstein,Jeffrey Epstein
 Author: Michel de Cryptadamus
 Requires-Python: >=3.11,<4.0
+Classifier: Development Status :: 5 - Production/Stable
+Classifier: Environment :: Console
+Classifier: Intended Audience :: Information Technology
+Classifier: License :: OSI Approved :: GNU General Public License v3 or later (GPLv3+)
+Classifier: Programming Language :: Python
 Classifier: Programming Language :: Python :: 3
 Classifier: Programming Language :: Python :: 3.11
+Classifier: Programming Language :: Python :: 3.10
+Classifier: Programming Language :: Python :: 3.12
+Classifier: Programming Language :: Python :: 3.13
 Requires-Dist: datefinder (>=0.7.3,<0.8.0)
 Requires-Dist: inflection (>=0.5.1,<0.6.0)
 Requires-Dist: python-dateutil (>=2.9.0.post0,<3.0.0)
 Requires-Dist: python-dotenv (>=1.2.1,<2.0.0)
 Requires-Dist: requests (>=2.32.5,<3.0.0)
 Requires-Dist: rich (>=14.2.0,<15.0.0)
+Project-URL: Emails, https://michelcrypt4d4mus.github.io/epstein_text_messages/all_emails_epstein_files_nov_2025.html
+Project-URL: Metadata, https://michelcrypt4d4mus.github.io/epstein_text_messages/file_metadata_epstein_files_nov_2025.json
+Project-URL: TextMessages, https://michelcrypt4d4mus.github.io/epstein_text_messages
+Project-URL: WordCounts, https://michelcrypt4d4mus.github.io/epstein_text_messages/communication_word_count_epstein_files_nov_2025.html
 Description-Content-Type: text/markdown
 # I Made Epstein's Text Messages Great Again
 * [I Made Epstein's Text Messages Great Again (And You Should Read Them)](https://cryptadamus.substack.com/p/i-made-epsteins-text-messages-great) post on [Substack](https://cryptadamus.substack.com/p/i-made-epsteins-text-messages-great)
 * The Epstein text messages (and some of the emails along with summary counts of sent emails to/from Epstein) generated by this code can be viewed [here](https://michelcrypt4d4mus.github.io/epstein_text_messages/).
-* All of His Emails can be read at another page also generated by this code [here](https://michelcrypt4d4mus.github.io/epstein_emails_house_oversight/).
-* Word counts for the emails and text messages are [here](https://michelcrypt4d4mus.github.io/epstein_text_messages/epstein_emails_word_count.html).
-* Metadata containing what I have figured out about who sent or received the communications in a given file (and a brief explanation for how I figured it out for each file) is deployed [here](https://michelcrypt4d4mus.github.io/epstein_text_messages/epstein_files_nov_2025_cryptadamus_metadata.json)
+* All of His Emails can be read at another page also generated by this code [here](https://michelcrypt4d4mus.github.io/epstein_text_messages/all_emails_epstein_files_nov_2025.html).
+* Word counts for the emails and text messages are [here](https://michelcrypt4d4mus.github.io/epstein_text_messages/communication_word_count_epstein_files_nov_2025.html).
+* Metadata containing what I have figured out about who sent or received the communications in a given file (and a brief explanation for how I figured it out for each file) is deployed [here](https://michelcrypt4d4mus.github.io/epstein_text_messages/file_metadata_epstein_files_nov_2025.json)
 * Configuration variables assigning specific `HOUSE_OVERSIGHT_XXXXXX.txt` file IDs (the `111111` part) as being emails to or from particular people based on various research and contributions can be found in [constants.py](./epstein_files/util/constants.py). Everything in `constants.py` should also appear in the JSON metadata.

{epstein_files-1.0.1 → epstein_files-1.0.2}/README.md RENAMED Viewed

@@ -2,9 +2,9 @@
 * [I Made Epstein's Text Messages Great Again (And You Should Read Them)](https://cryptadamus.substack.com/p/i-made-epsteins-text-messages-great) post on [Substack](https://cryptadamus.substack.com/p/i-made-epsteins-text-messages-great)
 * The Epstein text messages (and some of the emails along with summary counts of sent emails to/from Epstein) generated by this code can be viewed [here](https://michelcrypt4d4mus.github.io/epstein_text_messages/).
-* All of His Emails can be read at another page also generated by this code [here](https://michelcrypt4d4mus.github.io/epstein_emails_house_oversight/).
-* Word counts for the emails and text messages are [here](https://michelcrypt4d4mus.github.io/epstein_text_messages/epstein_emails_word_count.html).
-* Metadata containing what I have figured out about who sent or received the communications in a given file (and a brief explanation for how I figured it out for each file) is deployed [here](https://michelcrypt4d4mus.github.io/epstein_text_messages/epstein_files_nov_2025_cryptadamus_metadata.json)
+* All of His Emails can be read at another page also generated by this code [here](https://michelcrypt4d4mus.github.io/epstein_text_messages/all_emails_epstein_files_nov_2025.html).
+* Word counts for the emails and text messages are [here](https://michelcrypt4d4mus.github.io/epstein_text_messages/communication_word_count_epstein_files_nov_2025.html).
+* Metadata containing what I have figured out about who sent or received the communications in a given file (and a brief explanation for how I figured it out for each file) is deployed [here](https://michelcrypt4d4mus.github.io/epstein_text_messages/file_metadata_epstein_files_nov_2025.json)
 * Configuration variables assigning specific `HOUSE_OVERSIGHT_XXXXXX.txt` file IDs (the `111111` part) as being emails to or from particular people based on various research and contributions can be found in [constants.py](./epstein_files/util/constants.py). Everything in `constants.py` should also appear in the JSON metadata.

epstein_files-1.0.2/epstein_files/__init__.py ADDED Viewed

@@ -0,0 +1,134 @@
+#!/usr/bin/env python
+"""
+Reformat Epstein text message files for readability and count email senders.
+For use with iMessage log files from https://drive.google.com/drive/folders/1hTNH5woIRio578onLGElkTWofUSWRoH_
+Install: 'poetry install'
+    Run: 'EPSTEIN_DOCS_DIR=/path/to/TXT ./generate.py'
+"""
+from sys import exit
+from dotenv import load_dotenv
+load_dotenv()
+from rich.markup import escape
+from rich.padding import Padding
+from rich.panel import Panel
+from epstein_files.epstein_files import EpsteinFiles, document_cls
+from epstein_files.documents.document import INFO_PADDING, Document
+from epstein_files.documents.email import Email
+from epstein_files.util.constant.html import *
+from epstein_files.util.constant.names import *
+from epstein_files.util.constant.output_files import ALL_EMAILS_PATH, TEXT_MSGS_HTML_PATH, make_clean
+from epstein_files.util.env import args, specified_names
+from epstein_files.util.file_helper import coerce_file_path, extract_file_id
+from epstein_files.util.logging import logger
+from epstein_files.util.output import print_emails, print_json_metadata, print_json_stats, print_text_messages, write_urls
+from epstein_files.util.rich import build_highlighter, console, print_header, print_panel, write_html
+from epstein_files.util.timer import Timer
+def generate_html() -> None:
+    if args.make_clean:
+        make_clean()
+        exit()
+    timer = Timer()
+    epstein_files = EpsteinFiles.get_files(timer)
+    if args.json_metadata:
+        print_json_metadata(epstein_files)
+        exit()
+    print_header(epstein_files)
+    if args.colors_only:
+        exit()
+    if args.output_texts:
+        print_text_messages(epstein_files)
+        timer.print_at_checkpoint(f'Printed {len(epstein_files.imessage_logs)} text message logs')
+    if args.output_emails:
+        emails_printed = print_emails(epstein_files)
+        timer.print_at_checkpoint(f"Printed {emails_printed:,} emails")
+    if args.output_other_files:
+        files_printed = epstein_files.print_other_files_table()
+        timer.print_at_checkpoint(f"Printed {len(files_printed)} other files")
+    # Save output
+    write_html(ALL_EMAILS_PATH if args.all_emails else TEXT_MSGS_HTML_PATH)
+    logger.warning(f"Total time: {timer.seconds_since_start_str()}")
+    # JSON stats (mostly used for building pytest checks)
+    if args.json_stats:
+        print_json_stats(epstein_files)
+def epstein_diff():
+    """Diff the cleaned up text of two files."""
+    Document.diff_files(args.positional_args)
+def epstein_search():
+    """Search the cleaned up text of the files."""
+    _assert_positional_args()
+    epstein_files = EpsteinFiles.get_files(use_pickled=True)
+    for search_term in args.positional_args:
+        temp_highlighter = build_highlighter(search_term)
+        search_results = epstein_files.docs_matching(search_term, specified_names)
+        console.line(2)
+        print_panel(f"Found {len(search_results)} documents matching '{search_term}'", padding=(0, 0, 0, 3))
+        for search_result in search_results:
+            console.line()
+            if args.whole_file:
+                console.print(search_result.document)
+            else:
+                console.print(search_result.document.description_panel())
+                for matching_line in search_result.lines:
+                    line_txt = matching_line.__rich__()
+                    console.print(Padding(temp_highlighter(line_txt), INFO_PADDING), style='gray37')
+def epstein_show():
+    """Show the color highlighted file. If --raw arg is passed, show the raw text of the file as well."""
+    _assert_positional_args()
+    ids = [extract_file_id(arg) for arg in args.positional_args]
+    console.line()
+    if args.pickled:
+        epstein_files = EpsteinFiles.get_files(use_pickled=True)
+        docs = epstein_files.get_documents_by_id(ids)
+    else:
+        raw_docs = [Document(coerce_file_path(id)) for id in ids]
+        docs = [document_cls(doc)(doc.file_path) for doc in raw_docs]
+    for doc in docs:
+        console.line()
+        console.print(doc)
+        if args.raw:
+            console.line()
+            console.print(Panel(f"*** {doc.url_slug} RAW ***", expand=False, style=doc._border_style()))
+            console.print(escape(doc.raw_text()))
+            if isinstance(doc, Email):
+                console.line()
+                console.print(Panel(f"*** {doc.url_slug} actual_text ***", expand=False, style=doc._border_style()))
+                console.print(escape(doc._actual_text()))
+def epstein_dump_urls() -> None:
+    write_urls()
+def _assert_positional_args():
+    if not args.positional_args:
+        console.print(f"\n  ERROR: No positional args!\n", style='red1')
+        exit(1)

{epstein_files-1.0.1 → epstein_files-1.0.2}/epstein_files/documents/document.py RENAMED Viewed

@@ -255,7 +255,11 @@ class Document:
             txt.append(f"{timestamp_str}", style=TIMESTAMP_DIM).append(')', style=SYMBOL_STYLE)
         txt.append(' [').append(key_value_txt('size', Text(self.file_size_str(), style='aquamarine1')))
-        txt.append(", ").append(key_value_txt('lines', Text(f"{self.num_lines}", style='cyan')))
+        txt.append(", ").append(key_value_txt('lines', self.num_lines))
+        if self.config and self.config.dupe_of_id:
+            txt.append(", ").append(key_value_txt('dupe_of', Text(self.config.dupe_of_id, style='magenta')))
         return txt
     def top_lines(self, n: int = 10) -> str:

{epstein_files-1.0.1 → epstein_files-1.0.2}/epstein_files/epstein_files.py RENAMED Viewed

@@ -19,6 +19,7 @@ from epstein_files.documents.emails.email_header import AUTHOR
 from epstein_files.documents.json_file import JsonFile
 from epstein_files.documents.messenger_log import MSG_REGEX, MessengerLog
 from epstein_files.documents.other_file import OtherFile
+from epstein_files.util.constant.output_files import PICKLED_PATH
 from epstein_files.util.constant.strings import *
 from epstein_files.util.constant.urls import (EPSTEIN_WEB, JMAIL, epsteinify_name_url, epstein_web_person_url,
      search_jmail_url, search_twitter_url)
@@ -26,7 +27,7 @@ from epstein_files.util.constants import *
 from epstein_files.util.data import dict_sets_to_lists, json_safe, sort_dict
 from epstein_files.util.doc_cfg import EmailCfg
 from epstein_files.util.env import args, logger
-from epstein_files.util.file_helper import DOCS_DIR, PICKLED_PATH, file_size_str
+from epstein_files.util.file_helper import DOCS_DIR, file_size_str
 from epstein_files.util.highlighted_group import get_info_for_name, get_style_for_name
 from epstein_files.util.rich import (DEFAULT_NAME_STYLE, NA_TXT, add_cols_to_table, console, highlighter,
      link_text_obj, link_markup, print_author_header, print_centered, print_other_site_link, print_panel,
@@ -37,7 +38,7 @@ from epstein_files.util.timer import Timer
 DEVICE_SIGNATURE = 'Device Signature'
 DEVICE_SIGNATURE_PADDING = (1, 0)
 NOT_INCLUDED_EMAILERS = [e.lower() for e in (USELESS_EMAILERS + [JEFFREY_EPSTEIN])]
-SLOW_FILE_SECONDS = 0.4
+SLOW_FILE_SECONDS = 1.0
 INVALID_FOR_EPSTEIN_WEB = JUNK_EMAILERS + KRASSNER_RECIPIENTS + [
     'ACT for America',
@@ -54,6 +55,7 @@ class EpsteinFiles:
     imessage_logs: list[MessengerLog] = field(default_factory=list)
     json_files: list[JsonFile] = field(default_factory=list)
     other_files: list[OtherFile] = field(default_factory=list)
+    timer: Timer = field(default_factory=lambda: Timer())
     # Analytics / calculations
     email_author_counts: dict[str | None, int] = field(default_factory=lambda: defaultdict(int))
@@ -90,17 +92,18 @@ class EpsteinFiles:
         self._tally_email_data()
     @classmethod
-    def get_files(cls, timer: Timer | None = None) -> 'EpsteinFiles':
+    def get_files(cls, timer: Timer | None = None, use_pickled: bool = False) -> 'EpsteinFiles':
         """Alternate constructor that reads/writes a pickled version of the data ('timer' arg is for logging)."""
         timer = timer or Timer()
-        if (args.pickled and PICKLED_PATH.exists()) and not args.overwrite_pickle:
+        if ((args.pickled or use_pickled) and PICKLED_PATH.exists()) and not args.overwrite_pickle:
             with gzip.open(PICKLED_PATH, 'rb') as file:
                 epstein_files = pickle.load(file)
                 timer.print_at_checkpoint(f"Loaded {len(epstein_files.all_files):,} documents from '{PICKLED_PATH}' ({file_size_str(PICKLED_PATH)})")
+                epstein_files.timer = timer
                 return epstein_files
-        epstein_files = EpsteinFiles()
+        epstein_files = EpsteinFiles(timer=timer)
         if args.overwrite_pickle or not PICKLED_PATH.exists():
             with gzip.open(PICKLED_PATH, 'wb') as file:
@@ -197,37 +200,36 @@ class EpsteinFiles:
     def json_metadata(self) -> str:
         metadata = {
-            EMAIL_CLASS: [json_safe(doc.metadata()) for doc in self.emails],
-            MESSENGER_LOG_CLASS: [json_safe(doc.metadata()) for doc in self.imessage_logs],
-            OTHER_FILE_CLASS: [json_safe(doc.metadata()) for doc in self.other_files],
+            EMAIL_CLASS: [json_safe(d.metadata()) for d in self.emails],
+            JSON_FILE_CLASS: [json_safe(d.metadata()) for d in self.json_files],
+            MESSENGER_LOG_CLASS: [json_safe(d.metadata()) for d in self.imessage_logs],
+            OTHER_FILE_CLASS: [json_safe(d.metadata()) for d in self.other_files if not isinstance(d, JsonFile)],
         }
         return json.dumps(metadata, indent=4, sort_keys=True)
-    def print_files_summary(self) -> None:
-        other_files = [doc for doc in self.other_files if not isinstance(doc, JsonFile)]
-        dupes = defaultdict(int)
-        for doc in self.all_documents():
-            if doc.is_duplicate:
-                dupes[doc.class_name()] += 1
+    def non_json_other_files(self) -> list[OtherFile]:
+        return [doc for doc in self.other_files if not isinstance(doc, JsonFile)]
+    def print_files_summary(self) -> None:
         table = Table(title='Summary of Document Types')
         add_cols_to_table(table, ['File Type', 'Files', 'Author Known', 'Author Unknown', 'Duplicates'])
-        def add_row(label: str, docs: list, known: int | None = None, dupes: int | None = None):
+        def add_row(label: str, docs: list):
+            known = None if isinstance(docs[0], JsonFile) else len([d for d in docs if d.author])
             table.add_row(
                 label,
                 f"{len(docs):,}",
-                f"{known:,}" if known else NA_TXT,
-                f"{len(docs) - known:,}" if known else NA_TXT,
-                f"{dupes:,}" if dupes else NA_TXT,
+                f"{known:,}" if known is not None else NA_TXT,
+                f"{len(docs) - known:,}" if known is not None else NA_TXT,
+                f"{len([d for d in docs if d.is_duplicate])}",
             )
-        add_row('iMessage Logs', self.imessage_logs, self.identified_imessage_log_count())
-        add_row('Emails', self.emails, len([e for e in self.emails if e.author]), dupes[EMAIL_CLASS])
-        add_row('JSON Data', self.json_files, dupes=0)
-        add_row('Other', other_files, dupes=dupes[OTHER_FILE_CLASS])
+        add_row('iMessage Logs', self.imessage_logs)
+        add_row('Emails', self.emails)
+        add_row('JSON Data', self.json_files)
+        add_row('Other', self.non_json_other_files())
         console.print(Align.center(table))
         console.line()
@@ -357,6 +359,18 @@ def build_signature_table(keyed_sets: dict[str, set[str]], cols: tuple[str, str]
     return Padding(table, DEVICE_SIGNATURE_PADDING)
+def count_by_month(docs: Sequence[Document]) -> dict[str | None, int]:
+    counts: dict[str | None, int] = defaultdict(int)
+    for doc in docs:
+        if doc.timestamp:
+            counts[doc.timestamp.date().isoformat()[0:7]] += 1
+        else:
+            counts[None] += 1
+    return counts
 def document_cls(document: Document) -> Type[Document]:
     search_area = document.text[0:5000]  # Limit search area to avoid pointless scans of huge files
@@ -380,15 +394,3 @@ def is_ok_for_epstein_web(name: str | None) -> bool:
         return False
     return True
-def count_by_month(docs: Sequence[Document]) -> dict[str | None, int]:
-    counts: dict[str | None, int] = defaultdict(int)
-    for doc in docs:
-        if doc.timestamp:
-            counts[doc.timestamp.date().isoformat()[0:7]] += 1
-        else:
-            counts[None] += 1
-    return counts

{epstein_files-1.0.1 → epstein_files-1.0.2}/epstein_files/util/constant/names.py RENAMED Viewed

@@ -1,6 +1,5 @@
 from epstein_files.util.constant.strings import QUESTION_MARKS, remove_question_marks
 UNKNOWN = '(unknown)'
 # Texting Names
@@ -170,6 +169,7 @@ ZUBAIR_KHAN = 'Zubair Khan'
 # No communications but name is in the files
 BILL_GATES = 'Bill Gates'
+DONALD_TRUMP = 'Donald Trump'
 ELON_MUSK = 'Elon Musk'
 HENRY_HOLT = 'Henry Holt'  # Actually a company?
 IVANKA = 'Ivanka'
@@ -195,6 +195,7 @@ INSIGHTS_POD = f"InsightsPod"  # Zubair bots
 NEXT_MANAGEMENT = 'Next Management LLC'
 JP_MORGAN = 'JP Morgan'
 OSBORNE_LLP = f"{IAN_OSBORNE} & Partners LLP"  # Ian Osborne's PR firm
+TRUMP_ORG = 'Trump Organization'
 UBS = 'UBS'
 # Locations

epstein_files-1.0.2/epstein_files/util/constant/output_files.py ADDED Viewed

@@ -0,0 +1,29 @@
+from pathlib import Path
+PICKLED_PATH = Path("the_epstein_files.pkl.gz")
+EPSTEIN_FILES_NOV_2025 = 'epstein_files_nov_2025'
+URLS_ENV = '.urls.env'
+HTML_DIR = Path('docs')
+ALL_EMAILS_PATH = HTML_DIR.joinpath(f'all_emails_{EPSTEIN_FILES_NOV_2025}.html')
+JSON_METADATA_PATH = HTML_DIR.joinpath(f'file_metadata_{EPSTEIN_FILES_NOV_2025}.json')
+TEXT_MSGS_HTML_PATH = HTML_DIR.joinpath('index.html')
+WORD_COUNT_HTML_PATH = HTML_DIR.joinpath(f'communication_word_count_{EPSTEIN_FILES_NOV_2025}.html')
+# EPSTEIN_WORD_COUNT_HTML_PATH = HTML_DIR.joinpath('epstein_texts_and_emails_word_count.html')
+BUILD_ARTIFACTS = [
+    ALL_EMAILS_PATH,
+    # EPSTEIN_WORD_COUNT_HTML_PATH,
+    JSON_METADATA_PATH,
+    TEXT_MSGS_HTML_PATH,
+    WORD_COUNT_HTML_PATH,
+]
+def make_clean() -> None:
+    """Delete all build artifacts."""
+    for build_file in BUILD_ARTIFACTS:
+        if build_file.exists():
+            print(f"Removing build file '{build_file}'...")
+            build_file.unlink()

{epstein_files-1.0.1 → epstein_files-1.0.2}/epstein_files/util/constant/urls.py RENAMED Viewed

@@ -5,8 +5,9 @@ from typing import Literal
 from inflection import parameterize
 from rich.text import Text
+from epstein_files.util.constant.output_files import *
 from epstein_files.util.constant.strings import EMAIL, TEXT_MESSAGE, SiteType
-from epstein_files.util.file_helper import JSON_METADATA_PATH, WORD_COUNT_HTML_PATH, coerce_file_stem
+from epstein_files.util.file_helper import coerce_file_stem
 # Style stuff
 ARCHIVE_LINK_COLOR = 'slate_blue3'
@@ -21,15 +22,17 @@ EPSTEINIFY = 'epsteinify'
 JMAIL = 'Jmail'
-# Cryptadamus URLs
+# Deployment URLS
+# NOTE: don't rename these variables without changing deploy.sh!
 GH_PAGES_BASE_URL = 'https://michelcrypt4d4mus.github.io'
-TEXT_MSGS_BASE_URL = f"{GH_PAGES_BASE_URL}/epstein_text_messages"
-JSON_METADATA_URL = f'{TEXT_MSGS_BASE_URL}/{JSON_METADATA_PATH.name}'
-WORD_COUNT_URL = f'{TEXT_MSGS_BASE_URL}/{WORD_COUNT_HTML_PATH.name}'
+TEXT_MSGS_URL = f"{GH_PAGES_BASE_URL}/epstein_text_messages"
+ALL_EMAILS_URL = f'{TEXT_MSGS_URL}/{ALL_EMAILS_PATH.name}'
+JSON_METADATA_URL = f'{TEXT_MSGS_URL}/{JSON_METADATA_PATH.name}'
+WORD_COUNT_URL = f'{TEXT_MSGS_URL}/{WORD_COUNT_HTML_PATH.name}'
 SITE_URLS: dict[SiteType, str] = {
-    EMAIL: f'{GH_PAGES_BASE_URL}/epstein_emails_house_oversight/',  # TODO should just be same repo
-    TEXT_MESSAGE: TEXT_MSGS_BASE_URL,
+    EMAIL: ALL_EMAILS_URL,
+    TEXT_MESSAGE: TEXT_MSGS_URL,
 }
 GH_PROJECT_URL = 'https://github.com/michelcrypt4d4mus/epstein_text_messages'

epstein-files 1.0.1__tar.gz → 1.0.2__tar.gz

epstein-files 1.0.1tar.gz → 1.0.2tar.gz