softhauzpy 0.0.1__tar.gz → 0.0.3__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,15 +1,17 @@
1
1
  Metadata-Version: 2.1
2
2
  Name: softhauzpy
3
- Version: 0.0.1
3
+ Version: 0.0.3
4
+ Author: Karen Urate
5
+ Author-email: karen.urate@softhauz.ca
4
6
  Description-Content-Type: text/markdown
5
7
 
6
8
  # SofthauzPy
7
9
  **SofthauzPy** is a comprehensive Python toolkit built for developers creating intelligent, data-driven web applications. It provides a powerful suite of web utilities including web scraping tools, crawling systems, content extraction pipelines, and search engine components that help developers build fully customizable in-house website search solutions.
8
10
 
9
- Designed for scalability and flexibility, Softhauz enables teams to collect, process, index, and search website content efficiently — all within a clean Python-first development ecosystem.
11
+ Designed for scalability and flexibility, **SofthauzPy** enables teams to collect, process, index, and search website content efficiently all within a clean Python-first development ecosystem.
10
12
 
11
- Built for developers who need scalable web data tools and intelligent search capabilities, Softhauz simplifies the process of scraping, processing, indexing, and searching website content.
12
- From lightweight crawlers to fully customizable in-house search engine functionality, Softhauz helps developers build smarter web applications without relying heavily on external search services.
13
+ Built for developers who need scalable web data tools and intelligent search capabilities, **SofthauzPy** simplifies the process of scraping, processing, indexing, and searching website content.
14
+ From lightweight crawlers to fully customizable in-house search engine functionality, **SofthauzPy** helps developers build smarter web applications without relying heavily on external search services.
13
15
 
14
16
 
15
17
  ## Key Features
@@ -1,10 +1,10 @@
1
1
  # SofthauzPy
2
2
  **SofthauzPy** is a comprehensive Python toolkit built for developers creating intelligent, data-driven web applications. It provides a powerful suite of web utilities including web scraping tools, crawling systems, content extraction pipelines, and search engine components that help developers build fully customizable in-house website search solutions.
3
3
 
4
- Designed for scalability and flexibility, Softhauz enables teams to collect, process, index, and search website content efficiently — all within a clean Python-first development ecosystem.
4
+ Designed for scalability and flexibility, **SofthauzPy** enables teams to collect, process, index, and search website content efficiently — all within a clean Python-first development ecosystem.
5
5
 
6
- Built for developers who need scalable web data tools and intelligent search capabilities, Softhauz simplifies the process of scraping, processing, indexing, and searching website content.
7
- From lightweight crawlers to fully customizable in-house search engine functionality, Softhauz helps developers build smarter web applications without relying heavily on external search services.
6
+ Built for developers who need scalable web data tools and intelligent search capabilities, **SofthauzPy** simplifies the process of scraping, processing, indexing, and searching website content.
7
+ From lightweight crawlers to fully customizable in-house search engine functionality, **SofthauzPy** helps developers build smarter web applications without relying heavily on external search services.
8
8
 
9
9
 
10
10
  ## Key Features
@@ -1,11 +1,13 @@
1
1
  from setuptools import setup, find_packages
2
2
 
3
- with open("README.md", "r") as f:
3
+ with open("README.md", "r", encoding="utf-8") as f:
4
4
  description = f.read()
5
5
 
6
6
  setup(
7
7
  name='softhauzpy',
8
- version='0.0.1',
8
+ version='0.0.3',
9
+ author='Karen Urate',
10
+ author_email='karen.urate@softhauz.ca',
9
11
  packages=find_packages(),
10
12
  install_requires=[
11
13
  'requests>=2.32.3',
@@ -989,21 +989,18 @@ def incremental_update(
989
989
  fp = fingerprint_page(text)
990
990
 
991
991
  if fingerprints.get(url) == fp:
992
- return False # No change — skip re-indexing
992
+ return False
993
993
 
994
994
  fingerprints[url] = fp
995
995
 
996
- # Remove stale entries from index
997
996
  for token in list(index.keys()):
998
997
  index[token] = [(doc_id, freq) for doc_id, freq in index[token] if doc_id != url]
999
998
  if not index[token]:
1000
999
  del index[token]
1001
1000
 
1002
- # Remove stale tfidf and metadata entries
1003
1001
  tfidf.pop(url, None)
1004
1002
  metadata[:] = [m for m in metadata if m.get("url") != url]
1005
1003
 
1006
- # Build fresh entries for this page
1007
1004
  token_freq = Counter(tokenize(text))
1008
1005
  total = len(list(token_freq.elements())) or 1
1009
1006
  for token, freq in token_freq.items():
@@ -1,15 +1,17 @@
1
1
  Metadata-Version: 2.1
2
2
  Name: softhauzpy
3
- Version: 0.0.1
3
+ Version: 0.0.3
4
+ Author: Karen Urate
5
+ Author-email: karen.urate@softhauz.ca
4
6
  Description-Content-Type: text/markdown
5
7
 
6
8
  # SofthauzPy
7
9
  **SofthauzPy** is a comprehensive Python toolkit built for developers creating intelligent, data-driven web applications. It provides a powerful suite of web utilities including web scraping tools, crawling systems, content extraction pipelines, and search engine components that help developers build fully customizable in-house website search solutions.
8
10
 
9
- Designed for scalability and flexibility, Softhauz enables teams to collect, process, index, and search website content efficiently — all within a clean Python-first development ecosystem.
11
+ Designed for scalability and flexibility, **SofthauzPy** enables teams to collect, process, index, and search website content efficiently all within a clean Python-first development ecosystem.
10
12
 
11
- Built for developers who need scalable web data tools and intelligent search capabilities, Softhauz simplifies the process of scraping, processing, indexing, and searching website content.
12
- From lightweight crawlers to fully customizable in-house search engine functionality, Softhauz helps developers build smarter web applications without relying heavily on external search services.
13
+ Built for developers who need scalable web data tools and intelligent search capabilities, **SofthauzPy** simplifies the process of scraping, processing, indexing, and searching website content.
14
+ From lightweight crawlers to fully customizable in-house search engine functionality, **SofthauzPy** helps developers build smarter web applications without relying heavily on external search services.
13
15
 
14
16
 
15
17
  ## Key Features
File without changes