waterfall 0.1.1__py3-none-any.whl → 0.1.3__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,23 +1,22 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: waterfall
3
- Version: 0.1.1
3
+ Version: 0.1.3
4
4
  Summary: Scalable Framework for Robust Text Watermarking and Provenance for LLMs
5
- Author-email: Xinyuan Niu <aperture@outlook.sg>
6
- License-Expression: Apache-2.0
7
5
  Project-URL: Homepage, https://github.com/aoi3142/Waterfall
8
6
  Project-URL: Issues, https://github.com/aoi3142/Waterfall/issues
9
- Classifier: Programming Language :: Python :: 3
7
+ Author-email: Xinyuan Niu <aperture@outlook.sg>
8
+ License-Expression: Apache-2.0
9
+ License-File: LICENSE
10
10
  Classifier: Operating System :: OS Independent
11
+ Classifier: Programming Language :: Python :: 3
11
12
  Requires-Python: >=3.10
12
- Description-Content-Type: text/markdown
13
- License-File: LICENSE
14
13
  Requires-Dist: accelerate>=0.29.0
15
14
  Requires-Dist: numpy>=2.0.0
16
15
  Requires-Dist: scipy>=1.13.0
17
16
  Requires-Dist: sentence-transformers>=3.0.0
18
17
  Requires-Dist: torch>=2.3.0
19
18
  Requires-Dist: transformers>=4.43.1
20
- Dynamic: license-file
19
+ Description-Content-Type: text/markdown
21
20
 
22
21
  # Waterfall: Scalable Framework for Robust Text Watermarking and Provenance for LLMs [EMNLP 2024 Main Long]
23
22
  Gregory Kang Ruey Lau*, Xinyuan Niu*, Hieu Dao, Jiangwei Chen, Chuan-Sheng Foo, Bryan Kian Hsiang Low
@@ -53,7 +52,7 @@ Protecting intellectual property (IP) of text such as articles and code is incre
53
52
 
54
53
  5. A token is sampled from the perturbed logits $\check{L}$ and is appended to the watermarked text.
55
54
 
56
- 6. Append the generated token to the prompt and continue autoregressive generation (steps 1-5) until the eos token.
55
+ 6. Append the generated token to the prompt and continue autoregressive generation (steps 1-5) until the `eos` token.
57
56
 
58
57
  # Verification of un/watermarked text
59
58
 
@@ -77,6 +76,13 @@ Protecting intellectual property (IP) of text such as articles and code is incre
77
76
 
78
77
  # Using our code
79
78
 
79
+ Install our package using `pip`
80
+ ```sh
81
+ pip install waterfall
82
+ ```
83
+
84
+ ## Alternative installation from source
85
+
80
86
  [Optional]
81
87
  If using `conda` (or other pkg managers), it is highly advisable to create a new environment
82
88
 
@@ -98,6 +104,8 @@ Use the command `waterfall_demo` to watermark a piece of text, and then verify t
98
104
  waterfall_demo
99
105
  ```
100
106
 
107
+ \* Ensure that your device (`cuda`/`cpu`/`mps`) has enough memory to load the model and perform inference (~18GB+ for default Llama 3.1 8B model)
108
+
101
109
  Additional arguments
102
110
  ```sh
103
111
  waterfall_demo \
@@ -107,9 +115,11 @@ waterfall_demo \
107
115
  --kappa 2 `# Watermark strength` \
108
116
  --model meta-llama/Llama-3.1-8B-Instruct `# Paraphrasing LLM` \
109
117
  --watermark_fn fourier `# fourier/square watermark` \
110
- --device cuda `# Use cuda/cpu`
118
+ --device cuda `# Use cuda/cpu/mps`
111
119
  ```
112
120
 
121
+ \* By default, `--device` automatically selects among `cuda`/`cpu`/`mps` if not set
122
+
113
123
  ## Using our code to watermark and verify
114
124
 
115
125
  To watermark texts
@@ -134,7 +144,7 @@ test_texts = ["...", "..."] # Suspected texts to verify
134
144
  watermark_strength = verify_texts(test_texts, id)[0] # np array of floats
135
145
  ```
136
146
 
137
- # Code structure
147
+ ## Code structure
138
148
 
139
149
  - `watermark.py` : Sample watermarking script used by with `watermark_demo` command, includes beam search and other optimizations
140
150
  - `WatermarkerBase.py` : Underlying generation and verification code provided by `Watermarker` class
@@ -5,9 +5,8 @@ waterfall/WatermarkingFnSquare.py,sha256=2PAO05DdKT02npo7GDf_82D520nP7kGAWK6H4E4
5
5
  waterfall/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
6
6
  waterfall/permute.py,sha256=RwxOHFhx_VSOhhFwy5s79YgwTUBkfW2-LCCXYR3VT2o,2582
7
7
  waterfall/watermark.py,sha256=whiNhPwWNNIZwXMH6r7QzEE3A7Niq2Ro9elA1iSRoxI,11952
8
- waterfall-0.1.1.dist-info/licenses/LICENSE,sha256=zAtaO-k41Q-Q4Etl4bzuh7pgNJsPH-dYfzvznRa0OvM,11341
9
- waterfall-0.1.1.dist-info/METADATA,sha256=Ik8I-yLPuHSWdGsrSj7YgTCDJ0uTbfbV8FDvKYlPQ6M,8392
10
- waterfall-0.1.1.dist-info/WHEEL,sha256=_zCd3N1l69ArxyTb8rzEoP9TpbYXkqRFSNOD5OuxnTs,91
11
- waterfall-0.1.1.dist-info/entry_points.txt,sha256=XXnUzuWXu2nc9j4WAll9tq6HyodN_8WJLjeG0O4Y2Gw,60
12
- waterfall-0.1.1.dist-info/top_level.txt,sha256=5rTgijeT9V5GRCwIDZmhjeZ4khgH1lmfhS9ZmdUUCKQ,10
13
- waterfall-0.1.1.dist-info/RECORD,,
8
+ waterfall-0.1.3.dist-info/METADATA,sha256=hnX9Vq2zjuq_u9VxJcP9OQ5U2RHUtTGnTQu6iwUJhXM,8715
9
+ waterfall-0.1.3.dist-info/WHEEL,sha256=qtCwoSJWgHk21S1Kb4ihdzI2rlJ1ZKaIurTj_ngOhyQ,87
10
+ waterfall-0.1.3.dist-info/entry_points.txt,sha256=XXnUzuWXu2nc9j4WAll9tq6HyodN_8WJLjeG0O4Y2Gw,60
11
+ waterfall-0.1.3.dist-info/licenses/LICENSE,sha256=zAtaO-k41Q-Q4Etl4bzuh7pgNJsPH-dYfzvznRa0OvM,11341
12
+ waterfall-0.1.3.dist-info/RECORD,,
@@ -1,5 +1,4 @@
1
1
  Wheel-Version: 1.0
2
- Generator: setuptools (80.9.0)
2
+ Generator: hatchling 1.27.0
3
3
  Root-Is-Purelib: true
4
4
  Tag: py3-none-any
5
-
@@ -1 +0,0 @@
1
- waterfall