waterfall 0.1.1__py3-none-any.whl → 0.1.3__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {waterfall-0.1.1.dist-info → waterfall-0.1.3.dist-info}/METADATA +20 -10
- {waterfall-0.1.1.dist-info → waterfall-0.1.3.dist-info}/RECORD +5 -6
- {waterfall-0.1.1.dist-info → waterfall-0.1.3.dist-info}/WHEEL +1 -2
- waterfall-0.1.1.dist-info/top_level.txt +0 -1
- {waterfall-0.1.1.dist-info → waterfall-0.1.3.dist-info}/entry_points.txt +0 -0
- {waterfall-0.1.1.dist-info → waterfall-0.1.3.dist-info}/licenses/LICENSE +0 -0
|
@@ -1,23 +1,22 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: waterfall
|
|
3
|
-
Version: 0.1.
|
|
3
|
+
Version: 0.1.3
|
|
4
4
|
Summary: Scalable Framework for Robust Text Watermarking and Provenance for LLMs
|
|
5
|
-
Author-email: Xinyuan Niu <aperture@outlook.sg>
|
|
6
|
-
License-Expression: Apache-2.0
|
|
7
5
|
Project-URL: Homepage, https://github.com/aoi3142/Waterfall
|
|
8
6
|
Project-URL: Issues, https://github.com/aoi3142/Waterfall/issues
|
|
9
|
-
|
|
7
|
+
Author-email: Xinyuan Niu <aperture@outlook.sg>
|
|
8
|
+
License-Expression: Apache-2.0
|
|
9
|
+
License-File: LICENSE
|
|
10
10
|
Classifier: Operating System :: OS Independent
|
|
11
|
+
Classifier: Programming Language :: Python :: 3
|
|
11
12
|
Requires-Python: >=3.10
|
|
12
|
-
Description-Content-Type: text/markdown
|
|
13
|
-
License-File: LICENSE
|
|
14
13
|
Requires-Dist: accelerate>=0.29.0
|
|
15
14
|
Requires-Dist: numpy>=2.0.0
|
|
16
15
|
Requires-Dist: scipy>=1.13.0
|
|
17
16
|
Requires-Dist: sentence-transformers>=3.0.0
|
|
18
17
|
Requires-Dist: torch>=2.3.0
|
|
19
18
|
Requires-Dist: transformers>=4.43.1
|
|
20
|
-
|
|
19
|
+
Description-Content-Type: text/markdown
|
|
21
20
|
|
|
22
21
|
# Waterfall: Scalable Framework for Robust Text Watermarking and Provenance for LLMs [EMNLP 2024 Main Long]
|
|
23
22
|
Gregory Kang Ruey Lau*, Xinyuan Niu*, Hieu Dao, Jiangwei Chen, Chuan-Sheng Foo, Bryan Kian Hsiang Low
|
|
@@ -53,7 +52,7 @@ Protecting intellectual property (IP) of text such as articles and code is incre
|
|
|
53
52
|
|
|
54
53
|
5. A token is sampled from the perturbed logits $\check{L}$ and is appended to the watermarked text.
|
|
55
54
|
|
|
56
|
-
6. Append the generated token to the prompt and continue autoregressive generation (steps 1-5) until the eos token.
|
|
55
|
+
6. Append the generated token to the prompt and continue autoregressive generation (steps 1-5) until the `eos` token.
|
|
57
56
|
|
|
58
57
|
# Verification of un/watermarked text
|
|
59
58
|
|
|
@@ -77,6 +76,13 @@ Protecting intellectual property (IP) of text such as articles and code is incre
|
|
|
77
76
|
|
|
78
77
|
# Using our code
|
|
79
78
|
|
|
79
|
+
Install our package using `pip`
|
|
80
|
+
```sh
|
|
81
|
+
pip install waterfall
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
## Alternative installation from source
|
|
85
|
+
|
|
80
86
|
[Optional]
|
|
81
87
|
If using `conda` (or other pkg managers), it is highly advisable to create a new environment
|
|
82
88
|
|
|
@@ -98,6 +104,8 @@ Use the command `waterfall_demo` to watermark a piece of text, and then verify t
|
|
|
98
104
|
waterfall_demo
|
|
99
105
|
```
|
|
100
106
|
|
|
107
|
+
\* Ensure that your device (`cuda`/`cpu`/`mps`) has enough memory to load the model and perform inference (~18GB+ for default Llama 3.1 8B model)
|
|
108
|
+
|
|
101
109
|
Additional arguments
|
|
102
110
|
```sh
|
|
103
111
|
waterfall_demo \
|
|
@@ -107,9 +115,11 @@ waterfall_demo \
|
|
|
107
115
|
--kappa 2 `# Watermark strength` \
|
|
108
116
|
--model meta-llama/Llama-3.1-8B-Instruct `# Paraphrasing LLM` \
|
|
109
117
|
--watermark_fn fourier `# fourier/square watermark` \
|
|
110
|
-
--device cuda `# Use cuda/cpu`
|
|
118
|
+
--device cuda `# Use cuda/cpu/mps`
|
|
111
119
|
```
|
|
112
120
|
|
|
121
|
+
\* By default, `--device` automatically selects among `cuda`/`cpu`/`mps` if not set
|
|
122
|
+
|
|
113
123
|
## Using our code to watermark and verify
|
|
114
124
|
|
|
115
125
|
To watermark texts
|
|
@@ -134,7 +144,7 @@ test_texts = ["...", "..."] # Suspected texts to verify
|
|
|
134
144
|
watermark_strength = verify_texts(test_texts, id)[0] # np array of floats
|
|
135
145
|
```
|
|
136
146
|
|
|
137
|
-
|
|
147
|
+
## Code structure
|
|
138
148
|
|
|
139
149
|
- `watermark.py` : Sample watermarking script used by with `watermark_demo` command, includes beam search and other optimizations
|
|
140
150
|
- `WatermarkerBase.py` : Underlying generation and verification code provided by `Watermarker` class
|
|
@@ -5,9 +5,8 @@ waterfall/WatermarkingFnSquare.py,sha256=2PAO05DdKT02npo7GDf_82D520nP7kGAWK6H4E4
|
|
|
5
5
|
waterfall/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
|
|
6
6
|
waterfall/permute.py,sha256=RwxOHFhx_VSOhhFwy5s79YgwTUBkfW2-LCCXYR3VT2o,2582
|
|
7
7
|
waterfall/watermark.py,sha256=whiNhPwWNNIZwXMH6r7QzEE3A7Niq2Ro9elA1iSRoxI,11952
|
|
8
|
-
waterfall-0.1.
|
|
9
|
-
waterfall-0.1.
|
|
10
|
-
waterfall-0.1.
|
|
11
|
-
waterfall-0.1.
|
|
12
|
-
waterfall-0.1.
|
|
13
|
-
waterfall-0.1.1.dist-info/RECORD,,
|
|
8
|
+
waterfall-0.1.3.dist-info/METADATA,sha256=hnX9Vq2zjuq_u9VxJcP9OQ5U2RHUtTGnTQu6iwUJhXM,8715
|
|
9
|
+
waterfall-0.1.3.dist-info/WHEEL,sha256=qtCwoSJWgHk21S1Kb4ihdzI2rlJ1ZKaIurTj_ngOhyQ,87
|
|
10
|
+
waterfall-0.1.3.dist-info/entry_points.txt,sha256=XXnUzuWXu2nc9j4WAll9tq6HyodN_8WJLjeG0O4Y2Gw,60
|
|
11
|
+
waterfall-0.1.3.dist-info/licenses/LICENSE,sha256=zAtaO-k41Q-Q4Etl4bzuh7pgNJsPH-dYfzvznRa0OvM,11341
|
|
12
|
+
waterfall-0.1.3.dist-info/RECORD,,
|
|
@@ -1 +0,0 @@
|
|
|
1
|
-
waterfall
|
|
File without changes
|
|
File without changes
|