waterfall 0.1.0__py3-none-any.whl → 0.1.2__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: waterfall
3
- Version: 0.1.0
3
+ Version: 0.1.2
4
4
  Summary: Scalable Framework for Robust Text Watermarking and Provenance for LLMs
5
5
  Author-email: Xinyuan Niu <aperture@outlook.sg>
6
6
  License-Expression: Apache-2.0
@@ -26,7 +26,7 @@ Gregory Kang Ruey Lau*, Xinyuan Niu*, Hieu Dao, Jiangwei Chen, Chuan-Sheng Foo,
26
26
 
27
27
  ## TL;DR: Training-free framework for text watermarking that is scalable, robust to LLM attacks, and applicable to original text of multiple types
28
28
 
29
- ![Alt text](Images/Problem_formulation.jpg "")
29
+ ![Alt text](https://raw.githubusercontent.com/aoi3142/Waterfall/main/Images/Problem_formulation.jpg "")
30
30
 
31
31
  1. Watermark original text $T_o$ with watermark key $\mu$ → watermarked text $T_w$ with same semantic content
32
32
 
@@ -41,7 +41,7 @@ Protecting intellectual property (IP) of text such as articles and code is incre
41
41
 
42
42
  # Watermark process
43
43
 
44
- ![Alt text](Images/Watermarking_process.png "")
44
+ ![Alt text](https://raw.githubusercontent.com/aoi3142/Waterfall/main/Images/Watermarking_process.png "")
45
45
 
46
46
  1. Original text $T_o$ is fed into LLM paraphraser to produce initial logits $L$.
47
47
 
@@ -57,7 +57,7 @@ Protecting intellectual property (IP) of text such as articles and code is incre
57
57
 
58
58
  # Verification of un/watermarked text
59
59
 
60
- ![Alt text](Images/Illustration.gif "Text watermarked with a sine-watermark shows the watermark signal when verified with the correct key")
60
+ ![Alt text](https://raw.githubusercontent.com/aoi3142/Waterfall/main/Images/Illustration.gif "Text watermarked with a sine-watermark shows the watermark signal when verified with the correct key")
61
61
 
62
62
  1. For each token $\hat{w}$ in the watermarked text $T_w$ (original text is not required), use the unique ID $\mu$ and preceding $n-1$ tokens to permute the token index of $\hat{w}$ from $V_o$ space into $V_w$ space.
63
63
 
@@ -77,6 +77,13 @@ Protecting intellectual property (IP) of text such as articles and code is incre
77
77
 
78
78
  # Using our code
79
79
 
80
+ Install our package from pypi
81
+ ```sh
82
+ pip install waterfall
83
+ ```
84
+
85
+ ## Alterntive installation from source
86
+
80
87
  [Optional]
81
88
  If using `conda` (or other pkg managers), it is highly advisable to create a new environment
82
89
 
@@ -85,7 +92,7 @@ conda create -n waterfall python=3.11 --yes `# Compatible with python version
85
92
  conda activate waterfall
86
93
  ```
87
94
 
88
- Clone and install our package
95
+ Clone and install our package from source
89
96
  ```sh
90
97
  git clone https://github.com/aoi3142/Waterfall.git
91
98
  pip install -e Waterfall `# Install in 'editable' mode with '-e', can be omitted`
@@ -98,6 +105,8 @@ Use the command `waterfall_demo` to watermark a piece of text, and then verify t
98
105
  waterfall_demo
99
106
  ```
100
107
 
108
+ \* Ensure that your device (cuda/cpu/mps) has enough memory to load the model (~16GB for default Llama 3.1 8B model)
109
+
101
110
  Additional arguments
102
111
  ```sh
103
112
  waterfall_demo \
@@ -107,9 +116,11 @@ waterfall_demo \
107
116
  --kappa 2 `# Watermark strength` \
108
117
  --model meta-llama/Llama-3.1-8B-Instruct `# Paraphrasing LLM` \
109
118
  --watermark_fn fourier `# fourier/square watermark` \
110
- --device cuda `# Use cuda/cpu`
119
+ --device cuda `# Use cuda/cpu/mps`
111
120
  ```
112
121
 
122
+ \* By default, `--device` automatically selects among `cuda`/`cpu`/`mps` if not set
123
+
113
124
  ## Using our code to watermark and verify
114
125
 
115
126
  To watermark texts
@@ -134,7 +145,7 @@ test_texts = ["...", "..."] # Suspected texts to verify
134
145
  watermark_strength = verify_texts(test_texts, id)[0] # np array of floats
135
146
  ```
136
147
 
137
- # Code structure
148
+ ## Code structure
138
149
 
139
150
  - `watermark.py` : Sample watermarking script used by with `watermark_demo` command, includes beam search and other optimizations
140
151
  - `WatermarkerBase.py` : Underlying generation and verification code provided by `Watermarker` class
@@ -5,9 +5,9 @@ waterfall/WatermarkingFnSquare.py,sha256=2PAO05DdKT02npo7GDf_82D520nP7kGAWK6H4E4
5
5
  waterfall/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
6
6
  waterfall/permute.py,sha256=RwxOHFhx_VSOhhFwy5s79YgwTUBkfW2-LCCXYR3VT2o,2582
7
7
  waterfall/watermark.py,sha256=whiNhPwWNNIZwXMH6r7QzEE3A7Niq2Ro9elA1iSRoxI,11952
8
- waterfall-0.1.0.dist-info/licenses/LICENSE,sha256=zAtaO-k41Q-Q4Etl4bzuh7pgNJsPH-dYfzvznRa0OvM,11341
9
- waterfall-0.1.0.dist-info/METADATA,sha256=ONpoos0Pyx43h4ntb9Fs4l78F3GZNxq__Ox05GMFD8A,8221
10
- waterfall-0.1.0.dist-info/WHEEL,sha256=_zCd3N1l69ArxyTb8rzEoP9TpbYXkqRFSNOD5OuxnTs,91
11
- waterfall-0.1.0.dist-info/entry_points.txt,sha256=XXnUzuWXu2nc9j4WAll9tq6HyodN_8WJLjeG0O4Y2Gw,60
12
- waterfall-0.1.0.dist-info/top_level.txt,sha256=5rTgijeT9V5GRCwIDZmhjeZ4khgH1lmfhS9ZmdUUCKQ,10
13
- waterfall-0.1.0.dist-info/RECORD,,
8
+ waterfall-0.1.2.dist-info/licenses/LICENSE,sha256=zAtaO-k41Q-Q4Etl4bzuh7pgNJsPH-dYfzvznRa0OvM,11341
9
+ waterfall-0.1.2.dist-info/METADATA,sha256=zDciBMgLwmhhf8voqFzLeCOruNQBdqtohMUYKDMzmdQ,8715
10
+ waterfall-0.1.2.dist-info/WHEEL,sha256=_zCd3N1l69ArxyTb8rzEoP9TpbYXkqRFSNOD5OuxnTs,91
11
+ waterfall-0.1.2.dist-info/entry_points.txt,sha256=XXnUzuWXu2nc9j4WAll9tq6HyodN_8WJLjeG0O4Y2Gw,60
12
+ waterfall-0.1.2.dist-info/top_level.txt,sha256=5rTgijeT9V5GRCwIDZmhjeZ4khgH1lmfhS9ZmdUUCKQ,10
13
+ waterfall-0.1.2.dist-info/RECORD,,