reait 0.0.19__tar.gz → 0.0.20__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {reait-0.0.19 → reait-0.0.20}/PKG-INFO +43 -21
- {reait-0.0.19 → reait-0.0.20}/README.md +42 -20
- {reait-0.0.19 → reait-0.0.20}/pyproject.toml +1 -1
- {reait-0.0.19 → reait-0.0.20}/setup.py +4 -2
- {reait-0.0.19 → reait-0.0.20}/src/reait/__init__.py +3 -0
- reait-0.0.20/src/reait/api.py +604 -0
- reait-0.0.20/src/reait/main.py +514 -0
- {reait-0.0.19 → reait-0.0.20}/src/reait.egg-info/PKG-INFO +43 -21
- reait-0.0.19/src/reait/api.py +0 -349
- reait-0.0.19/src/reait/main.py +0 -398
- {reait-0.0.19 → reait-0.0.20}/LICENSE +0 -0
- {reait-0.0.19 → reait-0.0.20}/setup.cfg +0 -0
- {reait-0.0.19 → reait-0.0.20}/src/reait.egg-info/SOURCES.txt +0 -0
- {reait-0.0.19 → reait-0.0.20}/src/reait.egg-info/dependency_links.txt +0 -0
- {reait-0.0.19 → reait-0.0.20}/src/reait.egg-info/entry_points.txt +0 -0
- {reait-0.0.19 → reait-0.0.20}/src/reait.egg-info/requires.txt +0 -0
- {reait-0.0.19 → reait-0.0.20}/src/reait.egg-info/top_level.txt +0 -0
- {reait-0.0.19 → reait-0.0.20}/tests/test_reait.py +0 -0
@@ -1,6 +1,6 @@
|
|
1
1
|
Metadata-Version: 2.1
|
2
2
|
Name: reait
|
3
|
-
Version: 0.0.
|
3
|
+
Version: 0.0.20
|
4
4
|
Home-page: https://github.com/RevEng-AI/reait
|
5
5
|
Author: James Patrick-Evans
|
6
6
|
Author-email: James Patrick-Evans <james@reveng.ai>
|
@@ -714,19 +714,23 @@ NB: We are in Alpha. We support GNU/Linux ELF and Windows PE executables for x86
|
|
714
714
|
|
715
715
|
## Installation
|
716
716
|
|
717
|
-
Install the latest stable version using
|
717
|
+
Install the latest stable version using `pip3`.
|
718
718
|
|
719
|
-
|
719
|
+
```shell
|
720
|
+
pip3 install reait
|
721
|
+
```
|
720
722
|
|
721
723
|
### Latest development version
|
722
724
|
|
723
|
-
|
725
|
+
```shell
|
726
|
+
pip3 install -e .
|
727
|
+
```
|
724
728
|
|
725
729
|
or
|
726
730
|
|
727
|
-
```
|
731
|
+
```shell
|
728
732
|
python3 -m build .
|
729
|
-
|
733
|
+
pip3 install -U dist/reait-*.whl
|
730
734
|
```
|
731
735
|
|
732
736
|
## Using reait
|
@@ -734,7 +738,9 @@ pip install -U dist/reait-*.whl
|
|
734
738
|
### Analysing binaries
|
735
739
|
To submit a binary for analysis, run `reait` with the `-a` flag:
|
736
740
|
|
737
|
-
|
741
|
+
```shell
|
742
|
+
reait -b /usr/bin/true -a
|
743
|
+
```
|
738
744
|
|
739
745
|
This uploads the binary specified by `-b` to RevEng.AI servers for analysis. Depending on the size of the binary, it may take several hours. You may check an analysis jobs progress with the `-l` flag e.g. `reait -b /usr/bin/true -l`.
|
740
746
|
|
@@ -742,30 +748,42 @@ This uploads the binary specified by `-b` to RevEng.AI servers for analysis. Dep
|
|
742
748
|
Symbol embeddings are numerical vector representations of each component that capture their semantic understanding. Similar functions should be similar to each other in our embedded vector space. They can be thought of as *advanced* AI-based IDA FLIRT signatures or Radare2 Zignatures.
|
743
749
|
Once an analysis is complete, you may access RevEng.AI's BinNet embeddings for all symbols extracted with the `-x` flag.
|
744
750
|
|
745
|
-
|
751
|
+
```shell
|
752
|
+
reait -b /usr/bin/true -x > embeddings.json
|
753
|
+
```
|
746
754
|
|
747
|
-
#### Extract embedding for symbol at vaddr
|
748
|
-
|
755
|
+
#### Extract embedding for symbol at vaddr 0x19F0
|
756
|
+
```shell
|
757
|
+
reait -b /usr/bin/true -x | jq ".[] | select(.vaddr==$((0x19F0))).embedding" > embedding.json
|
758
|
+
```
|
749
759
|
|
750
760
|
|
751
761
|
### Search for similar symbols using an embedding
|
752
762
|
To query our database of similar symbols based on an embedding, use `-n` to search using Approximate Nearest Neighbours. The `--nns` allows you to specify the number of results returned. A list of symbols with their names, distance (similarity), RevEng.AI collection set, source code filename, source code line number, and file creation timestamp is returned.
|
753
763
|
|
754
|
-
|
764
|
+
```shell
|
765
|
+
reait --embedding embedding.json -n
|
766
|
+
```
|
755
767
|
|
756
|
-
The following command searches for the top 10 most similar symbols found in md5sum.gcc.og.dynamic to the symbol starting at
|
768
|
+
The following command searches for the top 10 most similar symbols found in md5sum.gcc.og.dynamic to the symbol starting at _0x33E6_ in md5sum.clang.og.dynamic. You may need to pass `--image-base` to ensure virtual addresses are mapped correctly.
|
757
769
|
|
758
|
-
|
770
|
+
```shell
|
771
|
+
reait -b md5sum.gcc.og.dynamic -n --start-vaddr 0x33E6 --found-in md5sum.gcc.o2.dynamic --nns 10 --base-address 0x100000
|
772
|
+
```
|
759
773
|
|
760
774
|
Search NN by symbol name.
|
761
|
-
|
775
|
+
```shell
|
776
|
+
reait -b md5sum.gcc.og.dynamic -n --symbol md5_buffer --found-in md5sum.gcc.o2.dynamic --nns 5
|
777
|
+
```
|
762
778
|
|
763
779
|
NB: A smaller distance indicates a higher degree of similarity.
|
764
780
|
|
765
781
|
#### Specific Search
|
766
782
|
To search for the most similar symbols found in a specific binary, use the `--found-in` option with a path to the executable to search from.
|
767
783
|
|
768
|
-
|
784
|
+
```shell
|
785
|
+
reait -n --embedding /tmp/sha256_init.json --found-in ~/malware.exe --nns 5
|
786
|
+
```
|
769
787
|
|
770
788
|
This downloads embeddings from `malware.exe` and computes the cosine similarity between all symbols and `sha256_init.json`. The returned results lists the most similar symbol locations by cosine similarity score (1.0 most similar, -1.0 dissimilar).
|
771
789
|
|
@@ -775,7 +793,9 @@ The `--from-file` option may also be used to limit the search to a custom file c
|
|
775
793
|
#### Limited Search
|
776
794
|
To search for most similar symbols from a set of RevEng.AI collections, use the `--collections` options with a RegEx to match collection names. For example:
|
777
795
|
|
778
|
-
|
796
|
+
```shell
|
797
|
+
reait -n --embedding my_func.json --collections "(libc.*|lib.*crypt.*)"
|
798
|
+
```
|
779
799
|
|
780
800
|
RevEng.AI collections are sets of pre-analysed executable objects. To create custom collection sets e.g., malware collections, please create a RevEng.AI account.
|
781
801
|
|
@@ -786,14 +806,16 @@ Find common components between binaries, RevEng.AI collections, or global search
|
|
786
806
|
|
787
807
|
Example usage:
|
788
808
|
|
789
|
-
```
|
809
|
+
```shell
|
790
810
|
reait -M -b 05ff897f430fec0ac17f14c89181c76961993506e5875f2987e9ead13bec58c2.exe --from-file 755a4b2ec15da6bb01248b2dfbad206c340ba937eae9c35f04f6cedfe5e99d63.embeddings.json --confidence high
|
791
811
|
```
|
792
812
|
|
793
813
|
### RevEng.AI embedding models
|
794
814
|
To use specific RevEng.AI AI models, or for training custom models, use `-m` to specify the model. The default option is to use the latest development model. Available models are `binnet-0.1` and `dexter`.
|
795
815
|
|
796
|
-
|
816
|
+
```shell
|
817
|
+
reait -b /usr/bin/true -m dexter -a
|
818
|
+
```
|
797
819
|
|
798
820
|
### Software Composition Analysis
|
799
821
|
To identify known open source software components embedded inside a binary, use the `-C` flag.
|
@@ -807,7 +829,7 @@ To generate an AI functional description of an entire binary file, use the `-s`
|
|
807
829
|
|
808
830
|
REAI signatures can be used to compute the binary similarity between entire executables with the `-S` flag. For example:
|
809
831
|
|
810
|
-
```
|
832
|
+
```shell
|
811
833
|
reait -b d24ccf73aabca4192d33a07b4a238c8d40ac97a550c2e65b8074f03455a981ca.exe -S -t 00062cb01088cea245cd5f3eb03f65a0e6b11a8126ce00034d87935a451cf99c.exe,438d64bb831555caadaa92a32c9d62e255001bc8d524721c885f37d750ec3476.exe,755a4b2ec15da6bb01248b2dfbad206c340ba937eae9c35f04f6cedfe5e99d63.exe,05ff897f430fec0ac17f14c89181c76961993506e5875f2987e9ead13bec58c2.exe
|
812
834
|
Computing Binary Similarity... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:01
|
813
835
|
Binary Similarity to RedlineInfoStealer/d24ccf73aabca4192d33a07b4a238c8d40ac97a550c2e65b8074f03455a981ca.exe
|
@@ -826,7 +848,7 @@ Computing Binary Similarity... ━━━━━━━━━━━━━━━━
|
|
826
848
|
|
827
849
|
To perform binary ANN search, pass in `-n` and `-s` flag at the same time. For example:
|
828
850
|
|
829
|
-
```
|
851
|
+
```shell
|
830
852
|
reait -b /usr/bin/true -s -n
|
831
853
|
Found /usr/bin/true:elf-x86_64
|
832
854
|
[
|
@@ -858,7 +880,7 @@ Found /usr/bin/true:elf-x86_64
|
|
858
880
|
|
859
881
|
`reait` reads the config file stored at `~/.reait.toml`. An example config file looks like:
|
860
882
|
|
861
|
-
```
|
883
|
+
```shell
|
862
884
|
apikey = "l1br3"
|
863
885
|
host = "https://api.reveng.ai"
|
864
886
|
model = "binnet-0.1"
|
@@ -10,19 +10,23 @@ NB: We are in Alpha. We support GNU/Linux ELF and Windows PE executables for x86
|
|
10
10
|
|
11
11
|
## Installation
|
12
12
|
|
13
|
-
Install the latest stable version using
|
13
|
+
Install the latest stable version using `pip3`.
|
14
14
|
|
15
|
-
|
15
|
+
```shell
|
16
|
+
pip3 install reait
|
17
|
+
```
|
16
18
|
|
17
19
|
### Latest development version
|
18
20
|
|
19
|
-
|
21
|
+
```shell
|
22
|
+
pip3 install -e .
|
23
|
+
```
|
20
24
|
|
21
25
|
or
|
22
26
|
|
23
|
-
```
|
27
|
+
```shell
|
24
28
|
python3 -m build .
|
25
|
-
|
29
|
+
pip3 install -U dist/reait-*.whl
|
26
30
|
```
|
27
31
|
|
28
32
|
## Using reait
|
@@ -30,7 +34,9 @@ pip install -U dist/reait-*.whl
|
|
30
34
|
### Analysing binaries
|
31
35
|
To submit a binary for analysis, run `reait` with the `-a` flag:
|
32
36
|
|
33
|
-
|
37
|
+
```shell
|
38
|
+
reait -b /usr/bin/true -a
|
39
|
+
```
|
34
40
|
|
35
41
|
This uploads the binary specified by `-b` to RevEng.AI servers for analysis. Depending on the size of the binary, it may take several hours. You may check an analysis jobs progress with the `-l` flag e.g. `reait -b /usr/bin/true -l`.
|
36
42
|
|
@@ -38,30 +44,42 @@ This uploads the binary specified by `-b` to RevEng.AI servers for analysis. Dep
|
|
38
44
|
Symbol embeddings are numerical vector representations of each component that capture their semantic understanding. Similar functions should be similar to each other in our embedded vector space. They can be thought of as *advanced* AI-based IDA FLIRT signatures or Radare2 Zignatures.
|
39
45
|
Once an analysis is complete, you may access RevEng.AI's BinNet embeddings for all symbols extracted with the `-x` flag.
|
40
46
|
|
41
|
-
|
47
|
+
```shell
|
48
|
+
reait -b /usr/bin/true -x > embeddings.json
|
49
|
+
```
|
42
50
|
|
43
|
-
#### Extract embedding for symbol at vaddr
|
44
|
-
|
51
|
+
#### Extract embedding for symbol at vaddr 0x19F0
|
52
|
+
```shell
|
53
|
+
reait -b /usr/bin/true -x | jq ".[] | select(.vaddr==$((0x19F0))).embedding" > embedding.json
|
54
|
+
```
|
45
55
|
|
46
56
|
|
47
57
|
### Search for similar symbols using an embedding
|
48
58
|
To query our database of similar symbols based on an embedding, use `-n` to search using Approximate Nearest Neighbours. The `--nns` allows you to specify the number of results returned. A list of symbols with their names, distance (similarity), RevEng.AI collection set, source code filename, source code line number, and file creation timestamp is returned.
|
49
59
|
|
50
|
-
|
60
|
+
```shell
|
61
|
+
reait --embedding embedding.json -n
|
62
|
+
```
|
51
63
|
|
52
|
-
The following command searches for the top 10 most similar symbols found in md5sum.gcc.og.dynamic to the symbol starting at
|
64
|
+
The following command searches for the top 10 most similar symbols found in md5sum.gcc.og.dynamic to the symbol starting at _0x33E6_ in md5sum.clang.og.dynamic. You may need to pass `--image-base` to ensure virtual addresses are mapped correctly.
|
53
65
|
|
54
|
-
|
66
|
+
```shell
|
67
|
+
reait -b md5sum.gcc.og.dynamic -n --start-vaddr 0x33E6 --found-in md5sum.gcc.o2.dynamic --nns 10 --base-address 0x100000
|
68
|
+
```
|
55
69
|
|
56
70
|
Search NN by symbol name.
|
57
|
-
|
71
|
+
```shell
|
72
|
+
reait -b md5sum.gcc.og.dynamic -n --symbol md5_buffer --found-in md5sum.gcc.o2.dynamic --nns 5
|
73
|
+
```
|
58
74
|
|
59
75
|
NB: A smaller distance indicates a higher degree of similarity.
|
60
76
|
|
61
77
|
#### Specific Search
|
62
78
|
To search for the most similar symbols found in a specific binary, use the `--found-in` option with a path to the executable to search from.
|
63
79
|
|
64
|
-
|
80
|
+
```shell
|
81
|
+
reait -n --embedding /tmp/sha256_init.json --found-in ~/malware.exe --nns 5
|
82
|
+
```
|
65
83
|
|
66
84
|
This downloads embeddings from `malware.exe` and computes the cosine similarity between all symbols and `sha256_init.json`. The returned results lists the most similar symbol locations by cosine similarity score (1.0 most similar, -1.0 dissimilar).
|
67
85
|
|
@@ -71,7 +89,9 @@ The `--from-file` option may also be used to limit the search to a custom file c
|
|
71
89
|
#### Limited Search
|
72
90
|
To search for most similar symbols from a set of RevEng.AI collections, use the `--collections` options with a RegEx to match collection names. For example:
|
73
91
|
|
74
|
-
|
92
|
+
```shell
|
93
|
+
reait -n --embedding my_func.json --collections "(libc.*|lib.*crypt.*)"
|
94
|
+
```
|
75
95
|
|
76
96
|
RevEng.AI collections are sets of pre-analysed executable objects. To create custom collection sets e.g., malware collections, please create a RevEng.AI account.
|
77
97
|
|
@@ -82,14 +102,16 @@ Find common components between binaries, RevEng.AI collections, or global search
|
|
82
102
|
|
83
103
|
Example usage:
|
84
104
|
|
85
|
-
```
|
105
|
+
```shell
|
86
106
|
reait -M -b 05ff897f430fec0ac17f14c89181c76961993506e5875f2987e9ead13bec58c2.exe --from-file 755a4b2ec15da6bb01248b2dfbad206c340ba937eae9c35f04f6cedfe5e99d63.embeddings.json --confidence high
|
87
107
|
```
|
88
108
|
|
89
109
|
### RevEng.AI embedding models
|
90
110
|
To use specific RevEng.AI AI models, or for training custom models, use `-m` to specify the model. The default option is to use the latest development model. Available models are `binnet-0.1` and `dexter`.
|
91
111
|
|
92
|
-
|
112
|
+
```shell
|
113
|
+
reait -b /usr/bin/true -m dexter -a
|
114
|
+
```
|
93
115
|
|
94
116
|
### Software Composition Analysis
|
95
117
|
To identify known open source software components embedded inside a binary, use the `-C` flag.
|
@@ -103,7 +125,7 @@ To generate an AI functional description of an entire binary file, use the `-s`
|
|
103
125
|
|
104
126
|
REAI signatures can be used to compute the binary similarity between entire executables with the `-S` flag. For example:
|
105
127
|
|
106
|
-
```
|
128
|
+
```shell
|
107
129
|
reait -b d24ccf73aabca4192d33a07b4a238c8d40ac97a550c2e65b8074f03455a981ca.exe -S -t 00062cb01088cea245cd5f3eb03f65a0e6b11a8126ce00034d87935a451cf99c.exe,438d64bb831555caadaa92a32c9d62e255001bc8d524721c885f37d750ec3476.exe,755a4b2ec15da6bb01248b2dfbad206c340ba937eae9c35f04f6cedfe5e99d63.exe,05ff897f430fec0ac17f14c89181c76961993506e5875f2987e9ead13bec58c2.exe
|
108
130
|
Computing Binary Similarity... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:01
|
109
131
|
Binary Similarity to RedlineInfoStealer/d24ccf73aabca4192d33a07b4a238c8d40ac97a550c2e65b8074f03455a981ca.exe
|
@@ -122,7 +144,7 @@ Computing Binary Similarity... ━━━━━━━━━━━━━━━━
|
|
122
144
|
|
123
145
|
To perform binary ANN search, pass in `-n` and `-s` flag at the same time. For example:
|
124
146
|
|
125
|
-
```
|
147
|
+
```shell
|
126
148
|
reait -b /usr/bin/true -s -n
|
127
149
|
Found /usr/bin/true:elf-x86_64
|
128
150
|
[
|
@@ -154,7 +176,7 @@ Found /usr/bin/true:elf-x86_64
|
|
154
176
|
|
155
177
|
`reait` reads the config file stored at `~/.reait.toml`. An example config file looks like:
|
156
178
|
|
157
|
-
```
|
179
|
+
```shell
|
158
180
|
apikey = "l1br3"
|
159
181
|
host = "https://api.reveng.ai"
|
160
182
|
model = "binnet-0.1"
|
@@ -1,11 +1,14 @@
|
|
1
|
+
# -*- coding: utf-8 -*-
|
1
2
|
import setuptools
|
2
3
|
|
4
|
+
__version__ = "0.0.20"
|
5
|
+
|
3
6
|
with open("README.md", "r") as f:
|
4
7
|
long_description = f.read()
|
5
8
|
|
6
9
|
setuptools.setup(
|
7
10
|
name="reait",
|
8
|
-
version=
|
11
|
+
version=__version__,
|
9
12
|
long_description=long_description,
|
10
13
|
long_description_content_type="text/markdown",
|
11
14
|
url="https://github.com/RevEng-AI/reait",
|
@@ -21,4 +24,3 @@ setuptools.setup(
|
|
21
24
|
'tqdm', 'argparse', 'requests', 'rich', 'tomli', 'pandas', 'numpy', "scipy", "scikit-learn"
|
22
25
|
],
|
23
26
|
)
|
24
|
-
|