reait 0.0.18__tar.gz → 0.0.20__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.1
2
2
  Name: reait
3
- Version: 0.0.18
3
+ Version: 0.0.20
4
4
  Home-page: https://github.com/RevEng-AI/reait
5
5
  Author: James Patrick-Evans
6
6
  Author-email: James Patrick-Evans <james@reveng.ai>
@@ -704,6 +704,8 @@ Requires-Dist: scikit-learn
704
704
 
705
705
  # reait
706
706
 
707
+ [![Python package](https://github.com/RevEngAI/reait/actions/workflows/python-package.yml/badge.svg)](https://github.com/RevEngAI/reait/actions/workflows/python-package.yml)
708
+
707
709
  ## <ins>R</ins>ev<ins>E</ins>ng.<ins>AI</ins> <ins>T</ins>oolkit
708
710
 
709
711
  Analyse compiled executable binaries using the RevEng.AI API. This tool allows you to search for similar components across different compiled executable programs, identify known vulnerabilities in stripped executables, and generate "YARA++" **REAI** signatures for entire binary files. More details about the API can be found at [docs.reveng.ai](https://docs.reveng.ai).
@@ -712,19 +714,23 @@ NB: We are in Alpha. We support GNU/Linux ELF and Windows PE executables for x86
712
714
 
713
715
  ## Installation
714
716
 
715
- Install the latest stable version using pip.
717
+ Install the latest stable version using `pip3`.
716
718
 
717
- `pip install reait`
719
+ ```shell
720
+ pip3 install reait
721
+ ```
718
722
 
719
723
  ### Latest development version
720
724
 
721
- `pip install -e .`
725
+ ```shell
726
+ pip3 install -e .
727
+ ```
722
728
 
723
729
  or
724
730
 
725
- ```
731
+ ```shell
726
732
  python3 -m build .
727
- pip install -U dist/reait-*.whl
733
+ pip3 install -U dist/reait-*.whl
728
734
  ```
729
735
 
730
736
  ## Using reait
@@ -732,7 +738,9 @@ pip install -U dist/reait-*.whl
732
738
  ### Analysing binaries
733
739
  To submit a binary for analysis, run `reait` with the `-a` flag:
734
740
 
735
- `reait -b /usr/bin/true -a`
741
+ ```shell
742
+ reait -b /usr/bin/true -a
743
+ ```
736
744
 
737
745
  This uploads the binary specified by `-b` to RevEng.AI servers for analysis. Depending on the size of the binary, it may take several hours. You may check an analysis jobs progress with the `-l` flag e.g. `reait -b /usr/bin/true -l`.
738
746
 
@@ -740,30 +748,42 @@ This uploads the binary specified by `-b` to RevEng.AI servers for analysis. Dep
740
748
  Symbol embeddings are numerical vector representations of each component that capture their semantic understanding. Similar functions should be similar to each other in our embedded vector space. They can be thought of as *advanced* AI-based IDA FLIRT signatures or Radare2 Zignatures.
741
749
  Once an analysis is complete, you may access RevEng.AI's BinNet embeddings for all symbols extracted with the `-x` flag.
742
750
 
743
- `reait -b /usr/bin/true -x > embeddings.json`
751
+ ```shell
752
+ reait -b /usr/bin/true -x > embeddings.json
753
+ ```
744
754
 
745
- #### Extract embedding for symbol at vaddr 0x19f0
746
- `reait -b /usr/bin/true -x | jq ".[] | select(.vaddr==$((0x19f0))).embedding" > embedding.json`
755
+ #### Extract embedding for symbol at vaddr 0x19F0
756
+ ```shell
757
+ reait -b /usr/bin/true -x | jq ".[] | select(.vaddr==$((0x19F0))).embedding" > embedding.json
758
+ ```
747
759
 
748
760
 
749
761
  ### Search for similar symbols using an embedding
750
762
  To query our database of similar symbols based on an embedding, use `-n` to search using Approximate Nearest Neighbours. The `--nns` allows you to specify the number of results returned. A list of symbols with their names, distance (similarity), RevEng.AI collection set, source code filename, source code line number, and file creation timestamp is returned.
751
763
 
752
- `reait -e embedding.json -n`
764
+ ```shell
765
+ reait --embedding embedding.json -n
766
+ ```
753
767
 
754
- The following command searches for the top 10 most similar symbols found in md5sum.gcc.og.dynamic to the symbol starting at 0x4037e0 in md5sum.clang.og.dynamic. You may need to pass `--image-base` to ensure virtual addresses are mapped correctly.
768
+ The following command searches for the top 10 most similar symbols found in md5sum.gcc.og.dynamic to the symbol starting at _0x33E6_ in md5sum.clang.og.dynamic. You may need to pass `--image-base` to ensure virtual addresses are mapped correctly.
755
769
 
756
- `reait -b md5sum.gcc.og.dynamic -n --start-vaddr 0x33e6 --found-in md5sum.gcc.o2.dynamic --nns 10 --base-address 0x100000`
770
+ ```shell
771
+ reait -b md5sum.gcc.og.dynamic -n --start-vaddr 0x33E6 --found-in md5sum.gcc.o2.dynamic --nns 10 --base-address 0x100000
772
+ ```
757
773
 
758
774
  Search NN by symbol name.
759
- `reait -b md5sum.gcc.og.dynamic -n --symbol md5_buffer --found-in md5sum.gcc.o2.dynamic --nns 5`
775
+ ```shell
776
+ reait -b md5sum.gcc.og.dynamic -n --symbol md5_buffer --found-in md5sum.gcc.o2.dynamic --nns 5
777
+ ```
760
778
 
761
779
  NB: A smaller distance indicates a higher degree of similarity.
762
780
 
763
781
  #### Specific Search
764
782
  To search for the most similar symbols found in a specific binary, use the `--found-in` option with a path to the executable to search from.
765
783
 
766
- `reait -n --embedding /tmp/sha256_init.json --found-in ~/malware.exe --nns 5`
784
+ ```shell
785
+ reait -n --embedding /tmp/sha256_init.json --found-in ~/malware.exe --nns 5
786
+ ```
767
787
 
768
788
  This downloads embeddings from `malware.exe` and computes the cosine similarity between all symbols and `sha256_init.json`. The returned results lists the most similar symbol locations by cosine similarity score (1.0 most similar, -1.0 dissimilar).
769
789
 
@@ -773,7 +793,9 @@ The `--from-file` option may also be used to limit the search to a custom file c
773
793
  #### Limited Search
774
794
  To search for most similar symbols from a set of RevEng.AI collections, use the `--collections` options with a RegEx to match collection names. For example:
775
795
 
776
- `reait -n --embedding my_func.json --collections "(libc.*|lib.*crypt.*)"`
796
+ ```shell
797
+ reait -n --embedding my_func.json --collections "(libc.*|lib.*crypt.*)"
798
+ ```
777
799
 
778
800
  RevEng.AI collections are sets of pre-analysed executable objects. To create custom collection sets e.g., malware collections, please create a RevEng.AI account.
779
801
 
@@ -784,14 +806,16 @@ Find common components between binaries, RevEng.AI collections, or global search
784
806
 
785
807
  Example usage:
786
808
 
787
- ```
809
+ ```shell
788
810
  reait -M -b 05ff897f430fec0ac17f14c89181c76961993506e5875f2987e9ead13bec58c2.exe --from-file 755a4b2ec15da6bb01248b2dfbad206c340ba937eae9c35f04f6cedfe5e99d63.embeddings.json --confidence high
789
811
  ```
790
812
 
791
813
  ### RevEng.AI embedding models
792
814
  To use specific RevEng.AI AI models, or for training custom models, use `-m` to specify the model. The default option is to use the latest development model. Available models are `binnet-0.1` and `dexter`.
793
815
 
794
- `reait -b /usr/bin/true -m dexter -a`
816
+ ```shell
817
+ reait -b /usr/bin/true -m dexter -a
818
+ ```
795
819
 
796
820
  ### Software Composition Analysis
797
821
  To identify known open source software components embedded inside a binary, use the `-C` flag.
@@ -805,7 +829,7 @@ To generate an AI functional description of an entire binary file, use the `-s`
805
829
 
806
830
  REAI signatures can be used to compute the binary similarity between entire executables with the `-S` flag. For example:
807
831
 
808
- ```
832
+ ```shell
809
833
  reait -b d24ccf73aabca4192d33a07b4a238c8d40ac97a550c2e65b8074f03455a981ca.exe -S -t 00062cb01088cea245cd5f3eb03f65a0e6b11a8126ce00034d87935a451cf99c.exe,438d64bb831555caadaa92a32c9d62e255001bc8d524721c885f37d750ec3476.exe,755a4b2ec15da6bb01248b2dfbad206c340ba937eae9c35f04f6cedfe5e99d63.exe,05ff897f430fec0ac17f14c89181c76961993506e5875f2987e9ead13bec58c2.exe
810
834
  Computing Binary Similarity... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:01
811
835
  Binary Similarity to RedlineInfoStealer/d24ccf73aabca4192d33a07b4a238c8d40ac97a550c2e65b8074f03455a981ca.exe
@@ -824,7 +848,7 @@ Computing Binary Similarity... ━━━━━━━━━━━━━━━━
824
848
 
825
849
  To perform binary ANN search, pass in `-n` and `-s` flag at the same time. For example:
826
850
 
827
- ```
851
+ ```shell
828
852
  reait -b /usr/bin/true -s -n
829
853
  Found /usr/bin/true:elf-x86_64
830
854
  [
@@ -856,7 +880,7 @@ Found /usr/bin/true:elf-x86_64
856
880
 
857
881
  `reait` reads the config file stored at `~/.reait.toml`. An example config file looks like:
858
882
 
859
- ```
883
+ ```shell
860
884
  apikey = "l1br3"
861
885
  host = "https://api.reveng.ai"
862
886
  model = "binnet-0.1"
@@ -1,5 +1,7 @@
1
1
  # reait
2
2
 
3
+ [![Python package](https://github.com/RevEngAI/reait/actions/workflows/python-package.yml/badge.svg)](https://github.com/RevEngAI/reait/actions/workflows/python-package.yml)
4
+
3
5
  ## <ins>R</ins>ev<ins>E</ins>ng.<ins>AI</ins> <ins>T</ins>oolkit
4
6
 
5
7
  Analyse compiled executable binaries using the RevEng.AI API. This tool allows you to search for similar components across different compiled executable programs, identify known vulnerabilities in stripped executables, and generate "YARA++" **REAI** signatures for entire binary files. More details about the API can be found at [docs.reveng.ai](https://docs.reveng.ai).
@@ -8,19 +10,23 @@ NB: We are in Alpha. We support GNU/Linux ELF and Windows PE executables for x86
8
10
 
9
11
  ## Installation
10
12
 
11
- Install the latest stable version using pip.
13
+ Install the latest stable version using `pip3`.
12
14
 
13
- `pip install reait`
15
+ ```shell
16
+ pip3 install reait
17
+ ```
14
18
 
15
19
  ### Latest development version
16
20
 
17
- `pip install -e .`
21
+ ```shell
22
+ pip3 install -e .
23
+ ```
18
24
 
19
25
  or
20
26
 
21
- ```
27
+ ```shell
22
28
  python3 -m build .
23
- pip install -U dist/reait-*.whl
29
+ pip3 install -U dist/reait-*.whl
24
30
  ```
25
31
 
26
32
  ## Using reait
@@ -28,7 +34,9 @@ pip install -U dist/reait-*.whl
28
34
  ### Analysing binaries
29
35
  To submit a binary for analysis, run `reait` with the `-a` flag:
30
36
 
31
- `reait -b /usr/bin/true -a`
37
+ ```shell
38
+ reait -b /usr/bin/true -a
39
+ ```
32
40
 
33
41
  This uploads the binary specified by `-b` to RevEng.AI servers for analysis. Depending on the size of the binary, it may take several hours. You may check an analysis jobs progress with the `-l` flag e.g. `reait -b /usr/bin/true -l`.
34
42
 
@@ -36,30 +44,42 @@ This uploads the binary specified by `-b` to RevEng.AI servers for analysis. Dep
36
44
  Symbol embeddings are numerical vector representations of each component that capture their semantic understanding. Similar functions should be similar to each other in our embedded vector space. They can be thought of as *advanced* AI-based IDA FLIRT signatures or Radare2 Zignatures.
37
45
  Once an analysis is complete, you may access RevEng.AI's BinNet embeddings for all symbols extracted with the `-x` flag.
38
46
 
39
- `reait -b /usr/bin/true -x > embeddings.json`
47
+ ```shell
48
+ reait -b /usr/bin/true -x > embeddings.json
49
+ ```
40
50
 
41
- #### Extract embedding for symbol at vaddr 0x19f0
42
- `reait -b /usr/bin/true -x | jq ".[] | select(.vaddr==$((0x19f0))).embedding" > embedding.json`
51
+ #### Extract embedding for symbol at vaddr 0x19F0
52
+ ```shell
53
+ reait -b /usr/bin/true -x | jq ".[] | select(.vaddr==$((0x19F0))).embedding" > embedding.json
54
+ ```
43
55
 
44
56
 
45
57
  ### Search for similar symbols using an embedding
46
58
  To query our database of similar symbols based on an embedding, use `-n` to search using Approximate Nearest Neighbours. The `--nns` allows you to specify the number of results returned. A list of symbols with their names, distance (similarity), RevEng.AI collection set, source code filename, source code line number, and file creation timestamp is returned.
47
59
 
48
- `reait -e embedding.json -n`
60
+ ```shell
61
+ reait --embedding embedding.json -n
62
+ ```
49
63
 
50
- The following command searches for the top 10 most similar symbols found in md5sum.gcc.og.dynamic to the symbol starting at 0x4037e0 in md5sum.clang.og.dynamic. You may need to pass `--image-base` to ensure virtual addresses are mapped correctly.
64
+ The following command searches for the top 10 most similar symbols found in md5sum.gcc.og.dynamic to the symbol starting at _0x33E6_ in md5sum.clang.og.dynamic. You may need to pass `--image-base` to ensure virtual addresses are mapped correctly.
51
65
 
52
- `reait -b md5sum.gcc.og.dynamic -n --start-vaddr 0x33e6 --found-in md5sum.gcc.o2.dynamic --nns 10 --base-address 0x100000`
66
+ ```shell
67
+ reait -b md5sum.gcc.og.dynamic -n --start-vaddr 0x33E6 --found-in md5sum.gcc.o2.dynamic --nns 10 --base-address 0x100000
68
+ ```
53
69
 
54
70
  Search NN by symbol name.
55
- `reait -b md5sum.gcc.og.dynamic -n --symbol md5_buffer --found-in md5sum.gcc.o2.dynamic --nns 5`
71
+ ```shell
72
+ reait -b md5sum.gcc.og.dynamic -n --symbol md5_buffer --found-in md5sum.gcc.o2.dynamic --nns 5
73
+ ```
56
74
 
57
75
  NB: A smaller distance indicates a higher degree of similarity.
58
76
 
59
77
  #### Specific Search
60
78
  To search for the most similar symbols found in a specific binary, use the `--found-in` option with a path to the executable to search from.
61
79
 
62
- `reait -n --embedding /tmp/sha256_init.json --found-in ~/malware.exe --nns 5`
80
+ ```shell
81
+ reait -n --embedding /tmp/sha256_init.json --found-in ~/malware.exe --nns 5
82
+ ```
63
83
 
64
84
  This downloads embeddings from `malware.exe` and computes the cosine similarity between all symbols and `sha256_init.json`. The returned results lists the most similar symbol locations by cosine similarity score (1.0 most similar, -1.0 dissimilar).
65
85
 
@@ -69,7 +89,9 @@ The `--from-file` option may also be used to limit the search to a custom file c
69
89
  #### Limited Search
70
90
  To search for most similar symbols from a set of RevEng.AI collections, use the `--collections` options with a RegEx to match collection names. For example:
71
91
 
72
- `reait -n --embedding my_func.json --collections "(libc.*|lib.*crypt.*)"`
92
+ ```shell
93
+ reait -n --embedding my_func.json --collections "(libc.*|lib.*crypt.*)"
94
+ ```
73
95
 
74
96
  RevEng.AI collections are sets of pre-analysed executable objects. To create custom collection sets e.g., malware collections, please create a RevEng.AI account.
75
97
 
@@ -80,14 +102,16 @@ Find common components between binaries, RevEng.AI collections, or global search
80
102
 
81
103
  Example usage:
82
104
 
83
- ```
105
+ ```shell
84
106
  reait -M -b 05ff897f430fec0ac17f14c89181c76961993506e5875f2987e9ead13bec58c2.exe --from-file 755a4b2ec15da6bb01248b2dfbad206c340ba937eae9c35f04f6cedfe5e99d63.embeddings.json --confidence high
85
107
  ```
86
108
 
87
109
  ### RevEng.AI embedding models
88
110
  To use specific RevEng.AI AI models, or for training custom models, use `-m` to specify the model. The default option is to use the latest development model. Available models are `binnet-0.1` and `dexter`.
89
111
 
90
- `reait -b /usr/bin/true -m dexter -a`
112
+ ```shell
113
+ reait -b /usr/bin/true -m dexter -a
114
+ ```
91
115
 
92
116
  ### Software Composition Analysis
93
117
  To identify known open source software components embedded inside a binary, use the `-C` flag.
@@ -101,7 +125,7 @@ To generate an AI functional description of an entire binary file, use the `-s`
101
125
 
102
126
  REAI signatures can be used to compute the binary similarity between entire executables with the `-S` flag. For example:
103
127
 
104
- ```
128
+ ```shell
105
129
  reait -b d24ccf73aabca4192d33a07b4a238c8d40ac97a550c2e65b8074f03455a981ca.exe -S -t 00062cb01088cea245cd5f3eb03f65a0e6b11a8126ce00034d87935a451cf99c.exe,438d64bb831555caadaa92a32c9d62e255001bc8d524721c885f37d750ec3476.exe,755a4b2ec15da6bb01248b2dfbad206c340ba937eae9c35f04f6cedfe5e99d63.exe,05ff897f430fec0ac17f14c89181c76961993506e5875f2987e9ead13bec58c2.exe
106
130
  Computing Binary Similarity... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:01
107
131
  Binary Similarity to RedlineInfoStealer/d24ccf73aabca4192d33a07b4a238c8d40ac97a550c2e65b8074f03455a981ca.exe
@@ -120,7 +144,7 @@ Computing Binary Similarity... ━━━━━━━━━━━━━━━━
120
144
 
121
145
  To perform binary ANN search, pass in `-n` and `-s` flag at the same time. For example:
122
146
 
123
- ```
147
+ ```shell
124
148
  reait -b /usr/bin/true -s -n
125
149
  Found /usr/bin/true:elf-x86_64
126
150
  [
@@ -152,7 +176,7 @@ Found /usr/bin/true:elf-x86_64
152
176
 
153
177
  `reait` reads the config file stored at `~/.reait.toml`. An example config file looks like:
154
178
 
155
- ```
179
+ ```shell
156
180
  apikey = "l1br3"
157
181
  host = "https://api.reveng.ai"
158
182
  model = "binnet-0.1"
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
4
4
 
5
5
  [project]
6
6
  name = "reait"
7
- version = "0.0.18"
7
+ version = "0.0.20"
8
8
  readme = "README.md"
9
9
  classifiers=[
10
10
  "Programming Language :: Python :: 3",
@@ -1,11 +1,14 @@
1
+ # -*- coding: utf-8 -*-
1
2
  import setuptools
2
3
 
4
+ __version__ = "0.0.20"
5
+
3
6
  with open("README.md", "r") as f:
4
7
  long_description = f.read()
5
8
 
6
9
  setuptools.setup(
7
10
  name="reait",
8
- version="0.0.18",
11
+ version=__version__,
9
12
  long_description=long_description,
10
13
  long_description_content_type="text/markdown",
11
14
  url="https://github.com/RevEng-AI/reait",
@@ -21,4 +24,3 @@ setuptools.setup(
21
24
  'tqdm', 'argparse', 'requests', 'rich', 'tomli', 'pandas', 'numpy', "scipy", "scikit-learn"
22
25
  ],
23
26
  )
24
-
@@ -1,2 +1,5 @@
1
1
  from reait import api
2
2
  api.parse_config()
3
+
4
+
5
+ __version__ = "0.0.20"