rvasm 0.0.1a1__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
rvasm-0.0.1a1/LICENSE ADDED
@@ -0,0 +1,7 @@
1
+ Copyright © 2026 Will Arden
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
4
+
5
+ The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
6
+
7
+ THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
rvasm-0.0.1a1/PKG-INFO ADDED
@@ -0,0 +1,80 @@
1
+ Metadata-Version: 2.4
2
+ Name: rvasm
3
+ Version: 0.0.1a1
4
+ Summary: A simple RISC-V assembler written in Python
5
+ Author-email: Will Arden <will.ardxn@gmail.com>
6
+ License: Copyright © 2026 Will Arden
7
+
8
+ Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
9
+
10
+ The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
11
+
12
+ THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
13
+ Project-URL: Homepage, https://github.com/will-arden/rvasm
14
+ Project-URL: Source, https://github.com/will-arden/rvasm
15
+ Requires-Python: >=3.10
16
+ Description-Content-Type: text/markdown
17
+ License-File: LICENSE
18
+ Dynamic: license-file
19
+
20
+ # rvasm: A Python-Based RISC-V Assembler
21
+ **rvasm** aims to provide a simple and easy-to-use RISC-V assembler, as both a **command-line tool** and a **Python package**. **rvasm** makes it easy to define and use **custom instructions**, simply by providing a JSON file.
22
+
23
+ ## Getting started
24
+ You can install **rvasm** using `pip`, as follows:
25
+
26
+ > `pip install rvasm`
27
+
28
+ **rvasm** can be used either as a **command-line tool**, or as a **Python package** which can be used in a Python program.
29
+
30
+ The command-line tool can be used to assemble an RV32I text file, as demonstrated below:
31
+ > `rvasm my_file.asm`
32
+
33
+ This will produce an output file `out.mem`, containing your assembled RV32I in hexadecimal. For help on other command-line options, use:
34
+ > `rvasm --help`
35
+
36
+ Here is an example on how you can use **rvasm** in a Python project:
37
+ >```python
38
+ >import rvasm
39
+ >my_assembler = rvasm.RVAsm() # Create an Assembler object
40
+ >with open("my_input_file.asm", "r") as f: # Open the assembly file
41
+ > my_assembler.Assemble(f) # Generate the machine code
42
+ >```
43
+
44
+ ## Assembling custom instructions
45
+ **rvasm** makes assembling custom instructions straightforward.
46
+
47
+ Firstly, create a **JSON file** detailing your custom instructions ([use this reference](https://github.com/will-arden/rvasm/tree/main/src/rvasm/json/RV32I.json)). You can specify one or more RISC-V extensions in the same file, or use multiple files. An example would be like so:
48
+ > ```json
49
+ > {
50
+ > "MY_CUSTOM_EXTENSION": [
51
+ > {
52
+ > "instr": "addi",
53
+ > "format": "instr rd, rs1, imm",
54
+ > "width": 32,
55
+ > "encoding": "imm[11:0] & rs1[4:0] & funct3[2:0] & rd[4:0] & opcode[6:0]",
56
+ > "opcode": "0010011",
57
+ > "funct3": "000",
58
+ > "funct7": null
59
+ > }
60
+ > ]
61
+ > }
62
+ > ```
63
+
64
+ To include the new extension(s) from the command line, use the `--include` option, as below:
65
+ > `rvasm my_file.asm --include my_custom_extension.json`
66
+
67
+ The following example shows how you can use **rvasm** to assemble custom RISC-V instructions in your Python code:
68
+ > ```python
69
+ > import rvasm
70
+ > asm = rvasm.RVAsm()
71
+ > with open("my_custom_extension.json", "r") as f:
72
+ > asm.IncludeFromJSON(f)
73
+ > with open("my_file.asm", "r") as f:
74
+ > asm.Assemble(f)
75
+ > ```
76
+
77
+ This project is a work-in-progress, so please keep checking in! Feel free to create issues and suggest improvements.
78
+
79
+ ## License
80
+ This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.
@@ -0,0 +1,61 @@
1
+ # rvasm: A Python-Based RISC-V Assembler
2
+ **rvasm** aims to provide a simple and easy-to-use RISC-V assembler, as both a **command-line tool** and a **Python package**. **rvasm** makes it easy to define and use **custom instructions**, simply by providing a JSON file.
3
+
4
+ ## Getting started
5
+ You can install **rvasm** using `pip`, as follows:
6
+
7
+ > `pip install rvasm`
8
+
9
+ **rvasm** can be used either as a **command-line tool**, or as a **Python package** which can be used in a Python program.
10
+
11
+ The command-line tool can be used to assemble an RV32I text file, as demonstrated below:
12
+ > `rvasm my_file.asm`
13
+
14
+ This will produce an output file `out.mem`, containing your assembled RV32I in hexadecimal. For help on other command-line options, use:
15
+ > `rvasm --help`
16
+
17
+ Here is an example on how you can use **rvasm** in a Python project:
18
+ >```python
19
+ >import rvasm
20
+ >my_assembler = rvasm.RVAsm() # Create an Assembler object
21
+ >with open("my_input_file.asm", "r") as f: # Open the assembly file
22
+ > my_assembler.Assemble(f) # Generate the machine code
23
+ >```
24
+
25
+ ## Assembling custom instructions
26
+ **rvasm** makes assembling custom instructions straightforward.
27
+
28
+ Firstly, create a **JSON file** detailing your custom instructions ([use this reference](https://github.com/will-arden/rvasm/tree/main/src/rvasm/json/RV32I.json)). You can specify one or more RISC-V extensions in the same file, or use multiple files. An example would be like so:
29
+ > ```json
30
+ > {
31
+ > "MY_CUSTOM_EXTENSION": [
32
+ > {
33
+ > "instr": "addi",
34
+ > "format": "instr rd, rs1, imm",
35
+ > "width": 32,
36
+ > "encoding": "imm[11:0] & rs1[4:0] & funct3[2:0] & rd[4:0] & opcode[6:0]",
37
+ > "opcode": "0010011",
38
+ > "funct3": "000",
39
+ > "funct7": null
40
+ > }
41
+ > ]
42
+ > }
43
+ > ```
44
+
45
+ To include the new extension(s) from the command line, use the `--include` option, as below:
46
+ > `rvasm my_file.asm --include my_custom_extension.json`
47
+
48
+ The following example shows how you can use **rvasm** to assemble custom RISC-V instructions in your Python code:
49
+ > ```python
50
+ > import rvasm
51
+ > asm = rvasm.RVAsm()
52
+ > with open("my_custom_extension.json", "r") as f:
53
+ > asm.IncludeFromJSON(f)
54
+ > with open("my_file.asm", "r") as f:
55
+ > asm.Assemble(f)
56
+ > ```
57
+
58
+ This project is a work-in-progress, so please keep checking in! Feel free to create issues and suggest improvements.
59
+
60
+ ## License
61
+ This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.
@@ -0,0 +1,24 @@
1
+ [project]
2
+ name = "rvasm"
3
+ version = "0.0.1a1"
4
+ description = "A simple RISC-V assembler written in Python"
5
+ readme = "README.md"
6
+ requires-python = ">=3.10"
7
+ license = { file = "LICENSE" }
8
+ authors = [
9
+ { name = "Will Arden", email = "will.ardxn@gmail.com" }
10
+ ]
11
+
12
+ [project.urls]
13
+ Homepage = "https://github.com/will-arden/rvasm"
14
+ Source = "https://github.com/will-arden/rvasm"
15
+
16
+ [build-system]
17
+ requires = ["setuptools>=61.0"]
18
+ build-backend = "setuptools.build_meta"
19
+
20
+ [tool.setuptools.packages.find]
21
+ where = ["src"]
22
+
23
+ [project.scripts]
24
+ rvasm = "rvasm.rvasm:main"
@@ -0,0 +1,4 @@
1
+ [egg_info]
2
+ tag_build =
3
+ tag_date = 0
4
+
@@ -0,0 +1 @@
1
+ from .rvasm import RVAsm
@@ -0,0 +1,83 @@
1
+ from pathlib import Path
2
+ import json
3
+
4
+ class Inspector():
5
+
6
+ def __init__(self):
7
+
8
+ # Inspect all JSON files
9
+ dir = Path(__file__).parent / "json"
10
+ for json_file in dir.glob("*.json"):
11
+ with open(json_file, "r") as f:
12
+ json_data = json.load(f)
13
+ self.InspectJSON(json_data)
14
+
15
+ class InspectorError(Exception):
16
+ def __init__(self, message: str):
17
+ super().__init__(message)
18
+
19
+ # Method to inspect a JSON file containing ISA information
20
+ def InspectJSON(self, json_data: json):
21
+
22
+ # Check if any illegal characters are found in the instruction formats
23
+ for extension_name, extension_data in json_data.items():
24
+ for instruction in extension_data:
25
+ for attr, val in instruction.items():
26
+ match attr:
27
+
28
+ case "instr":
29
+ legal_chars = ".abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
30
+ if (val == None or val == ""):
31
+ raise self.InspectorError(f"Empty attribute '{attr}' found in instruction '{instruction['instr']} in extension {extension_name}.'")
32
+ for c in val:
33
+ if (c not in legal_chars):
34
+ raise self.InspectorError(f"Illegal character '{c}' found in attribute '{attr}' of instruction '{instruction['instr']}' in extension '{extension_name}'.")
35
+
36
+ case "format":
37
+ legal_chars = " ,()abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"
38
+ if (val == None or val == ""):
39
+ raise self.InspectorError(f"Empty attribute '{attr}' found in instruction '{instruction['instr']} in extension {extension_name}.'")
40
+ for c in val:
41
+ if (c not in legal_chars):
42
+ raise self.InspectorError(f"Illegal character '{c}' found in attribute '{attr}' of instruction '{instruction['instr']}' in extension '{extension_name}'.")
43
+
44
+ case "width":
45
+ if not (isinstance(val, int)) or (val < 8) or (val % 8 != 0):
46
+ raise self.InspectorError(f"Illegal width field found for instruction '{instruction['instr']}' in extension '{extension_name}'. Only integers are permitted.")
47
+
48
+ case "encoding":
49
+ legal_chars = " ,[:]&abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"
50
+ if (val == None or val == ""):
51
+ raise self.InspectorError(f"Empty attribute '{attr}' found in instruction '{instruction['instr']} in extension {extension_name}.'")
52
+ for c in val:
53
+ if (c not in legal_chars):
54
+ raise self.InspectorError(f"Illegal character '{c}' found in attribute '{attr}' of instruction '{instruction['instr']}' in extension '{extension_name}'.")
55
+
56
+ case "opcode":
57
+ legal_chars = "01"
58
+ if (val == None or val == ""):
59
+ raise self.InspectorError(f"Empty attribute '{attr}' found in instruction '{instruction['instr']} in extension {extension_name}.'")
60
+ for c in val:
61
+ if (c not in legal_chars):
62
+ raise self.InspectorError(f"Illegal character '{c}' found in attribute '{attr}' of instruction '{instruction['instr']}' in extension '{extension_name}'.")
63
+
64
+ case "funct3":
65
+ legal_chars = "01"
66
+ if (val == None or val == ""):
67
+ continue
68
+ for c in val:
69
+ if (c not in legal_chars):
70
+ raise self.InspectorError(f"Illegal character '{c}' found in attribute '{attr}' of instruction '{instruction['instr']}' in extension '{extension_name}'.")
71
+
72
+ case "funct7":
73
+ legal_chars = "01"
74
+ if (val == None or val == ""):
75
+ continue
76
+ for c in val:
77
+ if (c not in legal_chars):
78
+ raise self.InspectorError(f"Illegal character '{c}' found in attribute '{attr}' of instruction '{instruction['instr']}' in extension '{extension_name}'.")
79
+
80
+
81
+ # Raise an error for unexpected attributes
82
+ case _:
83
+ raise self.InspectorError(f"Unexpected attribute '{attr}' found for {extension_name} instruction {instruction['instr']} in JSON file")
@@ -0,0 +1,58 @@
1
+ import json
2
+ from typing import TextIO
3
+ import importlib.resources as res
4
+ from pathlib import Path
5
+
6
+ class Library():
7
+
8
+ # At runtime, this will be populated with declared and included instructions
9
+ working_lib = []
10
+
11
+ # Declared ISAs
12
+ core_isas = {} # These ISAs are supported by default - simply include them to use
13
+ extra_isas = {} # Extra/custom ISAs must be declared before they can be included for use
14
+
15
+ def __init__(self):
16
+
17
+ # Declare the core ISAs (within json/ directory)
18
+ for file in res.files("rvasm.core_ext").iterdir(): # Iterate through every JSON file
19
+ if (file.name.endswith(".json")):
20
+ with file.open("r", encoding="utf-8-sig") as f:
21
+ json_data = json.load(f)
22
+ for isa_name, isa_data in json_data.items(): # Add the ISA data to the dictionary
23
+ self.core_isas[isa_name] = isa_data
24
+
25
+ class LibraryError(Exception):
26
+ def __init__(self, message: str):
27
+ super().__init__(message)
28
+
29
+ # Method to declare extra/custom ISA(s) using a JSON file
30
+ def DeclareFromJSONData(self, json_data: json):
31
+ for isa_name, isa_data in json_data.items(): # Iterate through ISAs
32
+ self.extra_isas[isa_name] = isa_data # and add the data to the dictionary
33
+
34
+ # Check for duplicate instructions added to the working library
35
+ seen_instructions = []
36
+ for entry in self.working_lib:
37
+ if (entry["instr"] in seen_instructions):
38
+ raise self.LibraryError(f"Found multiple definitions for the same instruction ({entry['instr']}) when compiling the working library.")
39
+ seen_instructions.append(entry["instr"])
40
+
41
+ # Method to update the working library based on a given include list
42
+ def UpdateWorkingLibrary(self, include: list[str]):
43
+ isa_found = False
44
+ for isa_to_include in include: # Iterate through every ISA in the include list
45
+ for collection in (self.core_isas, self.extra_isas): # Repeat for both core and extra ISA dictionaries
46
+ if (collection.get(isa_to_include)):
47
+ isa_found = True
48
+ for instruction in collection.get(isa_to_include):
49
+ instr_to_add = instruction.copy()
50
+ self.working_lib.append(instr_to_add) # Append the instruction to the working library
51
+ if (not isa_found):
52
+ raise self.LibraryError(f"Could not recognise ISA: {isa_to_include}")
53
+
54
+ # Method to lookup an instruction from the working library
55
+ def WorkingLibraryLookUp(self, search_term: str):
56
+ for instruction in self.working_lib:
57
+ if (instruction["instr"] == search_term):
58
+ return instruction
@@ -0,0 +1,161 @@
1
+ from rvasm.tokeniser import Tokeniser
2
+
3
+ class Processor():
4
+
5
+ def __init__(self, library):
6
+ self.library = library # Shared Library object
7
+ self.instructions = [] # Store tokenised instructions
8
+ self.labels = [] # Store labels with a program index
9
+ self.index = 0 # Program index (program counter / 4)
10
+ self.tokeniser = Tokeniser(library) # Object responsible for tokenising instructions
11
+
12
+ class ProcessorError(Exception):
13
+ def __init__(self, message):
14
+ super().__init__(message)
15
+
16
+ # Method to reset the Processor, ready for another file to assemble
17
+ def Reset(self):
18
+ self.instructions = []
19
+ self.labels = []
20
+
21
+ # Method to process the next line of the assembly file
22
+ def ProcessLine(self, line: str):
23
+
24
+ line = line.rstrip() # Strip the newline character from the line
25
+ line = line.split("#")[0] # Ignore comments
26
+ line = line.strip() # Remove whitespace
27
+
28
+ # Ignore empty lines
29
+ if (not line):
30
+ return
31
+
32
+ # Parse labels
33
+ if (":" in line):
34
+ label_parts = line.split(":")
35
+
36
+ if (label_parts[1].rstrip()):
37
+ raise self.ProcessorError(f"Unexpected characters following label declaration: {line}")
38
+
39
+ label = {"name": label_parts[0], "index": self.index}
40
+ self.labels.append(label)
41
+ return
42
+
43
+ # Tokenise instructions
44
+ tokens = self.tokeniser.Tokenise(line)
45
+
46
+ # Look-up the library data for this instruction
47
+ lib_data = self.library.WorkingLibraryLookUp(tokens["instr"])
48
+ if (lib_data == None):
49
+ raise self.ProcessorError(f"Could not find instruction {tokens['instr']} in the working library.")
50
+
51
+ # Iterate through the tokens
52
+ for key, value in tokens.items():
53
+
54
+ # No need to check the instruction keyword; this must be correct
55
+ if (key == "instr"):
56
+ continue
57
+
58
+ # Ensure any x's are removed from register fields, then convert to an integer between 0-31
59
+ if (key in ("rd", "rs1", "rs2")):
60
+ tokens[key] = tokens[key].replace("x", "")
61
+ tokens[key] = int(tokens[key])
62
+ if (tokens[key] < 0 or tokens[key] > 31):
63
+ raise self.ProcessorError(f"Register {str(tokens[key])} outside of range 0-31.")
64
+
65
+ # Identify keys which have plain-text values (these might be labels)
66
+ if (value.isalpha()):
67
+
68
+ # TODO: some specific strings might be mappable to some other function
69
+
70
+ # Resolve unknown plain-text values to labels
71
+ for lbl in self.labels:
72
+ if (lbl["name"] == tokens[key]):
73
+ tokens[key] = lbl["index"] * int(lib_data["width"] / 8)
74
+
75
+ # Convert immediate values to integers (if they are not already)
76
+ if (key == "imm"):
77
+ tokens[key] = int(tokens[key])
78
+
79
+ # Finally, increment the index and append the instruction to the list of parsed instructions
80
+ self.index += 1
81
+ self.instructions.append(tokens)
82
+
83
+ # Method to generate the final machine code
84
+ def GenerateBinaries(self):
85
+ machine_code = []
86
+
87
+ # Iterate through each instruction
88
+ for line in self.instructions:
89
+ lib_data = self.library.WorkingLibraryLookUp(line["instr"]) # Fetch the relevant library data
90
+ binary = "" # Placeholder for binarised instruction
91
+
92
+ # Iterate through each field of the encoding
93
+ fields = [p.strip() for p in lib_data["encoding"].split("&")]
94
+ for field in fields:
95
+
96
+ # Parse fields which use bit-slicing
97
+ if ("[" in field and "]" in field and ":" in field):
98
+ fname, remainder = field.split("[", 1) # Get the name of the field
99
+ fupper, remainder = remainder.split(":", 1) # Get the upper bound
100
+ flower = remainder.split("]")[0] # Get the lower bound
101
+ (fupper, flower) = (int(fupper), int(flower)) # Convert bounds to integers
102
+
103
+ # Try to retrieve the information from the instruction
104
+ if (line.get(fname.strip()) is not None):
105
+ fdata = line.get(fname.strip())
106
+
107
+ # Failing that, try to retrieve the information from the library data
108
+ elif (lib_data.get(fname.strip()) is not None):
109
+ fdata = lib_data.get(fname.strip())
110
+
111
+ # If the information can't be found, raise an error
112
+ else:
113
+ raise self.ProcessorError(f"Couldn't find information about '{fname.strip()}' in the instruction or in the corresponding library data.")
114
+
115
+ # Add the bits to the binary string of the instruction
116
+ fvalue = format(fdata, "064b") if isinstance(fdata, int) else fdata
117
+ binary += fvalue[-1 - fupper : len(fvalue) - flower]
118
+
119
+ # Parse fields which index single bits
120
+ elif ("[" in field and "]" in field and not ":" in field):
121
+ fname, remainder = field.split("[", 1) # Get the name of the field
122
+ findex = int(remainder.split("]", 1)[0]) # Get the index of the bit of interest
123
+
124
+ # Try to retrieve the information from the instruction
125
+ if (line.get(fname.strip()) is not None):
126
+ fdata = line.get(fname.strip())
127
+
128
+ # Failing that, try to retrieve the information from the library data
129
+ elif (lib_data.get(fname.strip()) is not None):
130
+ fdata = lib_data.get(fname.strip())
131
+
132
+ # If the information can't be found, raise an error
133
+ else:
134
+ raise self.ProcessorError(f"Couldn't find information about '{fname.strip()}' in the instruction or in the corresponding library data.")
135
+
136
+ # Add the bits to the binary string of the instruction
137
+ fvalue = format(fdata, "064b") if isinstance(fdata, int) else fdata
138
+ binary += fvalue[-1 - findex]
139
+
140
+ # Where no bit-indexing or bit-slicing is required
141
+ elif (not "[" in field and not "]" in field and not ":" in field):
142
+ if (line.get(field.strip()) is not None):
143
+ binary += line.get(field.strip())
144
+ elif (lib_data.get(field.strip()) is not None):
145
+ binary += lib_data.get(field.strip())
146
+ else:
147
+ raise self.ProcessorError(f"Couldn't find information about '{field.strip()}' in the instruction or in the corresponding library data.")
148
+
149
+ # Encoding could not be interpreted
150
+ else:
151
+ raise self.ProcessorError(f"Invalid encoding syntax in JSON data: {field}")
152
+
153
+ # Check that the length of the generated binary is valid
154
+ if (len(binary) != lib_data.get("width")):
155
+ raise self.ProcessorError(f"Expected the width of the '{lib_data['instr']}' instruction to be {lib_data['width']}; got {len(binary)}.")
156
+
157
+ # Append the binary instruction to the list of machine code instructions
158
+ machine_code.append(binary)
159
+
160
+ return machine_code
161
+
@@ -0,0 +1,136 @@
1
+ import argparse
2
+ import json
3
+ from typing import TextIO
4
+
5
+ from .inspector import Inspector
6
+ from .library import Library
7
+ from .processor import Processor
8
+
9
+ def main():
10
+
11
+ # Handle arguments
12
+ parser = argparse.ArgumentParser(description="A Python-based, JSON-driven RISC-V assembler.")
13
+ parser.add_argument("input", help="input file path")
14
+ parser.add_argument("-o", "--output", help="output file path")
15
+ parser.add_argument("-f", "--format", help="output format (binary/hex)")
16
+ parser.add_argument("-i", "--include", nargs="*", help="specify any number of JSON files detailing extra/custom ISA(s)")
17
+ args = parser.parse_args()
18
+
19
+ # Create a new RVAsm object per command
20
+ rvasm = RVAsm()
21
+
22
+ with open(args.input, "r", encoding="utf-8") as asm_file:
23
+
24
+ # Placeholder variables to pass to rvasm object
25
+ OUTPUT = None
26
+ OUTPUT_FORMAT = None
27
+ INCLUDE = None
28
+
29
+ # Overwrite placeholders with any arguments (if present)
30
+ if (args.output):
31
+ OUTPUT = args.output
32
+ if (args.format):
33
+ OUTPUT_FORMAT = args.format
34
+ if (args.include):
35
+ INCLUDE = args.include
36
+
37
+ # Include ISAs
38
+ if (INCLUDE and len(INCLUDE) > 0):
39
+ for json_path in INCLUDE:
40
+ with open(json_path, "r") as json_file:
41
+ rvasm.IncludeFromJSON(json_file)
42
+
43
+ # Go!
44
+ rvasm.Assemble(asm_file, output=OUTPUT, output_format=OUTPUT_FORMAT)
45
+
46
+ class RVAsm():
47
+
48
+ def __init__(self):
49
+
50
+ # Verify the RVAsm setup using the Inspector class
51
+ try:
52
+ self.inspector = Inspector()
53
+ except Exception as e:
54
+ print(f"Failed to created the RVAsm object; setup could not be verified.")
55
+ print(f"{e}")
56
+ exit()
57
+ self.library = Library() # Create a new Library object
58
+
59
+ self.default_includes = ["RV32I"] # Specify ISAs to include by default
60
+ self.user_includes = [] # Placeholder for user-specified inclusions
61
+ self._UpdateWorkingLibrary(self.default_includes + self.user_includes) # Update and compile the working library based on the include list
62
+
63
+ self.processor = Processor(self.library) # Create a Processor object with the shared library
64
+ self.bin = None # Placeholder for the assembled machine code
65
+
66
+ class RVAsmError(Exception):
67
+ def __init__(self, message: str):
68
+ super().__init__(message)
69
+
70
+ # Method to assemble a '.asm' file
71
+ def Assemble(self, file: TextIO, output:str=None, output_format:str=None):
72
+
73
+ # Internally set default arguments (easier for argparse)
74
+ if (not output):
75
+ output = "out.mem"
76
+ if (not output_format):
77
+ output_format = "hex"
78
+
79
+ # Reset the processor
80
+ self.processor.Reset()
81
+
82
+ # Process each line of the .asm file, catching any exceptions and reporting as debug information
83
+ for i, line in enumerate(file):
84
+ try:
85
+ self.processor.ProcessLine(line)
86
+ except Exception as e:
87
+ print(f"\n{type(e).__name__} caused the following line to fail:")
88
+ print(line.rstrip("\n"))
89
+ print(f"{e}")
90
+ exit()
91
+
92
+ self.bin = self.processor.GenerateBinaries() # Create the machine code
93
+ self._WriteOutput( # Output the file to the current directory
94
+ filename=output,
95
+ output_format=output_format
96
+ )
97
+
98
+ # Method to reset the assembler
99
+ def Reset(self):
100
+ self.processor.Reset()
101
+ self.user_includes = []
102
+ self._UpdateWorkingLibrary()
103
+
104
+ # Method to include ISA data from a JSON file
105
+ def IncludeFromJSON(self, json_file: TextIO):
106
+ json_data = json.load(json_file)
107
+ self.library.DeclareFromJSONData(json_data)
108
+
109
+ # Update the include list
110
+ for isa_name, isa_data in json_data.items():
111
+ if not (isa_name in self.user_includes):
112
+ self._IncludeISA(isa_name)
113
+
114
+ # Method to include an ISA of a particular name for use
115
+ def _IncludeISA(self, name):
116
+ if not (name in self.user_includes): # Avoid double-inclusions
117
+ self.user_includes.append(name) # Append the name of the ISA to the user includes
118
+ self._UpdateWorkingLibrary(self.default_includes + self.user_includes) # Update the working library
119
+
120
+ # Method to update the working library following changes to the include list
121
+ def _UpdateWorkingLibrary(self, total_include_list: list[str]):
122
+ self.library.UpdateWorkingLibrary(total_include_list)
123
+
124
+ # Method to write the to an output file
125
+ def _WriteOutput(self, filename="out.mem", output_format="hex"):
126
+ write_content = self.bin
127
+
128
+ with open(filename, "w") as f:
129
+ for line in write_content:
130
+
131
+ # If "hex" is selected as the format, convert each line to a hexadecimal number
132
+ if (output_format.lower() == "hex"):
133
+ line = format(int(line, 2), "0" + str(int(len(line) / 4)) + "x")
134
+
135
+ # Write each line to the output file
136
+ f.write(line + "\n")
@@ -0,0 +1,41 @@
1
+ import re
2
+
3
+ from rvasm.library import Library
4
+
5
+ class Tokeniser():
6
+
7
+ def __init__(self, library):
8
+ self.library = library
9
+
10
+ class TokeniserError(Exception):
11
+ def __init__(self, message: str):
12
+ super().__init__(message)
13
+
14
+ # Method to tokenise an instruction
15
+ def Tokenise(self, line: str):
16
+ tokenised_instruction = {}
17
+ library_data = None
18
+ instr = None
19
+
20
+ # Split the instruction into parts
21
+ parts = re.split(r"[,()\s]+", line) # Split for whitespace, commas and brackets
22
+ parts = [p.strip() for p in parts if p.strip()] # Prune separators and useless parts
23
+ instr = parts[0] # Identify the instruction keyword
24
+
25
+ # Firstly, retrieve the data about the instruction from the library
26
+ library_data = self.library.WorkingLibraryLookUp(instr)
27
+
28
+ # Throw an error if the instruction can't be found in the working library
29
+ if (library_data == None):
30
+ raise self.TokeniserError(f"Unknown instruction {instr} which cannot be found in the working library.")
31
+
32
+ # Tokenise the format string
33
+ fparts = re.split(r"[,()\s]+", library_data["format"]) # Split for whitespace, commas and brackets
34
+ fparts = [fp.strip() for fp in fparts if fp.strip()] # Prune separators and useless parts
35
+
36
+ # Match together each field with the corresponding value in the written instruction
37
+ for i, fp in enumerate(fparts):
38
+ tokenised_instruction[fp] = parts[i].lower()
39
+
40
+ return tokenised_instruction
41
+
@@ -0,0 +1,80 @@
1
+ Metadata-Version: 2.4
2
+ Name: rvasm
3
+ Version: 0.0.1a1
4
+ Summary: A simple RISC-V assembler written in Python
5
+ Author-email: Will Arden <will.ardxn@gmail.com>
6
+ License: Copyright © 2026 Will Arden
7
+
8
+ Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
9
+
10
+ The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
11
+
12
+ THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
13
+ Project-URL: Homepage, https://github.com/will-arden/rvasm
14
+ Project-URL: Source, https://github.com/will-arden/rvasm
15
+ Requires-Python: >=3.10
16
+ Description-Content-Type: text/markdown
17
+ License-File: LICENSE
18
+ Dynamic: license-file
19
+
20
+ # rvasm: A Python-Based RISC-V Assembler
21
+ **rvasm** aims to provide a simple and easy-to-use RISC-V assembler, as both a **command-line tool** and a **Python package**. **rvasm** makes it easy to define and use **custom instructions**, simply by providing a JSON file.
22
+
23
+ ## Getting started
24
+ You can install **rvasm** using `pip`, as follows:
25
+
26
+ > `pip install rvasm`
27
+
28
+ **rvasm** can be used either as a **command-line tool**, or as a **Python package** which can be used in a Python program.
29
+
30
+ The command-line tool can be used to assemble an RV32I text file, as demonstrated below:
31
+ > `rvasm my_file.asm`
32
+
33
+ This will produce an output file `out.mem`, containing your assembled RV32I in hexadecimal. For help on other command-line options, use:
34
+ > `rvasm --help`
35
+
36
+ Here is an example on how you can use **rvasm** in a Python project:
37
+ >```python
38
+ >import rvasm
39
+ >my_assembler = rvasm.RVAsm() # Create an Assembler object
40
+ >with open("my_input_file.asm", "r") as f: # Open the assembly file
41
+ > my_assembler.Assemble(f) # Generate the machine code
42
+ >```
43
+
44
+ ## Assembling custom instructions
45
+ **rvasm** makes assembling custom instructions straightforward.
46
+
47
+ Firstly, create a **JSON file** detailing your custom instructions ([use this reference](https://github.com/will-arden/rvasm/tree/main/src/rvasm/json/RV32I.json)). You can specify one or more RISC-V extensions in the same file, or use multiple files. An example would be like so:
48
+ > ```json
49
+ > {
50
+ > "MY_CUSTOM_EXTENSION": [
51
+ > {
52
+ > "instr": "addi",
53
+ > "format": "instr rd, rs1, imm",
54
+ > "width": 32,
55
+ > "encoding": "imm[11:0] & rs1[4:0] & funct3[2:0] & rd[4:0] & opcode[6:0]",
56
+ > "opcode": "0010011",
57
+ > "funct3": "000",
58
+ > "funct7": null
59
+ > }
60
+ > ]
61
+ > }
62
+ > ```
63
+
64
+ To include the new extension(s) from the command line, use the `--include` option, as below:
65
+ > `rvasm my_file.asm --include my_custom_extension.json`
66
+
67
+ The following example shows how you can use **rvasm** to assemble custom RISC-V instructions in your Python code:
68
+ > ```python
69
+ > import rvasm
70
+ > asm = rvasm.RVAsm()
71
+ > with open("my_custom_extension.json", "r") as f:
72
+ > asm.IncludeFromJSON(f)
73
+ > with open("my_file.asm", "r") as f:
74
+ > asm.Assemble(f)
75
+ > ```
76
+
77
+ This project is a work-in-progress, so please keep checking in! Feel free to create issues and suggest improvements.
78
+
79
+ ## License
80
+ This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.
@@ -0,0 +1,14 @@
1
+ LICENSE
2
+ README.md
3
+ pyproject.toml
4
+ src/rvasm/__init__.py
5
+ src/rvasm/inspector.py
6
+ src/rvasm/library.py
7
+ src/rvasm/processor.py
8
+ src/rvasm/rvasm.py
9
+ src/rvasm/tokeniser.py
10
+ src/rvasm.egg-info/PKG-INFO
11
+ src/rvasm.egg-info/SOURCES.txt
12
+ src/rvasm.egg-info/dependency_links.txt
13
+ src/rvasm.egg-info/entry_points.txt
14
+ src/rvasm.egg-info/top_level.txt
@@ -0,0 +1,2 @@
1
+ [console_scripts]
2
+ rvasm = rvasm.rvasm:main
@@ -0,0 +1 @@
1
+ rvasm