rvasm 0.0.1a1__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- rvasm-0.0.1a1/LICENSE +7 -0
- rvasm-0.0.1a1/PKG-INFO +80 -0
- rvasm-0.0.1a1/README.md +61 -0
- rvasm-0.0.1a1/pyproject.toml +24 -0
- rvasm-0.0.1a1/setup.cfg +4 -0
- rvasm-0.0.1a1/src/rvasm/__init__.py +1 -0
- rvasm-0.0.1a1/src/rvasm/inspector.py +83 -0
- rvasm-0.0.1a1/src/rvasm/library.py +58 -0
- rvasm-0.0.1a1/src/rvasm/processor.py +161 -0
- rvasm-0.0.1a1/src/rvasm/rvasm.py +136 -0
- rvasm-0.0.1a1/src/rvasm/tokeniser.py +41 -0
- rvasm-0.0.1a1/src/rvasm.egg-info/PKG-INFO +80 -0
- rvasm-0.0.1a1/src/rvasm.egg-info/SOURCES.txt +14 -0
- rvasm-0.0.1a1/src/rvasm.egg-info/dependency_links.txt +1 -0
- rvasm-0.0.1a1/src/rvasm.egg-info/entry_points.txt +2 -0
- rvasm-0.0.1a1/src/rvasm.egg-info/top_level.txt +1 -0
rvasm-0.0.1a1/LICENSE
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
1
|
+
Copyright © 2026 Will Arden
|
|
2
|
+
|
|
3
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
|
|
4
|
+
|
|
5
|
+
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
|
|
6
|
+
|
|
7
|
+
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
rvasm-0.0.1a1/PKG-INFO
ADDED
|
@@ -0,0 +1,80 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: rvasm
|
|
3
|
+
Version: 0.0.1a1
|
|
4
|
+
Summary: A simple RISC-V assembler written in Python
|
|
5
|
+
Author-email: Will Arden <will.ardxn@gmail.com>
|
|
6
|
+
License: Copyright © 2026 Will Arden
|
|
7
|
+
|
|
8
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
|
|
9
|
+
|
|
10
|
+
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
|
|
11
|
+
|
|
12
|
+
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
|
13
|
+
Project-URL: Homepage, https://github.com/will-arden/rvasm
|
|
14
|
+
Project-URL: Source, https://github.com/will-arden/rvasm
|
|
15
|
+
Requires-Python: >=3.10
|
|
16
|
+
Description-Content-Type: text/markdown
|
|
17
|
+
License-File: LICENSE
|
|
18
|
+
Dynamic: license-file
|
|
19
|
+
|
|
20
|
+
# rvasm: A Python-Based RISC-V Assembler
|
|
21
|
+
**rvasm** aims to provide a simple and easy-to-use RISC-V assembler, as both a **command-line tool** and a **Python package**. **rvasm** makes it easy to define and use **custom instructions**, simply by providing a JSON file.
|
|
22
|
+
|
|
23
|
+
## Getting started
|
|
24
|
+
You can install **rvasm** using `pip`, as follows:
|
|
25
|
+
|
|
26
|
+
> `pip install rvasm`
|
|
27
|
+
|
|
28
|
+
**rvasm** can be used either as a **command-line tool**, or as a **Python package** which can be used in a Python program.
|
|
29
|
+
|
|
30
|
+
The command-line tool can be used to assemble an RV32I text file, as demonstrated below:
|
|
31
|
+
> `rvasm my_file.asm`
|
|
32
|
+
|
|
33
|
+
This will produce an output file `out.mem`, containing your assembled RV32I in hexadecimal. For help on other command-line options, use:
|
|
34
|
+
> `rvasm --help`
|
|
35
|
+
|
|
36
|
+
Here is an example on how you can use **rvasm** in a Python project:
|
|
37
|
+
>```python
|
|
38
|
+
>import rvasm
|
|
39
|
+
>my_assembler = rvasm.RVAsm() # Create an Assembler object
|
|
40
|
+
>with open("my_input_file.asm", "r") as f: # Open the assembly file
|
|
41
|
+
> my_assembler.Assemble(f) # Generate the machine code
|
|
42
|
+
>```
|
|
43
|
+
|
|
44
|
+
## Assembling custom instructions
|
|
45
|
+
**rvasm** makes assembling custom instructions straightforward.
|
|
46
|
+
|
|
47
|
+
Firstly, create a **JSON file** detailing your custom instructions ([use this reference](https://github.com/will-arden/rvasm/tree/main/src/rvasm/json/RV32I.json)). You can specify one or more RISC-V extensions in the same file, or use multiple files. An example would be like so:
|
|
48
|
+
> ```json
|
|
49
|
+
> {
|
|
50
|
+
> "MY_CUSTOM_EXTENSION": [
|
|
51
|
+
> {
|
|
52
|
+
> "instr": "addi",
|
|
53
|
+
> "format": "instr rd, rs1, imm",
|
|
54
|
+
> "width": 32,
|
|
55
|
+
> "encoding": "imm[11:0] & rs1[4:0] & funct3[2:0] & rd[4:0] & opcode[6:0]",
|
|
56
|
+
> "opcode": "0010011",
|
|
57
|
+
> "funct3": "000",
|
|
58
|
+
> "funct7": null
|
|
59
|
+
> }
|
|
60
|
+
> ]
|
|
61
|
+
> }
|
|
62
|
+
> ```
|
|
63
|
+
|
|
64
|
+
To include the new extension(s) from the command line, use the `--include` option, as below:
|
|
65
|
+
> `rvasm my_file.asm --include my_custom_extension.json`
|
|
66
|
+
|
|
67
|
+
The following example shows how you can use **rvasm** to assemble custom RISC-V instructions in your Python code:
|
|
68
|
+
> ```python
|
|
69
|
+
> import rvasm
|
|
70
|
+
> asm = rvasm.RVAsm()
|
|
71
|
+
> with open("my_custom_extension.json", "r") as f:
|
|
72
|
+
> asm.IncludeFromJSON(f)
|
|
73
|
+
> with open("my_file.asm", "r") as f:
|
|
74
|
+
> asm.Assemble(f)
|
|
75
|
+
> ```
|
|
76
|
+
|
|
77
|
+
This project is a work-in-progress, so please keep checking in! Feel free to create issues and suggest improvements.
|
|
78
|
+
|
|
79
|
+
## License
|
|
80
|
+
This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.
|
rvasm-0.0.1a1/README.md
ADDED
|
@@ -0,0 +1,61 @@
|
|
|
1
|
+
# rvasm: A Python-Based RISC-V Assembler
|
|
2
|
+
**rvasm** aims to provide a simple and easy-to-use RISC-V assembler, as both a **command-line tool** and a **Python package**. **rvasm** makes it easy to define and use **custom instructions**, simply by providing a JSON file.
|
|
3
|
+
|
|
4
|
+
## Getting started
|
|
5
|
+
You can install **rvasm** using `pip`, as follows:
|
|
6
|
+
|
|
7
|
+
> `pip install rvasm`
|
|
8
|
+
|
|
9
|
+
**rvasm** can be used either as a **command-line tool**, or as a **Python package** which can be used in a Python program.
|
|
10
|
+
|
|
11
|
+
The command-line tool can be used to assemble an RV32I text file, as demonstrated below:
|
|
12
|
+
> `rvasm my_file.asm`
|
|
13
|
+
|
|
14
|
+
This will produce an output file `out.mem`, containing your assembled RV32I in hexadecimal. For help on other command-line options, use:
|
|
15
|
+
> `rvasm --help`
|
|
16
|
+
|
|
17
|
+
Here is an example on how you can use **rvasm** in a Python project:
|
|
18
|
+
>```python
|
|
19
|
+
>import rvasm
|
|
20
|
+
>my_assembler = rvasm.RVAsm() # Create an Assembler object
|
|
21
|
+
>with open("my_input_file.asm", "r") as f: # Open the assembly file
|
|
22
|
+
> my_assembler.Assemble(f) # Generate the machine code
|
|
23
|
+
>```
|
|
24
|
+
|
|
25
|
+
## Assembling custom instructions
|
|
26
|
+
**rvasm** makes assembling custom instructions straightforward.
|
|
27
|
+
|
|
28
|
+
Firstly, create a **JSON file** detailing your custom instructions ([use this reference](https://github.com/will-arden/rvasm/tree/main/src/rvasm/json/RV32I.json)). You can specify one or more RISC-V extensions in the same file, or use multiple files. An example would be like so:
|
|
29
|
+
> ```json
|
|
30
|
+
> {
|
|
31
|
+
> "MY_CUSTOM_EXTENSION": [
|
|
32
|
+
> {
|
|
33
|
+
> "instr": "addi",
|
|
34
|
+
> "format": "instr rd, rs1, imm",
|
|
35
|
+
> "width": 32,
|
|
36
|
+
> "encoding": "imm[11:0] & rs1[4:0] & funct3[2:0] & rd[4:0] & opcode[6:0]",
|
|
37
|
+
> "opcode": "0010011",
|
|
38
|
+
> "funct3": "000",
|
|
39
|
+
> "funct7": null
|
|
40
|
+
> }
|
|
41
|
+
> ]
|
|
42
|
+
> }
|
|
43
|
+
> ```
|
|
44
|
+
|
|
45
|
+
To include the new extension(s) from the command line, use the `--include` option, as below:
|
|
46
|
+
> `rvasm my_file.asm --include my_custom_extension.json`
|
|
47
|
+
|
|
48
|
+
The following example shows how you can use **rvasm** to assemble custom RISC-V instructions in your Python code:
|
|
49
|
+
> ```python
|
|
50
|
+
> import rvasm
|
|
51
|
+
> asm = rvasm.RVAsm()
|
|
52
|
+
> with open("my_custom_extension.json", "r") as f:
|
|
53
|
+
> asm.IncludeFromJSON(f)
|
|
54
|
+
> with open("my_file.asm", "r") as f:
|
|
55
|
+
> asm.Assemble(f)
|
|
56
|
+
> ```
|
|
57
|
+
|
|
58
|
+
This project is a work-in-progress, so please keep checking in! Feel free to create issues and suggest improvements.
|
|
59
|
+
|
|
60
|
+
## License
|
|
61
|
+
This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.
|
|
@@ -0,0 +1,24 @@
|
|
|
1
|
+
[project]
|
|
2
|
+
name = "rvasm"
|
|
3
|
+
version = "0.0.1a1"
|
|
4
|
+
description = "A simple RISC-V assembler written in Python"
|
|
5
|
+
readme = "README.md"
|
|
6
|
+
requires-python = ">=3.10"
|
|
7
|
+
license = { file = "LICENSE" }
|
|
8
|
+
authors = [
|
|
9
|
+
{ name = "Will Arden", email = "will.ardxn@gmail.com" }
|
|
10
|
+
]
|
|
11
|
+
|
|
12
|
+
[project.urls]
|
|
13
|
+
Homepage = "https://github.com/will-arden/rvasm"
|
|
14
|
+
Source = "https://github.com/will-arden/rvasm"
|
|
15
|
+
|
|
16
|
+
[build-system]
|
|
17
|
+
requires = ["setuptools>=61.0"]
|
|
18
|
+
build-backend = "setuptools.build_meta"
|
|
19
|
+
|
|
20
|
+
[tool.setuptools.packages.find]
|
|
21
|
+
where = ["src"]
|
|
22
|
+
|
|
23
|
+
[project.scripts]
|
|
24
|
+
rvasm = "rvasm.rvasm:main"
|
rvasm-0.0.1a1/setup.cfg
ADDED
|
@@ -0,0 +1 @@
|
|
|
1
|
+
from .rvasm import RVAsm
|
|
@@ -0,0 +1,83 @@
|
|
|
1
|
+
from pathlib import Path
|
|
2
|
+
import json
|
|
3
|
+
|
|
4
|
+
class Inspector():
|
|
5
|
+
|
|
6
|
+
def __init__(self):
|
|
7
|
+
|
|
8
|
+
# Inspect all JSON files
|
|
9
|
+
dir = Path(__file__).parent / "json"
|
|
10
|
+
for json_file in dir.glob("*.json"):
|
|
11
|
+
with open(json_file, "r") as f:
|
|
12
|
+
json_data = json.load(f)
|
|
13
|
+
self.InspectJSON(json_data)
|
|
14
|
+
|
|
15
|
+
class InspectorError(Exception):
|
|
16
|
+
def __init__(self, message: str):
|
|
17
|
+
super().__init__(message)
|
|
18
|
+
|
|
19
|
+
# Method to inspect a JSON file containing ISA information
|
|
20
|
+
def InspectJSON(self, json_data: json):
|
|
21
|
+
|
|
22
|
+
# Check if any illegal characters are found in the instruction formats
|
|
23
|
+
for extension_name, extension_data in json_data.items():
|
|
24
|
+
for instruction in extension_data:
|
|
25
|
+
for attr, val in instruction.items():
|
|
26
|
+
match attr:
|
|
27
|
+
|
|
28
|
+
case "instr":
|
|
29
|
+
legal_chars = ".abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
|
|
30
|
+
if (val == None or val == ""):
|
|
31
|
+
raise self.InspectorError(f"Empty attribute '{attr}' found in instruction '{instruction['instr']} in extension {extension_name}.'")
|
|
32
|
+
for c in val:
|
|
33
|
+
if (c not in legal_chars):
|
|
34
|
+
raise self.InspectorError(f"Illegal character '{c}' found in attribute '{attr}' of instruction '{instruction['instr']}' in extension '{extension_name}'.")
|
|
35
|
+
|
|
36
|
+
case "format":
|
|
37
|
+
legal_chars = " ,()abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"
|
|
38
|
+
if (val == None or val == ""):
|
|
39
|
+
raise self.InspectorError(f"Empty attribute '{attr}' found in instruction '{instruction['instr']} in extension {extension_name}.'")
|
|
40
|
+
for c in val:
|
|
41
|
+
if (c not in legal_chars):
|
|
42
|
+
raise self.InspectorError(f"Illegal character '{c}' found in attribute '{attr}' of instruction '{instruction['instr']}' in extension '{extension_name}'.")
|
|
43
|
+
|
|
44
|
+
case "width":
|
|
45
|
+
if not (isinstance(val, int)) or (val < 8) or (val % 8 != 0):
|
|
46
|
+
raise self.InspectorError(f"Illegal width field found for instruction '{instruction['instr']}' in extension '{extension_name}'. Only integers are permitted.")
|
|
47
|
+
|
|
48
|
+
case "encoding":
|
|
49
|
+
legal_chars = " ,[:]&abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"
|
|
50
|
+
if (val == None or val == ""):
|
|
51
|
+
raise self.InspectorError(f"Empty attribute '{attr}' found in instruction '{instruction['instr']} in extension {extension_name}.'")
|
|
52
|
+
for c in val:
|
|
53
|
+
if (c not in legal_chars):
|
|
54
|
+
raise self.InspectorError(f"Illegal character '{c}' found in attribute '{attr}' of instruction '{instruction['instr']}' in extension '{extension_name}'.")
|
|
55
|
+
|
|
56
|
+
case "opcode":
|
|
57
|
+
legal_chars = "01"
|
|
58
|
+
if (val == None or val == ""):
|
|
59
|
+
raise self.InspectorError(f"Empty attribute '{attr}' found in instruction '{instruction['instr']} in extension {extension_name}.'")
|
|
60
|
+
for c in val:
|
|
61
|
+
if (c not in legal_chars):
|
|
62
|
+
raise self.InspectorError(f"Illegal character '{c}' found in attribute '{attr}' of instruction '{instruction['instr']}' in extension '{extension_name}'.")
|
|
63
|
+
|
|
64
|
+
case "funct3":
|
|
65
|
+
legal_chars = "01"
|
|
66
|
+
if (val == None or val == ""):
|
|
67
|
+
continue
|
|
68
|
+
for c in val:
|
|
69
|
+
if (c not in legal_chars):
|
|
70
|
+
raise self.InspectorError(f"Illegal character '{c}' found in attribute '{attr}' of instruction '{instruction['instr']}' in extension '{extension_name}'.")
|
|
71
|
+
|
|
72
|
+
case "funct7":
|
|
73
|
+
legal_chars = "01"
|
|
74
|
+
if (val == None or val == ""):
|
|
75
|
+
continue
|
|
76
|
+
for c in val:
|
|
77
|
+
if (c not in legal_chars):
|
|
78
|
+
raise self.InspectorError(f"Illegal character '{c}' found in attribute '{attr}' of instruction '{instruction['instr']}' in extension '{extension_name}'.")
|
|
79
|
+
|
|
80
|
+
|
|
81
|
+
# Raise an error for unexpected attributes
|
|
82
|
+
case _:
|
|
83
|
+
raise self.InspectorError(f"Unexpected attribute '{attr}' found for {extension_name} instruction {instruction['instr']} in JSON file")
|
|
@@ -0,0 +1,58 @@
|
|
|
1
|
+
import json
|
|
2
|
+
from typing import TextIO
|
|
3
|
+
import importlib.resources as res
|
|
4
|
+
from pathlib import Path
|
|
5
|
+
|
|
6
|
+
class Library():
|
|
7
|
+
|
|
8
|
+
# At runtime, this will be populated with declared and included instructions
|
|
9
|
+
working_lib = []
|
|
10
|
+
|
|
11
|
+
# Declared ISAs
|
|
12
|
+
core_isas = {} # These ISAs are supported by default - simply include them to use
|
|
13
|
+
extra_isas = {} # Extra/custom ISAs must be declared before they can be included for use
|
|
14
|
+
|
|
15
|
+
def __init__(self):
|
|
16
|
+
|
|
17
|
+
# Declare the core ISAs (within json/ directory)
|
|
18
|
+
for file in res.files("rvasm.core_ext").iterdir(): # Iterate through every JSON file
|
|
19
|
+
if (file.name.endswith(".json")):
|
|
20
|
+
with file.open("r", encoding="utf-8-sig") as f:
|
|
21
|
+
json_data = json.load(f)
|
|
22
|
+
for isa_name, isa_data in json_data.items(): # Add the ISA data to the dictionary
|
|
23
|
+
self.core_isas[isa_name] = isa_data
|
|
24
|
+
|
|
25
|
+
class LibraryError(Exception):
|
|
26
|
+
def __init__(self, message: str):
|
|
27
|
+
super().__init__(message)
|
|
28
|
+
|
|
29
|
+
# Method to declare extra/custom ISA(s) using a JSON file
|
|
30
|
+
def DeclareFromJSONData(self, json_data: json):
|
|
31
|
+
for isa_name, isa_data in json_data.items(): # Iterate through ISAs
|
|
32
|
+
self.extra_isas[isa_name] = isa_data # and add the data to the dictionary
|
|
33
|
+
|
|
34
|
+
# Check for duplicate instructions added to the working library
|
|
35
|
+
seen_instructions = []
|
|
36
|
+
for entry in self.working_lib:
|
|
37
|
+
if (entry["instr"] in seen_instructions):
|
|
38
|
+
raise self.LibraryError(f"Found multiple definitions for the same instruction ({entry['instr']}) when compiling the working library.")
|
|
39
|
+
seen_instructions.append(entry["instr"])
|
|
40
|
+
|
|
41
|
+
# Method to update the working library based on a given include list
|
|
42
|
+
def UpdateWorkingLibrary(self, include: list[str]):
|
|
43
|
+
isa_found = False
|
|
44
|
+
for isa_to_include in include: # Iterate through every ISA in the include list
|
|
45
|
+
for collection in (self.core_isas, self.extra_isas): # Repeat for both core and extra ISA dictionaries
|
|
46
|
+
if (collection.get(isa_to_include)):
|
|
47
|
+
isa_found = True
|
|
48
|
+
for instruction in collection.get(isa_to_include):
|
|
49
|
+
instr_to_add = instruction.copy()
|
|
50
|
+
self.working_lib.append(instr_to_add) # Append the instruction to the working library
|
|
51
|
+
if (not isa_found):
|
|
52
|
+
raise self.LibraryError(f"Could not recognise ISA: {isa_to_include}")
|
|
53
|
+
|
|
54
|
+
# Method to lookup an instruction from the working library
|
|
55
|
+
def WorkingLibraryLookUp(self, search_term: str):
|
|
56
|
+
for instruction in self.working_lib:
|
|
57
|
+
if (instruction["instr"] == search_term):
|
|
58
|
+
return instruction
|
|
@@ -0,0 +1,161 @@
|
|
|
1
|
+
from rvasm.tokeniser import Tokeniser
|
|
2
|
+
|
|
3
|
+
class Processor():
|
|
4
|
+
|
|
5
|
+
def __init__(self, library):
|
|
6
|
+
self.library = library # Shared Library object
|
|
7
|
+
self.instructions = [] # Store tokenised instructions
|
|
8
|
+
self.labels = [] # Store labels with a program index
|
|
9
|
+
self.index = 0 # Program index (program counter / 4)
|
|
10
|
+
self.tokeniser = Tokeniser(library) # Object responsible for tokenising instructions
|
|
11
|
+
|
|
12
|
+
class ProcessorError(Exception):
|
|
13
|
+
def __init__(self, message):
|
|
14
|
+
super().__init__(message)
|
|
15
|
+
|
|
16
|
+
# Method to reset the Processor, ready for another file to assemble
|
|
17
|
+
def Reset(self):
|
|
18
|
+
self.instructions = []
|
|
19
|
+
self.labels = []
|
|
20
|
+
|
|
21
|
+
# Method to process the next line of the assembly file
|
|
22
|
+
def ProcessLine(self, line: str):
|
|
23
|
+
|
|
24
|
+
line = line.rstrip() # Strip the newline character from the line
|
|
25
|
+
line = line.split("#")[0] # Ignore comments
|
|
26
|
+
line = line.strip() # Remove whitespace
|
|
27
|
+
|
|
28
|
+
# Ignore empty lines
|
|
29
|
+
if (not line):
|
|
30
|
+
return
|
|
31
|
+
|
|
32
|
+
# Parse labels
|
|
33
|
+
if (":" in line):
|
|
34
|
+
label_parts = line.split(":")
|
|
35
|
+
|
|
36
|
+
if (label_parts[1].rstrip()):
|
|
37
|
+
raise self.ProcessorError(f"Unexpected characters following label declaration: {line}")
|
|
38
|
+
|
|
39
|
+
label = {"name": label_parts[0], "index": self.index}
|
|
40
|
+
self.labels.append(label)
|
|
41
|
+
return
|
|
42
|
+
|
|
43
|
+
# Tokenise instructions
|
|
44
|
+
tokens = self.tokeniser.Tokenise(line)
|
|
45
|
+
|
|
46
|
+
# Look-up the library data for this instruction
|
|
47
|
+
lib_data = self.library.WorkingLibraryLookUp(tokens["instr"])
|
|
48
|
+
if (lib_data == None):
|
|
49
|
+
raise self.ProcessorError(f"Could not find instruction {tokens['instr']} in the working library.")
|
|
50
|
+
|
|
51
|
+
# Iterate through the tokens
|
|
52
|
+
for key, value in tokens.items():
|
|
53
|
+
|
|
54
|
+
# No need to check the instruction keyword; this must be correct
|
|
55
|
+
if (key == "instr"):
|
|
56
|
+
continue
|
|
57
|
+
|
|
58
|
+
# Ensure any x's are removed from register fields, then convert to an integer between 0-31
|
|
59
|
+
if (key in ("rd", "rs1", "rs2")):
|
|
60
|
+
tokens[key] = tokens[key].replace("x", "")
|
|
61
|
+
tokens[key] = int(tokens[key])
|
|
62
|
+
if (tokens[key] < 0 or tokens[key] > 31):
|
|
63
|
+
raise self.ProcessorError(f"Register {str(tokens[key])} outside of range 0-31.")
|
|
64
|
+
|
|
65
|
+
# Identify keys which have plain-text values (these might be labels)
|
|
66
|
+
if (value.isalpha()):
|
|
67
|
+
|
|
68
|
+
# TODO: some specific strings might be mappable to some other function
|
|
69
|
+
|
|
70
|
+
# Resolve unknown plain-text values to labels
|
|
71
|
+
for lbl in self.labels:
|
|
72
|
+
if (lbl["name"] == tokens[key]):
|
|
73
|
+
tokens[key] = lbl["index"] * int(lib_data["width"] / 8)
|
|
74
|
+
|
|
75
|
+
# Convert immediate values to integers (if they are not already)
|
|
76
|
+
if (key == "imm"):
|
|
77
|
+
tokens[key] = int(tokens[key])
|
|
78
|
+
|
|
79
|
+
# Finally, increment the index and append the instruction to the list of parsed instructions
|
|
80
|
+
self.index += 1
|
|
81
|
+
self.instructions.append(tokens)
|
|
82
|
+
|
|
83
|
+
# Method to generate the final machine code
|
|
84
|
+
def GenerateBinaries(self):
|
|
85
|
+
machine_code = []
|
|
86
|
+
|
|
87
|
+
# Iterate through each instruction
|
|
88
|
+
for line in self.instructions:
|
|
89
|
+
lib_data = self.library.WorkingLibraryLookUp(line["instr"]) # Fetch the relevant library data
|
|
90
|
+
binary = "" # Placeholder for binarised instruction
|
|
91
|
+
|
|
92
|
+
# Iterate through each field of the encoding
|
|
93
|
+
fields = [p.strip() for p in lib_data["encoding"].split("&")]
|
|
94
|
+
for field in fields:
|
|
95
|
+
|
|
96
|
+
# Parse fields which use bit-slicing
|
|
97
|
+
if ("[" in field and "]" in field and ":" in field):
|
|
98
|
+
fname, remainder = field.split("[", 1) # Get the name of the field
|
|
99
|
+
fupper, remainder = remainder.split(":", 1) # Get the upper bound
|
|
100
|
+
flower = remainder.split("]")[0] # Get the lower bound
|
|
101
|
+
(fupper, flower) = (int(fupper), int(flower)) # Convert bounds to integers
|
|
102
|
+
|
|
103
|
+
# Try to retrieve the information from the instruction
|
|
104
|
+
if (line.get(fname.strip()) is not None):
|
|
105
|
+
fdata = line.get(fname.strip())
|
|
106
|
+
|
|
107
|
+
# Failing that, try to retrieve the information from the library data
|
|
108
|
+
elif (lib_data.get(fname.strip()) is not None):
|
|
109
|
+
fdata = lib_data.get(fname.strip())
|
|
110
|
+
|
|
111
|
+
# If the information can't be found, raise an error
|
|
112
|
+
else:
|
|
113
|
+
raise self.ProcessorError(f"Couldn't find information about '{fname.strip()}' in the instruction or in the corresponding library data.")
|
|
114
|
+
|
|
115
|
+
# Add the bits to the binary string of the instruction
|
|
116
|
+
fvalue = format(fdata, "064b") if isinstance(fdata, int) else fdata
|
|
117
|
+
binary += fvalue[-1 - fupper : len(fvalue) - flower]
|
|
118
|
+
|
|
119
|
+
# Parse fields which index single bits
|
|
120
|
+
elif ("[" in field and "]" in field and not ":" in field):
|
|
121
|
+
fname, remainder = field.split("[", 1) # Get the name of the field
|
|
122
|
+
findex = int(remainder.split("]", 1)[0]) # Get the index of the bit of interest
|
|
123
|
+
|
|
124
|
+
# Try to retrieve the information from the instruction
|
|
125
|
+
if (line.get(fname.strip()) is not None):
|
|
126
|
+
fdata = line.get(fname.strip())
|
|
127
|
+
|
|
128
|
+
# Failing that, try to retrieve the information from the library data
|
|
129
|
+
elif (lib_data.get(fname.strip()) is not None):
|
|
130
|
+
fdata = lib_data.get(fname.strip())
|
|
131
|
+
|
|
132
|
+
# If the information can't be found, raise an error
|
|
133
|
+
else:
|
|
134
|
+
raise self.ProcessorError(f"Couldn't find information about '{fname.strip()}' in the instruction or in the corresponding library data.")
|
|
135
|
+
|
|
136
|
+
# Add the bits to the binary string of the instruction
|
|
137
|
+
fvalue = format(fdata, "064b") if isinstance(fdata, int) else fdata
|
|
138
|
+
binary += fvalue[-1 - findex]
|
|
139
|
+
|
|
140
|
+
# Where no bit-indexing or bit-slicing is required
|
|
141
|
+
elif (not "[" in field and not "]" in field and not ":" in field):
|
|
142
|
+
if (line.get(field.strip()) is not None):
|
|
143
|
+
binary += line.get(field.strip())
|
|
144
|
+
elif (lib_data.get(field.strip()) is not None):
|
|
145
|
+
binary += lib_data.get(field.strip())
|
|
146
|
+
else:
|
|
147
|
+
raise self.ProcessorError(f"Couldn't find information about '{field.strip()}' in the instruction or in the corresponding library data.")
|
|
148
|
+
|
|
149
|
+
# Encoding could not be interpreted
|
|
150
|
+
else:
|
|
151
|
+
raise self.ProcessorError(f"Invalid encoding syntax in JSON data: {field}")
|
|
152
|
+
|
|
153
|
+
# Check that the length of the generated binary is valid
|
|
154
|
+
if (len(binary) != lib_data.get("width")):
|
|
155
|
+
raise self.ProcessorError(f"Expected the width of the '{lib_data['instr']}' instruction to be {lib_data['width']}; got {len(binary)}.")
|
|
156
|
+
|
|
157
|
+
# Append the binary instruction to the list of machine code instructions
|
|
158
|
+
machine_code.append(binary)
|
|
159
|
+
|
|
160
|
+
return machine_code
|
|
161
|
+
|
|
@@ -0,0 +1,136 @@
|
|
|
1
|
+
import argparse
|
|
2
|
+
import json
|
|
3
|
+
from typing import TextIO
|
|
4
|
+
|
|
5
|
+
from .inspector import Inspector
|
|
6
|
+
from .library import Library
|
|
7
|
+
from .processor import Processor
|
|
8
|
+
|
|
9
|
+
def main():
|
|
10
|
+
|
|
11
|
+
# Handle arguments
|
|
12
|
+
parser = argparse.ArgumentParser(description="A Python-based, JSON-driven RISC-V assembler.")
|
|
13
|
+
parser.add_argument("input", help="input file path")
|
|
14
|
+
parser.add_argument("-o", "--output", help="output file path")
|
|
15
|
+
parser.add_argument("-f", "--format", help="output format (binary/hex)")
|
|
16
|
+
parser.add_argument("-i", "--include", nargs="*", help="specify any number of JSON files detailing extra/custom ISA(s)")
|
|
17
|
+
args = parser.parse_args()
|
|
18
|
+
|
|
19
|
+
# Create a new RVAsm object per command
|
|
20
|
+
rvasm = RVAsm()
|
|
21
|
+
|
|
22
|
+
with open(args.input, "r", encoding="utf-8") as asm_file:
|
|
23
|
+
|
|
24
|
+
# Placeholder variables to pass to rvasm object
|
|
25
|
+
OUTPUT = None
|
|
26
|
+
OUTPUT_FORMAT = None
|
|
27
|
+
INCLUDE = None
|
|
28
|
+
|
|
29
|
+
# Overwrite placeholders with any arguments (if present)
|
|
30
|
+
if (args.output):
|
|
31
|
+
OUTPUT = args.output
|
|
32
|
+
if (args.format):
|
|
33
|
+
OUTPUT_FORMAT = args.format
|
|
34
|
+
if (args.include):
|
|
35
|
+
INCLUDE = args.include
|
|
36
|
+
|
|
37
|
+
# Include ISAs
|
|
38
|
+
if (INCLUDE and len(INCLUDE) > 0):
|
|
39
|
+
for json_path in INCLUDE:
|
|
40
|
+
with open(json_path, "r") as json_file:
|
|
41
|
+
rvasm.IncludeFromJSON(json_file)
|
|
42
|
+
|
|
43
|
+
# Go!
|
|
44
|
+
rvasm.Assemble(asm_file, output=OUTPUT, output_format=OUTPUT_FORMAT)
|
|
45
|
+
|
|
46
|
+
class RVAsm():
|
|
47
|
+
|
|
48
|
+
def __init__(self):
|
|
49
|
+
|
|
50
|
+
# Verify the RVAsm setup using the Inspector class
|
|
51
|
+
try:
|
|
52
|
+
self.inspector = Inspector()
|
|
53
|
+
except Exception as e:
|
|
54
|
+
print(f"Failed to created the RVAsm object; setup could not be verified.")
|
|
55
|
+
print(f"{e}")
|
|
56
|
+
exit()
|
|
57
|
+
self.library = Library() # Create a new Library object
|
|
58
|
+
|
|
59
|
+
self.default_includes = ["RV32I"] # Specify ISAs to include by default
|
|
60
|
+
self.user_includes = [] # Placeholder for user-specified inclusions
|
|
61
|
+
self._UpdateWorkingLibrary(self.default_includes + self.user_includes) # Update and compile the working library based on the include list
|
|
62
|
+
|
|
63
|
+
self.processor = Processor(self.library) # Create a Processor object with the shared library
|
|
64
|
+
self.bin = None # Placeholder for the assembled machine code
|
|
65
|
+
|
|
66
|
+
class RVAsmError(Exception):
|
|
67
|
+
def __init__(self, message: str):
|
|
68
|
+
super().__init__(message)
|
|
69
|
+
|
|
70
|
+
# Method to assemble a '.asm' file
|
|
71
|
+
def Assemble(self, file: TextIO, output:str=None, output_format:str=None):
|
|
72
|
+
|
|
73
|
+
# Internally set default arguments (easier for argparse)
|
|
74
|
+
if (not output):
|
|
75
|
+
output = "out.mem"
|
|
76
|
+
if (not output_format):
|
|
77
|
+
output_format = "hex"
|
|
78
|
+
|
|
79
|
+
# Reset the processor
|
|
80
|
+
self.processor.Reset()
|
|
81
|
+
|
|
82
|
+
# Process each line of the .asm file, catching any exceptions and reporting as debug information
|
|
83
|
+
for i, line in enumerate(file):
|
|
84
|
+
try:
|
|
85
|
+
self.processor.ProcessLine(line)
|
|
86
|
+
except Exception as e:
|
|
87
|
+
print(f"\n{type(e).__name__} caused the following line to fail:")
|
|
88
|
+
print(line.rstrip("\n"))
|
|
89
|
+
print(f"{e}")
|
|
90
|
+
exit()
|
|
91
|
+
|
|
92
|
+
self.bin = self.processor.GenerateBinaries() # Create the machine code
|
|
93
|
+
self._WriteOutput( # Output the file to the current directory
|
|
94
|
+
filename=output,
|
|
95
|
+
output_format=output_format
|
|
96
|
+
)
|
|
97
|
+
|
|
98
|
+
# Method to reset the assembler
|
|
99
|
+
def Reset(self):
|
|
100
|
+
self.processor.Reset()
|
|
101
|
+
self.user_includes = []
|
|
102
|
+
self._UpdateWorkingLibrary()
|
|
103
|
+
|
|
104
|
+
# Method to include ISA data from a JSON file
|
|
105
|
+
def IncludeFromJSON(self, json_file: TextIO):
|
|
106
|
+
json_data = json.load(json_file)
|
|
107
|
+
self.library.DeclareFromJSONData(json_data)
|
|
108
|
+
|
|
109
|
+
# Update the include list
|
|
110
|
+
for isa_name, isa_data in json_data.items():
|
|
111
|
+
if not (isa_name in self.user_includes):
|
|
112
|
+
self._IncludeISA(isa_name)
|
|
113
|
+
|
|
114
|
+
# Method to include an ISA of a particular name for use
|
|
115
|
+
def _IncludeISA(self, name):
|
|
116
|
+
if not (name in self.user_includes): # Avoid double-inclusions
|
|
117
|
+
self.user_includes.append(name) # Append the name of the ISA to the user includes
|
|
118
|
+
self._UpdateWorkingLibrary(self.default_includes + self.user_includes) # Update the working library
|
|
119
|
+
|
|
120
|
+
# Method to update the working library following changes to the include list
|
|
121
|
+
def _UpdateWorkingLibrary(self, total_include_list: list[str]):
|
|
122
|
+
self.library.UpdateWorkingLibrary(total_include_list)
|
|
123
|
+
|
|
124
|
+
# Method to write the to an output file
|
|
125
|
+
def _WriteOutput(self, filename="out.mem", output_format="hex"):
|
|
126
|
+
write_content = self.bin
|
|
127
|
+
|
|
128
|
+
with open(filename, "w") as f:
|
|
129
|
+
for line in write_content:
|
|
130
|
+
|
|
131
|
+
# If "hex" is selected as the format, convert each line to a hexadecimal number
|
|
132
|
+
if (output_format.lower() == "hex"):
|
|
133
|
+
line = format(int(line, 2), "0" + str(int(len(line) / 4)) + "x")
|
|
134
|
+
|
|
135
|
+
# Write each line to the output file
|
|
136
|
+
f.write(line + "\n")
|
|
@@ -0,0 +1,41 @@
|
|
|
1
|
+
import re
|
|
2
|
+
|
|
3
|
+
from rvasm.library import Library
|
|
4
|
+
|
|
5
|
+
class Tokeniser():
|
|
6
|
+
|
|
7
|
+
def __init__(self, library):
|
|
8
|
+
self.library = library
|
|
9
|
+
|
|
10
|
+
class TokeniserError(Exception):
|
|
11
|
+
def __init__(self, message: str):
|
|
12
|
+
super().__init__(message)
|
|
13
|
+
|
|
14
|
+
# Method to tokenise an instruction
|
|
15
|
+
def Tokenise(self, line: str):
|
|
16
|
+
tokenised_instruction = {}
|
|
17
|
+
library_data = None
|
|
18
|
+
instr = None
|
|
19
|
+
|
|
20
|
+
# Split the instruction into parts
|
|
21
|
+
parts = re.split(r"[,()\s]+", line) # Split for whitespace, commas and brackets
|
|
22
|
+
parts = [p.strip() for p in parts if p.strip()] # Prune separators and useless parts
|
|
23
|
+
instr = parts[0] # Identify the instruction keyword
|
|
24
|
+
|
|
25
|
+
# Firstly, retrieve the data about the instruction from the library
|
|
26
|
+
library_data = self.library.WorkingLibraryLookUp(instr)
|
|
27
|
+
|
|
28
|
+
# Throw an error if the instruction can't be found in the working library
|
|
29
|
+
if (library_data == None):
|
|
30
|
+
raise self.TokeniserError(f"Unknown instruction {instr} which cannot be found in the working library.")
|
|
31
|
+
|
|
32
|
+
# Tokenise the format string
|
|
33
|
+
fparts = re.split(r"[,()\s]+", library_data["format"]) # Split for whitespace, commas and brackets
|
|
34
|
+
fparts = [fp.strip() for fp in fparts if fp.strip()] # Prune separators and useless parts
|
|
35
|
+
|
|
36
|
+
# Match together each field with the corresponding value in the written instruction
|
|
37
|
+
for i, fp in enumerate(fparts):
|
|
38
|
+
tokenised_instruction[fp] = parts[i].lower()
|
|
39
|
+
|
|
40
|
+
return tokenised_instruction
|
|
41
|
+
|
|
@@ -0,0 +1,80 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: rvasm
|
|
3
|
+
Version: 0.0.1a1
|
|
4
|
+
Summary: A simple RISC-V assembler written in Python
|
|
5
|
+
Author-email: Will Arden <will.ardxn@gmail.com>
|
|
6
|
+
License: Copyright © 2026 Will Arden
|
|
7
|
+
|
|
8
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
|
|
9
|
+
|
|
10
|
+
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
|
|
11
|
+
|
|
12
|
+
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
|
13
|
+
Project-URL: Homepage, https://github.com/will-arden/rvasm
|
|
14
|
+
Project-URL: Source, https://github.com/will-arden/rvasm
|
|
15
|
+
Requires-Python: >=3.10
|
|
16
|
+
Description-Content-Type: text/markdown
|
|
17
|
+
License-File: LICENSE
|
|
18
|
+
Dynamic: license-file
|
|
19
|
+
|
|
20
|
+
# rvasm: A Python-Based RISC-V Assembler
|
|
21
|
+
**rvasm** aims to provide a simple and easy-to-use RISC-V assembler, as both a **command-line tool** and a **Python package**. **rvasm** makes it easy to define and use **custom instructions**, simply by providing a JSON file.
|
|
22
|
+
|
|
23
|
+
## Getting started
|
|
24
|
+
You can install **rvasm** using `pip`, as follows:
|
|
25
|
+
|
|
26
|
+
> `pip install rvasm`
|
|
27
|
+
|
|
28
|
+
**rvasm** can be used either as a **command-line tool**, or as a **Python package** which can be used in a Python program.
|
|
29
|
+
|
|
30
|
+
The command-line tool can be used to assemble an RV32I text file, as demonstrated below:
|
|
31
|
+
> `rvasm my_file.asm`
|
|
32
|
+
|
|
33
|
+
This will produce an output file `out.mem`, containing your assembled RV32I in hexadecimal. For help on other command-line options, use:
|
|
34
|
+
> `rvasm --help`
|
|
35
|
+
|
|
36
|
+
Here is an example on how you can use **rvasm** in a Python project:
|
|
37
|
+
>```python
|
|
38
|
+
>import rvasm
|
|
39
|
+
>my_assembler = rvasm.RVAsm() # Create an Assembler object
|
|
40
|
+
>with open("my_input_file.asm", "r") as f: # Open the assembly file
|
|
41
|
+
> my_assembler.Assemble(f) # Generate the machine code
|
|
42
|
+
>```
|
|
43
|
+
|
|
44
|
+
## Assembling custom instructions
|
|
45
|
+
**rvasm** makes assembling custom instructions straightforward.
|
|
46
|
+
|
|
47
|
+
Firstly, create a **JSON file** detailing your custom instructions ([use this reference](https://github.com/will-arden/rvasm/tree/main/src/rvasm/json/RV32I.json)). You can specify one or more RISC-V extensions in the same file, or use multiple files. An example would be like so:
|
|
48
|
+
> ```json
|
|
49
|
+
> {
|
|
50
|
+
> "MY_CUSTOM_EXTENSION": [
|
|
51
|
+
> {
|
|
52
|
+
> "instr": "addi",
|
|
53
|
+
> "format": "instr rd, rs1, imm",
|
|
54
|
+
> "width": 32,
|
|
55
|
+
> "encoding": "imm[11:0] & rs1[4:0] & funct3[2:0] & rd[4:0] & opcode[6:0]",
|
|
56
|
+
> "opcode": "0010011",
|
|
57
|
+
> "funct3": "000",
|
|
58
|
+
> "funct7": null
|
|
59
|
+
> }
|
|
60
|
+
> ]
|
|
61
|
+
> }
|
|
62
|
+
> ```
|
|
63
|
+
|
|
64
|
+
To include the new extension(s) from the command line, use the `--include` option, as below:
|
|
65
|
+
> `rvasm my_file.asm --include my_custom_extension.json`
|
|
66
|
+
|
|
67
|
+
The following example shows how you can use **rvasm** to assemble custom RISC-V instructions in your Python code:
|
|
68
|
+
> ```python
|
|
69
|
+
> import rvasm
|
|
70
|
+
> asm = rvasm.RVAsm()
|
|
71
|
+
> with open("my_custom_extension.json", "r") as f:
|
|
72
|
+
> asm.IncludeFromJSON(f)
|
|
73
|
+
> with open("my_file.asm", "r") as f:
|
|
74
|
+
> asm.Assemble(f)
|
|
75
|
+
> ```
|
|
76
|
+
|
|
77
|
+
This project is a work-in-progress, so please keep checking in! Feel free to create issues and suggest improvements.
|
|
78
|
+
|
|
79
|
+
## License
|
|
80
|
+
This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.
|
|
@@ -0,0 +1,14 @@
|
|
|
1
|
+
LICENSE
|
|
2
|
+
README.md
|
|
3
|
+
pyproject.toml
|
|
4
|
+
src/rvasm/__init__.py
|
|
5
|
+
src/rvasm/inspector.py
|
|
6
|
+
src/rvasm/library.py
|
|
7
|
+
src/rvasm/processor.py
|
|
8
|
+
src/rvasm/rvasm.py
|
|
9
|
+
src/rvasm/tokeniser.py
|
|
10
|
+
src/rvasm.egg-info/PKG-INFO
|
|
11
|
+
src/rvasm.egg-info/SOURCES.txt
|
|
12
|
+
src/rvasm.egg-info/dependency_links.txt
|
|
13
|
+
src/rvasm.egg-info/entry_points.txt
|
|
14
|
+
src/rvasm.egg-info/top_level.txt
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
rvasm
|