- Building Machines In Code
- Building Machines in Code – Part 2
- Building Machines In Code – Part 3
- Building Machines In Code – Part 4
- Building Machines In Code – Part 5
- Building Machines In Code – Part 6
- Building Machines In Code – Part 7
- Building Machines In Code – Part 8
- Building Machines In Code – Part 9
Post Stastics
- This post has 2440 words.
- Estimated read time is 11.62 minute(s).
Tooling for the Tiny-T
When we completed the console, last installment, I had said I was unsure what I would cover next. I’m really wanting to begin implementing our audio device, but I felt that adding a GUI for the Tiny-T system was a target much more achievable in the limited time I had. However, before we can create a GUI for the Tiny-T, we will need an assembler, loader, and disassembler for our new CPU. The GUI’s main window will provide an assembly listing of our program and also display a range of memory addresses and their content. Also, the CPU status flags and registers will be shown. The GUI will provide a platform for the user to write their program, assemble it, and step through their program or run it to completion. Another option I want to add is the ability to load a binary file and display its corresponding assembly listing. This is why we need a toolset before we can create GUI.
Our goal today is to build our toolset, beginning with our assembler. The assembler for the Tiny-T is very similar to the one we created back in part 5 for the Tiny-P. However, our instructions have changed as well as their encoding. Because of this, we will need to rewrite parts of our assembler. Since we covered the workings of the assembler back in part 5, I’m just going to present the code and then discuss the changes from the Tiny-P’s assembler. Here’s the code for the Tiny-T’s assembler:
#!/usr/bin/python3 # -*- coding: utf-8 -*- # File: assembler.py """ Tiny-T Assembler ... # Opcode table relates mnemonics # to the corresponding opcode value. OPCODE_TABLE = { 'htl': 0x0, 'lda': 0x1, 'sta': 0x2, 'add': 0x3, 'sub': 0x4, 'and': 0x5, 'or': 0x6, 'xor': 0x7, 'not': 0x8, 'shl': 0x9, 'shr': 0xA, 'bra': 0xB, 'brp': 0xC, 'brz': 0xD, 'inp': 0xE, 'out': 0xF } class Lexer: def __init__(self): self.line = None self.tokens = [] def set_text(self, line: str): self.line = line self.tokens = line.split() def next_token(self): if not self.tokens: return None tok = self.tokens.pop(0) return tok class Assembler: def __init__(self, lexer: Lexer, _text: str): self.text = _text self.lines = self.text.split('\n') self.current_address = 0 self.opcode = 0 self.operand = 0 self.lexer = lexer self.symbol_table = {} self.code = [] def skip_spaces(self, tok: str): while tok.isspace(): tok = self.lexer.next_token() def skip_comment(self, tok: str): if tok == '#': while tok: tok = self.lexer.next_token() def is_hex(self, tok: str) -> bool: if tok.startswith('0x') or tok.startswith('0X'): try: op = int(tok[2:], 16) except ValueError: return False return True return False def from_hex(self, tok: str)-> str: if self.is_hex(tok): val = str(int(tok[2:], 16)) return val msg = f"Can not convert {tok} to integer value" raise ValueError(msg) def fixup(self): text_ = '' for line in self.code: parts = line.split(':') addr = parts[0] sub_parts = parts[1].split('-') opcode = sub_parts[0] operand = sub_parts[1] if operand.isalnum() and not operand.isnumeric() and not self.is_hex(operand): if operand in self.symbol_table: operand = self.symbol_table[operand] else: msg = f"Undefined Symbol: {operand}" raise ValueError(msg) elif self.is_hex(operand): operand = self.from_hex(operand) bin_code = (int(opcode) << 12) + int(operand) if bin_code > 0xFFFF: raise ValueError(f"Illegal Machine Code Value {bin_code}") code_line = f'{addr.zfill(4)} {bin_code}\n' text_ += code_line return text_ def parse(self): for line in self.lines: line = line.lower() self.opcode = 0 self.operand = 0 self.lexer.set_text(line) tok = self.lexer.next_token() code_text = '' while tok is not None: self.skip_spaces(tok) if tok is None or not tok: break elif tok.endswith(':'): # LABEL _DECL key = tok[:-1] self.symbol_table[key] = self.current_address elif tok == '#': # COMMENT self.skip_comment(tok) break elif tok.endswith('.'): # DIRECTIVE if tok[:-1] == 'org': operand = self.lexer.next_token() if operand.isnumeric(): self.current_address = int(operand) elif self.is_hex(operand): try: operand = int(operand[2:], 16) except ValueError: msg = f'Illegal value given. Expected int or hex, got {operand}' raise ValueError(msg) else: msg = f'Illegal Origin. Expected: integer, Found {operand}' raise ValueError(msg) break elif tok in OPCODE_TABLE.keys(): # INSTRUCTION self.opcode = OPCODE_TABLE[tok] operand = self.lexer.next_token() if operand.isnumeric(): self.operand = operand elif self.is_hex(operand): self.operand = self.from_hex(operand) elif operand.isalnum(): if operand in self.symbol_table: self.operand = self.symbol_table[operand] elif self.is_hex(operand): self.operand = self.from_hex(operand) else: self.operand = operand elif operand.startswith('#'): self.operand = 0 self.skip_comment(operand) self.code.append(f"{self.current_address} : {self.opcode}-{self.operand}") self.current_address += 1 tok = self.lexer.next_token() code_text = self.fixup() return code_text import sys, getopt def main(argv): inputfile = '' outputfile = '' usage_message = "Usage: assembler.py -i <inputfile> -o <outputfile>" try: opts, args = getopt.getopt(argv, "hi:0:", ["help", "ifile=", "ofile="]) except getopt.GetoptError: print(usage_message) sys.exit(2) for opt, arg in opts: if opt in ('-h', '--help'): print(usage_message) sys.exit() elif opt in ('-i', '--ifile'): inputfile = arg elif opt in ('-o', '--ofile'): outputfile = arg if not inputfile: print(usage_message) sys.exit(2) # If only input file given default output file to <inputfile>.bin if inputfile and not outputfile: outputfile = inputfile.split('.')[0] + '.bin' with open(inputfile, 'r') as ifh: program_text = ifh.read() ifh.close() # Assemble program assembler = Assembler(Lexer(), program_text) machine_text = assembler.parse() # Write output file if machine_text: with open(outputfile, 'w') as ofh: ofh.write(machine_text) ofh.close() else: msg = f'Unable to assemble output file {inputfile}' raise AssertionError(msg) # Exit message print(f"Assembled: {inputfile} and wrote machine code to {outputfile}") if __name__ == '__main__': main(sys.argv[1:])
The first thing you will notice is that I have removed the MNEMONICS list. This was not needed as we can simply use the OPCODE_TABLE keys. In addition, the OPCODE_TABLE’s contents had to change to support our new instruction set.
Instead of creating the lexer in the assembler class, I did a little dependency injection and passed a fully instantiated lexer into the assembler’s __init__() method.
Since our new assembly language allows hexadecimal values, we need two additional methods is_hex() and from_hex(). The first returns a boolean True if the string passed in represents a hexadecimal value. The second will convert a hex string to an integer string.
Under Python 3.11 my f-strings weren’t working inside exceptions. So I moved the message composition outside the exceptions calls and assigned them to a variable. I’ll figure this out and fix it later.
Inside fixup() we need to make a few changes. The first if statement inside the for loop needs to have a new condition added to it. Change this line:
if operand.isalnum() and not operand.isnumeric():
to this:
if operand.isalnum() and not operand.isnumeric() and not self.is_hex(operand):
Then we need to add a new elif branch:
... elif self.is_hex(operand): operand = self.from_hex(operand) ...
In the line that assembles our opcode and operand into a single value and stores it in bin_code we originally multiplied our opcode value by 100 to shift its position left two digits. In our new assembler, we are dealing with a 16-bit value and need to shift our opcode left 3 hexadecimal digits or 12-bits. So, change:
... bin_code = (int(opcode * 100) + int(operand) ...
to:
... bin_code = (int(opecode) << 12) + int(operand) ...
Next, to allow hexadecimal values in our assembler directives (ORG.), we need to parse them. This means making calls to is_hex() and from_hex() inside the directive branch of our parsing code.
When we create our assembler instance inside main(), we need to first create an instance of our Lexer() to pass to the Assembler() which now uses dependency injection.
Lastly, we should add some error testing and handling before we attempt to write out our outputfile.
Hopefully, I haven’t missed anything. You can pull the new code from the repo and diff the files to be sure.
Testing the Assembler
With the changes made to our new assembler, we are now ready to test it. We won’t write a full test suite but we will write a simple four-line assembly program, assemble it, and run the binary.
To begin, create a new sub-directory in the part-9 folder named
ORG. 0x0000 start: INP 0x0FE # Read console input 0xE0FE OUT 0x0FF # Write back to display 0xF0FF BRA start # Loop 0xB000
As you can see our little program simply echos anything we type into the console back out to the console.
Now assemble the program:
> python3 assembler.py asm/echo.asm
This should create a new file, “asm/echo.bin” with the following contents:
000 57598 001 61695 002 45056
This is the machine code for our program. If your file doesn’t contain exactly this, then you need to check your assembler. Don’t move on until you have the assembler working correctly.
The Loader
The loader is the next tool in our small arsenal. As you recall from Part-6, our loader is responsible for reading our machine code file and loading its contents into memory. Because we covered loaders in Part-6, I won’t cover it here. I’ll just show the code and let you dif it with the loader from Part-6.
Most of the changes have to do with the Tiny-P having a program() method while the Tiny-T uses a write() method. The main() method had to change considerably do to the fact that the CPU is no longer a stand-alone device. Here’s the code:
#!/usr/bin/env python3 # -*- coding: utf-8 -*- """ Tiny-T CPU Simulator. ... """ # Tiny-T Machine Code Loader # Assumes machine code is stored in # a *.bin file and is formatted as: # <address> <opcode> # Where the address is a 4-digit decimal # value and the opcode is a 4-digit # decimal value. import sys import getopt from cpu import CPU from bus import Bus from memory import Memory from console import Console class Loader: def __init__(self, cpu: CPU, code_text: str): self.machine_code = code_text self.code = self.machine_code.split('\n') self.cpu = cpu def load(self): for line in self.code: code = line.split() if len(code) == 2: addr = int(code[0]) opcode = int(code[1]) self.cpu.write(addr, opcode) def dump(cpu: CPU): print(f"ACC: {cpu.accumulator}, PC: {cpu.program_counter}, Z: {cpu.z_flag}, P: {cpu.p_flag}") print('\n') def dump_mem(mem: list): for i, data in enumerate(mem): if i % 16 == 0: print(f"\n{i} : ", end='') print(f" {data}, ", end='') print() def main(argv): inputfile = '' usage_message = "Usage: assembler.py -i <inputfile> " try: opts, args = getopt.getopt(argv, "hi:", ["help", "ifile="]) except getopt.GetoptError: print(usage_message) sys.exit(2) for opt, arg in opts: if opt in ('-h', '--help'): print(usage_message) sys.exit() elif opt in ('-i', '--ifile'): inputfile = arg if not inputfile: print(usage_message) sys.exit(2) with open(inputfile, 'r') as ifh: program_text = ifh.read() ifh.close() # Build up Computer Stem ram = Memory(64, 16) con = Console() bus = Bus() bus.register_handler(ram) bus.register_handler(con) cpu = CPU(bus) # Loader Program loader = Loader(cpu, program_text) loader.load() # Exit message print(f"Loader: {inputfile} loaded in to cpu.") print(f"Ready to run!") # Run the program cpu.run() if __name__ == "__main__": main(sys.argv[1:])
As you can see there isn’t much difference between the loader presented in Part-6 and the one presented above.
You should be able to run the loader, passing it your echo.asm file and get an operating Tiny-T system waiting for your input.
Disassembler
In preparation for our GUI, I want to add a disassembler to our arsenal of tools. What is a disassembler? It’s a program that takes in machine code and spits out assembly code.
Our disassembler will take each word of machine code and split it into its corresponding opcode and operand. Then we only need to look up the opcode value in a table to locate the mnemonic. The majority of our disassemblers will deal with handling command-line arguments. The meat of our disassembler is contained in two static methods. The disasm() method takes in our program text read from the input file in main() and splits it into lines. It then walks over each line splitting the line into the address and instruction components. Next, it calls the static method decode() to convert the instruction code into its mnemonic and operand, format them as a line of text, and return that text back to disasm(). The disasm() method then collects these lines of decoded instructions into the asm_text variable and returns this text back to main(). The main() function then writes this text out to our output file.
That was easy, right? A disassembler in about 20 lines of code. The rest of the program is just file handling. Since the command-line option handling is the same as for our assemblers presented earlier, I won’t discuss this part of the program.
Homework
Give the disassembler a try. Make sure it produces the proper output for your echo.bin file and make a test.bin file containing each instruction, then run disassemble on it and inspect it to ensure proper disassembley.
Conclusion
In this post, we have prepared ourselves for developing a GUI to support our Tiny-T computer system. We implemented an assembler, loader, and disassembler. Much of this work was familiar to us and differed only slightly from some of our previous projects so, I didn’t give a detailed explanation of the code.
I would recommend you play with the code and get familiar with it. Try to write these tools yourself, from scratch. In the future, we will create more complex tooling, and having a good foundation will help.
In our next installment, we will begin creating a graphical user interface for the Tiny-T. This GUI will most likely be built using pySimpleGui or Qt. I haven’t quite decided yet. But I have found pySimpleGui in the past, to be easy to use and quick to develop GUI applications.
Until next time: Happy Coding!
Resources
You can find the code for this post on GitHub at: https://github.com/Monotoba/Building-Machines-In-Code
Please let me know by email if you are available for some consulting for our Flutter/Dart project.