Implementing Stack Oriented Languages – Part 4

This entry is part 4 of 4 in the series Implementing Stack Oriented Languages

Implementing Stack Oriented Languages
Implementing Stack Oriented Languages — Part 2
Implementing Stack Oriented Languages – Part 3
Implementing Stack Oriented Languages – Part 4

Table of Contents

Post Stastics

This post has 4341 words.
Estimated read time is 20.67 minute(s).

Strings and Variable

So far we’ve added lots of stack operations and a couple I/O routines with the KEY and EMIT keywords. At the moment if we wanted to write a simple “Hello World” application we would need to place each character on the stack and pop them off using the EMIT keyword. This is messy and bit difficult to manage in a larger program. What we need is a way to store strings somewhere and then a way to tell our language where to find the string we wish to manipulate. So how can we do this?

Well, a simple straight-forward solution would be to place all strings in a string table. A string table is an area of memory where we store the strings. Then we need to tell our language how to find the string. In a compiled version of a Stack-Oriented language, we would use a pointer to the string, passing its address via the stack to any method that needed to manipulate the string. In an interpreted version of the language, we only need to pass an index into our list of strings. Sounds pretty easy, right? Well, it’s a bit more complicated. When the parser encounters a string we will need to store it in the string table and then place the string’s address (index) onto the stack. Since managing the string table is a bit more work than we want the parser to do, and in keeping with the Single Responsibility Principle, we will create a StringTable class to keep track of the strings and their addresses.

Our string table class will need a list to store strings into, a save method that will return the index of the string in the table. A get method to get the string by index and a length method to return the length of the string.

Our strings will need to store two peices of data. The actual string (value) and it’s length. So we have to determine if we will return the object containing this information of just the printable string contents. I opted to provide methods for both. The get(idx) method will return the string data object while the get_string(idx) method will extract the string contents and return only the printable string.

Ok, it’s time to see what this StringTable class looks like:

# string_table.py

class StringTable:

def __init__(self):

self.data = []

# Insert string and meta data into string list

def save(self, string):

entry = {'len': len(string), 'val': string}

self.data.append(entry)

return (len(self.data) - 1)

# Return string storage object (including length)

def get(self, idx):

return self.data[idx]

# Return length of string

def length(self, idx):

return int(self.data[idx]['len'])

# Return printable string only (no meta data).

def get_str(self, idx):

return str(self.data[idx]['val'])

def __str__(self):

return str(self.data)

# string_table.py class StringTable: def __init__(self): self.data = [] # Insert string and meta data into string list def save(self, string): entry = {'len': len(string), 'val': string} self.data.append(entry) return (len(self.data) - 1) # Return string storage object (including length) def get(self, idx): return self.data[idx] # Return length of string def length(self, idx): return int(self.data[idx]['len']) # Return printable string only (no meta data). def get_str(self, idx): return str(self.data[idx]['val']) def __str__(self): return str(self.data)

# string_table.py

class StringTable:

    def __init__(self):
        self.data = []

    # Insert string and meta data into string list
    def save(self, string):
        entry = {'len': len(string), 'val': string}
        self.data.append(entry)
        return (len(self.data) - 1)

    # Return string storage object (including length)
    def get(self, idx):
        return self.data[idx]

    # Return length of string
    def length(self, idx):
        return int(self.data[idx]['len'])

    # Return printable string only (no meta data).
    def get_str(self, idx):
        return str(self.data[idx]['val'])

    def __str__(self):
        return str(self.data)

As you can see the string table couldn’t be much simpler. Now we need to handle inserting strings in the parser. Our parser will need to take the string from the token passed back from the lexer, save it in the string table, and push it’s address (index) on to the stack. Then it will need to push the string length onto the stack. Once the string is encountered it is up to the next operation to handle the length-address pair.

Place the following code in the parser after the arithmetic and logic operations and before the section of code to handle keywords.

# parser.py

...

# Handle Strings and Characters

elif tok.type == TT_STRING:

string = self.current_tok.value

addr = self.string_table.save(string)

self.data.push(addr)

self.data.push(len(string))

# Handle KEYWORDS

elif tok.type == TT_KEYWORD:

...

# parser.py ... # Handle Strings and Characters elif tok.type == TT_STRING: string = self.current_tok.value addr = self.string_table.save(string) self.data.push(addr) self.data.push(len(string)) # Handle KEYWORDS elif tok.type == TT_KEYWORD: ...

# parser.py
              ...
               # Handle Strings and Characters
               elif tok.type == TT_STRING:
                string = self.current_tok.value
                addr = self.string_table.save(string)
                self.data.push(addr)
                self.data.push(len(string))

              # Handle KEYWORDS
              elif tok.type == TT_KEYWORD:
              ...

Now before we can run this code we need to first import it and add it to our parser object at the bottom of the __init__() method:

def __init__(self, lexer: Lexer, data_stack: Stack = None):

...

# Setup string table

if string_table is not None:

self.string_table = string_table

else:

self.string_table = StringTable()

def __init__(self, lexer: Lexer, data_stack: Stack = None): ... # Setup string table if string_table is not None: self.string_table = string_table else: self.string_table = StringTable()

   def __init__(self, lexer: Lexer, data_stack: Stack = None):
       ...
       # Setup string table
       if string_table is not None:
           self.string_table = string_table
       else:
           self.string_table = StringTable()

Next, we need to change the signature of the __init__() method to allow us to pass a new symbol table into the method when needed. You will see why we would want to do this later. For now, just know it’s something we may need to do.

class Parser:

def __init__(self, lexer: Lexer, data_stack: Stack = None, string_table=None):

self.lexer = lexer

# Setup data stack

...

class Parser: def __init__(self, lexer: Lexer, data_stack: Stack = None, string_table=None): self.lexer = lexer # Setup data stack ...

class Parser:

    def __init__(self, lexer: Lexer, data_stack: Stack = None, string_table=None):
        self.lexer = lexer

        # Setup data stack
   ...

With that change, any errors you were getting in the __init__() method should have disappeared. If not, then make sure you’ve imported the StringTable class at the top of the file.

We now need to change our source code in the “if” statement at the bottom of the parser file to:

if __name__ == '__main__':

source = """ 16 2 >>

"This is a string"

dump

"""

lexer = Lexer(source)

parser = Parser(lexer)

parser.parse()

if __name__ == '__main__': source = """ 16 2 >> "This is a string" dump """ lexer = Lexer(source) parser = Parser(lexer) parser.parse()

if __name__ == '__main__':
    source = """ 16 2 >> 
    "This is a string" 
    dump
    """
    lexer = Lexer(source)
    parser = Parser(lexer)
    parser.parse()

If we run this code now you should see the following:

[ '4', '0', '16' ]

[ '4', '0', '16' ]

These values come from the dump operation which displays the contents of the stack after the parser processed the string. Since we haven’t put any other strings in the string table, the index for this string is 0 and it’s length is 16.

Now that we can get the string stowed away in the string table and the address of it on the stack, what can we do with it? Well, at the moment not a lot. So let’s fix that. The primary thing we want to do with strings is display them. So let’s add some code to handle a new keyword “TYPE”. Type will pop the length and address of a string off the stack and print the string out to the console.

Add the following code to the parser between the code to handle the “emit” and the code to handle the “key” keywords:

# parser.py

...

elif tok.value == 'type':

"""<addr> <length> type

Sends a string to standard out. The string is located

in the string table at location <addr> and has a length

of <length>.

"""

length = int(self.data.pop())

addr = int(self.data.pop())

string = self.string_table.get_str(addr)

if length > len(string):

raise IndexError(f'Error: String length of {length} was given but actual length is {len(string)}.')

idx = 0

while idx < length:

print(string[idx], end='')

idx += 1

print()

...

# parser.py ... elif tok.value == 'type': """<addr> <length> type Sends a string to standard out. The string is located in the string table at location <addr> and has a length of <length>. """ length = int(self.data.pop()) addr = int(self.data.pop()) string = self.string_table.get_str(addr) if length > len(string): raise IndexError(f'Error: String length of {length} was given but actual length is {len(string)}.') idx = 0 while idx < length: print(string[idx], end='') idx += 1 print() ...

# parser.py 
               ...

                 elif tok.value == 'type':
                    """<addr> <length> type
                    Sends a string to standard out. The string is located
                    in the string table at location <addr> and has a length
                    of <length>.
                    """
                    length = int(self.data.pop())
                    addr = int(self.data.pop())

                    string = self.string_table.get_str(addr)
                    if length > len(string):
                        raise IndexError(f'Error: String length of {length} was given but actual length is {len(string)}.')

                    idx = 0
                    while idx < length:
                        print(string[idx], end='')
                        idx += 1
                    print()

                ...

Now change the code for the runner to:

if __name__ == '__main__':

source = """ 16 2 >>

"This is a string"

dump

type dump

"""

lexer = Lexer(source)

parser = Parser(lexer)

parser.parse()

if __name__ == '__main__': source = """ 16 2 >> "This is a string" dump type dump """ lexer = Lexer(source) parser = Parser(lexer) parser.parse()

if __name__ == '__main__':
    source = """ 16 2 >> 
    "This is a string" 
    dump
    type dump
    """
    lexer = Lexer(source)
    parser = Parser(lexer)
    parser.parse()

When you run this you should get the following output:

[ '4', '0', '16' ]

This is a string

[ '4' ]

[ '4', '0', '16' ] This is a string [ '4' ]

[ '4', '0', '16' ]
This is a string
[ '4' ]

When the parser encounters the string it already has 4 on the stack from the previous operation. Then it places the strings address (index), on the stack followed by the string’s length. We then call “type” which takes the length and address of the string off the stack and uses this information to display the string. Then we call “dump” again to show that only the string values were consumed, leaving 4 still on the stack when the program completes.

With this, we can now right that “Hello World” app. Give it a try!

Now that we have strings working, let’s get characters working. Characters are single-quoted ASCII characters, and when encountered by the parser, they are converted to their ASCII values and placed on the stack. I bet by now you can add the code for this without my help. So, I’ll just show you my implementation below:

...

elif tok.type == TT_CHAR:

char = self.current_tok.value

ascii_code = ord(char)

self.data.push(ascii_code)

# Handle Keywords

elif tok.type == TT_KEYWORD:

...

... elif tok.type == TT_CHAR: char = self.current_tok.value ascii_code = ord(char) self.data.push(ascii_code) # Handle Keywords elif tok.type == TT_KEYWORD: ...

            ...
            elif tok.type == TT_CHAR:
                char = self.current_tok.value
                ascii_code = ord(char)
                self.data.push(ascii_code)

            # Handle Keywords
            elif tok.type == TT_KEYWORD:
           ...

Here we simply get the character from the token value and convert it to its code-point using the ord() function from python. Then we push that value onto the stack.

I had wanted to include variable handling in this issue. However, I feel that it will make this post exceedingly long so I will postpone variables until next time.

As always, I have included the code as it now stands, below:

# keywords.py

KEYWORDS = [

# Stack Operators

'SWAP', 'SWAP2', 'DUMP', 'DROP', 'ROL', 'ROR', 'OVER',

'DUP', 'DUP2', 'DUP3', 'DUP4', 'CLEAR', 'DUMP',

# Conditional Operators

'IF', 'ELSE', 'THEN', 'NOT',

# Iterative

'DO', 'LOOP',

'WHILE', 'WEND',

# I/O Operators

'TYPE', # Outputs a string to stdout

'EMIT', # Outputs a character to stdout

'KEY', # Takes input character from stdin

# Special Character Values

# See ASCII Character Table for values

'CR', 'LF', 'NULL', 'BELL', 'BS', 'TAB',

'VT', 'FF', 'ESC',

# Logical Operators

'NEQ', 'AND', 'OR',

]

# keywords.py KEYWORDS = [ # Stack Operators 'SWAP', 'SWAP2', 'DUMP', 'DROP', 'ROL', 'ROR', 'OVER', 'DUP', 'DUP2', 'DUP3', 'DUP4', 'CLEAR', 'DUMP', # Conditional Operators 'IF', 'ELSE', 'THEN', 'NOT', # Iterative 'DO', 'LOOP', 'WHILE', 'WEND', # I/O Operators 'TYPE', # Outputs a string to stdout 'EMIT', # Outputs a character to stdout 'KEY', # Takes input character from stdin # Special Character Values # See ASCII Character Table for values 'CR', 'LF', 'NULL', 'BELL', 'BS', 'TAB', 'VT', 'FF', 'ESC', # Logical Operators 'NEQ', 'AND', 'OR', ]

# keywords.py

KEYWORDS = [
    # Stack Operators
    'SWAP', 'SWAP2', 'DUMP', 'DROP', 'ROL', 'ROR', 'OVER',
    'DUP', 'DUP2', 'DUP3', 'DUP4', 'CLEAR', 'DUMP',


    # Conditional Operators
    'IF', 'ELSE', 'THEN', 'NOT',

    # Iterative
    'DO', 'LOOP',
    'WHILE', 'WEND',

    # I/O Operators
    'TYPE',  # Outputs a string to stdout
    'EMIT',  # Outputs a character to stdout
    'KEY',   # Takes input character from stdin

    # Special Character Values
    # See ASCII Character Table for values
    'CR', 'LF', 'NULL', 'BELL', 'BS', 'TAB',
    'VT', 'FF', 'ESC',

    # Logical Operators
    'NEQ', 'AND', 'OR',

]

# lexer.py

from keywords import KEYWORDS

from tokens import *

class Lexer:

def __init__(self, text, line=None, col=None):

self.text = text

self.pos = 0

self.line = line if line else 1

self.col = col if col else 1

self.current_char = self.text[0]

#############################################

# Utility Methods #

#-------------------------------------------#

# These methods help manage the internal #

# workings of the lexer. #

#############################################

def advance(self):

"""advance():

Advances the current position in the character stream.

"""

if not self.is_eoi():

self.pos += 1

self.current_char = self.text[self.pos]

self.update_position()

else:

self.current_char = ''

def update_position(self):

"""update_position

Maintains the line and column positions for the current token scan.

"""

if self.current_char == '\n':

self.line += 1

self.col = 0

else:

self.col += 1

def is_eoi(self):

"""is_eoi():

:return:

True if current position is End Of Input stream.

Returns False otherwise.

"""

return self.pos >= (len(self.text) - 1)

def expected(self, expected, found):

"""expected

Raises error is expected and found are different.

"""

if expected == found:

return

else:

raise ValueError(f'[Error] Expected {expected} but found {found} in line: {self.line} position: {self.col}')

def peek(self):

"""peek

Returns next character in input stream after the current character.

Note: If we are at EOI (End Of Input) then a space character is returned.

"""

if not self.is_eoi():

return self.text[self.pos + 1]

else:

return ' ' # space

#############################################

# Recognizer Methods #

#-------------------------------------------#

# These methods collect the characters that #

# belong to a single multi-character token #

# into a single unit, and return that unit #

# to the caller. #

#############################################

def char(self):

if self.current_char != "'":

self.expected("'", self.current_char)

self.advance() # Consume opening single quote

char = self.current_char

self.advance() # Consume character

# Now we must have a closing single quote mark

if self.current_char != "'":

self.expected("'", self.current_char)

self.advance() # Consume closing single quote

return char

def string(self):

if self.current_char != '"':

self.expected('"', self.peek())

self.advance() # Consume leading '"'

string = ''

while self.current_char != '"':

string += self.current_char

self.advance()

self.advance() # Consume trailing '"'

return string

def number(self):

number = ''

while self.current_char.isdigit():

number += self.current_char

self.advance()

return number

def is_ident(self):

return self.current_char.isalpha() or self.current_char == '_'

def _ident(self):

_id = ''

# identifiers begin with alpha or underscores

if self.current_char.isalpha() or self.current_char == '_':

while self.current_char.isalnum() or self.current_char == '_':

_id += self.current_char

self.advance()

return _id

def is_keyword(self, kw):

return kw.upper() in KEYWORDS

def eat_comment(self):

if self.current_char == '(':

while self.current_char != ')' and not self.is_eoi():

self.advance()

self.advance() # skip trailing )

elif self.current_char == '\\':

while self.current_char != '\n' and not self.is_eoi():

self.advance()

self.advance() # skip new line character

#############################################

# Tokenization #

#-------------------------------------------#

# The method next_token() is called by the #

# client to obtain the next token in the #

# input stream. #

#############################################

def next_token(self) -> Token:

if not self.is_eoi():

if self.current_char.isspace():

while self.current_char.isspace():

self.advance()

if self.current_char.isdigit():

return Token(TT_NUMBER, self.number(), self.line, self.col)

elif self.current_char == '+':

val = self.current_char

self.advance()

return Token(TT_PLUS, val, self.line, self.col)

elif self.current_char == '-':

val = self.current_char

self.advance()

return Token(TT_MINUS, val, self.line, self.col)

elif self.current_char == '*':

val = self.current_char

if self.peek() == '*':

val += '*'

self.advance()

return Token(TT_POW, val, self.line, self.col)

self.advance()

return Token(TT_MUL, val, self.line, self.col)

elif self.current_char == '/':

val = self.current_char

self.advance()

return Token(TT_DIV, val, self.line, self.col)

elif self.current_char == '&':

val = self.current_char

self.advance()

return Token(TT_AND, val, self.line, self.col)

elif self.current_char == '|':

val = self.current_char

self.advance()

return Token(TT_OR, val, self.line, self.col)

elif self.current_char == '^':

val = self.current_char

self.advance()

return Token(TT_XOR, val, self.line, self.col)

elif self.current_char == '~':

val = self.current_char

self.advance()

return Token(TT_INV, val, self.line, self.col)

elif self.is_ident():

_id = self._ident()

if self.is_keyword(_id):

return Token(TT_KEYWORD, _id, self.line, self.col)

else:

return Token(TT_ID, _id, self.line, self.col)

elif self.current_char == '<':

val = self.current_char

if self.peek() == '<':

val += '<'

self.advance()

return Token(TT_LSL, val, self.line, self.col)

self.advance()

return Token(TT_LESS, val, self.line, self.col)

elif self.current_char == '>':

val = self.current_char

if self.peek() == '>':

val += '>'

self.advance()

return Token(TT_LSR, val, self.line, self.col)

self.advance()

return Token(TT_GREATER, val, self.line, self.col)

elif self.current_char == '=':

val = self.current_char

if self.peek() == '=':

val += '='

self.advance()

return Token(TT_EQ, val, self.line, self.col)

self.advance()

return Token(TT_ASSIGN, val, self.line, self.col)

elif self.current_char == ':':

val = self.current_char

self.advance()

return Token(TT_STARTDEF, val, self.line, self.col)

elif self.current_char == ';':

val = self.current_char

self.advance()

return Token(TT_ENDDEF, val, self.line, self.col)

elif self.current_char == '(' or self.current_char == '\\':

self.eat_comment()

return self.next_token()

elif self.current_char == "'":

return Token(TT_CHAR, self.char(), self.line, self.col)

elif self.current_char == '"':

string = self.string()

return Token(TT_STRING, string, self.line, self.col)

elif self.is_eoi():

return Token(TT_EOF, None, self.line, self.col)

else:

raise ValueError(f'Unexpected character: {self.current_char} in input stream at position: {str(self.pos)}.')

else:

return Token(TT_EOF, None, self.line, self.col)

def main(text: str):

lexer = Lexer(text)

tok = lexer.next_token()

while tok.type != TT_EOF:

print(tok)

tok = lexer.next_token()

if __name__ == "__main__":

text = """

( this is a multi-line

comment. )

\ This ia a line comment."

_myIdent """

main(text)

# lexer.py from keywords import KEYWORDS from tokens import * class Lexer: def __init__(self, text, line=None, col=None): self.text = text self.pos = 0 self.line = line if line else 1 self.col = col if col else 1 self.current_char = self.text[0] ############################################# # Utility Methods # #-------------------------------------------# # These methods help manage the internal # # workings of the lexer. # ############################################# def advance(self): """advance(): Advances the current position in the character stream. """ if not self.is_eoi(): self.pos += 1 self.current_char = self.text[self.pos] self.update_position() else: self.current_char = '' def update_position(self): """update_position Maintains the line and column positions for the current token scan. """ if self.current_char == '\n': self.line += 1 self.col = 0 else: self.col += 1 def is_eoi(self): """is_eoi(): :return: True if current position is End Of Input stream. Returns False otherwise. """ return self.pos >= (len(self.text) - 1) def expected(self, expected, found): """expected Raises error is expected and found are different. """ if expected == found: return else: raise ValueError(f'[Error] Expected {expected} but found {found} in line: {self.line} position: {self.col}') def peek(self): """peek Returns next character in input stream after the current character. Note: If we are at EOI (End Of Input) then a space character is returned. """ if not self.is_eoi(): return self.text[self.pos + 1] else: return ' ' # space ############################################# # Recognizer Methods # #-------------------------------------------# # These methods collect the characters that # # belong to a single multi-character token # # into a single unit, and return that unit # # to the caller. # ############################################# def char(self): if self.current_char != "'": self.expected("'", self.current_char) self.advance() # Consume opening single quote char = self.current_char self.advance() # Consume character # Now we must have a closing single quote mark if self.current_char != "'": self.expected("'", self.current_char) self.advance() # Consume closing single quote return char def string(self): if self.current_char != '"': self.expected('"', self.peek()) self.advance() # Consume leading '"' string = '' while self.current_char != '"': string += self.current_char self.advance() self.advance() # Consume trailing '"' return string def number(self): number = '' while self.current_char.isdigit(): number += self.current_char self.advance() return number def is_ident(self): return self.current_char.isalpha() or self.current_char == '_' def _ident(self): _id = '' # identifiers begin with alpha or underscores if self.current_char.isalpha() or self.current_char == '_': while self.current_char.isalnum() or self.current_char == '_': _id += self.current_char self.advance() return _id def is_keyword(self, kw): return kw.upper() in KEYWORDS def eat_comment(self): if self.current_char == '(': while self.current_char != ')' and not self.is_eoi(): self.advance() self.advance() # skip trailing ) elif self.current_char == '\\': while self.current_char != '\n' and not self.is_eoi(): self.advance() self.advance() # skip new line character ############################################# # Tokenization # #-------------------------------------------# # The method next_token() is called by the # # client to obtain the next token in the # # input stream. # ############################################# def next_token(self) -> Token: if not self.is_eoi(): if self.current_char.isspace(): while self.current_char.isspace(): self.advance() if self.current_char.isdigit(): return Token(TT_NUMBER, self.number(), self.line, self.col) elif self.current_char == '+': val = self.current_char self.advance() return Token(TT_PLUS, val, self.line, self.col) elif self.current_char == '-': val = self.current_char self.advance() return Token(TT_MINUS, val, self.line, self.col) elif self.current_char == '*': val = self.current_char if self.peek() == '*': val += '*' self.advance() self.advance() return Token(TT_POW, val, self.line, self.col) self.advance() return Token(TT_MUL, val, self.line, self.col) elif self.current_char == '/': val = self.current_char self.advance() return Token(TT_DIV, val, self.line, self.col) elif self.current_char == '&': val = self.current_char self.advance() return Token(TT_AND, val, self.line, self.col) elif self.current_char == '|': val = self.current_char self.advance() return Token(TT_OR, val, self.line, self.col) elif self.current_char == '^': val = self.current_char self.advance() return Token(TT_XOR, val, self.line, self.col) elif self.current_char == '~': val = self.current_char self.advance() return Token(TT_INV, val, self.line, self.col) elif self.is_ident(): _id = self._ident() if self.is_keyword(_id): return Token(TT_KEYWORD, _id, self.line, self.col) else: return Token(TT_ID, _id, self.line, self.col) elif self.current_char == '<': val = self.current_char if self.peek() == '<': val += '<' self.advance() self.advance() return Token(TT_LSL, val, self.line, self.col) self.advance() return Token(TT_LESS, val, self.line, self.col) elif self.current_char == '>': val = self.current_char if self.peek() == '>': val += '>' self.advance() self.advance() return Token(TT_LSR, val, self.line, self.col) self.advance() return Token(TT_GREATER, val, self.line, self.col) elif self.current_char == '=': val = self.current_char if self.peek() == '=': val += '=' self.advance() self.advance() return Token(TT_EQ, val, self.line, self.col) self.advance() return Token(TT_ASSIGN, val, self.line, self.col) elif self.current_char == ':': val = self.current_char self.advance() return Token(TT_STARTDEF, val, self.line, self.col) elif self.current_char == ';': val = self.current_char self.advance() return Token(TT_ENDDEF, val, self.line, self.col) elif self.current_char == '(' or self.current_char == '\\': self.eat_comment() return self.next_token() elif self.current_char == "'": return Token(TT_CHAR, self.char(), self.line, self.col) elif self.current_char == '"': string = self.string() return Token(TT_STRING, string, self.line, self.col) elif self.is_eoi(): return Token(TT_EOF, None, self.line, self.col) else: raise ValueError(f'Unexpected character: {self.current_char} in input stream at position: {str(self.pos)}.') else: return Token(TT_EOF, None, self.line, self.col) def main(text: str): lexer = Lexer(text) tok = lexer.next_token() while tok.type != TT_EOF: print(tok) tok = lexer.next_token() if __name__ == "__main__": text = """ ( this is a multi-line comment. ) \ This ia a line comment." _myIdent """ main(text)

# lexer.py

from keywords import KEYWORDS
from tokens import *

class Lexer:
    def __init__(self, text, line=None, col=None):
        self.text = text
        self.pos = 0
        self.line = line if line else 1
        self.col = col if col else 1
        self.current_char = self.text[0]


    #############################################
    # Utility Methods                           #
    #-------------------------------------------#
    # These methods help manage the internal    #
    # workings of the lexer.                    #
    #############################################
    def advance(self):
        """advance():
        Advances the current position in the character stream.
        """
        if not self.is_eoi():
            self.pos += 1
            self.current_char = self.text[self.pos]
            self.update_position()
        else:
            self.current_char = ''

    def update_position(self):
        """update_position
        Maintains the line and column positions for the current token scan.
        """
        if self.current_char == '\n':
            self.line += 1
            self.col = 0
        else:
            self.col += 1

    def is_eoi(self):
        """is_eoi():
        :return:
            True if current position is End Of Input stream.
            Returns False otherwise.
        """
        return self.pos >= (len(self.text) - 1)

    def expected(self, expected, found):
        """expected
        Raises error is expected and found are different.
        """
        if expected == found:
            return
        else:
            raise ValueError(f'[Error] Expected {expected} but found {found} in line: {self.line} position: {self.col}')

    def peek(self):
        """peek
        Returns next character in input stream after the current character.
        Note: If we are at EOI (End Of Input) then a space character is returned.
        """
        if not self.is_eoi():
            return self.text[self.pos + 1]
        else:
            return ' '  # space

    #############################################
    # Recognizer Methods                        #
    #-------------------------------------------#
    # These methods collect the characters that #
    # belong to a single multi-character token  #
    # into a single unit, and return that unit  #
    # to the caller.                            #
    #############################################
    def char(self):
        if self.current_char != "'":
            self.expected("'", self.current_char)
        self.advance()  # Consume opening single quote

        char = self.current_char
        self.advance()  # Consume character

        # Now we must have a closing single quote mark
        if self.current_char != "'":
           self.expected("'", self.current_char)
        self.advance()  # Consume closing single quote
        return char

    def string(self):
        if self.current_char != '"':
            self.expected('"', self.peek())
        self.advance()  # Consume leading '"'

        string = ''
        while self.current_char != '"':
            string += self.current_char
            self.advance()
        self.advance()  # Consume trailing '"'
        return string

    def number(self):
        number = ''
        while self.current_char.isdigit():
            number += self.current_char
            self.advance()
        return number

    def is_ident(self):
        return self.current_char.isalpha() or self.current_char == '_'

    def _ident(self):
        _id = ''
        # identifiers begin with alpha or underscores
        if self.current_char.isalpha() or self.current_char == '_':
            while self.current_char.isalnum() or self.current_char == '_':
                _id += self.current_char
                self.advance()
        return _id

    def is_keyword(self, kw):
        return kw.upper() in KEYWORDS

    def eat_comment(self):
        if self.current_char == '(':
            while self.current_char != ')' and not self.is_eoi():
                self.advance()
            self.advance()  # skip trailing )
        elif self.current_char == '\\':
            while self.current_char != '\n' and not self.is_eoi():
                self.advance()
            self.advance()  # skip new line character

    #############################################
    # Tokenization                              #
    #-------------------------------------------#
    # The method next_token() is called by the  #
    # client to obtain the next token in the    #
    # input stream.                             #
    #############################################
    def next_token(self) -> Token:
        if not self.is_eoi():
            if self.current_char.isspace():
                while self.current_char.isspace():
                    self.advance()

            if self.current_char.isdigit():
                return Token(TT_NUMBER, self.number(), self.line, self.col)

            elif self.current_char == '+':
                val = self.current_char
                self.advance()
                return Token(TT_PLUS, val, self.line, self.col)

            elif self.current_char == '-':
                val = self.current_char
                self.advance()
                return Token(TT_MINUS, val, self.line, self.col)

            elif self.current_char == '*':
                val = self.current_char
                if self.peek() == '*':
                    val += '*'
                    self.advance()
                    self.advance()
                    return Token(TT_POW, val, self.line, self.col)
                self.advance()
                return Token(TT_MUL, val, self.line, self.col)

            elif self.current_char == '/':
                val = self.current_char
                self.advance()
                return Token(TT_DIV, val, self.line, self.col)

            elif self.current_char == '&':
                val = self.current_char
                self.advance()
                return Token(TT_AND, val, self.line, self.col)

            elif self.current_char == '|':
                val = self.current_char
                self.advance()
                return Token(TT_OR, val, self.line, self.col)

            elif self.current_char == '^':
                val = self.current_char
                self.advance()
                return Token(TT_XOR, val, self.line, self.col)

            elif self.current_char == '~':
                val = self.current_char
                self.advance()
                return Token(TT_INV, val, self.line, self.col)

            elif self.is_ident():
                _id = self._ident()
                if self.is_keyword(_id):
                    return Token(TT_KEYWORD, _id, self.line, self.col)
                else:
                    return Token(TT_ID, _id, self.line, self.col)

            elif self.current_char == '<':
                val = self.current_char
                if self.peek() == '<':
                    val += '<'
                    self.advance()
                    self.advance()
                    return Token(TT_LSL, val, self.line, self.col)
                self.advance()
                return Token(TT_LESS, val, self.line, self.col)

            elif self.current_char == '>':
                val = self.current_char
                if self.peek() == '>':
                    val += '>'
                    self.advance()
                    self.advance()
                    return Token(TT_LSR, val, self.line, self.col)
                self.advance()
                return Token(TT_GREATER, val, self.line, self.col)

            elif self.current_char == '=':
                val = self.current_char
                if self.peek() == '=':
                    val += '='
                    self.advance()
                    self.advance()
                    return Token(TT_EQ, val, self.line, self.col)
                self.advance()
                return Token(TT_ASSIGN, val, self.line, self.col)




            elif self.current_char == ':':
                val = self.current_char
                self.advance()
                return Token(TT_STARTDEF, val, self.line, self.col)

            elif self.current_char == ';':
                val = self.current_char
                self.advance()
                return Token(TT_ENDDEF, val, self.line, self.col)

            elif self.current_char == '(' or self.current_char == '\\':
                self.eat_comment()
                return self.next_token()


            elif self.current_char == "'":
                return Token(TT_CHAR, self.char(), self.line, self.col)

            elif self.current_char == '"':
                string = self.string()
                return Token(TT_STRING, string, self.line, self.col)

            elif self.is_eoi():
                return Token(TT_EOF, None, self.line, self.col)

            else:
                raise ValueError(f'Unexpected character: {self.current_char} in input stream at position: {str(self.pos)}.')

        else:
            return Token(TT_EOF, None, self.line, self.col)


def main(text: str):
    lexer = Lexer(text)
    tok = lexer.next_token()
    while tok.type != TT_EOF:
        print(tok)
        tok = lexer.next_token()

if __name__ == "__main__":
    text = """ 
    ( this is a multi-line 
       comment. ) 
    \ This ia a line comment." 
    _myIdent """

    main(text)

# parser.py

import sys

from lexer import Lexer

from posts.post4.string_table import StringTable

from stack import Stack

from tokens import *

#######################################

# Parser #

#######################################

class Parser:

def __init__(self, lexer: Lexer, data_stack: Stack = None, string_table=None):

self.lexer = lexer

# Setup data stack

if data_stack:

self.data = data_stack

else:

self.data = Stack()

self.current_tok = Token(TT_START, None)

# Setup string table

if string_table is not None:

self.string_table = string_table

else:

self.string_table = StringTable()

def next_token(self):

if self.current_tok != TT_EOF:

self.current_tok = self.lexer.next_token()

def parse(self):

# get token and decide what to do with it

tok = self.current_tok

while tok.type != TT_EOF:

self.next_token()

tok = self.current_tok

if tok.type == TT_START:

continue

if tok.type == TT_NUMBER:

self.data.push(int(tok.value))

continue

# Arithmetic

elif tok.type == TT_PLUS:

right = self.data.pop()

left = self.data.pop()

result = left + right

self.data.push(result)

elif tok.type == TT_MINUS:

right = self.data.pop()

left = self.data.pop()

result = left - right

self.data.push(result)

elif tok.type == TT_MUL:

right = self.data.pop()

left = self.data.pop()

result = left * right

self.data.push(result)

elif tok.type == TT_DIV:

right = self.data.pop()

left = self.data.pop()

result = left // right

self.data.push(result)

elif tok.type == TT_POW:

right = self.data.pop()

left = self.data.pop()

result = left ** right

self.data.push(result)

# Logical Operations

elif tok.type == TT_INV:

left = self.data.pop()

left = ~left

self.data.push(left)

elif tok.type == TT_XOR:

right = self.data.pop()

left = self.data.pop()

self.data.push((left ^ right))

elif tok.type == TT_AND:

right = self.data.pop()

left = self.data.pop()

self.data.push((left & right))

elif tok.type == TT_OR:

right = self.data.pop()

left = self.data.pop()

self.data.push((left | right))

elif tok.type == TT_LSL:

right = self.data.pop()

left = self.data.pop()

self.data.push(int(left) << int(right))

elif tok.type == TT_LSR:

right = self.data.pop()

left = self.data.pop()

self.data.push((left >> right))

elif tok.type == TT_STRING:

string = self.current_tok.value

addr = self.string_table.save(string)

self.data.push(addr)

self.data.push(len(string))

elif tok.type == TT_CHAR:

char = self.current_tok.value

ascii_code = ord(char)

self.data.push(ascii_code)

# Handle Keywords

elif tok.type == TT_KEYWORD:

# Process keyword

if tok.value == 'dump':

"""dump

Display the data stack contents in the console

"""

print(str(self.data))

elif tok.value == 'drop':

"""pop

Remove the top value from the stack and discard it.

"""

self.data.pop()

elif tok.value == 'clear':

"""clear

Empty the stack.

"""

self.data.clear()

elif tok.value.lower() == 'swap':

"""swap

Swaps the position of the top two items on the stack.

Ex: x y z --> x z y

"""

if self.data.count() < 2:

raise IndexError(

f'Attempt to SWAP on stack of size {self.data.count()}!\n\t\t\tStack must contain at least 2 elements to SWAP.')

n1 = self.data.pop()

n2 = self.data.pop()

self.data.push(n1)

self.data.push(n2)

elif tok.value == 'swap2':

"""swap2

Swaps the position of the top two pairs of items on the stack.

Ex: w x y z --> y z w x

"""

if self.data.count() < 4:

raise IndexError(

f'Attempt to SWAP on stack of size {self.data.count()}!\n\t\t\tStack must contain at least 4 elements to SWAP2.')

n3 = self.data.pop()

n2 = self.data.pop()

n1 = self.data.pop()

n0 = self.data.pop()

self.data.push(n2)

self.data.push(n3)

self.data.push(n0)

self.data.push(n1)

elif tok.value == 'over':

"""over

Copies the second item on the stack to the head position.

Ex: x y z --> x y z y

"""

top = self.data.pop()

next = self.data.pop()

self.data.push(next)

self.data.push(top)

self.data.push(next)

elif tok.value == 'over2':

"""over

Copies the second item on the stack to the head position.

Ex: x y z --> x y z y

"""

top = self.data.pop()

sec = self.data.pop()

next = self.data.pop()

self.data.push(next)

self.data.push(top)

self.data.push(next)

elif tok.value == 'dup2':

"""dup2

Duplicates the top two items on the stack.

Ex: x y z --> x y z y z

"""

n1 = self.data.pop()

n2 = self.data.pop()

self.data.push(n2)

self.data.push(n1)

self.data.push(n2)

self.data.push(n1)

elif tok.value == 'dup3':

"""dup3

Duplicates the top three items on the stack.

Ex: x y z --> x y z x y z

"""

n1 = self.data.pop()

n2 = self.data.pop()

n3 = self.data.pop()

self.data.push(n3)

self.data.push(n2)

self.data.push(n1)

self.data.push(n3)

self.data.push(n2)

self.data.push(n1)

elif tok.value == 'dup4':

"""dup3

Duplicates the top three items on the stack.

Ex: w x y z --> w x y z w x y z

"""

n1 = self.data.pop()

n2 = self.data.pop()

n3 = self.data.pop()

n4 = self.data.pop()

self.data.push(n4)

self.data.push(n3)

self.data.push(n2)

self.data.push(n1)

self.data.push(n4)

self.data.push(n3)

self.data.push(n2)

self.data.push(n1)

elif tok.value == 'rol':

"""ror

Rotates the values in the stack by moving the top

element to the bottom and shifting everything else up

one position.

Ex: rot x y z -> y z x

"""

item = self.data.pop()

last = self.data.que(item)

elif tok.value == 'ror':

"""rot

Rotates the values in the stack by moving the bottom

element to the top and pushing everything else down

one position.

Ex: rot x y z -> y z x

"""

last = self.data.tail()

self.data.push(last)

# Handle I/O Routines

elif tok.value == 'emit':

"""emit

The word EMIT takes a single ASCII representation on the

stack, using the low-order byte only, and sends the

character to stdout. For example, in decimal:

65 EMIT↵ A

66 EMIT↵ B

67 EMIT↵ C

"""

ch = self.data.pop()

if ch > 0x110000 or ch < 0:

raise ValueError(f'[Error] character popped by emit: {ch} is out of printable range!')

char = chr(ch)

print(char, end='')

elif tok.value == 'type':

"""<addr> <length> type

Sends a string to standard out. The string is located

in the string table at location <addr> and has a length

of <length>.

"""

length = int(self.data.pop())

addr = int(self.data.pop())

string = self.string_table.get_str(addr)

if length > len(string):

raise IndexError(f'Error: String length of {length} was given but actual length is {len(string)}.')

idx = 0

while idx < length:

print(string[idx], end='')

idx += 1

print()

elif tok.value == 'key':

"""key

Pauses execution until a key is pressed. Once a key

is pressed, the key code is pushed onto the stack.

Ex: <A> --> 65

"""

getch = sys.stdin.read

key = None

while not key:

key = getch(1)

self.data.push(ord(key))

# Helpful Constants

elif tok.value == 'cr': # Carriage Return

self.data.push(13)

elif tok.value == 'lf': # Line Feed

self.data.push(10)

elif tok.value == 'sp':

self.data.push(32)

elif tok.value == 'null': # NUll

self.data.push(0)

elif tok.value == 'bell': # System Bell

self.data.push(7)

elif tok.value == 'bs': # Backspace

self.data.push(8)

elif tok.value == 'tab': # TAB

self.data.push(9)

elif tok.value == 'vt': # Vertical TAB

self.data.push(11)

elif tok.value == 'ff': # Form Feed

self.data.push(12)

elif tok.value == 'esc': # Escape

self.data.push(27)

if __name__ == '__main__':

source = """ 16 2 >>

"This is a string"

dump

type dump

'C' 'B' 'A'

dump

"""

lexer = Lexer(source)

parser = Parser(lexer)

parser.parse()

#print(parser.data)

# parser.py import sys from lexer import Lexer from posts.post4.string_table import StringTable from stack import Stack from tokens import * ####################################### # Parser # ####################################### class Parser: def __init__(self, lexer: Lexer, data_stack: Stack = None, string_table=None): self.lexer = lexer # Setup data stack if data_stack: self.data = data_stack else: self.data = Stack() self.current_tok = Token(TT_START, None) # Setup string table if string_table is not None: self.string_table = string_table else: self.string_table = StringTable() def next_token(self): if self.current_tok != TT_EOF: self.current_tok = self.lexer.next_token() def parse(self): # get token and decide what to do with it tok = self.current_tok while tok.type != TT_EOF: self.next_token() tok = self.current_tok if tok.type == TT_START: continue if tok.type == TT_NUMBER: self.data.push(int(tok.value)) continue # Arithmetic elif tok.type == TT_PLUS: right = self.data.pop() left = self.data.pop() result = left + right self.data.push(result) elif tok.type == TT_MINUS: right = self.data.pop() left = self.data.pop() result = left - right self.data.push(result) elif tok.type == TT_MUL: right = self.data.pop() left = self.data.pop() result = left * right self.data.push(result) elif tok.type == TT_DIV: right = self.data.pop() left = self.data.pop() result = left // right self.data.push(result) elif tok.type == TT_POW: right = self.data.pop() left = self.data.pop() result = left ** right self.data.push(result) # Logical Operations elif tok.type == TT_INV: left = self.data.pop() left = ~left self.data.push(left) elif tok.type == TT_XOR: right = self.data.pop() left = self.data.pop() self.data.push((left ^ right)) elif tok.type == TT_AND: right = self.data.pop() left = self.data.pop() self.data.push((left & right)) elif tok.type == TT_OR: right = self.data.pop() left = self.data.pop() self.data.push((left | right)) elif tok.type == TT_LSL: right = self.data.pop() left = self.data.pop() self.data.push(int(left) << int(right)) elif tok.type == TT_LSR: right = self.data.pop() left = self.data.pop() self.data.push((left >> right)) elif tok.type == TT_STRING: string = self.current_tok.value addr = self.string_table.save(string) self.data.push(addr) self.data.push(len(string)) elif tok.type == TT_CHAR: char = self.current_tok.value ascii_code = ord(char) self.data.push(ascii_code) # Handle Keywords elif tok.type == TT_KEYWORD: # Process keyword if tok.value == 'dump': """dump Display the data stack contents in the console """ print(str(self.data)) elif tok.value == 'drop': """pop Remove the top value from the stack and discard it. """ self.data.pop() elif tok.value == 'clear': """clear Empty the stack. """ self.data.clear() elif tok.value.lower() == 'swap': """swap Swaps the position of the top two items on the stack. Ex: x y z --> x z y """ if self.data.count() < 2: raise IndexError( f'Attempt to SWAP on stack of size {self.data.count()}!\n\t\t\tStack must contain at least 2 elements to SWAP.') n1 = self.data.pop() n2 = self.data.pop() self.data.push(n1) self.data.push(n2) elif tok.value == 'swap2': """swap2 Swaps the position of the top two pairs of items on the stack. Ex: w x y z --> y z w x """ if self.data.count() < 4: raise IndexError( f'Attempt to SWAP on stack of size {self.data.count()}!\n\t\t\tStack must contain at least 4 elements to SWAP2.') n3 = self.data.pop() n2 = self.data.pop() n1 = self.data.pop() n0 = self.data.pop() self.data.push(n2) self.data.push(n3) self.data.push(n0) self.data.push(n1) elif tok.value == 'over': """over Copies the second item on the stack to the head position. Ex: x y z --> x y z y """ top = self.data.pop() next = self.data.pop() self.data.push(next) self.data.push(top) self.data.push(next) elif tok.value == 'over2': """over Copies the second item on the stack to the head position. Ex: x y z --> x y z y """ top = self.data.pop() sec = self.data.pop() next = self.data.pop() self.data.push(next) self.data.push(top) self.data.push(next) elif tok.value == 'dup2': """dup2 Duplicates the top two items on the stack. Ex: x y z --> x y z y z """ n1 = self.data.pop() n2 = self.data.pop() self.data.push(n2) self.data.push(n1) self.data.push(n2) self.data.push(n1) elif tok.value == 'dup3': """dup3 Duplicates the top three items on the stack. Ex: x y z --> x y z x y z """ n1 = self.data.pop() n2 = self.data.pop() n3 = self.data.pop() self.data.push(n3) self.data.push(n2) self.data.push(n1) self.data.push(n3) self.data.push(n2) self.data.push(n1) elif tok.value == 'dup4': """dup3 Duplicates the top three items on the stack. Ex: w x y z --> w x y z w x y z """ n1 = self.data.pop() n2 = self.data.pop() n3 = self.data.pop() n4 = self.data.pop() self.data.push(n4) self.data.push(n3) self.data.push(n2) self.data.push(n1) self.data.push(n4) self.data.push(n3) self.data.push(n2) self.data.push(n1) elif tok.value == 'rol': """ror Rotates the values in the stack by moving the top element to the bottom and shifting everything else up one position. Ex: rot x y z -> y z x """ item = self.data.pop() last = self.data.que(item) elif tok.value == 'ror': """rot Rotates the values in the stack by moving the bottom element to the top and pushing everything else down one position. Ex: rot x y z -> y z x """ last = self.data.tail() self.data.push(last) # Handle I/O Routines elif tok.value == 'emit': """emit The word EMIT takes a single ASCII representation on the stack, using the low-order byte only, and sends the character to stdout. For example, in decimal: 65 EMIT↵ A 66 EMIT↵ B 67 EMIT↵ C """ ch = self.data.pop() if ch > 0x110000 or ch < 0: raise ValueError(f'[Error] character popped by emit: {ch} is out of printable range!') char = chr(ch) print(char, end='') elif tok.value == 'type': """<addr> <length> type Sends a string to standard out. The string is located in the string table at location <addr> and has a length of <length>. """ length = int(self.data.pop()) addr = int(self.data.pop()) string = self.string_table.get_str(addr) if length > len(string): raise IndexError(f'Error: String length of {length} was given but actual length is {len(string)}.') idx = 0 while idx < length: print(string[idx], end='') idx += 1 print() elif tok.value == 'key': """key Pauses execution until a key is pressed. Once a key is pressed, the key code is pushed onto the stack. Ex: <A> --> 65 """ getch = sys.stdin.read key = None while not key: key = getch(1) self.data.push(ord(key)) # Helpful Constants elif tok.value == 'cr': # Carriage Return self.data.push(13) elif tok.value == 'lf': # Line Feed self.data.push(10) elif tok.value == 'sp': self.data.push(32) elif tok.value == 'null': # NUll self.data.push(0) elif tok.value == 'bell': # System Bell self.data.push(7) elif tok.value == 'bs': # Backspace self.data.push(8) elif tok.value == 'tab': # TAB self.data.push(9) elif tok.value == 'vt': # Vertical TAB self.data.push(11) elif tok.value == 'ff': # Form Feed self.data.push(12) elif tok.value == 'esc': # Escape self.data.push(27) if __name__ == '__main__': source = """ 16 2 >> "This is a string" dump type dump 'C' 'B' 'A' dump """ lexer = Lexer(source) parser = Parser(lexer) parser.parse() #print(parser.data)

# parser.py
import sys

from lexer import Lexer
from posts.post4.string_table import StringTable
from stack import Stack
from tokens import *

#######################################
# Parser                              #
#######################################

class Parser:

    def __init__(self, lexer: Lexer, data_stack: Stack = None, string_table=None):
        self.lexer = lexer

        # Setup data stack
        if data_stack:
            self.data = data_stack
        else:
            self.data = Stack()

        self.current_tok = Token(TT_START, None)

        # Setup string table
        if string_table is not None:
            self.string_table = string_table
        else:
            self.string_table = StringTable()

    def next_token(self):
        if self.current_tok != TT_EOF:
            self.current_tok = self.lexer.next_token()

    def parse(self):
        # get token and decide what to do with it
        tok = self.current_tok

        while tok.type != TT_EOF:
            self.next_token()
            tok = self.current_tok

            if tok.type == TT_START:
                continue

            if tok.type == TT_NUMBER:
                self.data.push(int(tok.value))
                continue

            # Arithmetic
            elif tok.type == TT_PLUS:
                right = self.data.pop()
                left = self.data.pop()
                result = left + right
                self.data.push(result)

            elif tok.type == TT_MINUS:
                right = self.data.pop()
                left = self.data.pop()
                result = left - right
                self.data.push(result)

            elif tok.type == TT_MUL:
                right = self.data.pop()
                left = self.data.pop()
                result = left * right
                self.data.push(result)

            elif tok.type == TT_DIV:
                right = self.data.pop()
                left = self.data.pop()
                result = left // right
                self.data.push(result)

            elif tok.type == TT_POW:
                right = self.data.pop()
                left = self.data.pop()
                result = left ** right
                self.data.push(result)

            # Logical Operations
            elif tok.type == TT_INV:
                left = self.data.pop()
                left = ~left
                self.data.push(left)

            elif tok.type == TT_XOR:
                right = self.data.pop()
                left = self.data.pop()
                self.data.push((left ^ right))

            elif tok.type == TT_AND:
                right = self.data.pop()
                left = self.data.pop()
                self.data.push((left & right))

            elif tok.type == TT_OR:
                right = self.data.pop()
                left = self.data.pop()
                self.data.push((left | right))

            elif tok.type == TT_LSL:
                right = self.data.pop()
                left = self.data.pop()
                self.data.push(int(left) << int(right))

            elif tok.type == TT_LSR:
                right = self.data.pop()
                left = self.data.pop()
                self.data.push((left >> right))

            elif tok.type == TT_STRING:
                string = self.current_tok.value
                addr = self.string_table.save(string)
                self.data.push(addr)
                self.data.push(len(string))

            elif tok.type == TT_CHAR:
                char = self.current_tok.value
                ascii_code = ord(char)
                self.data.push(ascii_code)

            # Handle Keywords
            elif tok.type == TT_KEYWORD:
                # Process keyword
                if tok.value == 'dump':
                    """dump
                    Display the data stack contents in the console
                    """
                    print(str(self.data))

                elif tok.value == 'drop':
                    """pop
                    Remove the top value from the stack and discard it.
                    """
                    self.data.pop()

                elif tok.value == 'clear':
                    """clear
                    Empty the stack.
                    """
                    self.data.clear()

                elif tok.value.lower() == 'swap':
                    """swap
                    Swaps the position of the top two items on the stack.
                    Ex: x y z --> x z y
                    """
                    if self.data.count() < 2:
                        raise IndexError(
                            f'Attempt to SWAP on stack of size {self.data.count()}!\n\t\t\tStack must contain at least 2 elements to SWAP.')
                    n1 = self.data.pop()
                    n2 = self.data.pop()
                    self.data.push(n1)
                    self.data.push(n2)

                elif tok.value == 'swap2':
                    """swap2
                    Swaps the position of the top two pairs of items on the stack.
                    Ex: w x y z --> y z w x
                    """
                    if self.data.count() < 4:
                        raise IndexError(
                            f'Attempt to SWAP on stack of size {self.data.count()}!\n\t\t\tStack must contain at least 4 elements to SWAP2.')
                    n3 = self.data.pop()
                    n2 = self.data.pop()
                    n1 = self.data.pop()
                    n0 = self.data.pop()
                    self.data.push(n2)
                    self.data.push(n3)
                    self.data.push(n0)
                    self.data.push(n1)

                elif tok.value == 'over':
                    """over
                    Copies the second item on the stack to the head position.
                    Ex: x y z --> x y z y
                    """
                    top = self.data.pop()
                    next = self.data.pop()
                    self.data.push(next)
                    self.data.push(top)
                    self.data.push(next)

                elif tok.value == 'over2':
                    """over
                    Copies the second item on the stack to the head position.
                    Ex: x y z --> x y z y
                    """
                    top = self.data.pop()
                    sec = self.data.pop()
                    next = self.data.pop()
                    self.data.push(next)
                    self.data.push(top)
                    self.data.push(next)

                elif tok.value == 'dup2':
                    """dup2
                    Duplicates the top two items on the stack.
                    Ex: x y z --> x y z y z
                    """
                    n1 = self.data.pop()
                    n2 = self.data.pop()
                    self.data.push(n2)
                    self.data.push(n1)
                    self.data.push(n2)
                    self.data.push(n1)

                elif tok.value == 'dup3':
                    """dup3
                    Duplicates the top three items on the stack.
                    Ex: x y z --> x y z x y z
                    """
                    n1 = self.data.pop()
                    n2 = self.data.pop()
                    n3 = self.data.pop()
                    self.data.push(n3)
                    self.data.push(n2)
                    self.data.push(n1)
                    self.data.push(n3)
                    self.data.push(n2)
                    self.data.push(n1)

                elif tok.value == 'dup4':
                    """dup3
                    Duplicates the top three items on the stack.
                    Ex: w x y z --> w x y z w x y z 
                    """
                    n1 = self.data.pop()
                    n2 = self.data.pop()
                    n3 = self.data.pop()
                    n4 = self.data.pop()
                    self.data.push(n4)
                    self.data.push(n3)
                    self.data.push(n2)
                    self.data.push(n1)
                    self.data.push(n4)
                    self.data.push(n3)
                    self.data.push(n2)
                    self.data.push(n1)

                elif tok.value == 'rol':
                    """ror
                    Rotates the values in the stack by moving the top
                    element to the bottom and shifting everything else up 
                    one position. 
                    Ex: rot x y z -> y z x
                    """
                    item = self.data.pop()
                    last = self.data.que(item)

                elif tok.value == 'ror':
                    """rot
                    Rotates the values in the stack by moving the bottom
                    element to the top and pushing everything else down 
                    one position. 
                    Ex: rot x y z -> y z x
                    """
                    last = self.data.tail()
                    self.data.push(last)

                # Handle I/O Routines
                elif tok.value == 'emit':
                    """emit
                    The word EMIT takes a single ASCII representation on the 
                    stack, using the low-order byte only, and sends the 
                    character to stdout. For example, in decimal:

                        65 EMIT↵ A  
                        66 EMIT↵ B  
                        67 EMIT↵ C  
                    """
                    ch = self.data.pop()
                    if ch > 0x110000 or ch < 0:
                        raise ValueError(f'[Error] character popped by emit: {ch} is out of printable range!')
                    char = chr(ch)
                    print(char, end='')

                elif tok.value == 'type':
                    """<addr> <length> type
                    Sends a string to standard out. The string is located
                    in the string table at location <addr> and has a length
                    of <length>.
                    """
                    length = int(self.data.pop())
                    addr = int(self.data.pop())

                    string = self.string_table.get_str(addr)
                    if length > len(string):
                        raise IndexError(f'Error: String length of {length} was given but actual length is {len(string)}.')

                    idx = 0
                    while idx < length:
                        print(string[idx], end='')
                        idx += 1
                    print()

                elif tok.value == 'key':
                    """key
                    Pauses execution until a key is pressed. Once a key 
                    is pressed, the key code is pushed onto the stack.
                    Ex: <A> --> 65
                    """
                    getch = sys.stdin.read
                    key = None
                    while not key:
                        key = getch(1)
                    self.data.push(ord(key))

                # Helpful Constants
                elif tok.value == 'cr':  # Carriage Return
                    self.data.push(13)

                elif tok.value == 'lf':  # Line Feed
                    self.data.push(10)

                elif tok.value == 'sp':
                    self.data.push(32)

                elif tok.value == 'null':  # NUll
                    self.data.push(0)

                elif tok.value == 'bell':  # System Bell
                    self.data.push(7)

                elif tok.value == 'bs':  # Backspace
                    self.data.push(8)

                elif tok.value == 'tab':  # TAB
                    self.data.push(9)

                elif tok.value == 'vt':  # Vertical TAB
                    self.data.push(11)

                elif tok.value == 'ff':  # Form Feed
                    self.data.push(12)

                elif tok.value == 'esc':  # Escape
                    self.data.push(27)


if __name__ == '__main__':
    source = """ 16 2 >> 
    "This is a string" 
    dump
    type dump
    'C' 'B' 'A'
    dump
    """
    lexer = Lexer(source)
    parser = Parser(lexer)
    parser.parse()
    #print(parser.data)

#! usr/bin/env/python3

# sol.py

from lexer import Lexer

from parser import Parser

import argparse

def main(src):

_lexer = Lexer(src)

_parser = Parser(_lexer)

_parser.parse()

if __name__ == '__main__':

argparse = argparse.ArgumentParser()

argparse.add_argument("-i", "--infile",

help="Input source file", required=True)

argparse.add_argument("-v", "--verbose", help="Echo source file", action="store_true")

args = argparse.parse_args()

infile = args.infile

sorce = ''

with open(infile, 'r') as ifh:

source = ifh.read()

if args.verbose:

print(source)

main(source)

#! usr/bin/env/python3 # sol.py from lexer import Lexer from parser import Parser import argparse def main(src): _lexer = Lexer(src) _parser = Parser(_lexer) _parser.parse() if __name__ == '__main__': argparse = argparse.ArgumentParser() argparse.add_argument("-i", "--infile", help="Input source file", required=True) argparse.add_argument("-v", "--verbose", help="Echo source file", action="store_true") args = argparse.parse_args() infile = args.infile sorce = '' with open(infile, 'r') as ifh: source = ifh.read() if args.verbose: print(source) main(source)

#! usr/bin/env/python3

# sol.py

from lexer import Lexer
from parser import Parser
import argparse


def main(src):
    _lexer = Lexer(src)
    _parser = Parser(_lexer)
    _parser.parse()



if __name__ == '__main__':
    argparse = argparse.ArgumentParser()
    argparse.add_argument("-i", "--infile",
                        help="Input source file", required=True)
    argparse.add_argument("-v", "--verbose", help="Echo source file", action="store_true")
    args = argparse.parse_args()

    infile = args.infile
    sorce = ''
    with open(infile, 'r') as ifh:
        source = ifh.read()
        if args.verbose:
            print(source)

    main(source)

# stack.py

class Stack:

def __init__(self, report_errors=False):

self.store = []

self.report_error = report_errors

def push(self, item):

self.store.append(item)

def pop(self):

if self.store:

return self.store.pop()

else:

if self.report_error:

raise IndexError('Attempt to pop a value from empty stack!')

else:

return None

def tail(self):

if self.store:

return self.store.pop(0)

else:

return None

def clear(self):

self.store.clear()

def count(self):

return len(self.store)

def is_empty(self):

return not bool(self.store)

def __str__(self):

s = '[ '

for v in self.store:

s += "'" + str(v) + "', "

s = s[:-2]

s += ' ]'

return s

def __repr__(self):

return self.__str__()

# stack.py class Stack: def __init__(self, report_errors=False): self.store = [] self.report_error = report_errors def push(self, item): self.store.append(item) def pop(self): if self.store: return self.store.pop() else: if self.report_error: raise IndexError('Attempt to pop a value from empty stack!') else: return None def tail(self): if self.store: return self.store.pop(0) else: return None def clear(self): self.store.clear() def count(self): return len(self.store) def is_empty(self): return not bool(self.store) def __str__(self): s = '[ ' for v in self.store: s += "'" + str(v) + "', " s = s[:-2] s += ' ]' return s def __repr__(self): return self.__str__()

# stack.py

class Stack:
    def __init__(self, report_errors=False):
        self.store = []
        self.report_error = report_errors

    def push(self, item):
        self.store.append(item)

    def pop(self):
        if self.store:
            return self.store.pop()
        else:
            if self.report_error:
                raise IndexError('Attempt to pop a value from empty stack!')
            else:
                return None

    def tail(self):
        if self.store:
            return self.store.pop(0)
        else:
            return None

    def clear(self):
        self.store.clear()

    def count(self):
        return len(self.store)

    def is_empty(self):
        return not bool(self.store)

    def __str__(self):
        s = '[ '
        for v in self.store:
            s += "'" + str(v) + "', "
        s = s[:-2]
        s += ' ]'
        return s

    def __repr__(self):
        return self.__str__()

class StringTable:

def __init__(self):

self.data = []

# Insert string and meta data into string list

def save(self, string):

entry = {'len': len(string), 'val': string}

self.data.append(entry)

return (len(self.data) - 1)

# Return string storage object (including length)

def get(self, idx):

return self.data[idx]

# Return length of string

def length(self, idx):

return int(self.data[idx]['len'])

# Return printable string only (no meta data).

def get_str(self, idx):

return str(self.data[idx]['val'])

def __str__(self):

return str(self.data)

class StringTable: def __init__(self): self.data = [] # Insert string and meta data into string list def save(self, string): entry = {'len': len(string), 'val': string} self.data.append(entry) return (len(self.data) - 1) # Return string storage object (including length) def get(self, idx): return self.data[idx] # Return length of string def length(self, idx): return int(self.data[idx]['len']) # Return printable string only (no meta data). def get_str(self, idx): return str(self.data[idx]['val']) def __str__(self): return str(self.data)

class StringTable:

    def __init__(self):
        self.data = []

    # Insert string and meta data into string list
    def save(self, string):
        entry = {'len': len(string), 'val': string}
        self.data.append(entry)
        return (len(self.data) - 1)

    # Return string storage object (including length)
    def get(self, idx):
        return self.data[idx]

    # Return length of string
    def length(self, idx):
        return int(self.data[idx]['len'])

    # Return printable string only (no meta data).
    def get_str(self, idx):
        return str(self.data[idx]['val'])

    def __str__(self):
        return str(self.data)

# tokens.py

# Data

TT_CHAR = 'CHAR' # 'c'

TT_STRING = 'STRING' # "string"

TT_NUMBER = 'NUMBER' # 1234 (integers only

TT_ID = 'ID'

# Arithmatic Operators

TT_ASSIGN = 'ASSIGN' # '='

TT_PLUS = 'PLUS' # '+'

TT_MINUS = 'MINUS' # '-'

TT_MUL = 'MUL' # '*'

TT_DIV = 'DIV' # '/'

TT_POW = 'POW' # '**"

# Bitwise Operators

TT_LSR = 'LSR' # '>>'

TT_LSL = 'LSL' # '<<'

TT_AND = 'AND' # '&' Bitwise AND

TT_OR = 'OR' # '|' Bitwise OR

TT_XOR = 'XOR' # '^' Bitwise XOR

TT_INV = 'INV' # '~' Bitwise NOT/Invert

# Relational Operators

TT_EQ = 'EQ' # '==' Is Equal?

TT_LESS = 'LESS' # '<' Is Less?

TT_GREATER = 'GREATER' # '>' Is greater?

# Logical Operators

TT_LAND = 'LAND' # AND Logical

TT_LOR = 'LOR' # OR Logical

TT_NEQ = 'NEQ' # NEQ Logical

# DEFINE Functions

TT_STARTDEF = 'STARTDEFINE' # ':'

TT_ENDDEF = 'ENDDEFINE' # ';'

# INTERNAL USE

TT_KEYWORD = 'KEYWORD'

TT_START = "START"

TT_EOF = 'EOF'

class Token:

def __init__(self, _type, _value, line=None, col=None):

self.type = _type

self.value = _value

self.col = col

self.line = line

def __str__(self):

return f'''type: {self.type}, value: {self.value},

at line {self.line} in column {self.col}'''

def __repr__(self):

return self.__str__()

# tokens.py # Data TT_CHAR = 'CHAR' # 'c' TT_STRING = 'STRING' # "string" TT_NUMBER = 'NUMBER' # 1234 (integers only TT_ID = 'ID' # Arithmatic Operators TT_ASSIGN = 'ASSIGN' # '=' TT_PLUS = 'PLUS' # '+' TT_MINUS = 'MINUS' # '-' TT_MUL = 'MUL' # '*' TT_DIV = 'DIV' # '/' TT_POW = 'POW' # '**" # Bitwise Operators TT_LSR = 'LSR' # '>>' TT_LSL = 'LSL' # '<<' TT_AND = 'AND' # '&' Bitwise AND TT_OR = 'OR' # '|' Bitwise OR TT_XOR = 'XOR' # '^' Bitwise XOR TT_INV = 'INV' # '~' Bitwise NOT/Invert # Relational Operators TT_EQ = 'EQ' # '==' Is Equal? TT_LESS = 'LESS' # '<' Is Less? TT_GREATER = 'GREATER' # '>' Is greater? # Logical Operators TT_LAND = 'LAND' # AND Logical TT_LOR = 'LOR' # OR Logical TT_NEQ = 'NEQ' # NEQ Logical # DEFINE Functions TT_STARTDEF = 'STARTDEFINE' # ':' TT_ENDDEF = 'ENDDEFINE' # ';' # INTERNAL USE TT_KEYWORD = 'KEYWORD' TT_START = "START" TT_EOF = 'EOF' class Token: def __init__(self, _type, _value, line=None, col=None): self.type = _type self.value = _value self.col = col self.line = line def __str__(self): return f'''type: {self.type}, value: {self.value}, at line {self.line} in column {self.col}''' def __repr__(self): return self.__str__()

# tokens.py


# Data
TT_CHAR     = 'CHAR'    # 'c'
TT_STRING   = 'STRING'  # "string"
TT_NUMBER   = 'NUMBER'  # 1234 (integers only
TT_ID       = 'ID'

# Arithmatic Operators
TT_ASSIGN   = 'ASSIGN'  #   '='
TT_PLUS     = 'PLUS'    #   '+'
TT_MINUS    = 'MINUS'   #   '-'
TT_MUL      = 'MUL'     #   '*'
TT_DIV      = 'DIV'     #   '/'
TT_POW      = 'POW'     #   '**"

# Bitwise Operators
TT_LSR      = 'LSR'     #   '>>'
TT_LSL      = 'LSL'     #   '<<'
TT_AND     = 'AND'    #   '&' Bitwise AND
TT_OR      = 'OR'     #   '|' Bitwise OR
TT_XOR     = 'XOR'     #   '^' Bitwise XOR
TT_INV     = 'INV'     #   '~' Bitwise NOT/Invert

# Relational Operators
TT_EQ       = 'EQ'      #   '==' Is Equal?
TT_LESS     = 'LESS'    #   '<'  Is Less?
TT_GREATER  = 'GREATER' #   '>' Is greater?

# Logical Operators
TT_LAND  = 'LAND'   #   AND Logical
TT_LOR   = 'LOR'    #   OR Logical
TT_NEQ   = 'NEQ'    #   NEQ Logical

# DEFINE Functions
TT_STARTDEF = 'STARTDEFINE' #   ':'
TT_ENDDEF   = 'ENDDEFINE'   #   ';'

# INTERNAL USE
TT_KEYWORD  = 'KEYWORD'
TT_START    = "START"
TT_EOF      = 'EOF'


class Token:
    def __init__(self, _type, _value, line=None, col=None):
        self.type = _type
        self.value = _value
        self.col = col
        self.line = line

    def __str__(self):
        return f'''type: {self.type}, value: {self.value}, 
                at line {self.line} in column {self.col}'''

    def __repr__(self):
        return self.__str__()

\ file: test.sol

\ Expected Output

\ -----------------------

\ [ '2', '3' ]

\ [ '5' ]

\ [ '5', '5' ]

\ [ '10' ]

\ [ '10', '6' ]

2 3 dump + dump

5 dump + dump

6 dump * dump

\ file: test.sol \ \ Expected Output \ ----------------------- \ [ '2', '3' ] \ [ '5' ] \ [ '5', '5' ] \ [ '10' ] \ [ '10', '6' ] \ 2 3 dump + dump 5 dump + dump 6 dump * dump

\ file: test.sol
\
\ Expected Output
\ -----------------------
\ [ '2', '3' ]
\ [ '5' ]
\ [ '5', '5' ]
\ [ '10' ]
\ [ '10', '6' ]
\

2 3 dump + dump
5 dump + dump
6 dump * dump

Until next time, Happy Coding!

Series Navigation<< Implementing Stack Oriented Languages – Part 3

Implementing Stack Oriented Languages – Part 4

Post Stastics

Strings and Variable

Leave a Reply Cancel reply

Recent Posts

Recent Comments

Archives

Categories

Meta