The Symbol Table
4 min read
Well, it's been just over a year since my last blog post, I've been working on the Symbol Table (also changing jobs - so a bit distracted).
Review of Grammar
I've had the standard ANTLR grammar reviewed and all looks good. So I progressed with the main symbol table work. All this code is all up in GitHub.
What is a Symbol Table?
What do I intend to use the
SymbolTable for exactly? Well, as source code is broken up in to
Tokens are used by the
Parser to match up against the grammar. In effect ANTL4 will build a 'abstract syntax tree' (AST) for me.
The ANTLR4 API then allows me to plug in 'visitors' and/or 'listeners'.
So when the AST is created by something from EK9 code like:
... someIntegerValue <- 42 ...
Here is what the AST looks like:
This is where it starts to get interesting for me. There are 'built-in' concepts such as
Integer for example; that the EK9 language just must 'know about'. But how does it know about them? Also, how does the compiler 'know' 42 is an
If you look at the AST above, you can see that the great work done in ANTR4 by Terence Parr together with the EK9 grammar enables me to 'see' that 42 is an 'integerLit'. i.e. a literal of type
More importantly when a developer creates a new 'class' or 'function' then the compiler must then know about those new types as well.
This is where a symbol table comes in. Those
Symbols are actually types and so need to be recoded somewhere as types, this is so that when we declare a variable of a particular type; we can resolve it.
So for the compiler to be able to deal with a statement as simple as
someIntegerValue <- 42 I need to have defined:
- The grammar
- The Lexer
- The Parser
- The ANTLR4 Visitor
- A Symbol for type
- A SymbolTable for EK9 where that
Integertype can be recorded
- Also a SymbolTable where the variable 'someIntegerValue' can be recorded and linked to its type
But as soon as I want to do something with the variable someIntegerValue, like add another integer value to it, I'll need some operators on the
Integer type. For that I'll need the type
Integer to be an
Aggregate (with methods/operators).
Boot strapping the SymbolTable
There is more than one
SymbolTable, in fact there will be thousands. But there is only one main global
SymbolTable that is part of the EK9 language. So the first job I have to do is define the concept of the
SymbolTable and the idea of a
SymbolTable has a
Scope this is sort of like a 'prefix' in programming terms like a 'namespace'. So it has a 'name', these names have to be unique. Like a module name for example.
It is within this
Scope we need to define a
Symbol. Now we can just keep a list of these
Symbols. The idea that some
Symbols may clash may depend on the type of
Symbol being defined.
So for example, I've described
Integer, clearly as types, these must be unique. But let's consider
Methods on a
Class; where we allow method overloading.
In that case; we would have several
Methods (which will also be
Symbols) with the same name, but with different parameters in the same
So that's the job I'm working on at the moment, boot strapping the main EK9
Symbol Table with all the standard built-in types and then some of the main built-in
In my prototype compilers; I actually used Java and reflection to do this, but now I'm moving to the first reference compiler - I'll define the
Symbols in terms of
Aggregates in a more abstract way. Only if the final compiled output targets Java will the appropriate Java class be employed. This will allow me to target different runtimes, initially it will be Java, but I want to be able to target LLVM as well (at least).
Once you have
SymbolTables then next thing you need to be able to do after defining a
Symbol is resolving a
Symbol. But you have to bear in mind that programs tend to have nested structures. For example:
- Global (built-in EK9)
- module (the developers application)
So when you want to resolve a symbol from within the block above, it is necessary to resolve
Symbols back up that nested structure; right up to the Global (building-in EK9 types).
So that's where I am at the moment, writing lots of
Scopes and many tests. I hope to get the bulk of the
SymbolTable done this year, but it'll depend on how draining my new job is!