Five Years Later

Having spent quite a bit of time defining the EK9 language, I started writing the compiler in around 2020. Now today on the 2nd of Ocober 2025, I have ‘hello world’ done from end to end. At the moment just running on a Java VM, but in time it will also be native.

Why so long?

Well I went down quite a few rabbit holes in the ‘front-end’ of the compiler to be honest. I was that interested in detecting developer issues and reporting errors, I just carried on. Strage this is there are still more to pick up, like detecting missing dependency injection variables with ‘Applications’. I want to work out cohesion and coupling and emit errors if the values are to low/high, I’ve done this with complexity and I think it will stop AI producing poor quality code.

In fact it is in the latter area of using AI that I’m most interested in to be honest, because I’ve already used Claude Code to write some EK9 source examples and with a few adjustements in the compiler I managed to change its behaviour to produce more accurate code. I have the feeling I’m only skimming the surface of this. It’s really quite addictive and interesting to be able to automatically get and AI to alter its approach to code writing via compiler error messages. I see great potential here.

The Intermediate Representation

To be frank, it took me about three goes to get the right sort of level of IR. I started off far too low, then I pulled back to much so it was too close to EK9. Then I found I have issues in being able to target both the JVM (which is stack based), versus LLVM (which is Static Single Assignment based). Plus I had to work out how I could do memeory management via LLVM (as the JVM is garbage collection based). So a single Medium Level Intermediate Representation was my final approach. There’s still 90% of it still to do! But enough for hello world.

JVM Back End

Now I have at least some IR ready and was able to focus on checking that the IR was good enough to produce some sort of machine code. Turns out I had to keep fixing defects and omissions in the IR. I suspect there will be more as I do more of the IR. This sort of validates my approach of working on the ‘middle end’ and ‘back-end’ concurrently. But I also need to start work on the LLVM back-end to ensure that is viable.

I decided to use the ‘ASM’ Java package to help create the JVM byte code. Once I had a ‘stack based’ approach (which again took me a couple of attempts), I could move forward.

Basically now the back-end uses the generate IR (MLIR actually) and the details from the Symbols (populated by the ‘front-end’) to generate byte code.

There’s lots of tidying up to do and refactoring - because I was unsure which wy to go - so there’s messy dead end code I need to prune out - plus Claude is a ‘copy-paste master’.

Integrating it all

While the phases of the compiler all defined and coordinated and work well together, I had to decide on how to call the compiled code for a ‘program’ from some sort of ‘main’.

As an EK9 project can contain multiple ‘programs’ I needed to detect them all and then look at what arguments those programs needed. Unlike most languages, EK9 can marshall a range of types and promote them to the correct type for the ‘program’. It does this my looking at the type signature of the program, then uses the ‘String’ values that are provides on the command line and attempts to marshal them into the type. If the number of arguments is incorrect on the command line or the types cannot be marshalled then an error is emitted and the program not run.

So, the compiler can detect all the programs and create and appropriate MLIR structure for the back-end to generate code for. But then I also need the ‘main’ for the initial entry point. This was again done by using ASM to dynamically create an ek9/Main class and deal with creating the new program (as selected by the user running the command).

Almost there

The EK9 design is such that it is intended to ‘feel scripty’, but actually be a compiled language. So, the main reason for the ‘shebang’ of ‘#!ek9’ in all EK9 source files is so that an EK9 source file can be set as ‘executable’ and an ‘ek9’ executable will be run to execute the code.

As the initial compiler and runtime is Java based, I did not want any EK9 developers (well just me at the moment) to have to deal with all the Java ‘-jar’, ‘-cp’ some.jar various commandlines bits. So it’s back to my roots and ‘C’.

So the ‘C’ wrapper ‘ek9.c’ was created, this (when compiled using clang) can be put on the users PATH, it will then use environment variable ‘EK9_HOME’ to find the ‘jar’ that has the compiler in it. It also checks that a suitable version of Java is available (25 - yes we’re on the bleading edge). It can then use ‘fork’ to call java with all the right arguments to trigger the compilation. If the command was a ‘run’ command the EK9 compiler will do all the necessary checks to see if any code need compiling and then compile the ek9 sources to Java class files and also extract the ‘stdlib’ from within the compiler classes file and create a ‘fat jar’.

So, now with a simple native binary we get the ‘shebang’, easy command line use for the user. a migration path off Java in time (not sure when!), but without any user code or use changes. The command runs our compiler and if the compilation is successful and ‘run’ was requested, the command to execute is provided back to the ek9.c binary and it can then execute that.

Hello, World!

#!ek9
defines module introduction

  defines program

    HelloWorld()
      stdout <- Stdout()
      stdout.println("Hello, World!")

//EOF

Then when I run it (hardly a surprise):

 % ./helloWorld.ek9 
"Hello, World!"

Summary

It’s been a bit of a long haul, with lots of interesting distractions. I’ll spend a bit of time updating the error handling and then look at implementing the llvm native parts for ‘hello world’.

Hello, World!

Five Years Later

Why so long?

The Intermediate Representation

JVM Back End

Integrating it all

Almost there

Hello, World!

Summary

Comments

More from this blog

Hello World Again!

Can Claude Help?

Phases and Rules

New core 'Types' added

Command Palette

Five Years Later

Why so long?

The Intermediate Representation

JVM Back End

Integrating it all

Almost there

Hello, World!

Summary

Comments

More from this blog