Immutability - why is it hard?

Background

In my previous post I dealt with flow control and mentioned immutability.

But why is immutability desirable at all? Why, as developers would we not want to alter/mutate/change variables? After all we need to be able to calculate results and modify data all the time; why restrict ourselves?

Developers restrict themselves all the time. With public, protected and private modifiers; abstract, interfaces/traits and polymorphic mechanisms. Information hiding and other abstractions are also a form of self imposed constraint.

Constraints and Restrictions

We know in the longer term it is safer to use these mechanisms above. The idea of many functions having unrestricted access to some global data and modifying it in a free and easy manner has been shown to create errors.

Being assured that some item of data/object cannot undergo any sort of mutation/change enables us to design a solution with that in mind.

Immutability in languages

Many programming languages have a relatively simple view of immutability in the form of const, final and let etc. But they are really just referring to the ability to change a primitive value or a reference to an Object. These mechanisms don't really apply to complex data structures, lists, maps or object graphs.

Considerations

If we consider a List with one or more values in it and declare it to be immutable by some mechanism. Are we declaring that the reference to that List is immutable or that the contents of the List are immutable or that the actual items in the list are immutable? What would we mean if we mixed variable and constant values in that list? How far should/does that immutability extend?

Some languages have immutable versions of more complex data structures. This now means that all the type signatures need to be specific about whether or not they accept mutable or immutable types. The alternative is to move the immutable checks to runtime. Then there is less compiler support for immutability. So with a List holding a mix of variable and constant values we may now get runtime errors in some circumstances and not others.

Functional approach with pure

The world of Functional Programming has a very different and somewhat rigorous approach to controlling change; in the form of a pure function.

Pure

The pure idea is very appealing in many ways. In short; what the developer is saying is I won't alter any externally visible data or parameters used when this function is called.

What this gives you in a single concept, is the assurance that data will not be altered. It therefore removes the need for const, final and let etc and all the immutable versions of mutable data structures.

But the price is somewhat expensive in some ways. It limits your ability to log, write files or even how you implement some algorithms in the way you'd like.

So true pure functions really do stop all side effects, but they actually don't stop you reassigning local variables inside a pure function. So they do not provide const, final and let semantics for local variables; this is because they are not externally visible. But reassignment of variables can also lead to issues and in EK9 we'd like to address those issues as well if possible.

EK9 use of constants and pure

The idea in EK9 is to try and meld constants, const, final, let and a pure-ish idea applied to functions/methods into a reliable way of delivering immutability and limit variable mutation/reassignment.

EK9 does not have global variables; it provides Dependency Injection. It also has constants for use with its built-in types. This means that some of the issues surrounding immutability have been solved (as constants are by definition immutable).

But by adding pure as a key word on functions and methods; EK9 provides a mechanism to guarantee that any data passed in (irrespective of the complexity of the data structures) won't be mutated (modified). This is quite a misuse of the word pure as seen from a Functional Programming point of view, but it has a certain ring to it. So I'm going to misuse it.

Pure Functions, Objects and Constructors

In EK9 when a function or a method is marked as being 'pure' it means variables cannot be mutated and only other 'pure' functions and methods can be called.

This means that no matter what is passed in as parameters can never be mutated. But also it means local variables/field properties can never be mutated (or reassigned).

Significantly is means that a 'pure' Object constructor must be an exception to this rule. It must be able to mutate its own fields/properties at construction. In essence it has to be possible to mutate those fields/properties during object construction.

Clearly it must also be possible to assign/reassign/mutate any return parameter.

Side effects

EK9 is not so rigorous as true Pure Functions in the sense of forbidding all side effects. For example EK9 does not consider writing files or sending information out as being impure.

In EK9 it is only the mutation of data (with the exception of Object construction/return parameters) that EK9 limits with pure.

Implications

There are a number of implications of taking this approach. These are outlined below.

Constants

Constants can be passed into functions/methods; if those functions/methods are pure then it's quite clear that those constants cannot be mutated. If on the other hand the functions/methods are not marked as pure then those constants have to be converted to a variable. While it is possible to 'trace the const-ness' of a value, it leads to complexity and a push into runtime checks.

Imagine for a moment a constant being inserted into a List or some part of an object graph; The check now has to be a runtime check if functions/methods are not marked as pure. This limits the compilers ability to detect issues at development time. EK9 avoids this but that the price of making the developer knowingly convert a constant to a variable (when used in that context).

We only allow a constant to be passed into a pure function/method. EK9 enforces the conversion of a constant to a variable if it is to be used in a data structure or passed in to a non-pure function/method. All constant values must be converted to a variable if being returned from any function/method.

Inheritance and Delegation

The creation of abstract pure functions and methods has a couple of additional implications. Specifically it means that overriding those functions and methods must also be pure. This is because the super has defined the contract that is to be used. When other functions/methods make the calls via abstract mechanisms that are marked pure then the contract is set and therefore the implementations must also be pure to maintain the contract.

The converse is not true however, if a super function/method is not pure the implementation can either be pure or not pure.

Summary

The approach to immutability in EK9 is a bit of a 'cutting the gordian knot' approach. It removes an entire set of existing concepts in most modern imperative languages. The const, final, let syntax, all the compiler rules around those. The profusion and need to create immutable data structures to parallel the mutable versions (and compiler checks on those). But most importantly of all it pulls any immutability check back out of runtime into compile/design time.

The idea of pure in EK9 has some similarities to pure functions, but has a number of significant differences. Maybe the use of the word pure will lead to confusion or 'heated debate'. But it's a nice short word and has sort of 'clean simplicity' to it. I could not find another word that conveyed this meaning.

The next blog post will discuss the constructs that EK9 will provide, by this I mean the concept of things like a class or a function. EK9 provides many more than just those two.

Steve's EK9 Language Blog