- I love functional programming; I've been enjoying programs functionally for over 3 years; in LISP and Elixir mostly. I've loved every second of it. Well, maybe not every second, but definitely most of them.
On our journey to get a valentine, it looks like we have stumbled across a black box. Why? Well... we could start shoving random numbers in it and see what happens.
hmm... [-1, 2]? okay, 1. [3, 4]? okay, 7. what if we try [-1, 2] again, see if it meets our criteria to be our valentine; not changing its answer. amazing!
We're in love with its predictability, and simplicity; the assurance that no matter what we give as input or how many times it's given, the function behaves in a consistent manner.
Imagine asking your partner about what food they want to eat. But instead of a straightforward answer based on your question alone and some state (i.e. hunger levels, craving ratios, etc), the function's response is influenced by other factors; the day of the week, and the state of the day or week prior. We can simulate this: ((go through the ratios and outputs)).
Let's see what causes this in this black box we don't love; don't pay attention to the implementation (there's some stuff we haven't talked about yet), but check out this line (MATH.RANDOM). This is a side effect; an unpredictible effect on the output.
Side effects introduce unpredictability into our programming relationships. Where output is not just determined by its input but also by the state of the outside world, or the system at the time of execution. Suddenly, the predictability we cherished is compromised.
Now, when I go and update the cache to be correct, we get the correct answer. I was able to change the behavior of the function by modifying state outside of it. Get the gist - impure vs pure functions, side effects, etc? Cool.
For now let's move back to studying our pure black boxes.
We love our black boxes a lot, and we want to share them with other people in our lives. So let's create another that takes a list of people and creates a custom valentine's letter for them.
[SPACE] But a few months goes by, and all our friends' birthdays are soon coming up (why are so many cs people born in June)! We don't want to make them sad, as we see here, so we make another black box to generate a letter, and in how many days we should give it to them:
But, this is getting annoying; what about Christmas, Thanksgiving, or Easter? Making a new black box to make a list of new cards, going through each person, every single time to construct a list of cards, is getting really tedious.
[SPACE] What if we generalized this? [SPACE] We create a couple of black boxes that take a person, and generate them a card, specifically; like a template for a card you could print off and fill in manually. Like this, for alan turing.
The ability in a language to pass a function around like this - like a variable - is what makes functions "first class". And the `buildCards` function takes a function as input, making it a "higher order function".
After exploring the simple yet profound beauty of higher order black boxes, our journey into the functional programming romance takes an intriguing turn. What's better than one black box? A black box that creates more black boxes.
All life on Earth contains instructions in the form of DNA. During mitosis, special mechanisms in our cells read the instructions and create new instructions to give to another cell.
Very similarly, we can give our black boxes DNA from parents to reproduce and create new child black boxes. But in our world of black boxes, we refer to this DNA as "closures"; a function with its "environment" - the variables it has access to - attached to it. [SPACE]
Here we've created a black box that outputs a black box to create a certain type of letter for a person encoded in its DNA. If the type is "birthday", the output is a birthday card generator, and if it's "valentine", it's a valentine letter generator.
When we put in "Alan Turing" and "valentine", we obtain a black box with "person" (in this case, Alan Turing) in its DNA [OPEN DEV TOOLS -> valentineCardGenerator]. Now we evaluate the black box, and it returns a valentine letter for that person.
There's another way that we could implement the same functionality - "partial function application"; a fancy name for a simple idea; locking some arguments to partial inputs in a function and passing the rest to be applied.
We briefly mentioned side effects and alluded them to the unpredictability of a partner choosing their food. We love our black boxes because they're reliable. But, we've actually had an impostor among us throughout this presentation.
Specifically in `buildCards` [SPACE]
This function actually has a small side effect - though it's contained entirely within the function - when we append to the cards list. If we were to go about living a side effect free life, that means we can't change _anything_ lest something else depends on it.
In this case, looping through "people" inherently produces a side effect via mutation of the loop variable, so we need to eliminate it.
So how do we sus out an impostor? With copies and recursion.
Here we're not changing anything at all (except the stack), but we've kept the same functionality; this code is "immutable". When we call `build_cards` on a list of people, we're 100% certain we're not gonna change something and get something odd, short of a bug.
At a high level there are so many benefits to immutable functional programming:
- Readability and reasoning about code. Functional, immutable code reads like a book!
- Concurrency. When everything is immutable, it's much easier to prevent race conditions and deadlock. If you can avoid a mutex or semaphore here and there, that simplifies things so much. You also know for certain every thread or processor is working on the same data; it couldn't have changed when it was distributed.
- No side effects. I can't think about how many times I've spent hours poring through a code base to find where exactly something is changing or what it can effect. This helps verify correctness and makes unit testing a dream.
- I suppose this is more subjective, but it gives you an entirely new view of how computers work. When you work with functions you're directly treating the computer as an automaton or turing machine. There's so much to learn around functional programming, type theory, and it touches almost any given field of mathematics and computer science.
But there are also downsides:
- Keeping immutability requires more computation in many cases (special data structures and copying)
- Typical arrays go out the window in most applications. The cons cell / linked list is much preferred. - You also don't typically have mutable datastructures like hash maps. Instead, you need to reach to balanced trees or tries. It takes a lot of practice but you _can_ always write code just as efficiently in terms of time complexity.
- All things considered, to write and learn, it's difficult. The rabbit hole is as deep as your imagination. I haven't even talked about monads or treating code as data (macros); barely even scratched the surface.
But like all relationships, we need to compromise with our black boxes. Side effects are effectively _unavoidable_ as programmers; when people press the button, they don't want the computation to just exist out there in the computer void. No! They want to see the pop up box, and the value updated on their account. We need to be able to mutate state where the tradeoff of extra compute time is unavoidable like in I/O or a relatively large database.
# Writing Code Functionally
The best way to learn is by practicing. So we'll spend some time refactoring some code to be more functional and develop some higher order functions and practice currying.
We'll start with a simple example: a function that takes a list of numbers and returns the sum of the squares of the even numbers.
For millenia humans have thought about how best to find if a number is even. So the first function is the _most efficient_ way to do so; we know 0 is even, and then numbers alternatve between even and odd. We could prove this by induction but it's up to the reader.
Then for our `sumOfEvenSquares` we go through all the numbers `upTo` and add its square if it's even.
To make this functional, we first need to keep `isEven` pure and immutable. We can do this again via recursion; 0 is even, so we do the same alternation between even and odd.
For sumOfEvenSquares, we can make some higher order function to filter out the even numbers, and then map over the list to square them, and then reduce to sum them.
And, we fail, getting a "maximum call stack size exceeded" error in our incredibly naive recursive implementation of `filter`, `map`, and `reduce`. Thankfully javascript has these builtin, so we'll use those.
So everytime we check the number previous, we create a whole new stack frame to evaluate `isEvenFn(n-1)` and return control to the callee. Then we evaluate "!isEven" which requires that stack frame.
The runtime I'm using, Bun, uses JavaScriptCore - which supports Tail Call Optimization (which is, sadly, not in the JS spec). What is TCO? Tail Call Optimizations allows us to write recursive functions that overwrite the same stack frame when the last expression evaluated does not leave control of the function.
And there we go! It's passing. But, it's much much slower than the iterative solution. But from my experimentation, Tail Call Optimization won't help us much here since we spend most of our time doing our isEven check in linear time, and we're constantly doing garbage collection whereas our iterative solution has nothing to clean up.
The keen eyed among you might notice that since we can represent the sum of the first $n$ natural number's squares as (n _ (n + 1) _ (2 \* n + 1)) / 6, then transform each natural number to its nearest even entry by scaling by two, and halve the bounds of the sum. Which is constant time. Yeah. But that's not as fun.
The Lambda calculus, developed by Alonzo Church in the 1930s, is a "formal system in mathematical logic for expressing computation based on function abstraction and application using variable binding and substitution".
There are three rules that define the semantics of the Lambda Calculus:
1. "x": A variable is a character or string representing a parameter, and is a valid lambda term.
2. "(λ x . t)" is an "abstraction" - a function definition, taking as input the bound variable x and returning the lambda term t.
3. (M N) is an "application", applying a lambda term M with a lambda term N.
There, we're done; the rest is an exercise to the reader. Have a good day everyone! No... I wouldn't be that mean.
We can "evaluate" a lambda term by performing three types of reduction:
1.α - conversion: Renaming of bound variables to avoid name clashes (out of scope - we're gonna assume user knows not to bind variable names already chosen elsewhere).
2. β - reduction: Application of a function to an argument.
3. η - reduction: Conversion of a function to a point-free form (out of scope).
To perform a β-reduction, we substitute the argument into the body of the function, replacing all instances of the bound variable with the argument.
For example, given the lambda term `(λ x . x) y`, we can do a β-reduction to obtain the value `y` after substituting x. This folllows pretty intuitively, but more formally - we define substitution recursively - given M and N as lambda terms and x,y as variables, as follows:
1. x[x := N] = N
2. y[x := N] = y` if y != x
3. (M N)[x := P] = (M[x := P]) (N[x := P])
4. (λ x . M)[x := N] = λ x . M
5. (λ y . M)[x := N] = λ y . (M[x := N]) if y != x and y is not _free_ in N
A variable is _free_ in a lambda term if it is not bound by a λ abstraction. In other words, a free variable is one that is not a parameter of the function, or in the immediate scope of an abstraction.
# Lambda Calculus as a Computational Model
We encode all computation and values in the lambda calculus as a series of functions and applications. For example, we can encode a natural number n as a function that takes a function f and a value x, and applies f to x n times.
1.`0 = λ f . λ x . x` (no application of f)
2.`1 = λ f . λ x . f x` (one application of f )
3.`2 = λ f . λ x . f (f x)` (two applications of f)
This is called the "Church encoding" of the natural numbers.
We can also define the successor function `succ` as a function that takes a church encoded natural number n, and a function f, and a value x, and performs f on x n times, and then once more.
1.`succ = λ n . λ f . λ x . n f (f x)`
For example, we do an application of `succ` on `0`: `(succ 0) = 1`, and `(succ 1) = 2`.
1. We perform substituion on ,n with the church encoding for 0
2. Substituting g with f has no effect and returns the term (lambda y . y)
3. (lambda y . y) is the identity function.
We're Done! We've constructed 1 from 0. We'll go through the same process to construct 2 from 1.
We can define the boolean values `true` and `false` as functions that take two arguments and return the first and second, respectively.
1.`true = λ x . λ y . x`
2.`false = λ x . λ y . y`
We can then define the logical operators `and`, `or`, and `not` in terms of these boolean values.
1.`and = λ p . λ q . p q false`
2.`or = λ p . λ q . p true q`
3.`not = λ p . p false true`
(a few examples...)
Recursion is also possible in the lambda calculus via the funny-looking thing on the board, which you've probably seen before - the Y combinator. The Y combinator should satisfy the equation `(Y f) = (f (Y f))`, and is defined as follows:
1.`Y = λ f . (λ x . f (x x)) (λ x . f (x x))`
If we let it run by itself, we can see how our recursion works. We can kinda see some formation of a "stack" of sorts.
When we apply the Y combinator to a function, it will apply the function to itself, and then apply the function to itself again, and so on, until the function returns a value.
# Writing a Quick Lambda Calculus Interpreter
I think the best way to learn the lambda calculus is by building an interpreter for it, it's a quick exercise and it's a lot of fun.
When writing a Lambda Calculus interpreter, there are a few decisions to be made:
- [ ] is it the typed or untyped lambda calculus?
- [ ] is the language lazy or strict / eager?
- [ ] call by value or call by name? (call by value - environment, closures)
There are a lot of bugs in our interpreter; first, there is no alpha conversion, there's a high chance we could have name conflicts within lambda terms. We could use de brujin indexing, but that's outside the scope of this presentation.
Obviously we've only scratched the surface of the lambda calculus and functional programming; I didn't even mention a lot of things I wanted to get to. So here's a list you can study on your own:
The lambda calculus is a "universal model of computation", meaning that any computable function can be expressed in the lambda calculus. This is known as the Church-Turing thesis, and is the basis for the theory of computation.
I highly recommend reading the paper "Compilation of Lambda-Calculus into Functional Machine Code". You know how we alluded to the "stack" that the Y combinator produced? This paper goes into how we can utilize that stack to compile lambda calculus into machine code on a stack machine by reversing the order through continuation passing style.