Metaprogramming In Elixir - Part 1 - 2022-05-02
Abstract
I like challenges and I like them even more if I can work on them whenever I want instead of having the pressure to do so immediately.
While trying to find a solution for an exercism exercise called “top-secret” where your task is to decode hidden messages in code, I realized that it requires me to work more than usual to understand what is going on and how to solve it.
To quote Angelika Tyborska while describing the exercises on exercism: “solving a practice exercise takes anywhere from five minutes to five hours”. After I spent five hours I thought I could also write a blogpost and dump even more hours into this topic.
Solving the “top-secret” exercise involves something called meta programming which I have heard of before but never interested me enough to take a look at. Maybe I was scared of the complexity that the first sentence of the Wikipedia implies:
Metaprogramming is a programming technique in which computer programs have the
ability to treat other programs as their data.
It sounds intriguing but also scary. I already have a hard time reading code, how am I supposed to make the computer read and act on code?! And to be honest, I haven’t read more that the first paragraph of the Wikipedia page yet.
This post describes my journey of learning what I think is required to solve the challenge.
The challenge
Your task is to extract secret messages from code. You do so by making use of the name and the arity (amount of arguments to a function) of a function. The arity dictates how many letters of the function name should be taken into account for the message.
An example:
def invert(list), do: Enum.reverse(list)
def lines(list, nth_element) do
list
|> Enum.take_every(nth_element)
|> length
end
def keep(list, only_keep) do
list
|> Enum.filter(fn x -> x == only_keep end)
end
def teams(list, team_size, keep_smaller_teams \\ true) do
list
|> Enum.split(team_size)
# TODO: implement keep_smaller_teams
end
Hint: In elixir it is common to refer to a function by appending the arity to it.
Going from top to bottom we have:
invert/1
-> i
lines/2
-> li
keep/2
-> ke
teams/3
-> tea
Assembling the messages yields iliketea
. So far so good. Doing this for the
first couple test cases was easy, but the edge cases made me refactor the code
over and over again and breaking what worked before which was a sign for me that
I missed something along the way and my approach is probably very wrong.
But let’s start at the beginning.
Abstract Syntax Trees
The last time I dealt with Abstract Syntax Trees (AST) was in university. At least I think I dealt with them. Maybe they were different trees. Who knows. As far as I understood it, you might think of ASTs as a way of representing code as data.
Elixir refers to ASTs as quoted expressions and offers a macro called
quote
which turns some code into a tuple with three elements.
iex(1)> quote do: sum(1,2,3)
{:sum, [], [1, 2, 3]
The first element (:sum
) is an atom describing the operation, the second
argument is a list describing metadata and the last element is a list which
contains other nodes.
quote
works a bit different atoms, floats, integers, strings, lists and
tuples where it just returns the value. These seem to be the leafs of the tree.
iex(1)> quote do: "hello"
"hello"
iex(2)> quote do: [1,2,3]
[1, 2, 3]
iex(3)> quote do: :hello
:hello
This already feels like everything we need to move code into data.
On a side node: quote
somehow cleared up some things for me (not all of
course) when dealing with expressions in Elixir.
iex(1)> quote do: 1+1
{:+, [context: Elixir, import: Kernel], [1, 1]}
iex(2)> quote do: 1==1
{:==, [context: Elixir, import: Kernel], [1, 1]}
The first approach
Let’s start by taking the first function of the example from the beginning and
see what we get after feeding it to quote
:
iex(1)> quote do: def invert(list), do: Enum.reverse(list)
{:def, [context: Elixir, import: Kernel],
[
{:invert, [context: Elixir], [{:list, [if_undefined: :apply], Elixir}]},
[
do: {{:., [], [{:__aliases__, [alias: false], [:Enum]}, :reverse]}, [],
[{:list, [if_undefined: :apply], Elixir}]}
]
]}
This is already quite nested and looks complicated. The metadata
field
contains values and makes things harder to read and understand. The function
name invert
became an operation and somewhere at the end we find the actual
function call we’re running.
In the past I did a lot of web scraping using things like Beautifulsoup and xpath to get what I need from a nested data structure. Naturally I tried the same with elixir and ended with a lot of pipelines looking something like this:
arity = data |> elem(2) |> Enum.at(0) |> elem(2) |> length
message = data |> elem(2) |> Enum.at(0) |> elem(0) |> Atom.to_string() |>
String.slice(0..arity-1)
It works. Sure. But not for long. To account for different structures, I added a
couple of case
statements but eventually resigned because I wasn’t able to
make it work and have a solution that is readable.
What next?
Now I’m sitting here and have done two things:
- I failed to finish the exercise (“in time”)
- I started a blog post without knowing how to continue once I reach the “What next?” point
More to come. Hopefully.