Malte Krupa

Slow Starting Jails On FreeBSD - 2022-09-10

The issue

Yesterday I tried to find a way to create jails more or less automatically from an ansible configuration.

Today I wanted to have a quick look inside of these jails to understand a bit more about how they work different from the host OS.

But even before I was able to run the first command inside of a jail I had to deal with an issue that makes it very hard to work with jails. Slow startup time.

Have a look at this restart of a jail:

root@w0:~ # time service jail restart test1
Stopping jails: test1.
Starting jails: test1.
0.176u 0.139s 1:49.75 0.2%      151+217k 0+31io 0pf+0w

Stopping the jail was very fast. But starting the jail took the majority of the nearly two minutes.

This is not a playground you can work with, so we need to debug this somehow. After a minute with google I found a couple of hints in the direction of IPv6 and DNS.

To test both IPv6 and DNS I ran a couple of drills:

root@test1:/ # time drill nafn.de aaaa @2620:fe::fe
[..]
nafn.de.        3525    IN      AAAA    2a00:d0c0:200:0:3493:eeff:fe94:1775
0.000u 0.004s 0:00.00 0.0%      0+0k 0+0io 0pf+0w

Seems to work fine. What didn’t work so fine was the same command using a IPv4 address.

root@test1:/ # time drill nafn.de aaaa @9.9.9.9
Error: error sending query: Could not send or receive, because of network error
0.003u 0.000s 0:15.21 0.0%      0+0k 0+0io 0pf+0w

At this point I knew two things:

IPv4 traffic is somehow bugged
a jail without DNS resolving capabilities will start very slow

Firewall and NAT

Since I already worked with a system like this about 14 years ago (back then I failed misserably), I knew it might have something to do with the firewall configuration.

The server is running at Hetzner and I’m not interested in buying more IPv4 addresses so IPv4 inside of jail is making use of NAT. And while running these drill commands, I remembered that I haven’t spent much time with pf yet.

I think I just took the first example I could find in /usr/share/examples/pf/ and stripped it down to something that I can work with and understand.

Of course I only took care of the things in the configuration file that I immediately needed at that time, so the file looked something like this:

ext_if="em0"
[..]
nat on $ext_if from lo0:network to any -> ($ext_if)
[..]

Before we get into why this cannot work, lets see what this line actually means:

“Translate all packets coming from lo0:network to the address associated with $ext_if when they are going out on $ext_if`”

This cannot work for multiple reasons. First, the jail IPs are not associated to lo0. Also, $ext_if has multiple addresses associated which kind of makes this bogus. Completely useless so far.

Let’s have a look at a configuration that actually works:

ext_if="em0"
ext_ipv4="185.26.156.224"
int_network_range="10.0.0.0/8"
[..]
nat on $ext_if from $int_network_range to any -> $ext_ipv4
[..]

Meaning:

“Translate all packets coming from $int_network_range to the address $ext_ipv4 when they’re going out on $ext_if”

It’s a bit more static but since all the relevant information is available via ansible, it is automatically changed when running on different hosts.

Did it solve the problem?

It did. After applying this new configuration to the firewall, the jails started very quick and IPv4 traffic from inside of the jail was working too.

root@test1:/ # time drill nafn.de aaaa @9.9.9.9
[..]
nafn.de.        3600    IN      AAAA    2a00:d0c0:200:0:3493:eeff:fe94:1775
[..]
0.000u 0.003s 0:00.03 0.0%      0+0k 0+0io 0pf+0w

root@w0:~ # time service jail restart test1
Stopping jails: test1.
Starting jails: test1.
0.185u 0.162s 0:01.00 34.0%     108+194k 0+31io 0pf+0w

\o/