Debian/Hurd switches to sysvinit    Posted:


Previously, Debian/Hurd used a home-grown init system. Last year, I participated in the gsoc and set out to make it boot using sysvinit instead.

On Debian/Hurd, one can switch between the available init systems using update-alternatives(8). With the latest Debian/Hurd packages uploaded earlier today, the priority of the old init system was lowered so that sysvinit is preferred. So as of now, Debian/Hurd is using sysvinit by default.

(If you are upgrading your Debian/Hurd installation now, please remember that you must use reboot-hurd or halt-hurd to shut it down whenever you switch to a different init system.)

And because screenshots of booting operating systems are just awesome, here is a current one. Note how smooth it is:

start ext2fs: Hurd server bootstrap: ext2fs[device:hd0s1] exec init proc auth
INIT: version 2.88 booting
Using makefile-style concurrent boot in runlevel S.
Activating swap...done.
Checking root file system...fsck from util-linux 2.20.1
/dev/hd0s1: clean, 29799/181056 files, 206131/723200 blocks
done.
Creating compatibility symlink from /etc/mtab to /proc/mounts. ... (warning).
mount: cannot remount /proc: Invalid argument
Activating lvm and md swap...done.
Checking file systems...fsck from util-linux 2.20.1
done.
Cleaning up temporary files... /tmp.
Mounting local filesystems...done.
Activating swapfile swap...done.
Cleaning up temporary files....
Configuring network interfaces...Internet Systems Consortium DHCP Client 4.2.2
Copyright 2004-2011 Internet Systems Consortium.
All rights reserved.
For info, please visit https://www.isc.org/software/dhcp/

Listening on Socket//dev/eth0
Sending on   Socket//dev/eth0
DHCPDISCOVER on /dev/eth0 to 255.255.255.255 port 67 interval 7
DHCPREQUEST on /dev/eth0 to 255.255.255.255 port 67
DHCPOFFER from 10.0.2.2
DHCPACK from 10.0.2.2
bound to 10.0.2.15 -- renewal in 40021 seconds.
done.
Cleaning up temporary files....
INIT: Entering runlevel: 2
Using makefile-style concurrent boot in runlevel 2.
Starting enhanced syslogd: rsyslogd.
Starting deferred execution scheduler: atd.
Starting periodic command scheduler: cron.
Starting OpenBSD Secure Shell server: sshd.

Debian GNU/Hurd jessie/sid debian console

login:

On portability of init systems    Posted:


There is one thing in the current init system debate that irritates me. It is about the portability of init systems. The new init systems are evaluated by how portable they are. This is one of the arguments that is most often brought against systemd, which is understandable given the polarizing attitude of one of systemd's authors. In this context, the currently used sysvinit is considered portable. But in my book, it is not portable at all.

  • There is no standard for the proc filesystem.
  • Any program using the proc filesystem is therefore not portable.
  • sysvinit uses the proc filesystem (in src/{bootlog,hddown,killall5}.c).
  • Therefore, sysvinit is not portable.

But why can we use sysvinit to boot Debian/{kFreeBSD,Hurd}?

Debian/kFreeBSD uses linprocfs for /proc to provide a familiar Linux-like environment for the userspace tools available in Debian (say, pgrep). linprocfs was originally written by the FreeBSD folks to support running Linux binaries using the Linux Binary Compatibility layer.

Debian/Hurd uses procfs to provide /poc. This procfs translator is written mainly to provide a Linux-compatible /proc filesystem for the same reason linprocfs is used by Debian/kFreeBSD.

So sysvinit works on those system not because it is portable, but because the environment has been made Linux-like enough for sysvinit. We (most likely anyone not using Linux) often do this, because it is the easiest way to run popular software developed (mainly) for Linux. This is often the path of the least resistance, as opposed to getting the upstream project to support the native way of doing things on platform X.

During gsoc last year I had to patch our procfs to finally be able to safely shut down Debian/Hurd systems using sysvinit. The problem was, that sysvinit at certain runlevel transitions (like shutting down, or I guess, switching to single user mode), sysvinit assumes that it is okay to stop and kill (almost) all processes on the system (that's what killall5 does). This might be okay on monolithic systems, but on (multiserver) microkernel systems like the Hurd, where your root filesystem and your network driver and stack are running as userspace processes, it is clearly not. I wonder how Linux systems using a FUSE-based root filesystem get away with this.

This highlights that not only sysvinit depends on a Linux-specific kernel interface (/proc), but it also hard-codes assumptions about the system architecture.

Amusingly, systemd get's this right (ok, I'm not sure if it does, but it could get this right...). systemd organizes processes in cgroups, one for each service it starts and one for each login session or something like that. It can (could?) kill only those processes in it was responsible for, leaving all essential translators (system servers) alone. In fact, even my tiny cgroupfs prototype can keep track of translators that are started by the root filesystem translator.

Final GSoC report    Posted:


This is my final report :) the GSoC was great, I learned a lot about the Hurd and Mach programming in general. I am also very pleased to announce that I reached my initial goal and almost all of my patches already made it upstream and into Debian :)

I spent my last week fixing issues I introduced and splitting up /hurd/init into two programs. This would make it possible to integrate the patch that frees PID 1 for sysvinit into the Hurd upstream sources. I didn't quite finish the separation, but my proof of concept works and I will finish this as my next Hurd project.

Looking back at the last fourteen weeks, I accomplished the following:

I implemented /proc/mounts, umount, freed up PID 1 for sysvinit, fixed ifupdown, sysvinit and initscripts on Hurd, implemented a proof-of-concept cgroupfs and fixed many small issues along the way. Almost all of my patches are already upstream and in Debian, a Debian/Hurd booting using sysvinit is just a few uploads away.

It has been a lot of fun and I will definitively see you around :)

Justus

cgroupfs is as cgroupy as it gets...    Posted:


... at least until the cgroup interface is fixed. So, what can it do?

  • There is tasks and cgroup.procs. There are no thread IDs on Hurd, so cgroupfs works only on a per-process basis, not per-threads. Consequently tasks has the same semantic as cgroup.procs. Seeing that PIDs and TIDs can be used (mostly) interchangeably on Linux I think this is okay to do.
  • You can create and destroy cgroups, child processes are properly tracked.
  • You can register an release_agent and it is executed whenever the last process in a cgroup dies.
  • There is notify_on_release to enable or disable the use of release_agent.
  • There is cgroup.clone_children, one can toggle this bit but it is ignored.

So, what's missing?

  • There are no controllers. I haven't looked into this and resource accounting is one of Hurds weakest points, but it is fathomable that one could e.g. advise the scheduler inside the Mach kernel based upon the state of the cgroups if the cgroupfs process is sufficiently privileged (did I mention that any user can use cgroupfs?).
  • The notification API aka cgroup.event_control. The Hurd lacks eventfd(2), but even if that was implemented, this interface would still be impossible to implement. Rant below.
  • A patch for gnumach to make this bulletproof. I made some encouraging progress with that one this week, but there's nothing presentable yet.

So, what's wrong with Linux cgroup API?

Well for one thing the whole API is underspecified. Yes, there is Documentation/cgroups/cgroups.txt, but that is not a specification, that's a howto at best. Second, the notification API is not particularly nice:

To register a new notification handler you need to:
 - create a file descriptor for event notification using eventfd(2);
 - open a control file to be monitored (e.g. memory.usage_in_bytes);
 - write "<event_fd> <control_fd> <args>" to cgroup.event_control.
   Interpretation of args is defined by control file implementation;

Seriously? There is a POSIXly way to pass file-descriptors around, but smashing the decimal representation of it into a string is not the way to do that. Linux gets away with this hack because the kernel knows the process who wrote(2) that string in the first place, parse the string into an integer and look it up in the table of file descriptors for that process.

Now the trouble for cgroupfs is, that it is not the kernel and even if it were, it wouldn't solve the problem because on Hurd there are no file descriptors (well there are, but that's only to appease all the POSIX programs out there). Instead Hurd has ports, and you can send messages to ports, and this is pretty much everything that you can do on a Mach system. Reading a file works roughly like this:

  1. You open a file and get a port X.
  2. You send a message like "I'm like really interested in the first Y bytes of that file" to X.
  3. Whoever has the receiving end of X (probably the one who gave you X in the first place) answers your request.

Ports look pretty much like file descriptors, they are (usually small) integers, you can make them, destroy them, pass them around easily (yes, ports are first class objects in the Mach messaging system). Everything is implemented atop of this mechanism. It is transport-agnostic, the other end could be on another machine and you wouldn't even know. You can create proxies or filters (in fact, that is exactly how the firewall eth-filter is implemented). It's beautiful and extensible at it's heart, like Lego bricks.

So if X were a port to e.g. memory.usage_in_bytes and the cgroups interface would be less braindead^W^Wmore carefully designed so that on Hurd it could be transported like ports usually are, then cgroupfs could in fact use port X' to look up which file the caller is interested in (this is possible because cgroupfs was the one handing out the port in the first place) and generate notifications for that file. This is not possible when X is "serialized for transport" using sprintf because port names are specific for each process, so X != X'. The kernel would do the translation while sending the message, but it obviously cannot do that if the number is carried in a character array.

I'm not sure what I'm going to do next week. The gsoc timeline suggests a soft-pencils-down, time to scrub code and write documentation, not sure that this is applicable to me as I have pushed most of my work upstream as early as possible. I guess I will nag Samuel so that he merges the outstanding patches and continue working on my gnumach patch.

cgroupfs keeps track of processes    Posted:


Tl;dr!!elfel1 Screenshot (slightly edited and annotated shell trace):

+ settrans -ca /cgroup /hurd/cgroupfs
+ mkdir /cgroup/init /cgroup/rootfs
+ echo $$ >> /cgroup/init/tasks  # $$ is 6
+ echo 3 >> /cgroup/rootfs/tasks # pid 3 is the root filesystem
+ sleep 1m & echo sleep has pid $!
sleep has pid 16
+ cat /proc/cmdline > /dev/null
+ tail /cgroup/init/tasks /cgroup/rootfs/tasks
==> /cgroup/init/tasks <==
6
16
20

==> /cgroup/rootfs/tasks <==
3
19
17
+ pstree -p
init(1)-+-auth(5)
        |-cgroupfs(14)
        |-ext2fs(3)-+-exec(4)
        |           |-null(17)
        |           |-pflocal(8)
        |           |-procfs(19)
        |           `-term(7)
        |-mach-defpager(10)
        |-root=device:hd0s1(2)
        `-sh(6)-+-pstree(21)
                `-sleep(16)

Isn't she a beauty?

So we bind the cgroupfs translator to /cgroup, create two cgroups, init and rootfs, move the currently executing shell script (that later execs sysvinit) into the former and the root filesystem translator into the latter cgroup. We then spawn a sleep process and cat the content of /proc/cmdline into /dev/null which will make the root filesystem start the /hurd/procfs and the /hurd/null translator. We then inspect /cgroup/{init,rootfs}/tasks and find indeed all the newly spawned processes in the cgroup their parent process was in.

This is accomplished by:

I also filed a bug report containing my patches for the sysvinit package (#721917). This is the second bug report I filed during my gsoc, the first one was for the ifupdown package (#720531) which Andrew Shadura improved and merged on the very next day, thanks Andrew!

Next week I'll continue to improve the cgroupfs translator, work on the notification prototype (hopefully fixing non-root subhurds in the process, this requires a similar notification mechanism for newly created tasks and making /hurd/proc just a little subhurd aware) and trying to get my gnumach patch into a working shape (currently the parental relation of processes is a Hurd-only concept and relies upon processes telling the /hurd/proc server that a newly created process is their child. This is automatically done if the process uses fork(2) of course, but not if it uses task_create to start a new Mach task).

What will I do next? cgroupfs \o/    Posted:


With the ifupdown fixes that I published last week I actually reached my initial goal, that is to make Debian/Hurd boot using sysvinit and the initscripts provided by Debian. So on Monday we were discussing in #hurd what I could do next. Michael Banck suggested that I should port Upstart, but we agreed to do something different instead for two reasons:

  1. Upstart and systemd are somewhat competing to be the default init system for Debian, and we felt it might be inappropriate to get involved with this question as porting Upstart to Hurd would probably also enable it to be used on FreeBSD. The Upstart folks could then point out that Upstart is more portable because it runs on all kernels used by Debian.
  2. Upstart uses ptrace(2) to track child processes of servers it monitors. Obviously this is kind of a hack, and it was conjectured that Upstart would eventually use cgroups to do that. Also, the Hurd lacks support for ptrace(2) (that is most likely by choice by the way, ptrace(2) is not a nice interface and the Hurd (Mach actually) has much nicer interfaces to implement a debugger).

So we decided that no matter how the struggle between Upstart and systemd turns out, the Hurd would eventually need to support cgroups. So I started to write a cgroupfs translator, it is in its early stages but it already looks and acts a lot like Linux' cgroups:

% settrans -ac cg ./cgroupfs --release-agent=foobar
% ls cg
release_agent  tasks
% tail -n3 cg/tasks
11395
12869
1266
% mkdir cg/foo
% echo 1266 >> cg/foo/tasks
% tail -n3 cg/tasks cg/foo/tasks
==> cg/tasks <==
215
11395
12869

==> cg/foo/tasks <==
1266

To make this fully functional I will have to modify /hurd/proc and most likely also GNU Mach, but on the bright side this will help make subhurds (Hurds native, by-design-for-free-and-without-overhead container like functionality) work better and more securely (among other things this could enable non-root users to start subhurds). I will also look into porting libcg (I have a hacky patch series ready) so that we can actually test the cgroupfs translator. All current users of the cgroup interface are very Linux specific (surprise!), and libcg looks like the easiest one to port. And they do have a test suite that could help me improve the cgroupfs translator.

No noweb anymore...    Posted:


... which is probably a good thing. But here is the boot log you all have been waiting for:

start ext2fs: Hurd server bootstrap: ext2fs[device:hd0s1] exec init proc auth
INIT: version 2.88 booting
Using makefile-style concurrent boot in runlevel S.
Activating swap...done.
Checking root file system...fsck from util-linux 2.20.1
hd2 : tray open or drive not ready
hd2 : tray open or drive not ready
hd2 : tray open or drive not ready
hd2 : tray open or drive not ready
end_request: I/O error, dev 02:00, sector 0
/dev/hd0s1: clean, 44693/181056 files, 291766/723200 blocks
done.
Activating lvm and md swap...(default pager): Already paging to partition hd0s5!
done.
Checking file systems...fsck from util-linux 2.20.1
hd2 : tray open or drive not ready
hd2 : tray open or drive not ready
end_request: I/O error, dev 02:00, sector 0
done.
Cleaning up temporary files... /tmp.
Mounting local filesystems...done.
Activating swapfile swap...(default pager): Already paging to partition hd0s5!
done.
df: Warning: cannot read table of mounted file systems: No such file or directory
Cleaning up temporary files....
Configuring network interfaces...Internet Systems Consortium DHCP Client 4.2.2
Copyright 2004-2011 Internet Systems Consortium.
All rights reserved.
For info, please visit https://www.isc.org/software/dhcp/

Listening on Socket//dev/eth0
Sending on   Socket//dev/eth0
*** stack smashing detected ***: dhclient terminated
Aborted
Failed to bring up /dev/eth0.
done.
Cleaning up temporary files....
Setting up X socket directories... /tmp/.X11-unix /tmp/.ICE-unix.
INIT: Entering runlevel: 2
Using makefile-style concurrent boot in runlevel 2.
Starting enhanced syslogd: rsyslogd.
Starting deferred execution scheduler: atd.
Starting periodic command scheduler: cron.
Starting system message bus: dbusFailed to set socket option"/var/run/dbus/system_bus_socket": Protocol not available.
Starting OpenBSD Secure Shell server: sshd.
unexpected ACK from keyboard


GNU 0.3 (debian) (console)

login: root
[...]
root@debian:~# ifup /dev/eth0
Internet Systems Consortium DHCP Client 4.2.2
Copyright 2004-2011 Internet Systems Consortium.
All rights reserved.
For info, please visit https://www.isc.org/software/dhcp/

Listening on Socket//dev/eth0
Sending on   Socket//dev/eth0
*** stack smashing detected ***: dhclient terminated
Aborted
Failed to bring up /dev/eth0.
root@debian:~# dhclient -v -pf /run/dhclient.-dev-eth0.pid -lf /var/lib/dhcp/dhclient.-dev-eth0.leases /dev/eth0
Internet Systems Consortium DHCP Client 4.2.2
Copyright 2004-2011 Internet Systems Consortium.
All rights reserved.
For info, please visit https://www.isc.org/software/dhcp/

Listening on Socket//dev/eth0
Sending on   Socket//dev/eth0
*** stack smashing detected ***: dhclient terminated
Aborted
root@debian:~# dhclient -pf /run/dhclient.-dev-eth0.pid -lf /var/lib/dhcp/dhclient.-dev-eth0.leases /dev/eth0
root@debian:~# ifup /dev/eth0
Internet Systems Consortium DHCP Client 4.2.2
Copyright 2004-2011 Internet Systems Consortium.
All rights reserved.
For info, please visit https://www.isc.org/software/dhcp/

Listening on Socket//dev/eth0
Sending on   Socket//dev/eth0
DHCPREQUEST on /dev/eth0 to 255.255.255.255 port 67
DHCPACK from 10.0.2.2
bound to 10.0.2.15 -- renewal in 34108 seconds.
ps: comm: Unknown format spec
root@debian:~# halt

Broadcast message from root@debian (console) (Fri Aug 23 19:42:19 2013):

The system is going down for system halt NOW!
INIT: Switching to runlevel: 0root@debian:~#
INIT: Sending processes the TERM signal
INIT: Sending processes the KILL signal
Using makefile-style concurrent boot in runlevel 0.
Stopping deferred execution scheduler: atd.
task c10f53f8 deallocating an invalid port 2098928, most probably a bug.
Asking all remaining processes to terminate...done.
All processes ended within 1 seconds...done.
Stopping enhanced syslogd: rsyslogd.
Deconfiguring network interfaces...Internet Systems Consortium DHCP Client 4.2.2
Copyright 2004-2011 Internet Systems Consortium.
All rights reserved.
For info, please visit https://www.isc.org/software/dhcp/

Listening on Socket//dev/eth0
Sending on   Socket//dev/eth0
DHCPRELEASE on /dev/eth0 to 10.0.2.2 port 67
/dev/eth0 (2):
  inet address  0.0.0.0
  netmask       255.255.255.0
  broadcast     10.0.2.255
  flags         BROADCAST ALLMULTI MULTICAST
  mtu           1500
done.
Deactivating swap...swapoff: /dev/hd0s5: 177152k swap space
done.
Unmounting weak filesystems...umount: /etc/mtab: Warning: duplicate entry for device /dev/hd0s1 (/servers/socket/26)
umount: /etc/mtab: Warning: duplicate entry for device /dev/hd0s1 (/dev/cons)
umount: could not find entry for: /dev/cons
umount: could not find entry for: /servers/socket/26
done.
mount: cannot remount /: Device or resource busy
Will now halt.
store a new irq 11init: notifying pfinet of shutdown...init: notifying tmpfs swap of shutdown...init: notifying tmpfs swap of shutdown...init: notifying tmpfs swap of shutdown...init: notifying ext2fs device:hd0s1 of shutdown...init: halting Mach (flags 0x8)...
In tight loop: hit ctl-alt-del to reboot

With some tiny patches for ifupdown I've been able to resolve network related issues. All of them? Of course not, funny thing about developing for the Hurd is that once you fix one thing, then some other thing or code path is executed that has never been run on Hurd before, and therefore something else breaks. In this case I fixed ifupdown to generate valid names for the pid file and leases file and all of the sudden dhclient starts dying.

Funny thing about that is, if one drops the -v flag from the dhclient invocation as I did it above, the crash isn't triggered and once the lease file has been successfully written, it is safe to add the -v flag again. Not yet sure what goes on there, then again, looking at the source of isc-dhcp-client it is not so surprising that it crashes :/

When I first looked at ifupdown it was written in noweb, a literate programming tool. It is an interesting idea, even more so since (classic) c can be very verbose and cryptic. But it decouples the control flow from the structure of the program, which makes patching it quite a challenge since it is not as obvious where the changes have to go in. This is how ifupdown looked some weeks ago:

% wc --lines ifupdown.nw
6123 ifupdown.nw
% pdftk ifupdown.pdf dump_data | grep NumberOfPages
NumberOfPages: 113

The ifupdown.nw is the noweb source, from which seven .c, four .h, two .pl and one Makefile are generated. It also contains a redicioulus amount of documentation, to the point that the authors at several points did not now what to write and just drop some nonsensical lines into the file. The source also compiles to a 113 page pdf file, that contains all of the documentation and all of the code, not at all in the order that one would expect a program to be written, but in the order the authors chose to structure the documentation. Fortunately for me the maintainer decided to drop the noweb source and to add the generated files to the source control system. This made my job much easier :)

So here are the patches I published this week:

I must admit that I do not know exactly what I will do next week. Obviously fixing the dhclient crash would be nice, I'll look into that. But I'm surely find some useful thing to do.

All the important bits are there - please test and review :)    Posted:


Finally, more bootlog-pr0n:

start ext2fs: Hurd server bootstrap: ext2fs[device:hd0s1] exec init proc auth
INIT: version 2.88 booting
Using makefile-style concurrent boot in runlevel S.
Activating swap...done.
Checking root file system...fsck from util-linux 2.20.1
hd2 : tray open or drive not ready
hd2 : tray open or drive not ready
hd2 : tray open or drive not ready
hd2 : tray open or drive not ready
end_request: I/O error, dev 02:00, sector 0
/dev/hd0s1: clean, 44680/181056 files, 292234/723200 blocks
done.
Activating lvm and md swap...(default pager): Already paging to partition hd0s5!
done.
Checking file systems...fsck from util-linux 2.20.1
hd2 : tray open or drive not ready
hd2 : tray open or drive not ready
end_request: I/O error, dev 02:00, sector 0
done.
Cleaning up temporary files... /tmp.
Mounting local filesystems...done.
Activating swapfile swap...(default pager): Already paging to partition hd0s5!
done.
df: Warning: cannot read table of mounted file systems: No such file or directory
Cleaning up temporary files....
Configuring network interfaces...inetutils-ifconfig: invalid arguments
ifup: failed to open pid file /run/network/ifup-/dev/eth0.pid: No such file or directory
Internet Systems Consortium DHCP Client 4.2.2
Copyright 2004-2011 Internet Systems Consortium.
All rights reserved.
For info, please visit https://www.isc.org/software/dhcp/

can't create /var/lib/dhcp/dhclient./dev/eth0.leases: No such file or directory
Listening on Socket//dev/eth0
Sending on   Socket//dev/eth0
DHCPDISCOVER on /dev/eth0 to 255.255.255.255 port 67 interval 8
DHCPREQUEST on /dev/eth0 to 255.255.255.255 port 67
DHCPOFFER from 10.0.2.2
DHCPACK from 10.0.2.2
can't create /var/lib/dhcp/dhclient./dev/eth0.leases: No such file or directory
bound to 10.0.2.15 -- renewal in 34744 seconds.
done.
Cleaning up temporary files....
Setting up X socket directories... /tmp/.X11-unix /tmp/.ICE-unix.
INIT: Entering runlevel: 2
Using makefile-style concurrent boot in runlevel 2.
Starting enhanced syslogd: rsyslogd.
Starting deferred execution scheduler: atd.
Starting periodic command scheduler: cron.
Starting system message bus: dbusFailed to set socket option"/var/run/dbus/system_bus_socket": Protocol not available.
Starting OpenBSD Secure Shell server: sshd.
unexpected ACK from keyboard


GNU 0.3 (debian) (console)

login: root
root@debian:~# shutdown -h now

Broadcast message from root@debian (console) (Fri Aug 16 20:02:47 2013):

The system is going down for system halt NOW!
INIT: Switching to runlevel: 0root@debian:~#
INIT: Sending processes the TERM signal
INIT: Sending processes the KILL signal
Using makefile-style concurrent boot in runlevel 0.
Stopping deferred execution scheduler: atd.
task c10f72a8 deallocating an invalid port 2098928, most probably a bug.
Asking all remaining processes to terminate...done.
All processes ended within 1 seconds...done.
Stopping enhanced syslogd: rsyslogd.
Deconfiguring network interfaces...ifdown: failed to open pid file /run/network/ifdown-/dev/eth0.pid: No such file or directory
Internet Systems Consortium DHCP Client 4.2.2
Copyright 2004-2011 Internet Systems Consortium.
All rights reserved.
For info, please visit https://www.isc.org/software/dhcp/

can't create /var/lib/dhcp/dhclient./dev/eth0.leases: No such file or directory
Listening on Socket//dev/eth0
Sending on   Socket//dev/eth0
/bin/sh: 1: ifconfig: not found
done.
Deactivating swap...swapoff: /dev/hd0s5: 177152k swap space
done.
Unmounting weak filesystems...umount: /etc/mtab: Warning: duplicate entry for device /dev/hd0s1 (/dev/cons)
umount: could not find entry for: /dev/cons
done.
Unmounting local filesystems...done.
mount: cannot remount /: Device or resource busy
Will now halt.
init: notifying pfinet of shutdown...init: notifying tmpfs swap of shutdown...init: notifying tmpfs swap of shutdown...init: notifying tmpfs swap of shutdown...init: notifying ext2fs device:hd0s1 of shutdown...init: halting Mach (flags 0x8)...
In tight loop: hit ctl-alt-del to reboot

(You might note that df complains about not being able to read the mtab file. That is because it has been built with _PATH_MOUNTED being /var/run/mtab. This will correct itself when the coreutils package is being rebuilt against a patched libc.)

I spent my last two weeks with polishing my patch series, that meant a lot of package rebuilds and that means a lot of waiting (even more so on the Hurd, short version: fakeroot-hurd could be fast but is not yet working properly, fakeroot-tcp is slow) and thus some frustration ;)

Also I had to pay special attention so that the upgraded packages could be installed without accidentally breaking anything in the process. Making sysvinit pid 1 is surprisingly tricky in this regard since it breaks the ABI and requires a libc fix that also worked with the current Hurd servers.

Here are the patches:

So I had some spare time on my hand while waiting for numerous package rebuilds and I took this as an opportunity to read papers about Mach and to familiarize myself with mig, the Mach Interface Generator. While I have used it in the past, I had not yet looked at its implementation. And I had to patch the exec server, and there was both code implementing a questionable feature (on-demand unzipping of binaries) and code that was not even compiled (courtesy of the preprocessor) and had probably bit-rot by now. So I figured I could spend my time doing some cleanups:

I have rebuild all the necessary packages and uploaded them into an apt repository:

deb http://teythoon.cryptobitch.de/gsoc/heap/debian unstable main

Please use unstable for now. Also make sure that you have a recovery plan for your Debian/Hurd installation if anything goes wrong. For your convenience there's a seed tarball containing packages with the appropriate sources.list.d snippets and the repository key:

https://teythoon.cryptobitch.de/gsoc/heap/debian/seed.tar

If you want to switch to the new runsystem.sysv, do:

# update-alternatives --config runsystem

Whenever you switch runsystems, please use reboot-hurd to reboot the system. This is the most robust way.

Known issues:

  • procfs hardcodes the default kernel pid to 2. This breaks /proc/uptime and any program relying on it, most notably top and friends. Until this is properly fixed, you can do:

    # settrans -apfg /proc /hurd/procfs -c -k 3
    
  • The mtab translator should probably try to filter out non-filesystem translators. df complains loudly about /dev/cons for example.

Next week I will address the network related issues. By now they are the source of most of the error messages in the bootlog.

My worst week yet...    Posted:


This hasn't been my week. I had to fix two of my Hurd installations I use for development and testing. I spent countless hours waiting for some package to be rebuilt, reading scientific papers about adding stuff to the Mach kernel. Fascinating stuff like thread migration that would (among other things) allow for proper resource accounting on Mach kernels.

Oh, and gcc is a sadistic bastard:

In file included from printf-parsewc.c:2:0:
printf-parsemb.c: In function ‘__parse_one_specwc’:
printf-parsemb.c:407:1: internal compiler error: Segmentation fault
Please submit a full bug report,
with preprocessed source if appropriate.
See <file:///usr/share/doc/gcc-4.7/README.Bugs> for instructions.
The bug is not reproducible, so it is likely a hardware or OS problem.

A similar error came up like five hours into my libc rebuild. How nice of gcc to retry the build to figure out whether it is to blame or the environment only to throw away the successful build and abort my build process :/

  • I published my patch series addressing all sysvinit-related issues: http://lists.gnu.org/archive/html/bug-hurd/2013-08/msg00000.html
  • I since found three minor issues, so I'll roll another series shortly. But since this somewhat breaks the ABI (/hurd/init is no longer PID 1) this is waiting for a libc rebuild that I haven't managed to get done so far.
  • I updated my patch series for the sysvinit package. I'm quite pleased with it as it is and I'll propose it for inclusion with the sysvinit package shortly. Unfortunately this depends on the patched hurd package with is blocked by the libc package :/

Next week I'll finally propose my sysvinit patch series. Oh, and implement the -d flag in our umount. I thought I had implemented all the flags used by the initscripts, but somehow I missed -d. If I get bored, I'll take a look at the network related issues, but I bet I won't get there...

There is no spoon...    Posted:


This is the Hurd shutting down:

root@debian:~# init 0
INIT: Switching to runlevel: 0
INIT: Sending processes the TERM signal
INIT: Sending processes the KILL signal
Using makefile-style concurrent boot in runlevel 0.
Stopping deferred execution scheduler: atd.
Asking all remaining processes to terminate...done.
All processes ended within 1 seconds...done.
Stopping enhanced syslogd: rsyslogd.
Deconfiguring network interfaces...ifdown: failed to open pid file /run/network/ifdown-/dev/eth0.pid: No such file or directory
Internet Systems Consortium DHCP Client 4.2.2
Copyright 2004-2011 Internet Systems Consortium.
All rights reserved.
For info, please visit https://www.isc.org/software/dhcp/

can't create /var/lib/dhcp/dhclient./dev/eth0.leases: No such file or directory
Listening on Socket//dev/eth0
Sending on   Socket//dev/eth0
/bin/sh: 1: ifconfig: not found
done.
Deactivating swap...swapoff: /dev/hd0s5: 177152k swap space
done.
mount: cannot remount /: Device or resource busy
Will now halt.
INIT: no more processes left in this runlevel

The important line is the one saying Asking all remaining processes to terminate...done.. What happens there is that /sbin/killall5 is run, actually does its job and the system survives that. What was needed to fix killall5 is to mark some processes (all translators being started as root as well as any essential processes) as important and to exempt them from being frozen and killed by killall5. Furthermore it was necessary to fill in the correct values for the start_code and end_code fields of the /proc/*/stat records (this used to be an issue for Debian/kFreeBSD as well).

I am still in the process of cleaning up the patch series, I will finish that first thing Monday morning. I also spent my time polishing my mtab translator, the patch series is in its fourth revision and I consider it ready for inclusion:

http://lists.gnu.org/archive/html/bug-hurd/2013-07/msg00259.html

I also had this moment of clarity and enlightenment. For the first time I thought I finally understood what the Hurd really was. It is just a bunch of Mach programs that talk to each other in this really strange language and as a result they behave very much like the equivalent programs would on a different POSIX-like system. This is of course hidden away from the application programmer in our libc. This language is a mere convention, Richard calls this system personality. On some other level I have known this for a long time, but I never grasped the profound implications for users and developers, and for composability and the security of the system as a whole. I will try to show some of the cool stuff that can be done with such a system, stuff that is hard to do on monolithic kernels.

And I had to patch the exec server. You know what they say, once you looked into /hurd/exec, there is no going back.

I am very happy to report that I addressed all three major issues I identified in my second week. Next week I will clean up my initscripts patch series and submit it for inclusion. I will also rebuild the hurd, sysvinit and libc packages with all my patches included for broader testing. And if I ever run out of stuff to do first, I will have to look at the network related issues.

Contents © 2014 Justus Winter - Powered by Nikola