Debugging riddle of the day

One of our services failed to start on a test system (Ubuntu 12.04 on amd64). The stdout/stderr log streams contained only the string “Permission denied” – less than helpful. strace showed that the service tried to create a file under /run, which it doesn't have write permissions to. This caused the it to bail out:

open("/run/some_service", O_RDWR|O_CREAT|O_NOFOLLOW|O_CLOEXEC, 0644) = -1 EACCES (Permission denied)

Grepping the source code and configuration files for /run didn't turn up anything that could explain this open() call. Debugging with gdb gave further hints:

Breakpoint 2, 0x00007ffff73e3ea0 in open64 () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt
#0  0x00007ffff73e3ea0 in open64 () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007ffff7bd69bf in shm_open () from /lib/x86_64-linux-gnu/librt.so.1
#2  0x0000000000400948 in daemonize () at service.cpp:93
#3  0x00000000004009ac in main () at main.cpp:24
(gdb) p (char*)$rdi
$1 = 0x7fffffffe550 "/run/some_service"
(gdb) frame 2
#2  0x0000000000400948 in daemonize () at service.cpp:93
9           int fd = shm_open(fname.c_str(), O_RDWR | O_CREAT, 0644);
(gdb) p fname
$2 = {...., _M_p = 0x602028 "/some_service"}}

The open("/run/some_service", ...) was caused by an shm_open("/some_service", ...).

This code is working on other machines, why does it fail on this particular one? Can you figure it out? Bonus points if you can explain why it is trying to access /run and not some other directory. You might find the shm_open() man page and source code helpful.

I'll be waiting for you.

Caaaaaaaaat!

The solution is pretty evident after examining the Linux version of shm_open(). By default, it tries to create shared memory files under /dev/shm. If that doesn't exist, it will pick the first tmpfs mount point from /proc/mounts.

In Ubuntu 12.04, /dev/shm is a symlink to /run/shm. On this machine the symlink was missing, which caused shm_open() to go hunting for a tmpfs filesystem, and /run happened to be the first one in /proc/mounts.

Re-creating the symlink solved the problem. Why it was missing in the first place is still unclear. In the aftermath, we're also improving the error messages in this part of the code to make such issues easier to diagnose.


Behold the power of perf-tools

perf-tools is a collection of scripts for system-wide tracing on Linux. It's really, really cool. It's what the perf command should have included from day one, but didn't.

It is packaged in Debian and Ubuntu, but those versions miss some key features. As perf-tools consists of shell scripts (no compilation necessary), I recommend using the GitHub version directly:

git clone https://github.com/brendangregg/perf-tools.git

Two tools that are included are execsnoop and opensnoop, which trace new program executions and open() calls across the whole system.

$ sudo ./execsnoop
TIME        PID   PPID ARGS
21:12:56  22898  15674 ls --color=auto -la
21:12:56  22899  15674 git rev-parse --is-inside-work-tree
21:12:56  22900  15674 git rev-parse --git-dir
...

$ sudo ./opensnoop
Tracing open()s. Ctrl-C to end.
COMM             PID      FD FILE
opensnoop        22924   0x3 /etc/ld.so.cache
gawk             22924   0x3 /usr/lib/locale/locale-archive
top              15555   0x8 /proc/1/stat
...

Maybe the most interesting tool is uprobe. It's magic: it traces function calls in arbitrary user-space programs. With debugging symbols available, it can trace practically every function in a program. Without them, it can trace exported functions or arbitrary code locations (specified by raw address). It can also trace library code, e.g. libc). Having these possibilities on a production system without any prior setup is staggering.

$ sudo user/uprobe -F -l /tmp/a.out | grep quicksort
_Z9quicksortN9__gnu_cxx17__normal_iteratorIPiSt6vectorIiSaIiEEEES5_
$ sudo user/uprobe -F p:/tmp/a.out:_Z9quicksortN9__gnu_cxx17__normal_iteratorIPiSt6vectorIiSaIiEEEES5_
Tracing uprobe _Z9quicksort[snip] (p:_Z9quicksort[snip] /tmp/a.out:0x8ba). Ctrl-C to end.
   a.out-23171 [000] d... 1860355.891238: _Z9quicksort[snip]: (0x80488ba)
   a.out-23171 [000] d... 1860355.891353: _Z9quicksort[snip]: (0x80488ba)
   ...

(To demangle the C++ function names, use the c++filt tool.)

perf-tools really shows the power of the Linux perf/ftrace infrastructure, and make it usable for the broad masses. There are several other tools that analyze latency and cache hit rates, trace kernel functions, and much more. To finally have such functionality in Linux is fabulous!


Running strace for multiple processes

Just a quick note about strace, the ancient Linux system-call tracing tool.

It can trace multiple processes at once: simply start it with multiple -p arguments (the numbers give the processes' PIDs):

sudo strace -p 2916 -p 2929 -p 2930 -p 2931 -p 2932 -o /tmp/strace.log

This is great for tracing daemons which use separate worker processes, for example Apache with mpm-prefork.


Configuring SSL on Apache 2.4

Configuring a modern web server to employ strong encryption and forward secrecy doesn't have to be hard. There is excellent documentation from Mozilla and from the OWASP.

Obtaining an SSL certificate

One major stumbling block is where to obtain an SSL certificate. In the future, this should hopefully be easy with Let's Encrypt. Until that is actually functional, StartSSL offers free SSL certificates. The process takes a bit of patience, but it's not difficult. There's also a StartSSL HOWTO from h-online.com.

While I've used StartSSL in the past, I had some trouble with them because 10 years after I registered greek0.net, someone grabbed greekO.net and StartSSL was alleging I was trying to mislead users?! So that was the end of my business with them...

I've now switched to Comodo's Positive SSL Certificate, which I like for a couple of reasons:

  • it lasts for 3 years,
  • it's really uncomplicated, and
  • it's crazy cheap: 7.45$ per year.

The process of getting the cert from them was super easy, simpler than StartSSL. About 3 hours from going to their website to having the certificate installed on my server, with most of it waiting email verifications. Credit card payment was quick and easy. 10/10, would buy again :-)

Apache configuration

With the certificate acquisition out of the way, here are the juicy bits from my Apache config.

mod_ssl config:

# Enable only cyphers that support forward secrecy.
# See these two links for reference:
# https://stackoverflow.com/questions/17308690
# https://wiki.mozilla.org/Security/Server_Side_TLS#Non-Backward_Compatible_Ciphersuite
SSLCipherSuite ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-DSS-AES128-GCM-SHA256:kEDH+AESGCM:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA:ECDHE-ECDSA-AES256-SHA:DHE-RSA-AES128-SHA256:DHE-RSA-AES128-SHA:DHE-DSS-AES128-SHA256:DHE-RSA-AES256-SHA256:DHE-DSS-AES256-SHA:DHE-RSA-AES256-SHA:!aNULL:!eNULL:!EXPORT:!DES:!RC4:!3DES:!MD5:!PSK

# Use server priorities for cipher algorithm choice.
SSLHonorCipherOrder on

# With Apache 2.4, SSLv2 is gone and only SSLv3 and TLSv* are supported.
# Disable SSLv3, all TLS protocols are OK.
SSLProtocol all -SSLv3

# Enable OCSP stapling
# With this, the client can verify that our certificate isn't revoked
# without having to query an external OCSP service.
SSLUseStapling On
SSLStaplingCache shmcb:${APACHE_RUN_DIR}/ssl_stapling(32768)

The per-site configuration:

SSLEngine On
SSLCertificateKeyFile   /path/to/serverkey.key    # The private server key.
SSLCertificateFile      /path/to/certificate.crt  # The certificate provided by CA.
SSLCertificateChainFile /path/to/cert-bundle      # A separate download from your CA.


# Use a customized prime group for DH key exchange (vs Logjam attack).
# Generate a DH group file with:
#    openssl dhparam -out dhparams.pem 2048
#
# Newer Apache versions support the following command to set the dhparams:
SSLOpenSSLConfCmd DHParameters "/path/to/dhparams.pem"

# If Apache reports an error for the above line, remove it and include
# the dhparams in the certificate:
#   cat <CERT>.crt dhparams.pem > cert-with-dhparams.crt
#   SSLCertificateFile cert-with-dhparams.crt


# HSTS: Force browsers to require SSL for this domain for the next year.
# Down-grade to HTTP will cause browsers to abort with a security error.
Header always set Strict-Transport-Security "max-age=31536000; includeSubDomains; preload"


# HPKP: Pin the current key for the next two months.
# Generate hash using:
#   openssl rsa -in <serverkey>.key -outform der -pubout | \
#   openssl dgst -sha256 -binary | openssl enc -base64
#
# You ideally want to generate a backup key and include that here as well,
# in case the primary key is lost or compromised.
# Also note the implications for key rollover.
# See: https://developer.mozilla.org/en-US/docs/Web/Security/Public_Key_Pinning
Header always set Public-Key-Pins "pin-sha256=\"<HASH>\"; max-age=5184000; includeSubDomains"


# Disable compression to avoid BREACH HTTPS/SSL attack.
<Location />
    SetEnv no-gzip
</Location>

That should cover the basics.

Testing

As for SSL connection testing, I found the Qualys SSL Labs Test helpful. It shows what browsers (browser versions) will get which encryption quality (forward secrecy or not) and highlights common problems such as certificate chain issues.

Hope this helps someone out there!


Installing CyanogenMod 11 on a Samsung Galaxy S2

I bought my Samsung Galaxy S2 in 2011, and it's still going strong. It really was a great phone for the time and held up incredibly well. Unfortunately, Samsung's support has ended long ago, and users are stranded with an obsolete (and insecure) firmware.

Fortunately, CyanogenMod still provides relatively recent images for the device. As of this writing, snapshots of CM11 (based on Android 4.4) are available, but there are no images of CM12.

Here is how I flashed CM11 to my phone. This is based on the official CyanogenMod wiki page for the SGS2 and on this xda-developers post. Since you can brick your phone if you don't know what you are doing, I suggest reading both of these pages. Note that you will need to factory-reset your phone, so backup all your data (files, apps, SMS, contacts, ...).

All the following steps have to be performed on a root shell on Linux.

To start from a clean slate, create a new Debian Jessie chroot (you may need to install debootstrap first). Don't use LXC/Docker/VMWare here, you need raw hardware access:

host#  mkdir sgs2-flash-chroot
host#  cd sgs2-flash-chroot
host#  debootstrap jessie .
host#  mount --bind /dev/ dev
host#  mount --bind /sys sys
host#  mount --bind /proc proc

Copy the following files to sgs2-flash-chroot/tmp:

Boot the phone into download-mode (shutdown, then VOLDOWN + HOME + POWER) and connect to the Linux computer.

host#  chroot .
chroot#  apt-get install heimdall-flash android-tools-adb
chroot#  heimdall print-pit
chroot#  cd /tmp
chroot#  heimdall flash --KERNEL zImage --no-reboot

Disconnect the USB cable and hold POWER until the phone shuts down. Reboot into recovery (VOLUP + HOME + POWER, let go of POWER after 5 seconds or you'll trigger a reboot). Then reconnect the USB cable.

chroot#  adb devices    # Check if device recognized.
chroot#  adb push Recovery_CWM_6.0.4.7_I9100.zip /emmc

In recovery, select "install from zip file" to flash the new recovery image. Then go into advanced -> "reboot recovery". Mount /storage/sdcard0 in the recovery menu, then reconnect the USB cable.

chroot#  adb devices    # Check if device recognized.
chroot#  adb push cm-11-20141115-SNAPSHOT-M12-i9100.zip /storage/sdcard0
chroot#  adb push gapps-kk-20140105-signed.zip /storage/sdcard0

Again, in recovery, select "install from zip files", first install the CM image, then the GApps package. Select "reboot" to boot into CyanogenMod. Shut down again, reboot into recovery, wipe cache and perform factory reset, reboot into CM (avoid factory reset with stock kernel due to the "super brick" problem).

Done. You should now have a not-so-shiny-anymore Galaxy S2 running a new-and-shiny CyanogenMod 11. Enjoy :-)


Correct use of hyphens in man-pages

When writing manual pages the question comes up when to use "-" and when to use "\-". The answer is actually quite simple. Use "-" whenever you want a hyphen and "\-" when you want a minus sign.

There are two exceptions though: In the name section, "\-" is used to separate program name from short description, as in "man \- an interface to on-line manuals".

The other exception is that you have to use "\-" for options/switches (-h, --foo, etc.). "\-" causes man to emit an U+002d Hyphen-Minus character, whereas "-" results in U+2010 Hyphen (in a unicode locale).

U+2d is the normal ASCII hyphen char, the one programs use to test for switches. So "\-" allows copy&paste from the manpage, while "-" doesn't.


ELF talk

Last monday I held a short talk about ELF objects and dynamic linking for the Debienna crowd. It went semi-well; people were quite interested but somtimes didn't seem to grasp what I was talking about. Which was probably my fault because I didn't spend enough time preparing the talk, being on a difficult subject to begin with.

Perhaps I'll talk about the subject again for maks, Rhonda and baumgartner (if they are still interested), since they weren't able to attend.

In case anyone cares, I've written up some notes about ELF, dynamic linking, symbol lookup and related stuff, covering most the thinks I talked about.


Cross-compiler fun

I needed to fix the elfutils build failure on ia64, but I didn't have access to such a machine. Fortunately Herbert Pötzl pointed out ski, an ia64 emulator for Linux.

Ski needs a custom guest kernel however, so I had to cross-compile that for ia64.

Setting up a cross-compiling toolchain on Debian is really easy nowadays; there's even a nice HOWTO describing the needed steps. For lazy people pre-built packages are available.

When compiling the toolchain yourself, note that you may need more/other library packages then listed in the HOWTO. This depends on the target architecture, e.g. for ia64 you will need libunwind7-dev, libatomic-ops-dev, and further libc6.1 instead of libc6. Otherwise gcc will complain about missing build-dependencies.

For ia64 I ran into a linker error when building gcc, however a patch from Bertl's cross-compiling corner solved that.

While doing all this I wrote some scripts to automate the process, so compiling a cross-toolchain (for any architecture) is now a matter of 5 minutes configuration and one ./driver run. Whee!


Automatically syncing files between hosts without compromizing security

The problem

The goal is to automatically synchronize files between several hosts without compromising the integrity of the separate machines. A nice tool for 2-way sync is unison. To sync files between different machines The Right Way (TM) is to tunnel the unison protocol over ssh. This is well supported by unison.

To run sync automatically (e.g. via cron), you need to create an SSH keypair without passphrase, so unison can log into the other machine without human interaction. This is where the problems start, since anyone who got access to the private key (e.g. by compromising or stealing the machine the private key was on) can log into the other host.

Now ssh has a nice way to restrict what you can do with a specific key, so you can e.g. use the following in the remote hosts ~/.ssh/authorized_keys:

command="/usr/bin/unison -server" ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEA8K2cd0yemw...

That way someone who has the private key can't execute arbitrary commands, but just unison in server mode. However it's still possible to tell the unison server to overwrite arbitrary files (which the user has write access to). This is a major problem, since also files like ~/.bashrc can be overwritten, so the next time the user logs in, arbitrary commands will be executed.

A possible solution

One solution is to simply create a new user on the remote host with a disabled password, and let unison run as that user (via adding the appropriate line to $HOME/.ssh/authorized_keys, and telling the local unison to use that username).

That's possible, but the .bashrc trick still works, it's just less likely that the code there is ever executed (root would have to use su to become that user).

For me this solution didn't work out since I wanted to sync my maildir, and it was hard to ensure that file permissions were set in a way that both allowed me to read my mail and allowed unison (running under user unison-sync) to sync the files.

The Right Solution (TM)

All the problems vanish as soon as you run unison under the user you'd normally use, but in a chroot. Now a full-blown chroot takes up a lot of space, and there's once again the danger that someone might enter the chroot and run some kind of shell (though the risk is even lower).

It's best to use a chroot which only contains the bare minimum of files necessary to run unison -server.

You get numerous advantages:

  • No problems with file permissions
  • No shell inside the chroot that would read startup files from $HOME.
  • Hardly any space wasted. The whole chroot is about 4Mb in size
  • Since the chroot is pretty much empty, many common exploits (well, shell codes) won't work

How to do it

greek0@orest:/home/chroots/unichroot$ cat ~/.ssh/authorized_keys
command="/usr/bin/dchroot -q -c unison -- /usr/bin/unison -server" ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEA8K2cd0.....

greek0@orest:/home/chroots/unichroot$ grep unison /etc/dchroot.conf
unison /home/chroots/unichroot

greek0@orest:/home/chroots/unichroot$ find . -maxdepth 3 | xargs ls -ld
drwxr-xr-x   2 root   root      4096 2006-08-04 16:39 ./bin
-rwxr-xr-x   1 root   root    576100 2006-08-04 15:06 ./bin/sash
lrwxrwxrwx   1 root   root         4 2006-08-04 16:39 ./bin/zsh -&gt; sash
drwxr-xr-x   3 root   root      4096 2006-08-04 15:10 ./home
drwx------   4 greek0 greek0    4096 2006-08-04 15:18 ./home/greek0
drwx------  31 greek0 greek0    4096 2006-08-04 13:47 ./home/greek0/Maildir
drwx------   2 greek0 greek0    4096 2006-08-04 15:47 ./home/greek0/.unison
drwxr-xr-x   2 root   root      4096 2006-08-04 15:07 ./lib
-rwxr-xr-x   1 root   root     88164 2006-08-04 14:58 ./lib/ld-linux.so.2
-rwxr-xr-x   1 root   root   1151644 2006-08-04 14:56 ./lib/libc.so.6
-rw-r--r--   1 root   root      9592 2006-08-04 14:56 ./lib/libdl.so.2
-rw-r--r--   1 root   root    141040 2006-08-04 14:55 ./lib/libm.so.6
-rw-r--r--   1 root   root      9656 2006-08-04 14:55 ./lib/libutil.so.1
drwxr-xr-x   3 root   root      4096 2006-08-04 14:53 ./usr
drwxr-xr-x   2 root   root      4096 2006-08-04 14:55 ./usr/bin
lrwxrwxrwx   1 root   root        14 2006-08-04 15:12 ./usr/bin/unison -&gt; unison-2.13.16
-rwxr-xr-x   1 root   root    955784 2006-08-04 14:54 ./usr/bin/unison-2.13.16

The zsh symlink is there because I have /bin/zsh as my shell in /etc/passwd, and dchroot also wants to use it in the chroot (for launching unison).

/home/greek0/Maildir is bind-mounted from outside the chroot, bind-mounting is done at boot-time via /etc/fstab.

The chroot was created manually, simply by copying the files from the host. You obviously need /usr/bin/unison plus all the libraries it depends on. You can find those via readelf -d /usr/bin/unison | grep NEEDED. Additionally you need the dynamic linker /lib/ld-linux.so.2 (seen from readelf -l /usr/bin/unison | grep INTERP -A 1).

One thing to pay attention to is that most of the files copied from /lib are symlinks. Be sure to either use cp without arguments, or use cp -a and copy the link targets too.


Tools for mutt

Tools

mutt-bug

This is a tool that displays Debian bug reports in mutt. You can then directly read all messages sent to the bug and reply. The messages are fetched directly from the web interface, so there is no delay between requesting bug and getting it per email.

This tool was originally written by Christoph Berg, I've made some modifications to make it work in arbitrary directories.

Useage:

mutt-bug bugnumber

Download: mutt-bug

gpgverify

gpg --verify is quite slow when you have large keyrings included (like the debian keyring). This is nasty, since mutt has to wait until gpg is finished when displaying a gpg signed message (with signature verification on). So I've written a tool that splits a huge keyring into a lot of smaller keyrings (one key per keyring) and a shell script to verify signatures, to be used from within mutt. The former tool is called splitkeyring.sh. The latter one is gpgverify.sh.

gpgverify.sh first invokes gpg --verify as normal and captures its output. If gpg failed because the key was not found in any keyring, the script looks if the key is in one of the splitted keyrings, and if so, reruns gpg with that keyring included. Otherwise the gpg error is returned.

These scripts are still hacky, if you want to use them you'll probably have to modify them a bit. They aren't too big, so this shouldn't be too much of a problem.

Download: gpgverify