pf on OS X 10.7

Wednesday, 14. 09. 2011  –  Category: sw

Today’s the first day that my new laptop, which runs OS X 10.7 (Lion), will sit on an untrusted network so I figured it was time to port my firewall rules across from the old one, that ran OS X 10.6 (Snow Leopard).

I cut my UNIX teeth at a cryptoanarchist shop whose culture of paranoia makes me wary of Apple’s own firewall with its emphasis on letting all the hot shinyness Just Work rather than being overly fussy about inbound connections. Furthermore, with IPv6 tunnels like aiccu, you aren’t behind the warm fuzzy comfort of NAT, you’re just there on the net with all the fun that entails. So, worth having some extra protection I reckon.

10.6 (and earlier) came with ipfw, a packet filter that’s knocked around the BSD world for some time. It works but isn’t overly featuresome (for example, it doesn’t support NAT in-kernel, so you monkey about passing packets to an external daemon). But it was Good Enough for an end system so I supplemented the system’s “Application Firewall” with an additional ipfw ruleset to give an approximation of safety when out and about, and way more permissive when on networks I trust.

On 10.6 I used WaterRoof to sort-of manage the ipfw rules: in that I only really used its launchd loader and would hack on the rules by hand. In the spirit of decrufting I figured I’d sort that myself and went to remind myself how to load ipfw rules enmasse. I noticed at the top of man page:

NAME
ipfw -- IP firewall and traffic shaper control program (DEPRECATED)
...
DESCRIPTION
Note that use of this utility is DEPRECATED. Please use pfctl(8) instead

Deprecated? Use pfctl instead? Good news everyone – OS X 10.7 now comes with pf, another BSD packet filter that I’ve chosen of ipfw on BSD hosts for years off the back of its featureset (native NAT, state syncing between failover firewall pairs, traffic queing…).

Anyway, the point of this post is to point out a few things I noticed which are intriguing. Firstly, pf is not enabled by default. Further, Apple have added some moving parts around how it is enabled. From /etc/pf.conf:

# This file contains the main ruleset, which gets automatically loaded
# at startup. PF will not be automatically enabled, however. Instead,
# each component which utilizes PF is responsible for enabling and disabling
# PF via -E and -X as documented in pfctl(8). That will ensure that PF
# is disabled only when the last enable reference is released.

These two flags, -E and -X, are absent from pf on BSD. Here’s how they’re documented on OS X:

-E Enable the packet filter and increment the pf enable reference count.
-X token
Release the pf enable reference represented by the token passed.

This suggests that different system components might choose to enable and disable pf, and this is the mechanism to coordinate that. There’s a clue about which components in /etc/pf.anchors/com.apple, which is loaded by the main /etc/pf.conf. It defines additional rule anchors:

anchor "100.InternetSharing/*"
anchor "200.AirDrop/*"
anchor "250.ApplicationFirewall/*"

Interestingly, this host’s ApplicationFirewall has a bunch of entries in when viewed in the Preferences GUI, yet the pf anchor of the same name is empty (and pf was disabled when I started out):

$ sudo pfctl -a com.apple/250.ApplicationFirewall -s rules
Password:
No ALTQ support in kernel
ALTQ related functions disabled

so I’m unsure what the status of this mechanism is. I’ve not had occasion to use AirDrop or connection sharing, but would be curious to see if either use these anchors and enable pf temporarily.

Finally, what’s the token that’s passed to -X? You can ask pfctl for the current tokens:

$ sudo pfctl -s References
No ALTQ support in kernel
ALTQ related functions disabled
TOKENS:
PID Process Name TOKEN TIMESTAMP
17013 pfctl 18446743524308110600 0 days 01:05:50

I enabled pf with pfctl, so that makes sense. When I did so it didn’t inform me of the token, but I suppose an enabling process would spelunk the token shortly after enabling pf by merit of its name and PID and pass it back when it’s finished with pf.

Now, on with the actual job of ruleset writing and puzzling out the launchd voodoo required to enable it at boot.

Minor whinge: Apple could do with updating /etc/protocols:

# $FreeBSD: src/etc/protocols,v 1.14 2000/09/24 11:20:27 asmodai Exp $

Why whinge? It doesn’t know icmp6 is a valid alias for ipv6-icmp. Yep, minor.

Cyrus saslauthd and passwords containing quote marks

Saturday, 11. 06. 2011  –  Category: sw

On the back of reading how affordable and powerful GPUs make for insanely fast brute-force software (eg: whitepixel2) I recently did a round of password strengthening, even for accounts that aren’t immediately vulnerable to 30 billion MD5s a second (yes!) attacks.

I then found then whenever I sent mail using authenticated SMTP my mail server would lock up with saslauthd chewing the CPU. This authentication daemon is the glue between the MTA (Exim) and the IMAP server (Courier) – it logs into the IMAP service to test the SMTP user’s credentials. This little kink of indirection comes about because the IMAP daemon is downstream from the Exim host, in a BSD jail host, so its own authentication mechanisms aren’t visible to the MTA.

My new mail password contained a double-quote mark, which made me wonder if the password wasn’t being quoted properly. Testing a bit with openssl:

$ openssl s_client -starttls smtp -connect localhost:25
CONNECTED(00000003)
---
250 HELP
EHLO localhost
250-svc9.zomo.co.uk Hello localhost [127.0.0.1]
250-SIZE 52428800
250-PIPELINING
250-AUTH PLAIN LOGIN
250 HELP
AUTH PLAIN AGZvbwAi < -- this is Base64 for username foo, password "

[ hang ]


Compiling a -g debug variant of the daemon and aiming gdb at it:

$ sudo gdb /usr/local/sbin/saslauthd-debug 97103
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
...
(gdb) bt
#0 0x284250d1 in strchr () from /lib/libc.so.7
#1 0x0804a823 in qstring ()
#2 0x0804ac45 in auth_rimap ()
#3 0x0804f8e3 in do_auth ()
#4 0x0804e1f4 in do_request ()
#5 0x0804e53b in ipc_loop ()
#6 0x0805018d in main ()

What’s qstring()? It’s a function for escaping the quote marks in strings passed to the IMAP daemon. Turns out count-the-quotemark logic wasn’t properly advancing along the string, so it would sit there spinning forever.

Trivial patch1 fixes:

$ openssl s_client -starttls smtp -connect localhost:25
CONNECTED(00000003)
---
250 HELP
EHLO localhost
250-svc9.zomo.co.uk Hello localhost [127.0.0.1]
250-SIZE 52428800
250-PIPELINING
250-AUTH PLAIN LOGIN
250 HELP
AUTH PLAIN AGZvbwAi
535 Incorrect authentication data

Better :)

  1. Gist if it’s not inlined above []

Competing webserver workloads

Thursday, 17. 02. 2011  –  Category: sw, web

Recently a client was receiving complaints that their busy server hosting both their WordPress sites and their OpenX1 banner delivery was underperforming. Specifically, sites including their banners were seeing page loads hang on them. If you’re in the business of selling banners this is bad news. There were reports of the WordPress sites being slow too, but mostly from administrators2 rather than site visitors.

I sorted out a bunch of request amplification issues but still things still weren’t right, so I added a second server to help out. Instead of just chucking the combined traffic at both servers I used HAProxy to separate out the traffic to each, with a view to adding more OpenX servers as necessary.

Here’s what HAProxy’s stats had to say after some time running the sites split:

wordpress
Queue Session rate Sessions Bytes
Cur Max Limit Cur Max Limit Cur Max Limit Total LbTot In Out
app01 0 0 - 1 367 2 454 - 1758026 1758026 1336541933 30062777594
openx
Queue Session rate Sessions Bytes
Cur Max Limit Cur Max Limit Cur Max Limit Total LbTot In Out
app02 0 0 - 19 45 3 75 - 5588327 5588293 4216687748 11878951168

Some of these I found unsurprising – WordPress serves a higher volume of data, it is content heavy compared to banner delivery and related click handling. Conversely the inbound data volume for OpenX is up because it’s loaded with click information.

What’s interesting is that the WordPress sites have a higher maximum concurrent session count, yet the total sessions is far higher for the OpenX banners. This illustrates the benefit of separating out different server loads: one server is churning away pushing out fat content and even when heavily cached this burns enough resource that requests get queued and gum up, whilst another is fielding quick-in quick-out requests. When it’s not contending with its laggard sibling it can get on with its business unhindered.

Ultimately the visibility HAProxy affords beats an Apache scoreboard when that Apache is fielding two differently focused workloads.

  1. advertising is a necessary evil, right? []
  2. and that turned out to be a pagination issue []