Getting paranoid about ssh-agent
Wednesday, 09. 1. 2010 – Category: vague
A colleague asked me about my SSH setup, which uses different SSH agents for each set of keys that I use (I tend to use a different keypair for each client I work with) and also makes ssh-agent confirm with me each time a key is used.
What’s the point of all that? Because it’s trivially easy to take over someone else’s SSH agent if you have root on a box they’re forwarding to:
$ ssh-add -l
1024 c7:ba:59:92:98:40:f4:53:75:e3:7f:03:fc:0e:3b:bd /Volumes/key/ssh/id_dsa-zomo-bbc (DSA)
$ sudo -i
# ls -ld /tmp/ssh-*
drwx------ 2 victim admins 4096 Aug 27 16:20 /tmp/ssh-bsKJhM8501
drwx------ 2 me admins 4096 Sep 1 09:25 /tmp/ssh-NpAJW14419
# SSH_AUTH_SOCK=/tmp/ssh-bsKJhM8501/agent.8501 ssh-add -l
1024 7a:0a:df:bb:ab:cd:af:e1:04:97:cd:05:34:8c:b4:68 /home/victim/.ssh/id_dsa (DSA)
By setting SSH_AUTH_SOCK to their agent’s forwarding socket you can gain use of their agent for onward logins. Laws may apply.
To mitigate this risk, I use a collection of scripts that do two things
-
Run different SSH agents for different keys, so that a compromised agent
has only limited use (eg: root on client A’s hosts can’t use it access
client B’s hosts). -
Require ssh-agent to prompt for confirmation before it uses a key, so that
a compromised agent stands less chance of being exploited (if I’m away or I decline the request then nothing happens).
They’re here: http://github.com/zomo/ssh-bits. No points for elegance, but they scratch the itch.
Obviously cron jobs are abundantly useful for so many things, all the way from basic housekeeping up to big application functionality.
They’re also the source of plenty of flail. What do I mean?
- They are neither code nor data, so often get overlooked, or shonkily installed, by application deployment tools
- They run with a minimal environment that can catch out the unwary: scripts that work in interactive shell sometimes don’t from cron
- The default behaviour of mailing output to the cronjob owner generates large amounts of mail that gets ignored, filtered or bounced
- Jobs can fail silently and no-one notices until, say, you need to restore that backup that hasn’t run for last six months
- Jobs that helpfully append their output to a log commonly don’t rotate that log
- It’s easy to have jobs overlapping if they get stuck or take longer than expected to complete. This is a splendid way of wedging a machine.
The mail aspect is a particular peeve. In some jobs my mailbox has enjoyed several thousand cron generated mails a day, and there’s no way I’m able to accurately look at each one and react to it. Mostly they contain expected output from successful job execution, so they’re easy to skip. But I don’t trust my eyes to get that right all the time.
One approach to this is to arrange for jobs to only send mail on error. This is an improvement, but can lead into thinking that a job is happily succeeding when in fact it’s either not running or the only-on-error logic is bust. Since cron jobs often cover essential system tasks like backing up, syncing data around and reporting it’s vital that they don’t fail silently.
I’ve worked somewhere that tackled this by collating cron-generated mails from diverse systems into a system mailbox and pattern matching them for failure signs. This seems slightly dubious — it’s fragile and labour intensive — but at least the system also flagged if expected jobs failed to arrive and got our inboxes tamed.
To tackle these problems I find myself writing wrappers for cronjobs. I’ve written several variants to meet different situation’s needs. Unhelpfully I call them all cronwrap. These wrappers sets out to
- Engage the amazingly useful
lockrunutility to guard against multiple execution of stuck crons - Place cron output into timestamped logs that can be both aged out and made available to interested parties
- Hook into local monitoring systems:
- On execution, update a run counter (SNMP data or some simple text file)
- On failure, send a SNMP trap or leave some bait for Nagios. Also, update a fail counter
- If
lockrunhas prevented a job running owing to overlap, send a SNMP trap or similarly bait Nagios
- If required, send output by mail somewhere (sometimes this is necessary, even with the concerns listed above)
So, nothing surprising there. Using such wrappers helps keep cron jobs tamed and reliable, and it’s monitoring them near to where the action occurs, rather than mediating via SMTP.
This is hardly invention either, there’s plenty of prior art with different nuances in behaviour to meet the needs of different environments. Perhaps I’ll merge the variants of my efforts and publish too.
What’s curious is that this functionality isn’t available inside the cron daemon1 itself. It is perfectly placed to catch exit status, divert output and know if a job has overrun; and would remove the need for all this additional monkeying to make jobs reliable and well behaved. If my C wasn’t just read-only I’d have a crack at it!
There, I’ve finally condensed all my cron rant into one sustained piece.
- To be clear, I’m talking about the BSD cron written by Paul Vixie. None of the variants I’ve seen address these concerns either. I’d love to know if there’s any I’ve missed. [↩]
In web service access shuffle, today’s mission is introducing large N number of new backend pools, traffic rules and virtual servers to Zeus ZXTM balancers. No time for monkeying around in the web UI, better check out their well documented API. It uses SOAP, which I’ve never got busy with before – slightly apprehensive.
The reference documentation has examples in Perl and PHP which got me so far, but I’m most comfortable in Ruby now, and was happy to find 1 this pointer to using the soap4r library.
Chief bonus here is the wsdl2ruby.rb tool that’ll transform the WSDL data into Ruby objects with heirarchy, attribute accessors and everything else to make operating the API really comfortable. If your WSDL is a moving target during development it’ll even do this on the fly.
This meant getting the scripting done to configure the ZXTMs was pretty straightforward, without any faffing with the underlying access mech. Refreshing!
- I was going to paste example code, but this’ll do [↩]
Recent articles
- Getting paranoid about ssh-agent
(Wednesday, 09. 1. 2010 – 1 Comment) - cron
(Wednesday, 02. 24. 2010 – 6 Comments) - SOAP in unexpected “actually, quite easy” incident.
(Wednesday, 09. 30. 2009 – No Comments) - ipmitool for OSX
(Tuesday, 09. 29. 2009 – No Comments)
Archives
- September 2010
- February 2010
- September 2009
- August 2009
- January 2009
- September 2008
- August 2008
- July 2008
- May 2008
- April 2008
- February 2008
- January 2008
- November 2007
- October 2007
- September 2007
- August 2007
- December 2006
- November 2006
- August 2006
- June 2006
- May 2006
- March 2006
- February 2006
- January 2006
- December 2005
- November 2005
- October 2005