The Thoughts of Matt Palmer
In a perfect world, when we design some aspect of our infrastructure, we'd get it right the first time, and would never have to change our design once it had been deployed.
Back in this world, however, we regularly have to change things as we expand our infrastructure. Sometimes we made a mistake originally, other times new requirements pop up. In either case we will end up doing things differently in the future than how it was done in the past.
The danger here is that if you only create your new machines or services differently, you've got effectively two separate and different systems to maintain. This may not sound like much, until it happens again... and again... and again... and then suddenly you've got 18 different ways of setting up backups, depending on how long ago you set it up and (if you're really unlucky) who did it and what the phase of the moon was at the time.
If you've been here before, you probably remember the pain of having some disaster happen and needing to fix it NOW NOW NOW, only to wade into the setup and realise that it's been done in a way that (you thought) died out six months ago -- and so you don't really remember how it works, and yeah, the documentation's been updated so many times since then that it doesn't mention the old way of doing things, and here comes your boss to ask how long it's going to be until it's all working again...
If you haven't had this experience (yet?), you'll have to trust me when I say that it's no fun to hit an older, out-of-date system. It's especially painful because the most common reason you change things is because the old way sucked. So you turn up at a critical moment to fix something in a poorly documented and sub-optimally configured system.
Fun, fun, funnity fun.
There are only two ways to avoid this problem: never change the way you do anything, or else plan your changes so that retrofitting is part of the plan every time you're modifying an existing service.
Assuming that never changing anything isn't an option, you need to plan to retrofit. When you're putting together the design for your system change (you do design your changes, right?) you need to audit what's out there, determine what's needed to bring it all up to date, and put together a plan to make that happen. It's like documentation and testing -- if you don't make an explicit effort to make it happen, it'll never get done.
That sounds like a lot of work, and it is. Luckily, if you're already working in a well-maintained infrastructure you should only really have one existing config to upgrade, which shouldn't be too painful. However, if you've got a lot of machines to change there's no way around the fact that they've all got to be changed, and that can take time.
If you accept that you don't want configuration skew, you can start to think about how to engineer your systems and processes to make it less of a hassle to keep everything up to date. Putting some extra thought into how you set things up in the first place so that it's easier to update is at the heart of this. Automating deployment is an easy way to take some of the pain away from rolling out new configs, but there's also plenty of smaller decisions that help with updating configs: putting everything in revision control, making extensive use of templating and pulling out common config fragments, and making sure that you've got good notes and records of what has been done where.
The more you automate your deployment and maintenance, the less pain you feel when you need to retrofit. With something like Puppet, you can eliminate a lot of the pain because you're describing what you want done for all the relevant machines, and then the machines themselves work out what needs to be done to bring themselves up to date.
With an advanced automation system in place, you can say "I want all my mail servers to have setting X turned on" and off they go and set the option. Or you might want to have some of your mail servers to have a different config setting, triggered based on a config setting or class of machine.
Of course, you might not already have that config setting or class defined, so you'll need to add it. Don't forget to retrofit that new config setting or class to your existing machines. That might involve finding the config for all the existing machines and modifying them.
Yep, no matter how much automation you've got you can't get away from the need to retrofit, but given the choice between editing a bunch of config files on my local machine (with all the power of my text editor to back me up) or logging into dozens of machines all over the globe to edit their configs and restart services (with all the risks of making a mistake) I know which one I'd choose...
posted at: 19:18 | category: /general | permalink
That is all.
posted at: 15:29 | category: /general | permalink
For the past couple of days, one of my e-mail addresses has been some spammer's choice for forged From: lines in their spew. So, as invalid addresses aren't high on a spammer's list of priorities, I get all the bounces. The fact that MTAs, in this day and age, don't have SMTP session recipient validation is unpleasant. However, with ISPs requiring all outgoing e-mail to go through their own servers, I can kinda see where that can break down. I still don't like it, but I'll live with it.
In amongst all the bounces, though, there's a lot of other, really obnoxious, crap. So far, I've had five "please click this link / reply to this e-mail so it'll go through" (AKA "please filter my spam for me")[1]. I've also had a large number of e-mails saying that my e-mail was blocked or unwanted or whatever, from spam filtering programs themselves.
What I haven't got any of, as far as I can determine, is any e-mail from enraged recipients saying "stop sending me this crap!" or anything of that nature.
The only conclusion I can reasonably draw from this data is that users know that source addresses are forged and there's no point replying to them, but the people whose job it is to write, maintain, and run spam software don't. That's downright embarrassing. Not a single user was dumb enough to assume that I really sent the e-mail, but IT "professionals" who deal with spam for a living are.
If you are in any way involved in the production, sale, or use of an anti-spam product that hasn't realised that the from addresses of spams are universally forged, please shoot yourself in the head. Really. I'm sick to death of people who should know better doing the most stunningly stupid things regardless.
If you don't know that your software is spamming the rest of the world, then you're still on the hook. What other dumb shit is your system doing that you know nothing about? On the other hand, if you do know that your spam filter is contributing to the noise, you're even worse -- no spam has a real source address. If your software or system spews crap because some clueless manager told you to do it, then you need to grow some courage and ponder on the words of Napoleon:
A commander-in-chief cannot take as an excuse for his mistakes in warfare an order given by his sovereign or his minister, when the person giving the order is absent from the field of operations and is imperfectly aware or wholly unaware of the latest state of affairs. It follows that any commander-in-chief who undertakes to carry out a plan which he considers defective is at fault; he must put forward his reasons, insist on the plan being changed, and finally tender his resignation rather than be the instrument of his army's downfall.
In other words, if you did it and you know you shouldn't have, it's still your fault, regardless of why you did it. Take some responsibility for your actions, for fucks sake.
[1. Every single one of which I was more than happy to confirm -- if you want other people to do a job for you, you have to deal with the fact that some of them might not do it in quite the way you expect. I encourage anyone else who thinks that an anti-spam system that requires the rest of the world to filter your inbox is stupid (even disregarding the likely problems of infinite loops if everyone had a challenge-response inbox) to do the same.
posted at: 14:20 | category: /general | permalink
I really don't think I have anything to add to Joel Spolsky's latest essay, on the horror of architecture astronauts. My favourite quote:
It sort of bothers me, intellectually, that there are these people running around acting like they're building the next great thing who keep serving us the same exact TV dinner that I didn't want on Sunday night, and I didn't want it when you tried to serve it again Monday night, and you crunched it up and mixed in some cheese and I didn't eat that Tuesday night, and here it is Wednesday and you've rebuilt the whole goddamn TV dinner industry from the ground up and you're giving me 1955 salisbury steak that I just DON'T WANT. What is it going to take for you to get the message that customers don't want the things that architecture astronauts just love to build.
posted at: 19:24 | category: /general | permalink
It's stories like the following that make you realise just how puny your talents are...
(Reposted from a place not to be named, by the legendary Al Viro)
Give them what I got yesterday. As in, box that got
a) Linux kernel running.
b) init and bash running.
c) serial and floppy - built as modules. And not loaded.
d) no sash.
e) no ethernet.
f) bloody large number-crunching that Should Not Be Aborted(tm).
g) libc and ld-linux.so - unlinked (self-LART by owner).
Now, I could tell the guy to piss off, but... WTF? He had decent beer and
was properly scared. Oh, well... So we have no exec() for anything
dynamically
linked. And we have no chance to access any external stuff - insmod is
linked
dynamically, so no insmod floppy for you. Shutting the system down was not
an
option due to (f) (aside of dealing with fsck later - umount(8) is
dynamically
linked too). Now, I knew that both /lib/libc.so-2.1.2 and
/lib/ld-linux.so-2.1.2 were still alive - mmaped by init, for one. And
/proc/1/maps would even contain their inumbers. So the plan of attack was
to create a file in root and then cannibalize the entry (change inumber in
the directory). Alas - not enough. In-core inode got zero i_nlink and
if I would just create a link by hands it would not become positive. I.e.
still remove-upon-close. But. But if we will manage to call link() on the
hand-made link we will get i_nlink raised to 1 - iget() will find the same
in-core inode, so we are OK. We'll have to revert the phony link to avoid
PO'd fsck, but that's not a problem...
So there we go: assuming that we got static ln
echo >foo
ln foo bar
flip inumber in foo entry to point to libc
ln foo /lib/libc.so-2.1.2
flip inumber.......................... bar
repeat for ld-linux.so
rm foo bar
begin recovering other damage (self-LART was a bit larger).
But... we don't have this flip stuff and we don't have (aaarrgh)
static
ln. Oh, shit... OK, but all we really need is a couple of syscalls, right?
There we go:
eax=__NR_link;
ebx=s1;
ecx=s2;
int 0x80;
eax=__NR_exit;
ebx=0;
int 0x80;
s1: "foo"
s2: "bar"
The next step was getting the syscall numbers. grep? We don't need no
stinkin'
grep.
# while read i; do case $i in *__NR_link*) echo $i;; esac;
done </usr/include/asm/unistd.h
#define __NR_link 9
# while read i; do case $i in *__NR_exit*) echo $i;; esac;
done </usr/include/asm/unistd.h
#define __NR_exit 1
#define __NR__exit __NR_exit
Now, scratching the head and recalling intel code...
start: b8 09 00 00 00
bb (address of s1)
b9 (address of s2)
cd 80
b8 01 00 00 00
bb 00 00 00 00
cd 80
s1: 66 6f 6f 00
s2: 62 61 72 00
Fine, but... a.out support is compiled... you guessed it, as module and not
currently loaded. So we are in for crufting up an ELF binary. OK, we don't
actually need 100%-correct ELF, just something that will pass for the
exec(). Now, we don't have shell scripts, but . will work. So the next
step was rolling more(1) in shell and reading through the relevant code
(surprisingly small - binfmt_elf.c and two headers). After much swearing
the following abortion was created:
7f 45 4c 46 00 00 00 00 00 00 00 00 00 00 00 00
02 00 03 00 01 00 00 00 start______ 34 00 00 00
00 00 00 00 00 00 00 00 34 00 20 00 01 00 28 00
00 00 00 00 01 00 00 00 00 00 00 00 base_______
base_______ size_______ size_______ 05 00 00 00
00 10 00 00
start:
b8 09 00 00 00 bb start+1d___ b9 start+21___ cd
80 b8 01 00 00 00 bb 00 00 00 00 cd 80 66 6f 6f
00 62 61 72 00
OK, set base to something page-aligned, start=base+54, size=79. There we go:
7f 45 4c 46 00 00 00 00 00 00 00 00 00 00 00 00
02 00 03 00 01 00 00 00 54 00 00 80 34 00 00 00
00 00 00 00 00 00 00 00 34 00 20 00 01 00 28 00
00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 80
00 00 00 80 79 00 00 00 79 00 00 00 05 00 00 00
00 10 00 00 b8 09 00 00 00 bb 71 00 00 80 b9 75
00 00 80 cd 80 b8 01 00 00 00 bb 00 00 00 00 cd
80 66 6f 6f 00 62 61 72 00
well, while read l; do for i in `echo $l`; do echo -ne "\\$i"; done; done
made for oct2bin, overwriting /usr/bin/emacs with the output of that gave
us static equivalent of ln foo bar. And it worked. The rest was essentially
the same - the only tricky part was to find the location of directory entry.
Which was done with (lseek, read byte, exit(said_byte)) and shell wrapper
around that. So we had a way to read a block and dump it on the console.
The rest was obvious - start from relevant block in inode table and walk
through the references... Once we got that, it was an easy ride - trick
with inumber flipping brought libc and dynamic linker back and after that
we had 99% of system back into the working state. Amazing how little you
actually need to bring the system back to life...
Hot damn that's some deep hackery...
posted at: 13:00 | category: /general | permalink
Blahsagne, n.: A tasteless pile of pasta sheets and nominally tomato-based sauce, usually found in the cold store section of supermarkets.
posted at: 20:37 | category: /general | permalink
If you insist on having several levels of pulldown menu in your application, would you mind not having each one make an AJAX call back to the server to get the contents of the menu? The "click... wait... click... wait..." to navigate where I need to go gets particularly tedious when I have to do it in excess of 100 times today.
Stabbity Stab,
Matt
posted at: 19:49 | category: /general | permalink
Ever since I started University, I've been a member of the Institution of Electrical and Electronic Engineers. Although I don't really do what you would call "Engineering" any more, I like to read the IEEE Spectrum magazine and some of the Computer society journals are quite relevant to what I do.
However, this year the IEEE have produced what is, without a doubt, one of the most shithouse online ordering systems I have ever had the misfortune of trying to use. To start with, if you happen to go to the account login screen with cookies disabled, you get a page that is utterly blank. I've seen a lot of ways to fail to handle cookies, but I think the completely blank page is a first for me.
Once you work out what's going on and log in, you're presented with a long and painful journey through a series of "checkout" pages, mostly alike. I think I got presented with my list of journals about four times. After navigating that pest, it comes up with this little gem when you start the credit card processing step:
Then, every time you input practically anything into the form, it does a Javascript submit and does things to the form. For example, when you enter your credit card number, you get a form back that has blanked out all but the last four digits of the card number. OK, a little dumb, but not spectacularly so. Although, when you're a quick typist and have filled in the next three fields while the web browser was playing footsie with the slow-as-molasses web server, only to have that data wiped out when the page refreshes, is a bit of a pest.
Far more annoying is when you start entering your address "for verification purposes". Here, you get a form resubmit every time, presumably to update the form based on (for example) your selected country, but not a lot changes. Even though I picked Australia, I still got a pull-down with all of the the US states in it. You could short-circuit all this with the "choose existing address" option, except that (despite me being logged in) my existing addresses aren't in a select box, but in a separate pop-up window (which didn't actually display anything after all).
In short, the IEEE's online renewal system is a disgrace, and is an embarrassment to what is purportedly a technical organisation.
posted at: 16:37 | category: /general | permalink
You know you've got a bad wiki when you start pining for "the good old days" of MediaWiki. Yes, this wiki I need to use is that bad.
posted at: 09:27 | category: /general | permalink
As Internet-connected people, we tend to interact with a lot of people who aren't necessarily geographically co-located. Hence today's cool little program I found: gworldclock. You pick an arbitrary number of timezones to have in your list, and the current time in all of them is displayed in a neat list, quietly ticking away regularly. Small? Yes. Trivial? Getting close to it. Really Useful? You betcha. It also has a very tidy 'rendezvous' mode, where you choose a time, a timezone (from your list of selected timezones) and it shows the corresponding time in all of your other chosen timezones. Again, something that 20 minutes of coding will probably produce, but the whole point of this open source thing is not having to re-invent the wheel...
The 'g' does indicate a GTK application, but there is a kworldclock for people who wouldn't be seen dead with a GTK application on their desktop, and a console-only version, tzwatch, for anyone who lives the text-mode life.
posted at: 11:49 | category: /general | permalink