Your are here: Home > Blog

Jeremy Moskowitz recently posted a great article entitled Backup Tips for the 21st Century: Backup procedures so easy, your Mom could (and should) do it. This is not directed at IT managers or anyone else who has to manage a business network, although there are certainly some common themes, which we’ll talk about a bit later. Rather, the article is targeted at the average home user – you know, those people who are always asking you to help them with some kind of computer problem, because you “know about computers.”

I’d strongly recommend that you click over and read his entire article, and share it with as many people as possible, because he goes into detail on why you should be doing each of these things. But just to give you a little taste of it, here are the seven things:

  1. Get an online backup service (e.g., Carbonite.com, Mozy.com, etc.)
  2. Get a full-disk backup program
  3. Backup to an external USB drive (in fact, get two or three – they’re cheap)
  4. Don’t keep all your backups in your house
  5. Rotate between at least two, possibly three USB drives
  6. Keep copies of your original disks, downloadables, keycodes, and drivers
  7. Test your restore procedure

Although he feels strongly that you should do all seven in order to be absolutely safe, he also points out that just doing one of them will make you better off than most people – who don’t do anything at all! (And if you only do one, he suggests #3.)

Why should people do these things? Because, in Jeremy’s words, “DISK DRIVES ALWAYS FAIL. ALWAYS. It’s a guarantee. Even the newest ones with no moving parts. They all fail. Eventually.” And he’s right. The only question is when. I’ve seen drives fail within days of being installed (not many, but some), and drives last for years. But eventually, they will wear out. When they do, the data on them is toast, so you’d better either have a backup or have deep pockets to pay someone who specializes in forensic data recovery, and who may or may not be able to recover your most precious data from the dead drive no matter how much you’re willing to pay.

So, how does this translate to sound business practice? Allow me to paraphrase his seven points, and combine a couple of them:

  • Make sure you’re getting a copy of your data out of the building. Use an on-line service, stream data to a repository at a branch office, or just take a copy home every Friday. But do something to get a copy out of the building.
  • Your backup strategy should encompass both machine images and file/folder based backups. If you lose an entire system, it’s a lot faster to restore from an image than to reinstall the OS from scratch and then restore the data files. On the other hand, if all you need is a single file, or a single email message or mailbox, you don’t want to have to restore an entire image just to get that one thing you need.
  • What he said about disks failing goes double (at least) for tapes. Tapes are far less reliable than hard disks. Their capacity is limited. They wear out quickly. The drives get dirty and are subject to a variety of mechanical problems. Unless you’ve either got an expensive autoloader or a night operator to swap tapes in the middle of the night, if your tape fills up you either cancel the job when you come in the next morning, or you finish the backup during working hours and live with the performance hit of doing that while users are trying to work. That’s why we believe so strongly in disk-to-disk backups.
  • Keep copies of your original disks, downloadables, keycodes, and drivers. (Not much I can add to that point.)
  • Test your restore procedure. (Not much I can add to that either.) If you don’t ever do a test restore, you only think you’re getting good backups. And if you’re not, you won’t know about it until you have a catastrophic failure and find out that your data is gone forever.

That’s all for today – you go read Jeremy’s post in full, I’m going to swing by the local office superstore and pick up a couple more USB hard drives…

Virtualization can mean different things depending on who you ask so we are going to take a broad look at what virtualization is, the different forms it comes in, and why it is so popular.

This is going to be pretty basic stuff so if you are looking for more advanced material I promise we will have advanced stuff in future posts.

Virtualization has been getting a lot of buzz the last few years as it moved from being “bleeding edge” technology to becoming an industry standard. You may have even heard that there are lots of benefits to virtualizing your datacenter…but you may not be sure whether it’s for you, how it works, or even what it means.

There are several kinds of virtualization, including server virtualization, storage virtualization, application virtualization, network virtualization, and desktop virtualization. But when most folks talk about virtualization, they’re referring to server virtualization, so that’s what we will cover today.

So, what is server virtualization?  Simply put server virtualization is the technology that is designed to allow multiple (virtual) servers to reside on a single piece of (physical) hardware and share the resources of the physical server – while still maintaining separate operating environments, so that a problem that crops up in one virtual server won’t affect the operation of others that may be running on the same physical “host.” To help explain what this means I’m going to use the house and condo analogy.

Let’s say you’re a land developer and you build residential property. You cut your land into smaller plots and build one house per plot. As part of the land development, you need to bring in all the utilities from the main street to each and every plot. All of this development costs money.  To make matter worse you know that your city’s population is growing, you’re running out of land to build on, and you also need to control the spiraling costs of building materials. How do you cut cost and provide more homes for a growing population on a limited amount of land?

Figure 1 - Typical cul-de-sac USA

Figure 1 - Typical cul-de-sac USA

Perhaps instead of building single-family homes and having one resident per plot you start building condominiums that hold several residents each. Now the utilities that are brought in to the condo complex are shared by all the residents and yet no one ever sees the other residents’ bills. You’re making more efficient use of the land you have and not wasting time and money bringing in utilities to each individual house. Plus one yard is easier to take care of than ten yards.

1 & 2bd Condos Available Now!!

Figure 2 - 1 & 2bd Condos Available Now!!

So how does this relate to server virtualization?

Each plot of land is a physical server, the structure you build on that plot is a server “workload” (i.e., Exchange, SQL, file server, print server, etc.), and the city is your data center. The utilities are things like power, cooling, and network connectivity. When there is only one workload per physical server, a lot of space and resources get wasted. It’s common to see only 10-15% (if that) processor utilization on physical servers which run only one operating system and one application.

With server virtualization we can now create several “virtual” servers on one physical piece of hardware – think of the hardware as little “server condos” if you like. Just as you can have one-bedroom, two-bedroom, and three-bedroom units in a single building, you can allocate differing amounts of processing and memory resources to the virtual servers depending on the requirements of each individual workload. Each virtual server can now share the physical resources of the host machine with the other virtual servers and never know that they are sharing. In fact, each virtual server “thinks” it’s running on its own dedicated hardware platform. By doing this you can now utilize 80-90% of the processing power of the hardware you own, and cut down on the total amount of power, cooling, and floor space you need in your data center.

For example (just pulling numbers out of the air), let’s say that you’ve been paying an average of $5K each for servers that would handle a single workload. If you need four of them, that’s $20K in hardware cost. But if you can buy one server for $8 – 10K to virtualize these 4 machines, that’s a significant reduction in hardware cost. And with fewer machines to plug in and keep cool, your savings can be up to 40% on power consumption alone. (Did you know that we’ve now reached the point where, over the service life of a typical new server, it’s going to cost you more to keep it cool than it cost you to buy it?)

Since the virtual servers are all located on one physical box you now have fewer pieces of hardware to maintain – allowing the IT staff to use their time more efficiently. You’ll save space in your data center. You’ll also cut down on the amount of waste (some of it hazardous) that must be recycled or disposed of when your hardware finally reaches its end-of-life.

You’ve also cut down time needed to bring a new server on line. In the past you would have had to acquire the hardware, assemble it, rack it, connect it to the network, install and patch the OS, install and configure the application, test it all, and finally put it into service. Now that the servers are virtual they can be created, configured, and put into production in a few hours as opposed to the weeks it used to take. In some cases, by using templates for commonly-needed workloads, it can take only minutes. This makes for a much more flexible and scalable environment.

So server virtualization can:

  • Cut hardware costs
  • Cut energy costs (for both power and cooling)
  • Cut system maintenance time and costs
  • Create a very scalable and flexible data center
  • Save space
  • Create a more environmentally friendly data center (a.k.a. “green computing”)

These are the main reasons that server virtualization has become an industry standard. According to folks like Gartner, we’ve now reached the point where the majority of new servers placed into service are being virtualized, and the majority of enterprises have made it a standard practice to virtualize all new servers unless there is a compelling reason why a server can’t or shouldn’t be virtualized. Virtualization also makes it easier to implement things like high availability, disaster recovery, and business continuity, but that’s a subject for a future post.

Part 1 and Part 2 of this series covered the basic cryptographic concepts behind SSL certificates, and looked at how an SSL certificate is constructed and how it is validated. This installment will discuss what different kinds of certificates exist, some things to watch out for, and two big takeaways that will save you time, money, and aggravation.

Traditionally, an SSL server certificate, such as the Wells Fargo Bank certificate that we discussed in Part 2, were issued for the Fully Qualified Domain Name (“FQDN”) of the server the certificate is intended to secure. The certificate we discussed was specifically issued to “www.wellsfargo.com.” This is called the “Common Name” (abbreviated as “CN”), and is specified in the “Subject” field of the certificate:

The Common Name Field

The Common Name Field


That means that if there were other DNS entries, such as “remote.wellsfargo.com” or “email.wellsfargo.com” that happened to resolve to the IP address of the same physical server, and you pointed your browser at one of those, you would get a certificate error – because the certificate was issued to “www.wellsfargo.com,” not to one of those other entries, and the browser won’t be happy unless the host name in the address bar exactly matches the Common Name listed in the Subject field.

In recent years, a couple of new kinds of certificates have been introduced. One is the Multiple Domain, or “UCC” (Unified Communications Certificate) certificate. [Note: Yes, I realize that saying “UCC certificate” is inherently redundant – like saying “PIN number.” If, by chance, my former English professor is reading this, I apologize. But I’m going to do it anyway.] A UCC certificate contains an extra field called the “Subject Alternative Names” field, which can be used to list multiple subdomains that the certificate can be used to secure. For example, a UCC certificate could be used to secure “remote.mooselogic.com,” “email.mooselogic.com,” “extranet.mooselogic.com,” and so forth, provided that all of those subdomains are explicitly listed in the Subject Alternative Names field. That means that you must specify what subdomains you want listed when you purchase the certificate, and if you want to add or delete one, the certificate must be regenerated by the issuer (which will generally cost you more money).

In addition to the Subject Alternative Names field, a UCC certificate still has a “Common Name” listed in the “Subject” field. However, according to the X.509 certificate standard, if the Subject Alternative Names field is present, the client browser is supposed to ignore the contents of the Common Name field (although not all of them do). Therefore, if the common name is “www.mooselogic.com,” but that common name is not repeated as one of the Subject Alternative Names, a browser that strictly adhered to the standard would end up with a certificate error if it tried to connect to “www.mooselogic.com.” This interaction between Common Name and Subject Alternative Names has some implications for mobile devices that we’ll come back to in a bit.

The other new kind of certificate is the “Wildcard” certificate. A Wildcard certificate could be issued for, say, “*.mooselogic.com,” and used to secure any and all first level subdomains. (E.g., email.mooselogic.com is a first level subdomain; email.seattle.mooselogic.com is not a first level subdomain, and could not be secured with a Wildcard certificate.) A Wildcard certificate does not contain a Subject Alternative Names field – instead, the Wildcard (“*.mooselogic.com”) is actually listed as the Common Name in the Subject field.

If you are running a browser that was released anytime since 2003, it should support Subject Alternative Names, and probably Wildcard certificates as well. In that case, your browser will be happy if one of the following three conditions is true:

  1. The host name in the address bar of the browser exactly matches the Common Name of the certificate. (Unless the cert is a UCC cert, in which case the browser is supposed to ignore the Common Name.)
  2. The Common Name is a Wildcard, and the host name in the address bar matches the Wildcard.
  3. The cert is a UCC cert, and the host name in the address bar exactly matches one of the names listed in the Subject Alternative Names field.

However, if you are using a browser on a mobile device of some kind, it’s a different story. Windows Mobile 6.x devices support both Subject Alternative Names and Wildcards. Windows Mobile 5 devices support Subject Alternative Names, but do not support Wildcards. If you’re not running a Windows Mobile 5 or 6 device, you’re going to have to check with the vendor of your mobile device. Some support both Subject Alternative Names and Wildcards, some only support one of them, some support neither.

So what? Well, if you’re trying to use a mobile device to synchronize e-mail with an Exchange Server, it’s usually done by pointing the device at the same URL that you’re using for Outlook Web Access (“OWA”). If you’re using a Wildcard certificate to secure your OWA site, and your mobile device doesn’t support Wildcards, you’re out of luck – it’s simply not going to work. However, if you’re using a UCC certificate to secure your OWA site, and the URL of the OWA site is also the Common Name of that UCC certificate, your mobile device will be happy even if it doesn’t support UCC certificates…because it will simply look in the Common Name field and find a match.

So here’s big takeaway #1: If you’re going to use a UCC certificate to secure multiple URLs, and one of those URLs happens to be the URL you’re going to use to synch email to mobile devices, make sure that URL is the Common Name of the certificate in addition to being listed as one of the Subject Alternative Names.

Another common “gotcha” involving SSL and mobile devices involves intermediate certificates. Remember that “chain of trust” discussion from the second post in this series? It is increasingly common to find that the certificate you have purchased to secure your Web site is not chained directly to the CA’s trusted root. Instead, there is at least one intermediate certificate in the chain between the trusted root and the certificate you purchased. This isn’t a problem for “big Windows,” because the browser is smart enough to sense that the certificate the server is presenting is not chained directly to the trusted root that it knows about, and to request the intermediate certificate(s) so it can validate the complete chain of trust. Mobile devices, including Windows Mobile devices, are not that smart.

Mobile devices depend on the server to present the entire certificate chain, including any intermediate certificates, at the time of connection. And the server won’t do that unless all of the intermediate certificates are present in that server’s own local computer certificate store. Installing the certificates into IIS for use in securing the OWA Web site does not automatically put them in the local computer certificate store – you must explicitly import them.

But why purchase a commercial certificate at all? Can’t you be your own Certificate Authority if you’re running a Windows Active Directory Domain? Yes, you can…if you don’t care about supporting connections from any PCs other than ones that have been joined to the domain, and you don’t care about supporting mobile devices. For example, when you set up a Windows Small Business Server, the wizard that configures that server for OWA automatically secures it with a self-issued certificate. That’s not a problem for any PC or laptop that has been joined to your SBS domain, because the very act of joining a computer to a domain inserts the domain’s own self-issued root certificate into the computer’s trusted root certificates store. But if you then try to connect from your home PC, or your mother-in-law’s PC, or any other PC that isn’t a member of your domain, you get a certificate error. At least with a PC, you have the opportunity to override the error and connect to the Web site anyway…but a mobile device will simply fail to connect, while typically giving you very little information about what the problem is.

You may or may not be able to manually import a certificate into the trusted root certificate store of your mobile device. Some mobile operators give their subscribers that level of “management access” to their mobile devices and some don’t. Some mobile operators provide special certificate installation utilities for their smart phones, some don’t. Sometimes there are workarounds, sometimes there aren’t. To our knowledge, there is no definitive list available of which mobile devices have their certificate stores locked down and which don’t. So the question is: How much is your time worth? The first time you (or we, on your behalf) spend a half day trying to make a mobile device work with an SSL certificate that wasn’t built into the phone, you will have spent more money – not to mention the time and aggravation – than it would have cost to go to a public CA and purchase a certificate that’s already supported.

So big takeaway #2 is: If you’re going to synch e-mail to mobile devices, do yourself a favor and decide in advance what mobile devices you’re going to support, then buy an SSL certificate from a public CA whose trusted root is already supported by those mobile devices. You’ll save money in the long run, and probably keep your blood pressure lower as well.

For more information on certificates and mobile devices, including a list of the trusted root certificates that ship with Windows Mobile 5 and Windows Mobile 6 devices, download the Moose Logic Technical Bulletin entitled Recommended Best Practices for Exchange Synchronization with Mobile Devices.

In Part 1, we discussed basic cryptography, and worked our way up to symmetrical encryption systems such as AES, which accepts key lengths as long as 256 bits. We also discussed why key length was important to a cryptosystem, and alluded to the fact that there are also asymmetrical systems. An asymmetrical system uses a key pair, such that anything that is encrypted with one key can only be decrypted with the other. The math behind such a system is way beyond the scope of this humble blog, so we ask that you simply take our word for it that such systems exist.

In most systems that make use of key pairs, one of the keys is made public, and the other is kept secret. This is generically called a Public Key Infrastructure, or PKI. Consider the following use cases:

  • If you know my public key, you can use it to encrypt a message that only I can decrypt – assuming, of course, that I’ve kept my private key truly private.
  • I can encrypt something using my private key that can then be decrypted by anyone who has my public key. And since the public key is, well, public, that means pretty much anyone.

The first use case has obvious benefits. But what good is encrypting a message that anyone could theoretically decrypt? Well, if you know my public key – and you have some way of knowing that it’s really mine – then you can be pretty sure that any message that you successfully decrypt with it must have been sent by me (again, assuming that I’ve kept my private key safe). That makes for a pretty good digital signature.

So, the question becomes this: to use some kind of PKI, how can I securely transmit my public key to the people who might want to communicate with me (or authenticate my communication with them), in such a way that they are confident that it’s really my public key? One way, certainly, to get it to you would be to physically hand it to you on some kind of storage medium – which could be a USB flash drive, a CD, or even a piece of paper. If that’s not feasible, perhaps I could give it to someone you trust, who could then give it to you and vouch for its authenticity. That, by the way, is the concept behind “PGP” (which stands for “Pretty Good Privacy”) – you establish a circle of trust, and when Bob sends you Jane’s public key, it’s up to you to decide how much you really trust Bob.

That might be acceptable for exchanging secure e-mail with your friends, but it’s probably not good enough if we’re talking about securing access to your on-line banking system. So how do you know, when you point your browser at, say, www.wellsfargo.com, that the server you end up talking to, which is asking you for stuff like your social security number and password, is really a server that belongs to Wells Fargo Bank? Enter the SSL Certificate.

The next time you’re on your favorite banking or shopping site, click on the little padlock symbol (exactly where that symbol is will depend on what browser and version you’re running…IE8 displays it right at the end of the address field where the URL is displayed). I’m going to stick with Wells Fargo Bank for now, and break down what you should see if you choose “View Certificate.”

First, you’ll see something like this:

SSL Certificate Screen Capture

SSL Certificate Screen Capture


It tells you briefly the purpose for which the certificate was issued (to ensure the identity of a remote computer). It tells you that the certificate was issued to www.wellsfargo.com, and that it is valid through June 5, 2010. But how do you know you can trust it? Where did the certificate come from?

Well, if you click on the “Certification Path” tab, it will tell you where it came from:

SSL Chain of Trust

SSL Chain of Trust


This shows the “chain of trust,” and tells you that the “root” of that chain is the VeriSign Class 3 Public Primary Certificate Authority (“CA”). VeriSign is one of the public CAs that companies like Wells Fargo can go to and purchase certificates for purposes such as this. VeriSign is, in fact, probably the best known (and most expensive). But how do we really know that the certificate is valid? Just because it says it came from VeriSign, how do we know it really did?

Hold that thought, and let’s click on the “Details” tab:

Certificate Details Tab

Certificate Details Tab


Now we’re getting into the nitty-gritty of the certificate. Note the two fields called “Signature algorithm” and “Signature hash algorithm.” Remember those – we’ll come back to them later. For now, scroll down on the upper portion of the certificate, and highlight “Public key:”
The Public Key

The Public Key


You can now see the actual public key for this certificate displayed in the lower part of the window. Basically, the server is trying to tell you, “Here’s my public key. Use it to encrypt anything that you want to securely send to me.” But, still, how do we know it’s real? How do we know it hasn’t been tampered with?

Scroll down to the bottom, and you’ll see a reference to something called a “thumbprint:”

The Thumbprint

The Thumbprint


You will also see what algorithm was used to generate the thumbprint – in this case, it’s an algorithm called “sha1.” You probably don’t know what the “sha1” algorithm is, but your PC does. It’s used to generate a “hash value” on the contents of the certificate (remember, to your PC the entire certificate is just one long binary number). This is a one-way computation – in other words, it is not possible to look at the hash value, even knowing the algorithm, and work backwards to determine the original value that was used to generate the hash.

So the certificate is signed by running the sha1 algorithm on its contents, and then encrypting the results using the private key of the next-higher certificate in the chain of trust. That encrypted result is transmitted with the certificate as the “thumbprint.” Your computer then takes the contents of the certificate, runs the sha1 algorithm on it, and compares the results with the transmitted thumbprint, which it decrypts using the public key of the next-higher certificate in the chain of trust. If the hash values exactly match, you can be confident that the certificate hasn’t been tampered with – because it would be, for all practical purposes, impossible to tamper with the contents of the certificate without altering the thumbprint value.

Now, back to the basic question of trust. Each certificate in the chain of trust is, in this fashion, digitally signed using the private key of the certificate above it in the chain. You can, in fact, click on any certificate in the chain, click the button labeled “View Certificate,” and examine the details of that certificate, just as we examined the details in the example above. Ultimately, we find ourselves at the VeriSign “root” certificate, and the ultimate question becomes how do we really know that the public key presented in that root certificate, and which we are supposed to use to validate the signature of the next lower certificate in the chain, is valid – since we’ve now come to the end of the chain, and have no higher authority to use to validate the signature of the root?

The answer (which may surprise you) is that the manufacturer of your browser software made a deal with VeriSign to build their root certificate into the browser. If you’re running Internet Explorer, go to the “Tools” menu, and choose “Internet Options.” Click on the “Content” tab, and then click the “Certificates” button. Then click the tab that says “Trusted Root Certification Authorities,” and scroll down. Guess what?

Trusted Root Certificates

Trusted Root Certificates


There it is. Ever notice that from time to time Microsoft’s update service pushes out “root certificate updates?” Now you know what that’s all about – they’re either adding certificates to the trusted root certificate store, or replacing ones that are about to expire.

So, now that we know we can trust the certificate that the Wells Fargo Web server presented to us, what do we do with it? Well, we’ve learned two things from the certificate: First, we know that the server we’re communicating with is indeed what it claims to be – part of the www.wellsfargo.com server farm; second, we know that server’s public key. Since we know its public key, we know that we can send information to it securely. However, we don’t want to use the public/private key pair for our entire session, because it so happens that an asymmetrical encryption scheme requires more processing effort that a symmetrical scheme – it would be preferable to use a symmetrical encryption scheme. But we don’t want to just use the same symmetrical key every time, because one of the basic precepts of cryptography is that the more encrypted data you have to work with, the easier it is to break the encryption. Therefore, your PC will use the server’s public key to securely negotiate a session key that will be used to symmetrically encrypt and decrypt just this banking session. Tomorrow, or next week, when you log on again, your PC will negotiate a totally different session key.

In the next post, we’ll talk more about the different kinds of certificates, what they’re used for, and some of the pitfalls of using certificates to secure communications in your own networks.

We’ve seen a lot of confusion regarding what SSL certificates are all about – what they are, what they do, how you use them to secure a Web site, what the “gotchas” are when you’re trying to set up mobile devices to synchronize with an Exchange server, etc. So we’re going to attempt, over a few posts, to explain in layman’s terms (OK, a fairly technical layman) what it’s all about. However, before you can really understand what SSL is all about, you need to understand a little bit about cryptography.

When we were all kids, we probably all played around at one time or another with a simple substitution cipher – where each letter of the alphabet was substituted for another letter, and the same substitution was used for the entire message. It may have been done by simply reversing the alphabet (e.g., Z=A, Y=B, etc.), by shifting all the letters “x” letters to the right or left, or by using your Little Orphan Annie Decoder Ring. (The one-letter-to-the-left substitution cypher was famously used by Arthur C. Clarke in 2001: A Space Odyssey to turn “IBM” into “HAL” – the computer that ran the spaceship.)

The problem with such a simple cipher is that it may fool your average six-year-old, but that’s about it – because (among other things) it does nothing to conceal frequency patterns. The letter “e” is, by far, the most frequently used letter in the English language, followed by “t,” “a,” “o,” etc. (If you want the full list, you can find it at http://en.wikipedia.org/wiki/Letter_frequency.) So whichever letter shows up most frequently in your encoded message is likely to represent the letter “e,” and so forth…and the longer the message is, the more obvious these patterns become. It would be nice to have a system that used a different substitution method for each letter of the message so that the frequency patterns are also concealed.

One approach to this is the so-called “one-time pad,” which is nearly impossible to break if it is properly implemented. This is constructed by selecting letters at random, for example, drawing them from a hopper similar to that used for a bingo game. A letter is drawn, it’s written down, then it goes back into the hopper which is again shuffled, and another letter is drawn. This process is continued until you have enough random letters written down to encode the longest message you might care about. Two copies are then made: one which will be used to encode a message, and the other which will be used to decode it. After they are used once, they are destroyed (hence the “one-time” portion of the name). One-time pads were commonly used in World War II to encrypt the most sensitive messages.

To use a one-time pad, you take the first letter of your message and assign it a numerical value of 1 to 26 (1=A, 26=Z). Then you add to that numerical value the numerical value of the first letter of the pad. That gives you the numerical value of the first letter of your cyphertext. If the sum is greater than 26, you subtract 26 from it. This kind of arithmetic is called “modulo 26,” and while you may not have heard that term, we do these kinds of calculations all the time: If it’s 10:00 am, and you’re asked what time it will be in five hours, you know without even thinking hard that it will be 3:00 pm. Effectively, you’re doing modulo 12 arithmetic: 10 + 5 = 15, but 15 is more than 12, so we have to subtract 12 from it to yield 3:00. (Unless you’re in the military, in which case 15:00 is a perfectly legitimate time.) So as we work through the following example, it might be helpful to visualize a clock that, instead of having the numbers 1 – 12 on the face, has the letters A – Z…and when the hand comes around to “Z,” it then starts over at “A.”

Let’s say that your message is, “Hello world.” Let’s further assume that the first ten characters of your one-time pad are: DKZII MIAVR. (By the way, I came up with these by going to www.random.org, and using their on-line random number generator to generate ten random numbers between 1 and 26.) So let’s write out our message – I’ll put the numerical value of each letter next to it in parentheses – then write the characters from the one-time pad below them, and then do the math:


  H(8)  E(5)  L(12) L(12) O(15) W(23) O(15) R(18) L(12) D(4)
+ D(4)  K(11) Z(26) I(9)  I(9)  M(13) I(9)  A(1)  V(22) R(18)



= L(12) P(16) L(12) U(21) X(24) J(10) X(24) S(19) H(8)  V(22)


So our cyphertext is: LPLUX JXSHV. Note that, in the addition above, there were three times (L + Z, W + M, and L + V) when the sum exceeded 26, so we had to subtract 26 from that sum to come up with a number that we could actually map to a letter. Our recipient, who presumably has a copy of the pad, simply reverses the calculation by subtracting the pad from the cyphertext to yield the original message.

While one-time pads are very secure, you do have the logistical problem of getting a copy of the pad to the intended recipient of the message. So this approach doesn’t help us much when we’re trying to secure computer communications – where often you don’t know in advance exactly who you will need to communicate with, e.g., a banking site or a typical Internet e-commerce site. Instead, we need something that lends itself to automated coding and decoding.

During World War II, the Germans had a machine that the Allies referred to by the code name “Enigma.” This machine had a series of wheels and gears that operated in such a way that each time a letter was typed, the wheels would rotate into a new position, which would determine how the next letter would be encoded. The first Enigma machine had spaces for three wheels; a later model had spaces for four. All the recipient needed to know was which wheels to use (they generally had more wheels to choose from than the machine had spaces for) and how to set the initial positions of the wheels, and the message could be decoded. In modern terms, we would call this information the “key.”

One of the major turning points in the war occurred when the British were able to come up with a mathematical model (or “algorithm”) of how the Enigma machine worked. Alan Turing (yes, that Alan Turing) was a key player in that effort, and the roots of modern digital computing trace back to Bletchley Park and that code-breaking effort. (For a very entertaining read, I highly recommend Cryptonomicon by Neal Stephenson, in which Bletchley Park and the code breakers play a leading role.)

Today, we have computers that can perform complex mathematical algorithms very quickly, and the commonly used encryption algorithms are generally made public, specifically so that researchers will attack and attempt to break them. That way, the weak ones get weeded out pretty quickly. But they all work by performing some kind of mathematical manipulation of the numbers that represent the text (and to a computer, all text consists of numbers anyway), and they all require some kind of key, or “seed value,” to get the computation going. Therefore, since the encryption algorithm itself is public knowledge, the security of the system depends entirely on the key.

One such system is the “Advanced Encryption Standard” (“AES”), which happens to be the one adopted by the U. S. government. AES allows for keys that are 128 bits, 192 bits, or 256 bits long. Assuming there isn’t some kind of structural weakness in the AES algorithm – in which case it would presumably have been weeded out before anyone who was serious about security started using it – the logical way to attack it is to sequentially use all possible keys until you find the one that decodes the message. This is called a “brute force” attack. Of course, with a key length of n bits, there are 2n possible keys. So every bit that’s added to the length of the key doubles the number of possible keys.

It is generally accepted that the computing power required to try all possible 128-bit keys will be out of reach for the foreseeable future, unless some unanticipated breakthrough in technology occurs that dramatically increases processing power. Of course, such a breakthrough is entirely possible, which is why AES also allows for 192-bit and 256-bit keys – and remember, a 256-bit key isn’t just twice as hard to break as a 128-bit key, it’s 2128 times as hard. (And 2128 is roughly equal to the digit “3” followed by 38 zeros.) Therefore the government requires 192- or 256-bit keys for “highly sensitive” data.

AES uses a symmetrical key, meaning that the same key is used both to encrypt and decrypt the message, just as was the case with the old Enigma machine. In the next post of this series, we’ll talk about asymmetrical encryption systems, and try to work our way around to talking about SSL certificates.