Pages

Saturday, May 20, 2017

Entropie

Entropie

Die Entropie wird oft missverständlich als eine Art „Unordnung" bezeichnet. Doch das greift viel zu kurz. Einst eingeführt, um den begrenzten Wirkungsgrad von Dampfmaschinen zu erklären, wird der Begriff heute auch in vielen anderen Disziplinen genutzt.

Kaum ein Begriff der Physik wird so gerne außerhalb der Physik benutzt – und so oft abweichend von seiner eigentlichen Bedeutung – wie der der Entropie. Dabei hat der Begriff durchaus eine eng umrissene Bedeutung. Eine konkrete Definition dieser physikalischen Größe stellte der österreichische Physiker Ludwig Boltzmann in der zweiten Hälfte des 19. Jahrhunderts auf. Er konzentrierte sich auf das mikroskopische Verhalten eines Fluids, also eines Gases oder einer Flüssigkeit. Die ungeordnete Bewegung von Atomen oder Molekülen darin verstand er dabei als Wärme, was für seine Definition entscheidend war.

Entropie in der Badewanne

In einem abgeschlossenen System mit festem Volumen und fixer Teilchenzahl, so legte Boltzmann fest, ist die Entropie proportional zum Logarithmus der Anzahl von Mikrozuständen in dem System. Unter Mikrozuständen verstand er alle Möglichkeiten, wie sich die Moleküle oder Atome des eingesperrten Fluids anordnen können. Seine Formel definiert die Entropie somit als ein Maß für die „Anordnungsfreiheit" der Moleküle und Atome: Steigt die Zahl der einnehmbaren Mikrozustände, dann wächst die Entropie. Gibt es weniger Möglichkeiten, wie sich die Teilchen des Fluids anordnen können, ist die Entropie kleiner.

Grafik. Im oberen Bild ist eine Kammer zu sehen, die durch eine Wand getrennt sind. Nur in der linken Kammer sind Gasatome als rote kugeln zu sehen. Im unteren Bild wurde die Wand entfernt und die Gasatome können sich im doppelten Volumen ausbreiten.
Entropiezunahme

Boltzmanns Formel wird oft so interpretiert, als sei die Entropie gleichbedeutend mit „Unordnung". Dieses vereinfachte Bild führt allerdings leicht in die Irre. Ein Beispiel dafür ist der Schaum in einer Badewanne: Wenn die Bläschen zerplatzen und die Wasseroberfläche glatt wird, hat es den Anschein, als nehme die Unordnung ab. Die Entropie tut das aber nicht! Tatsächlich nimmt sie sogar zu, denn nach dem Zerplatzen des Schaums ist der mögliche Aufenthaltsraum für die Moleküle der Flüssigkeit nicht mehr auf die Außenhäute der Bläschen beschränkt – die Zahl der einnehmbaren Mikrozustände hat sich also vergrößert. Die Entropie ist gewachsen.

Mithilfe der Boltzmannschen Definition lässt sich eine Seite des Begriffs verstehen – doch die Entropie hat auch eine andere, makroskopische Seite, die der deutsche Physiker Rudolf Clausius bereits einige Jahre zuvor aufgedeckt hatte. Zu Beginn des 18. Jahrhunderts wurde die Dampfmaschine erfunden, eine klassische Wärmekraftmaschine. Wärmekraftmaschinen wandeln einen Temperaturunterschied in mechanische Arbeit um. Physiker versuchten damals zu begreifen, welchen Prinzipien diese Maschinen gehorchen. Die Forscher stellten nämlich irritiert fest, dass sich nur ein paar Prozent der thermischen Energie in mechanische Energie umwandeln ließen. Der Rest ging irgendwie verloren – ohne dass sie den Grund verstanden.

Wertigkeit der Energie

Der Theorie der Thermodynamik schien ein physikalisches Konzept zu fehlen, das die unterschiedliche Wertigkeit der Energie berücksichtigt und die Fähigkeit begrenzt, thermische Energie in mechanische Energie umzuwandeln. In Gestalt der Entropie kam die Lösung. Mitte des 19. Jahrhunderts führte Clausius den Begriff als thermodynamische Größe ein und definierte ihn als makroskopisches Maß für eine Eigenschaft, die die Nutzbarkeit von Energie begrenzt.

Clausius zufolge hängt die Entropieänderung eines Systems von der zugeführten Wärme und der dabei herrschenden Temperatur ab. Zusammen mit Wärme wird immer Entropie übertragen, so sein Fazit. Darüber hinaus stellte Clausius fest, dass die Entropie in geschlossenen Systemen, anders als die Energie, keine Erhaltungsgröße ist. Diese Erkenntnis ging als der zweite Hauptsatz der Thermodynamik in die Physik ein:

„In einem geschlossenen System nimmt die Entropie niemals ab."

Die Entropie nimmt demnach immer zu oder bleibt konstant. Damit wird in die Physik geschlossener Systeme ein Zeitpfeil eingeführt, denn bei wachsender Entropie sind thermodynamische Prozesse in geschlossenen Systemen unumkehrbar (oder irreversibel).

Foto. Schwarz lackierte Zahnräder einer mannshohen Maschine in Bewegung.
Wärmekraftmaschine

Reversibel (umkehrbar) wäre ein Prozess nur dann, wenn die Entropie konstant bliebe. Das ist aber bloß theoretisch möglich. Alle realen Prozesse sind irreversibel. Frei nach Boltzmann kann man auch sagen: Die Zahl der möglichen Mikrozustände nimmt jederzeit zu. Diese mikroskopische Interpretation erweitert die thermodynamisch-makroskopische Interpretation durch Clausius. Durch die Entropie ließ sich das Rätsel um die verschwundene Energie in Wärmekraftmaschinen endlich auflösen (siehe Kasten). Ständig entzieht sich ein Teil der Wärmeenergie der mechanischen Nutzbarkeit und wird wieder abgegeben, weil die Entropie in geschlossenen Systemen nicht abnehmen darf.

Vielseitiger Einsatz

Seit den Erkenntnissen von Clausius und Boltzmann ist die Entropie auch in andere Bereiche der Physik eingezogen. Sogar außerhalb der Physik griff man sie auf, jedenfalls als mathematisches Konzept. Beispielsweise führte der US-amerikanische Mathematiker und Elektrotechniker Claude Shannon im Jahr 1948 die sogenannte Informationsentropie ein. Mit dieser Größe charakterisierte er den Informationsverlust in Übertragungen per Telefonleitung.

Auch in der Chemie und Biologie spielt die Entropie eine Rolle: In bestimmten offenen Systemen können sich neue Strukturen bilden, sofern Entropie nach außen abgegeben wird. Es muss sich dabei um sogenannte dissipative Systeme handeln, bei denen also Energie in thermische Energie umgewandelt wird. Diese Theorie der Strukturbildung stammt vom belgischen Physikochemiker Ilya Prigogine. Bis heute werden Arbeiten veröffentlicht, in denen der physikalischen Tragweite des Konzepts neue Aspekte hinzugefügt werden. 

 

 

Wirkungsgrad und Entropie

Ein ansonsten vollkommen wärmeisolierter Zylinder ist lediglich die Unterseite mit einem idealen Wärmeleiter verschlossen. In ihm befindet sich ein ideales Gas als Arbeitssubstanz. An der Oberseite des Zylinders befindet sich ein verschiebbarer Stempel.
Der Carnotprozess

Warum ist der Wirkungsgrad von Wärmekraftmaschinen begrenzt? Rudolf Clausius löste dieses Rätsel, indem er den Begriff der Entropie einführte. Der Physiker betrachtete den Kreisprozess einer idealisierten Wärmekraftmaschine, bei dem sich Expansion und Kompression unter isothermen (konstante Temperatur) und isentropen (konstante Entropie) Bedingungen abwechseln. Durch Verknüpfung der Energieerhaltung mit dem zweiten Hauptsatz der Thermodynamik ergibt sich in diesem sogenannten Carnotprozess die folgende Ungleichung für den Wirkungsgrad:

ηT1T2T1

T1 und T2 sind die beiden Temperaturen, zwischen denen der Kreisprozess betrieben wird. Der maximal erreichbare Wirkungsgrad einer Wärmekraftmaschine ist also durch thermodynamische Gesetzmäßigkeiten begrenzt. Ein Beispiel: Wird die Maschine zwischen 100 und 200 Grad Celsius betrieben, dann liegt der maximal erreichbare Wirkungsgrad bei rund 21 Prozent (die Temperaturwerte müssen in der Einheit Kelvin in die Formel eingesetzt werden).

Aus der Energieerhaltung und dem zweiten Hauptsatz der Thermodynamik lassen sich auf mathematischem Weg auch noch zwei weitere nützliche Erkenntnisse ableiten: Wärme kann nur dann von einem kalten auf einen warmen Körper übergehen, wenn Arbeit aufgewendet wird – Kühlschränke und Wärmepumpen benötigen eine Energiezufuhr. Zweitens lässt sich mit einem Wärmereservoir konstanter Temperatur keine Arbeit verrichten. Dazu ist immer der Fluss von Wärme zwischen Reservoirs unterschiedlicher Temperatur notwendig.

Entropie in Formeln

Der Begriff Entropie ist eine Neubildung durch Rudolf Clausius aus griechischen Wörtern und bedeutet übersetzt ungefähr „Wandlungsgehalt". Laut dem Physiker hängt die Entropieänderung ΔS eines Systems folgendermaßen mit der zugeführten Wärme und der Temperatur zusammen:

ΔS=ΔQT

Dabei bezeichnet ΔQ eine kleine, dem System reversibel zugeführte Wärmemenge und T die Temperatur, die bei dieser Übertragung herrscht. Die Formel besagt, dass zusammen mit Wärme immer Entropie übertragen wird. Boltzmanns Definition der Entropie beruht auf dem Verständnis der Wärme als ungeordnete Bewegung von Atomen oder Molekülen. Ihm zufolge ist die Entropie S durch folgende Formel gegeben:

S=klnW

Die Entropie ist also proportional zum Logarithmus der Zahl W der „Mikrozustände" eines Systems, wobei alle anderen Parameter – wie Volumen und Teilchenzahl – konstant sind. Mit den Mikrozuständen sind die Möglichkeiten gemeint, wie die Moleküle oder Atome eines eingesperrten Fluids angeordnet sein können. Die Konstante k ist die Boltzmann-Konstante.

Saturday, December 10, 2016

SSL / TLS



SSL/TLS begins by a procedure called handshake, in which client and server agree on which cryptographic algorithms they will use (the cipher suite), and do some asymmetric cryptography magic to establish a shared secret, which, for the duration of the session, will be used to do symmetric encryption and integrity checks on subsequently exchanged application data.

Usually, the "key exchange" bit assumes that:

  • the server sends his public key as part of a certificate (as specified in X.509);
  • the client validates the certificate with regards to his own set of trust anchors (aka "root certificates") to make sure it is genuine, and contains the public key of the intended server;
  • the client uses the server's public key to do the asymmetric key exchange, normally by encrypting a random string with the server's public key (the server will use his private key to decrypt it back)(I am skipping over details irrelevant for this answer).

The hard part about a server certificate is that the server needs to have his public key in a certificate which the client will accept. Since the client accepts certificates based on their signature, to be verified against a finite set of a priori known public key (the keys of the "root certification authorities"), a certificate apt for validation may be obtained only by asking the said certification authorities. Most CA will do this only at a price (StartSSL will do it for free). Things are simpler if you control the client code, because you could add your own "root certificate" in the client, and, in effect, manage your own CA and issue your own certificates. This is still key management (the dreaded Public Key Infrastructure), and, as such, entails a non-zero cost.

You can do "certificate-less" SSL in two ways:

  • You can use one of the "DH_anon" cipher suites. These are cipher suites in which both client and server agree that there is no certificate at all; key exchange relies on Diffie-Hellman key exchange. Most client implementations (read: Web browsers) can support that, but it is very often deactivated by default.

  • You can have your server send his public key as a fake certificate. A certificate is just a container format; as long as you put a sequence of bytes of approximately the right length in the "signature" field, it will fit the format (traditionally, we use "self-signed" certificates -- as if the server was its own CA -- but any sequence of junk bytes will do). Of course, the client will no longer be able to validate the certificate. Web browsers will loudly complain with scary warnings.

You may note that while a Web browser will shout and scream at the sight of a self-signed certificate, it will also allow (after clicking a few times on "I know what I am doing" buttons) to install an "exception", by which the client accepts to use that unvalidated certificate. The good part of the exception is that it can be made permanent: the server stores a copy of the offending certificate, and, the next time the client connects to the same server and the server sends back the very same certificate, the client no longer complains. With an installed exception, there is a window of weakness (namely, when the user decides to "install the exception": at that time, he does not really know whether the certificate is the right one, or that of a meddling attacker), but afterward the security is restored.

(It is still an idea of questionable quality, to train your users to bypass the scary warnings by installing exceptions.)

Putty is not a SSL client; it uses the SSH protocol, which is conceptually similar to, but practically distinct from, the SSL protocol. The SSH protocol is certificate-less, because it relies on the same model than what I described above with self-signed certificates and installed exceptions. The first time you connect to a given server, the SSH client (Putty) will ask for confirmation before accepting the server public key, normally displaying the "fingerprint" of the public key (a hash value) so that you may compare that fingerprint with the expected value (that you know "out-of-band", e.g. you learned it by heart, or you phoned the sysadmin and he spelled it out for you). The SSH client will then permanently record the public key, and not ask for confirmation anymore -- but if a given suddenly uses a different public key, the SSH client will wail even more loudly than a Web browser confronted to a new self-signed certificate.

Sunday, February 21, 2016

IOPS Are A Scam

https://www.brentozar.com/archive/2013/09/iops-are-a-scam/

Storage vendors brag about the IOPS that their hardware can provide. Cloud providers have offered guaranteed IOPS for a while now. It seems that no matter where we turn, we can't get away from IOPS.

WHAT ARE YOU MEASURING?

When someone says IOPS, what are they referring to? IOPS is an acronym for Input/Output Operations Per Second. It's a measure of how many physical read/write operations a device can perform in one second.

IOPS are relied upon as an arbiter of storage performance. After all, if something has 7,000 IOPS, it's gotta be faster than something with only 300 IOPS, right?

The answer, as it turns out, is a resounding "maybe."

Most storage vendors perform their IOPS measurements using a 4k block size, which is irrelevant for SQL Server workloads; remember that SQL Server reads data 64k at a time (mostly). Are you slowly getting the feeling that the shiny thing you bought is a piece of wood covered in aluminum foil?

Those 50,000 IOPS SSDs are really only going to give you 3,125 64KiB IOPS. And that 7,000 IOPS number that Amazon promised you? That's in 16KiB IOPS. When you scale those numbers to 64KiB IOPS it works out to 1,750 64KiB IOPS for SQL Server RDS.

Latency as illustrated by ping

Latency as illustrated by ping


LATENCY VS IOPS

What about latency? Where does that fit in?

Latency is a measure of the duration between issuing a request and receiving a response. If you've ever played Counter-Strike, or just run ping, you know about latency. Latency is what we blame when we have unpredictable response times, can't get to google, or when I can't manage to get a headshot because I'm terrible at shooters.

Why does latency matter for disks?

It takes time to spin a disk platter and it takes time to move the read/write head of a disk into position. This introduces latency into rotational hard drives. Rotational HDDs have great sequential read/write numbers, but terrible random read/write numbers for the simple reason that the laws of physics get in the way.

Even SSDs have latency, though. Within an SSD, a controller is responsible for a finite number of chips. Some SSDs have multiple controllers, some have only one. Either way, a controller can only pull data off of the device so fast. As requests queue up, latency can be introduced.

On busy systems, the PCI-Express bus can even become a bottleneck. The PCI-E bus is shared among I/O controllers, network controllers, and other expansion cards. If several of those devices are in use at the same time, it's possible to see latency just from access to the PCI-E bus.

What could trigger PCI-E bottlenecks? A pair of high end PCI-E SSDs (TempDB) can theoretically produce more data than the PCI-E bus can transfer. When you use both PCI-E SSDs and Fibre Channel HBAs, it's easy to run into situations that can introduce random latency into PCI-E performance.

WHAT ABOUT THROUGHPUT?

Throughput is often measured as IOPS * operation size in bytes. So when you see that a disk is able to perform X IOPS or Y MB/s, you know what that number means – it's a measure of capability, but not necessarily timeliness. You could get 4,000 MB/s delivered after a 500 millisecond delay.

Although throughput is a good indication of what you can actually expect from a disk under perfect lab test conditions, it's still no good for measuring performance.

Amazon's SQL Server RDS promise of 7,000 IOPS sounds great until you put it into perspective. 7,000 IOPS * 16KiB = 112,000 KiB per second – that's roughly 100MBps. Or, as you or I might call it, 1 gigabit ethernet.

WHAT DOES GOOD STORAGE LOOK LIKE?

Measuring storage performance is tricky. IOPS and throughput are a measurement of activity, but there's no measure of timeliness involved. Latency is a measure of timeliness, but it's devoid of speed.

Combining IOPS, throughput, and latency numbers is a step in the right direction. It lets us combine activity (IOPS), throughput (MB/s), and performance (latency) to examine system performance. 

Predictable latency is incredibly important for disk drives. If we have no idea how the disks will perform, we can't predict application performance and have acceptable SLAs.

In their Systematic Look at EC2 I/O, Scalyr demonstrate that drive latency varies widely in EC2. While these numbers will vary across storage providers, keep in mind that latency is a very real thing and it can cause problems for shared storage and dedicated disks alike.

WHAT CAN WE DO ABOUT IOPS AND LATENCY?

The first step is to make sure we know what the numbers mean. Don't hesitate to convert the vendor's numbers into something relevant for your scenario. It's easy enough to turn 4k IOPS into 64k IOPS or to convert IOPS into MB/s measurements. Once we've converted to an understandable metric, we can verify performance using SQLIO and compare the advertised numbers with real world numbers.

But to get the most out of our hardware, we need to make sure that we're following best practices for SQL Server set up. Once we know that SQL Server is set up well, it's also important to consideradding memory, carefully tuning indexes, and avoiding query anti-patterns.

Even though we can't make storage faster, we can make storage do less work. In the end, making the storage do less gets the same results as making the storage faster.

Sunday, December 27, 2015

Bookstore sells some data centre capacity, becomes Microsoft, Oracle's nemesis

Sysadmin's 2015 review part 1 With 2015 drawing to a close and 2016 about to begin, it is time to reflect on the fact that the world never stops changing. The tech industry certainly changes, and so here's one sysadmin's view of the industry's movers and shakers.

In part one we're going to look at Amazon, Oracle and Microsoft. As I see it, the strategy of these three companies are broken reflections of one another. Oracle is trying to become Amazon. Microsoft is trying to become Oracle. Amazon's plans are completely unrelated to either of them.

Amazon

Amazon started life as an online book store. It quickly expanded to become an online store of, well... everything! This "everything" included the spare capacity of its own data centre. Amazon could have become "just another hosting provider", but to dismiss it as such – and the exceptionally ignorant among us enjoy loudly doing so – is to fail to understand the very first thing about Amazon.

No matter in which area of endeavour it chooses to participate, Amazon is corporately obsessed with efficiency. Amazon commoditises everything, from books to labour to computing to logistics. Amazon automates and orchestrates. It lives and breathes metrics and analytics.

It is more than a business strategy; it is a religion. The idea that everything can be improved through instrumentation, metrics and analytics is the religion of the Seattle tech scene. Microsoft is infected by it, as is virtually every other tech business in the area.

Humans are inefficient and their judgment not as pure as that of an algorithm. Imagine an entire metropolis where everyone has perpetual Google envy, but they try to address it by figuring out a way to ship you a box of bananas that costs half a cent less than the previous method.

Recently, Amazon has decided that owning the online world isn't enough. It wants to be Walmart. It wants physical stores and even more warehouses. It wants sub-warehouses everywhere delivering you goods automatically.

Above all else, Amazon wants those pesky humans out of the picture. Robots in the warehouses, robots to transfer goods from the primary storage facilities to the local sub-warehouses and drones to deliver goods directly to your doorstep.

And Amazon wants to deliver everything you need. Compute resource, physical goods, you name it. If someone buys something – anything – Amazon wants its 30 per cent.

Oracle

Oracle, meanwhile, is obsessed with what Amazon was five years ago and in doing so they are missing the bigger picture. Oracle wants to move to a subscription revenue model where they not only have an absolute lock on licensing, the workloads in question are running on Oracle's cloud, too.

Oracle completely misses the point of what Amazon is about and in doing so they guarantees it will never be the success Amazon is becoming.

As discussed above, Amazon's creation of AWS was essentially an accident. An outgrowth of Amazon's obsession with efficiency. Amazon simply can't have idle server capacity around, especially once the nerds have created this really neat layer of automation and orchestration for using those servers that make the idle capacity easy to sell!

But in doing so, Amazon turned self-service automated hosting of compute workloads into a commodity with relatively low prices and ease of use. Both of these things are anathema to Oracle, but necessary for success as a public cloud provider.

Oracle views becoming a cloud provider as a means of seeing who is using how much of what and how often. This is important because Oracle could then automate its licensing changes to squeeze the maximum amount of dollars out of its customers, adapting in near real time to any attempts to use licensing loopholes.

If Oracle can move enough customers over to its cloud with its new sales policies, then the cloud should help Oracle see increased short term revenue. Keeping to its existing licensing strategies, however, seems doomed to failure when the commodity approach of Amazon is just a click away.

Microsoft

Microsoft has a very successful public cloud. It also has some of the most advanced technologies in any number of markets and when assembled and analysed as a whole probably was the single most impressive technological product portfolio of any company on the planet. For Microsoft, this isn't even close to enough.

In your correspondent's opinion, it seems as if Microsoft operates on the belief that for every computer in use, Microsoft is owed a tithe. Every desktop ever sold should bring in a minimum amount. Every server in use should bring in a much – much – larger amount. Microsoft is seemingly so tied to this model that it has apparently blinded it to rational or useful licensing overhauls for almost two decades.

Today, however, Microsoft no longer has a monopoly on the endpoint. Mobile is huge and Microsoft is a non-entity there. Millions of people have never owned a Microsoft product but manage to access the internet every single day. Soon that number will reach a billion. What's Microsoft to do?

The answer for desktop users, apparently, is to alienate their existing installed base with intrusive Windows 10 advertisements that the average user can't make go away, download a copy of the operating system to their devices unwanted and plan to trigger an install of Windows 10 even when not requested by the user.

I'm personally bitter about the "download a copy of the operating system to their devices unwanted" because this happened to me while I had a device connected to a MiFi device while in another country. It ended up costing me hundreds of dollars, and there's absolutely nothing I can do about it.

Microsoft wants its endpoint dominance back because it allowed the firm to keep end users addicted to a huge number of Microsoft products (such as Office). These products used to reinforce Microsoft's position and in turn drove the uptake of Microsoft Server products. In turn these make using Microsoft endpoints easier.

On the Server side of the house, more than anything, Microsoft wants customers to stop running their own IT. For the very same reasons as Oracle, it wants customers to be using its public cloud for everything.

In order to help push customers to the cloud, Microsoft is making running workloads on your own tin as miserable as possible, and appears to be adopting some of Oracle's most reviled licensing strategies. One of these strategies is per core licensing.

Microsoft isn't, as the term might imply, charging some rational amount per actual core, perhaps with varying tiers based on the power and capability of the core. No, Microsoft is packaging cores up in minimum bundles, apparently ensuring that the average user can't possibly license things optimally and that ultimately the cost of on-premises workload hosting will be above that of hosting it on Azure.

Like Oracle, Microsoft is also massively incentivising its salesforce and what's left of its partners to sell Azure instead of on-premises licensing. Facing competition on various fronts, Microsoft will clearly try anything to get customers locked in.

If Microsoft succeed, it will be brilliant: once locked back in to its ecosystem, end users will be with it for another decade, maybe two. But if it fails, this will go down in history as a text book example of corporate hubris.

Microsoft stands upon the razor's edge. Will the superiority of its technology and its cross-integration win out? Or will customers say "Hold, enough"?

In part 2, I'll take a shot at decoding Cisco, Dell and HPE. ®

Friday, December 18, 2015

rsync.net: ZFS Replication to the cloud is finally here—and it’s fast

Even an rsync-lifer admits ZFS replication and rsync.net are making data transfers better.

by Jim Salter - Dec 17, 2015 2:00pm CET

In mid-August, the first commercially available ZFS cloud replication target became available at rsync.net. Who cares, right? As the service itself states, "If you're not sure what this means, our product is Not For You."

Of course, this product is for someone—and to those would-be users, this really will matter. Fully appreciating the new rsync.net (spoiler alert: it's pretty impressive!) means first having a grasp on basic data transfer technologies. And while ZFS replication techniques are burgeoning today, you must actually begin by examining the technology that ZFS is slowly supplanting.

A love affair with rsync

Further Reading

Bitrot and atomic COWs: Inside "next-gen" filesystems

We look at the amazing features in ZFS and btrfs—and why you need them.

Revisiting a first love of any kind makes for a romantic trip down memory lane, and that's what revisiting rsync—as in "rsync.net"—feels like for me. It's hard to write an article that's inevitably going to end up trashing the tool, because I've been wildly in love with it for more than 15 years. Andrew Tridgell (of Samba fame) first announced rsync publicly in June of 1996. He used it for three chapters of his PhD thesis three years later, about the time that I discovered and began enthusiastically using it. For what it's worth, the earliest record of my professional involvement with major open source tools—at least that I've discovered—is my activity on the rsync mailing list in the early 2000s.

Rsync is a tool for synchronizing folders and/or files from one location to another. Adhering to true Unix design philosophy, it's a simple tool to use. There is no GUI, no wizard, and you can use it for the most basic of tasks without being hindered by its interface. But somewhat rare for any tool, in my experience, rsync is also very elegant. It makes a task which is humanly intuitive seem simple despite being objectively complex. In common use, rsync looks like this:

root@test:~# rsync -ha --progress /source/folder /target/

Invoking this command will make sure that once it's over with, there will be a /target/folder, and it will contain all of the same files that the original /source/folder contains. Simple, right? Since we invoked the argument -a (for archive), the sync will be recursive, the timestamps, ownership, permission, and all other attributes of the files and folders involved will remain unchanged in the target just as they are on the source. Since we invoked -h, we'll get human-readable units (like G, M, and K rather than raw bytes, as appropriate). Progress means we'll get a nice per-file progress bar showing how fast the transfer is going.

So far, this isn't much more than a kinda-nice version of copy. But where it gets interesting is when /target/folder already exists. In that case, rsync will compare each of those files in /source/folder with its counterpart in /target/folder, and it will only update the latter if the source has changed. This keeps everything in the target updated with the least amount of thrashing necessary. This is much cleaner than doing a brute-force copy of everything, changed or not!

It gets even better when you rsync to a remote machine:

root@test:~# rsync -ha --progress /source/folder user@targetmachine:/target/

When rsyncing remotely, rsync still looks over the list of files in the source and target locations, and the tool only messes with files that have changed. It gets even better still—rsync also tokenizes the changed files on each end and then exchanges the tokens to figure out which blocks in the files have changed. Rsync then only moves those individual blocks across the network. (Holy saved bandwidth, Batman!)

You can go further and further down this rabbit hole of "what can rsync do." Inline compression to save even more bandwidth? Check. A daemon on the server end to expose only certain directories or files, require authentication, only allow certain IPs access, or allow read-only access to one group but write access to another? You got it. Running "rsync" without any arguments gets you a "cheat sheet" of valid command line arguments several pages long.

To Windows-only admins whose eyes are glazing over by now: rsync is "kinda like robocopy" in the same way that you might look at a light saber and think it's "kinda like a sword."

If rsync's so great, why is ZFS replication even a thing?

This really is the million dollar question. I hate to admit it, but I'd been using ZFS myself for something like four years before I realized the answer. In order to demonstrate how effective each technology is, let's go to the numbers. I'm using rsync.net's new ZFS replication service on the target end and a Linode VM on the source end. I'm also going to be using my own open source orchestration tool syncoid to greatly simplify the otherwise-tedious process of ZFS replication.

First test: what if we copy 1GB of raw data from Linode to rsync.net? First, let's try it with the old tried and true rsync:

root@rsyncnettest:~# time rsync -ha --progress /test/linodetest/ root@myzfs.rsync.net:/mnt/test/linodetest/  sending incremental file list  ./  1G.bin            1.07G 100%    6.60MB/s    0:02:35 (xfr#1, to-chk=0/2)    real	2m36.636s  user	0m22.744s  sys	0m3.616s

And now, with ZFS send/receive, as orchestrated by syncoid:

root@rsyncnettest:~# time syncoid --compress=none test/linodetest root@myzfs.rsync.net:test/linodetest  INFO: Sending oldest full snapshot test/linodetest@1G-clean (~ 1.0 GB) to new target filesystem:     1GB 0:02:32 [6.54MB/s] [=================================================>] 100%              INFO: Updating new target filesystem with incremental test/linodetest@1G-clean ...         syncoid_rsyncnettest_2015-09-18:17:15:53 (~ 4 KB):  1.52kB 0:00:00 [67.1kB/s] [===================>                              ] 38%                real	2m36.685s  user	0m0.244s  sys	0m2.548s

Time-wise, there's really not much to look at. Either way, we transfer 1GB of data in two minutes, 36 seconds and change. It is a little interesting to note that rsync ate up 26 seconds of CPU time while ZFS replication used less than three seconds, but still, this race is kind of a snoozefest.

So let's make things more interesting. Now that we have our 1GB of data actually there, what happens if we change it just enough to force a re-synchronization? In order to do so, we'll touch the file, which doesn't do anything but change its timestamp to the current time.

Just like before, we'll start out with rsync:

root@rsyncnettest:/test# touch /test/linodetest/1G.bin  root@rsyncnettest:/test# time rsync -ha --progress /test/linodetest/ root@myzfs.rsync.net:/mnt/test/linodetest  sending incremental file list  1G.bin            1.07G 100%  160.47MB/s    0:00:06 (xfr#1, to-chk=0/2)    real	0m13.248s  user	0m6.100s  sys	0m0.296s

And now let's try ZFS:

root@rsyncnettest:/test# touch /test/linodetest/1G.bin  root@rsyncnettest:/test# time syncoid --compress=none test/linodetest root@myzfs.rsync.net:test/linodetest  INFO: Sending incremental test/linodetest@syncoid_rsyncnettest_2015-09-18:16:07:06 ...         syncoid_rsyncnettest_2015-09-18:16:07:10 (~ 4 KB):  6.73kB 0:00:00 [ 277kB/s] [==================================================] 149%                real	0m1.740s  user	0m0.068s  sys	0m0.076s

Now things start to get real. Rsync needed 13 seconds to get the job done, while ZFS needed less than two. This problem scales, too. For a touched 8GB file, rsync will take 111.9 seconds to re-synchronize, while ZFS still needs only 1.7.

Touching is not even the worst-case scenario. What if, instead, we move a file from one place to another—or even just rename the folder it's in? For this test, we have synchronized folders containing 8GB of data in /test/linodetest/1. Once we've got that done, we rename /test/linodetest/1 to /test/linodetest/2 and resynchronize. Rsync is up first:

root@rsyncnettest:/test# mv /test/linodetest/1 /test/linodetest/2  root@rsyncnettest:/test# time rsync -ha --progress --delete /test/linodetest/ root@myzfs.rsync.net:/mnt/test/linodetest/  sending incremental file list  deleting 1/8G.bin  deleting 1/  ./  2/  2/8G.bin            8.59G 100%    5.56MB/s    0:24:34 (xfr#1, to-chk=0/3)    real	24m39.267s  user	3m15.944s  sys	0m30.056s

Ouch. What's essentially a subtle change requires nearly half an hour of real time and nearly four minutes of CPU time. But with ZFS...

root@rsyncnettest:/test# mv /test/linodetest/1 /test/linodetest/2  root@rsyncnettest:/test# time syncoid --compress=none test/linodetest root@myzfs.rsync.net:test/linodetest  INFO: Sending incremental test/linodetest@syncoid_rsyncnettest_2015-09-18:16:17:29 ...         syncoid_rsyncnettest_2015-09-18:16:19:06 (~ 4 KB):  9.41kB 0:00:00 [ 407kB/s] [==================================================] 209%                real	0m1.707s  user	0m0.072s  sys	0m0.024s

Yep—it took the same old 1.7 seconds for ZFS to re-sync, no matter whether we touched a 1GB file, touched an 8GB file, or even moved an 8GB file from one place to another. In the last test, that's almost three full orders of magnitude faster than rsync: 1.7 seconds versus 1,479.3 seconds. Poor rsync never stood a chance.

Listing image by Flickr user: jonel hanopol

OK, ZFS is faster sometimes. Does it matter?

I have to be honest—I feel a little like a monster. Most casual users' experience of rsync will be "it rocks!" and "how could anything be better than this?" But after 15 years of daily use, I knew exactly what rsync's weaknesses were, and I targeted them ruthlessly.

As for ZFS replication's weaknesses, well, it really only has one: you need to be using ZFS on both ends. On the one hand, I think you should already want ZFS on both ends. There's a giant laundry list of features you can only get with a next-generation filesystem. But you could easily find yourself stuck with a lesser filesystem—and if you're stuck, you're stuck. No ZFS, no ZFS replication.

Aside from that, ZFS replication ranges from "just as fast as anything else" to "noticeably faster than anything else" to "sit down, shut up, and hold on." The particular use case that drove me to finally exploring replication—which was much, much more daunting before tools like syncoid automated it—was the replication of VM images.

Virtualization keeps getting more and more prevalent, and VMs mean gigantic single files. rsync has a lot of trouble with these. The tool can save you network bandwidth when synchronizing a huge file with only a few changes, but it can't save you disk bandwidth, since rsync needs to read through and tokenize the entire file on both ends before it can even begin moving data across the wire. This was enough to be painful, even on our little 8GB test file. On a two terabyte VM image, it turns into a complete non-starter. I can (and do!) sync a two terabyte VM image daily (across a 5mbps Internet connection) usually in well under an hour. Rsync would need about seven hours just to tokenize those files before it even began actually synchronizing them... and it would render the entire system practically unusable while it did, since it would be greedily reading from the disks at maximum speed in order to do so.

The moral of the story? Replication definitely matters.

rsync is great, but when you feel the need for (data transfer) speed...

What about rsync.net?

Now that we know what ZFS replication is and why it matters, let's talk about rsync.net. I can't resist poking a little fun, but I like these folks. The no-nonsense "if you don't know what it is, it isn't for you" approach is a little blunt, but it makes a realistic assessment of what they're there for. This is a basic service offering extremely high-quality infrastructure to admins who know what they're doing and want to use standard system tools without getting hamstrung by "friendly" interfaces aimed at Joe and Jane Six-Pack. They've been around since 2001, and they are sporting some pretty big names in their "these are our customers" list, including Disney, ESPN, and 3M.

What you're actually getting with your rsync.net subscription is a big, honking VM with as much space on it as you paid for. You can use that VM as a target for rsync—the basic service they've been offering to the world for fourteen years now—or, now, for ZFS replication. It's kind of a sysadmin's dream. You can install what you want, however you want, without any "helpful" management interface getting in your way. Despite the fact that they'd never heard of my helper application Syncoid, I was able to get its dependencies installed immediately and get right to syncing without any trouble. As a veteran sysadmin... niiiiice.

I had a little more trouble testing their bandwidth, simply because it's hard to get enough bandwidth to really challenge their setup. I spun up a ridiculously expensive Linode instance ($960/mo, thank Ghu for hourly billing!) that claimed to offer 10Gbps outbound bandwidth, but it turned out to be... less impressive. Whether I sent to rsync.net or did a speed test with any speedtest.net provider within 200 miles of the exit point of Linode's network, the results turned out the same—about 57mbps. It's possible that Linode really is offering 10gbps outbound in aggregate but is using traffic-shaping to limit single pipes to what I saw. But I frankly didn't have the time, or the inclination, to test.

Did I mention speedtest.net? Did I mention that rsync.net offers you full root access to your VM and doesn't get in your way? I'm back to my happy place now. A couple of git clones later, I had a working copy of a command-line-only interface to speedtest.net's testing infrastructure on my rsync.net VM, and I could test it that way:

root@3730:/usr/local/bin # python ./speedtest_cli.py  Retrieving speedtest.net configuration...  Retrieving speedtest.net server list...  Testing from Castle Access (69.43.165.28)...  Selecting best server based on latency...  Hosted by I2B Networks Inc (San Diego, CA) [57.35 km]: 4.423 ms  Testing download speed........................................  Download: 501.63 Mbit/s  Testing upload speed..................................................  Upload: 241.71 Mbit/s

502mbps down, 242mbps up. Am I limited by rsync.net there, or by the speedtest.net infrastructure? I honestly don't know. Results were pretty similar with several other speedtest.net servers, so these are some pretty reasonable numbers as far as I can tell. The TL;DR is "almost certainly more bandwidth than you can use," and that's good enough for me... particularly considering that my 1TB VM with rsync.net is only $60/mo.

For a backup target, though, network bandwidth isn't the only concern. What about disk bandwidth? Can you write those incoming ZFS snapshots as fast as the network will carry them? In order to find out, I got a little crafty. Since ZFS supports inline compression and I have no way to be sure rsync.net isn't using it where I can't see it, I wrote a few lines of Perl to generate 1GB of incompressible (pseudo-random) data in memory and then write it repeatedly to the disk.

#!/usr/local/bin/perl    print "Obtaining 1G of pseudorandom data:\n";  my $G1 = `dd if=/dev/urandom bs=1024M count=1 2>/dev/null`;    print "Beginning write test:\n";  my $loop;  open FH, "| pv -s 10G > /mnt/test/linodetest/test.bin";  while ($loop <10) { print FH $G1; $loop++; }  close FH;

Looks good. So let's see what happens when we hammer the disks with a 10G stream of random data:

root@3730:/mnt/test/linodetest # perl ~/test.pl  Obtaining 1G of pseudorandom data:  Beginning write test:    10GiB 0:00:49 [ 208MiB/s] [==================================================>] 100%

Ultimately, I can receive data from the network at about 60MB/sec, which won't stress my storage since it can write at >200 MB/sec. We're solid... seriously solid. From our perspective as the user, it looks just like we have our own beefy machine with 8GB of RAM, a good CPU, and several high-quality hard drives all to ourselves. With a nice fat several-hundred-mbps pipe. In a datacenter. With support. For $60/mo.

Good

  • That price is pretty amazing for what you get—it's about on par with Amazon S3, while offering you any number of things S3 won't and can't give you.
  • rsync.net is so simple. If you know what you're doing, it's going to be extremely easy to work with and extremely difficult to compromise—there's no big complex Internet-facing Web interface to get compromised, it's just you, ssh, and the stuff you put in place.
  • When vulnerabilities do pop up, they're going to be extremely easy to address. FreeBSD is going to patch them, rsync.net is going to apply them (making the vulnerability window extremely small).
  • With no tier 1 support, anybody you talk to is going to be a serious *nix engineer, with serious security policies they understand. The kind of social engineering that owned Mat Honan's iCloud account will be extremely difficult to pull off.

Bad

  • Some of the above strengths are also weaknesses. Again, there is no tier 1 support for rsync.net—if you need support, you're going to be talking to a real, no-kidding *nix engineer.
  • If you have to use that support, well, it can get frustrating. I did have some back and forth with support while writing this review, and I learned some things. (I wasn't aware of the High Performance Networking fork of SSH until I reached out to rsync.net support during the course of reviewing the service.) Despite the fact that the folks at rsync.net knew I was writing a review and that it might end up on the front page of Ars Technica, it generally took anywhere from several hours to a day to get an e-mail response. 

Ugly

  • There's nothing else like rsync.net commercially available right now, but this is a pretty specialized service. Neither I nor rsync.net are likely to advocate it as a replacement for things like Dropbox any time soon.

Jim Salter (@jrssnet) is an author, public speaker, small business owner, mercenary sysadmin, and father of three—not necessarily in that order. He got his first real taste of open source by running Apache on his very own dedicated FreeBSD 3.1 server back in 1999, and he's been a fierce advocate of FOSS ever since. He also created and maintains http://freebsdwiki.net and http://ubuntuwiki.net.

In the name of full disclosure, the author developed and maintained the ZFS replication tool referenced above (Syncoid). And for this piece, the shell sessions would become a lot more cumbersome to follow if replication was instead done manually. While there are commercial options, Syncoid is a fully open source, GPL v3.0 licensed tool. The links above lead directly to the Github repo where it can be freely downloaded and used by anyone.

Tuesday, November 24, 2015

Kill the Password: A String of Characters Won’t Protect You

You have a secret that can ruin your life.

It's not a well-kept secret, either. Just a simple string of characters—maybe six of them if you're careless, 16 if you're cautious—that can reveal everything about you.


Your email. Your bank account. Your address and credit card number. Photos of your kids or, worse, of yourself, naked. The precise location where you're sitting right now as you read these words. Since the dawn of the information age, we've bought into the idea that a password, so long as it's elaborate enough, is an adequate means of protecting all this precious data. But in 2012 that's a fallacy, a fantasy, an outdated sales pitch. And anyone who still mouths it is a sucker—or someone who takes you for one.

No matter how complex, no matter how unique, your passwords can no longer protect you.

Look around. Leaks and dumps—hackers breaking into computer systems and releasing lists of usernames and passwords on the open web—are now regular occurrences. The way we daisy-chain accounts, with our email address doubling as a universal username, creates a single point of failure that can be exploited with devastating results. Thanks to an explosion of personal information being stored in the cloud, tricking customer service agents into resetting passwords has never been easier. All a hacker has to do is use personal information that's publicly available on one service to gain entry into another.

This summer, hackers destroyed my entire digital life in the span of an hour. My Apple, Twitter, and Gmail passwords were all robust—seven, 10, and 19 characters, respectively, all alphanumeric, some with symbols thrown in as well—but the three accounts were linked, so once the hackers had conned their way into one, they had them all. They really just wanted my Twitter handle: @mat. As a three-letter username, it's considered prestigious. And to delay me from getting it back, they used my Apple account to wipe every one of my devices, my iPhone and iPad and MacBook, deleting all my messages and documents and every picture I'd ever taken of my 18-month-old daughter.

The age of the password is over. We just haven't realized it yet.

Since that awful day, I've devoted myself to researching the world of online security. And what I have found is utterly terrifying. Our digital lives are simply too easy to crack. Imagine that I want to get into your email. Let's say you're on AOL. All I need to do is go to the website and supply your name plus maybe the city you were born in, info that's easy to find in the age of Google. With that, AOL gives me a password reset, and I can log in as you.

First thing I do? Search for the word "bank" to figure out where you do your online banking. I go there and click on the Forgot Password? link. I get the password reset and log in to your account, which I control. Now I own your checking account as well as your email.

This summer I learned how to get into, well, everything. With two minutes and $4 to spend at a sketchy foreign website, I could report back with your credit card, phone, and Social Security numbers and your home address. Allow me five minutes more and I could be inside your accounts for, say, Amazon, Best Buy, Hulu, Microsoft, and Netflix. With yet 10 more, I could take over your AT&T, Comcast, and Verizon. Give me 20—total—and I own your PayPal. Some of those security holes are plugged now. But not all, and new ones are discovered every day.

The common weakness in these hacks is the password. It's an artifact from a time when our computers were not hyper-connected. Today, nothing you do, no precaution you take, no long or random string of characters can stop a truly dedicated and devious individual from cracking your account. The age of the password has come to an end; we just haven't realized it yet.

Passwords are as old as civilization. And for as long as they've existed, people have been breaking them.

In 413 BC, at the height of the Peloponnesian War, the Athenian general Demosthenes landed in Sicily with 5,000 soldiers to assist in the attack on Syracusae. Things were looking good for the Greeks. Syracusae, a key ally of Sparta, seemed sure to fall.

But during a chaotic nighttime battle at Epipole, Demosthenes' forces were scattered, and while attempting to regroup they began calling out their watchword, a prearranged term that would identify soldiers as friendly. The Syracusans picked up on the code and passed it quietly through their ranks. At times when the Greeks looked too formidable, the watchword allowed their opponents to pose as allies. Employing this ruse, the undermatched Syracusans decimated the invaders, and when the sun rose, their cavalry mopped up the rest. It was a turning point in the war.

The first computers to use passwords were likely those in MIT's Compatible Time-Sharing System, developed in 1961. To limit the time any one user could spend on the system, CTSS used a login to ration access. It only took until 1962 when a PhD student named Allan Scherr, wanting more than his four-hour allotment, defeated the login with a simple hack: He located the file containing the passwords and printed out all of them. After that, he got as much time as he wanted.

During the formative years of the web, as we all went online, passwords worked pretty well. This was due largely to how little data they actually needed to protect. Our passwords were limited to a handful of applications: an ISP for email and maybe an ecommerce site or two. Because almost no personal information was in the cloud—the cloud was barely a wisp at that point—there was little payoff for breaking into an individual's accounts; the serious hackers were still going after big corporate systems.

So we were lulled into complacency. Email addresses morphed into a sort of universal login, serving as our username just about everywhere. This practice persisted even as the number of accounts—the number of failure points—grew exponentially. Web-based email was the gateway to a new slate of cloud apps. We began banking in the cloud, tracking our finances in the cloud, and doing our taxes in the cloud. We stashed our photos, our documents, our data in the cloud.

Eventually, as the number of epic hacks increased, we started to lean on a curious psychological crutch: the notion of the "strong" password. It's the compromise that growing web companies came up with to keep people signing up and entrusting data to their sites. It's the Band-Aid that's now being washed away in a river of blood.

Every security framework needs to make two major trade-offs to function in the real world. The first is convenience: The most secure system isn't any good if it's a total pain to access. Requiring you to remember a 256-character hexadecimal password might keep your data safe, but you're no more likely to get into your account than anyone else. Better security is easy if you're willing to greatly inconvenience users, but that's not a workable compromise.

A Password Hacker in Action

The following is from a January 2012 live chat between Apple online support and a hacker posing as Brian—a real Apple customer. The hacker's goal: resetting the password and taking over the account.

Apple: Can you answer a question from the account? Name of your best friend?

Hacker: I think that is "Kevin" or "Austin" or "Max."

Apple: None of those answers are correct. Do you think you may have entered last names with the answer?

Hacker: I might have, but I don't think so. I've provided the last 4, is that not enough?

Apple: The last four of the card are incorrect. Do you have another card?

Hacker: Can you check again? I'm looking at my Visa here, the last 4 is "5555."

Apple: Yes, I have checked again. 5555 is not what is on the account. Did you try to reset online and choose email authentication?

Hacker: Yes, but my email has been hacked. I think the hacker added a credit card to the account, as many of my accounts had the same thing happen to them.

Apple: You want to try the first and last name for the best friend question?

Hacker: Be right back. The chicken is burning, sorry. One second.

Apple: OK.

Hacker: Here, I'm back. I think the answer might be Chris? He's a good friend.

Apple: I am sorry, Brian, but that answer is incorrect.

Hacker: Christopher A********h is the full name. Another possibility is Raymond M*******r.

Apple: Both of those are incorrect as well.

Hacker: I'm just gonna list off some friends that might be haha. Brian C**a. Bryan Y***t. Steven M***y.

Apple: How about this. Give me the name of one of your custom mail folders.

Hacker: "Google" "Gmail" "Apple" I think. I'm a programmer at Google.

Apple: OK, "Apple" is correct. Can I have an alternate email address for you?

Hacker: The alternate email I used when I made the account?

Apple: I will need an email address to send you the password reset.

Hacker: Can you send it to "toe@aol.com"?

Apple: The email has been sent.

Hacker: Thanks!

The second trade-off is privacy. If the whole system is designed to keep data secret, users will hardly stand for a security regime that shreds their privacy in the process. Imagine a miracle safe for your bedroom: It doesn't need a key or a password. That's because security techs are in the room, watching it 24/7, and they unlock the safe whenever they see that it's you. Not exactly ideal. Without privacy, we could have perfect security, but no one would accept a system like that.

For decades now, web companies have been terrified by both trade-offs. They have wanted the act of signing up and using their service to seem both totally private and perfectly simple—the very state of affairs that makes adequate security impossible. So they've settled on the strong password as the cure. Make it long enough, throw in some caps and numbers, tack on an exclamation point, and everything will be fine.

But for years it hasn't been fine. In the age of the algorithm, when our laptops pack more processing power than a high-end workstation did a decade ago, cracking a long password with brute force computation takes just a few million extra cycles. That's not even counting the new hacking techniques that simply steal our passwords or bypass them entirely—techniques that no password length or complexity can ever prevent. The number of data breaches in the US increased by 67 percent in 2011, and each major breach is enormously expensive: After Sony's PlayStation account database was hacked in 2011, the company had to shell out $171 million to rebuild its network and protect users from identity theft. Add up the total cost, including lost business, and a single hack can become a billion-dollar catastrophe.

How do our online passwords fall? In every imaginable way: They're guessed, lifted from a password dump, cracked by brute force, stolen with a keylogger, or reset completely by conning a company's customer support department.

Let's start with the simplest hack: guessing. Carelessness, it turns out, is the biggest security risk of all. Despite years of being told not to, people still use lousy, predictable passwords. When security consultant Mark Burnett compiled a list of the 10,000 most common passwords based on easily available sources (like passwords dumped online by hackers and simple Google searches), he found the number one password people used was, yes, "password." The second most popular? The number 123456. If you use a dumb password like that, getting into your account is trivial. Free software tools with names like Cain and Abel or John the Ripper automate password-cracking to such an extent that, very literally, any idiot can do it. All you need is an Internet connection and a list of common passwords—which, not coincidentally, are readily available online, often in database-friendly formats.

What's shocking isn't that people still use such terrible passwords. It's that some companies continue to allow it. The same lists that can be used to crack passwords can also be used to make sure no one is able to choose those passwords in the first place. But saving us from our bad habits isn't nearly enough to salvage the password as a security mechanism.

Our other common mistake is password reuse. During the past two years, more than 280 million "hashes" (i.e., encrypted but readily crackable passwords) have been dumped online for everyone to see. LinkedIn, Yahoo, Gawker, and eHarmony all had security breaches in which the usernames and passwords of millions of people were stolen and then dropped on the open web. A comparison of two dumps found that 49 percent of people had reused usernames and passwords between the hacked sites.

"Password reuse is what really kills you," says Diana Smetters, a software engineer at Google who works on authentication systems. "There is a very efficient economy for exchanging that information." Often the hackers who dump the lists on the web are, relatively speaking, the good guys. The bad guys are stealing the passwords and selling them quietly on the black market. Your login may have already been compromised, and you might not know it—until that account, or another that you use the same credentials for, is destroyed.

Hackers also get our passwords through trickery. The most well-known technique is phishing, which involves mimicking a familiar site and asking users to enter their login information. Steven Downey, CTO of Shipley Energy in Pennsylvania, described how this technique compromised the online account of one of his company's board members this past spring. The executive had used a complex alphanumeric password to protect her AOL email. But you don't need to crack a password if you can persuade its owner to give it to you freely.

The hacker phished his way in: He sent her an email that linked to a bogus AOL page, which asked for her password. She entered it. After that he did nothing. At first, that is. The hacker just lurked, reading all her messages and getting to know her. He learned where she banked and that she had an accountant who handled her finances. He even learned her electronic mannerisms, the phrases and salutations she used. Only then did he pose as her and send an email to her accountant, ordering three separate wire transfers totaling roughly $120,000 to a bank in Australia. Her bank at home sent $89,000 before the scam was detected.

An even more sinister means of stealing passwords is to use malware: hidden programs that burrow into your computer and secretly send your data to other people. According to a Verizon report, malware attacks accounted for 69 percent of data breaches in 2011. They are epidemic on Windows and, increasingly, Android. Malware works most commonly by installing a keylogger or some other form of spyware that watches what you type or see. Its targets are often large organizations, where the goal is not to steal one password or a thousand passwords but to access an entire system.

One devastating example is ZeuS, a piece of malware that first appeared in 2007. Clicking a rogue link, usually from a phishing email, installs it on your computer. Then, like a good human hacker, it sits and waits for you to log in to an online banking account somewhere. As soon as you do, ZeuS grabs your password and sends it back to a server accessible to the hacker. In a single case in 2010, the FBI helped apprehend five individuals in the Ukraine who had employed ZeuS to steal $70 million from 390 victims, primarily small businesses in the US.

Targeting such companies is actually typical. "Hackers are increasingly going after small businesses," says Jeremy Grant, who runs the Department of Commerce's National Strategy for Trusted Identities in Cyberspace. Essentially, he's the guy in charge of figuring out how to get us past the current password regime. "They have more money than individuals and less protection than large corporations."

How to Survive the Password Apocalypse

Until we figure out a better system for protecting our stuff online, here are four mistakes you should never make—and four moves that will make your accounts harder (but not impossible) to crack.—M.H.

DON'T

  • Reuse passwords. If you do, a hacker who gets just one of your accounts will own them all.
  • Use a dictionary word as your password. If you must, then string several together into a pass phrase.
  • Use standard number substitutions. Think "P455w0rd" is a good password? N0p3! Cracking tools now have those built in.
  • Use a short password—no matter how weird. Today's processing speeds mean that even passwords like "h6!r$q" are quickly crackable. Your best defense is the longest possible password.

DO

  • Enable two-factor authentication when offered. When you log in from a strange location, a system like this will send you a text message with a code to confirm. Yes, that can be cracked, but it's better than nothing.
  • Give bogus answers to security questions. Think of them as a secondary password. Just keep your answers memorable. My first car? Why, it was a "Camper Van Beethoven Freaking Rules."
  • Scrub your online presence. One of the easiest ways to hack into an account is through your email and billing address information. Sites like Spokeo and WhitePages.com offer opt-out mechanisms to get your information removed from their databases.
  • Use a unique, secure email address for password recoveries. If a hacker knows where your password reset goes, that's a line of attack. So create a special account you never use for communications. And make sure to choose a username that isn't tied to your name—like m****n@wired.com—so it can't be easily guessed.

If our problems with passwords ended there, we could probably save the system. We could ban dumb passwords and discourage reuse. We could train people to outsmart phishing attempts. (Just look closely at the URL of any site that asks for a password.) We could use antivirus software to root out malware.

But we'd be left with the weakest link of all: human memory. Passwords need to be hard in order not to be routinely cracked or guessed. So if your password is any good at all, there's a very good chance you'll forget it—especially if you follow the prevailing wisdom and don't write it down. Because of that, every password-based system needs a mechanism to reset your account. And the inevitable trade-offs (security versus privacy versus convenience) mean that recovering a forgotten password can't be too onerous. That's precisely what opens your account to being easily overtaken via social engineering. Although "socialing" was responsible for just 7 percent of the hacking cases that government agencies tracked last year, it raked in 37 percent of the total data stolen.

Socialing is how my Apple ID was stolen this past summer. The hackers persuaded Apple to reset my password by calling with details about my address and the last four digits of my credit card. Because I had designated my Apple mailbox as a backup address for my Gmail account, the hackers could reset that too, deleting my entire account—eight years' worth of email and documents—in the process. They also posed as me on Twitter and posted racist and antigay diatribes there.

After my story set off a wave of publicity, Apple changed its practices: It temporarily quit issuing password resets over the phone. But you could still get one online. And so a month later, a different exploit was used against New York Times technology columnist David Pogue. This time the hackers were able to reset his password online by getting past his "security questions."

You know the drill. To reset a lost login, you need to supply answers to questions that (supposedly) only you know. For his Apple ID, Pogue had picked (1) What was your first car? (2) What is your favorite model of car? and (3) Where were you on January 1, 2000? Answers to the first two were available on Google: He had written that a Corolla had been his first car, and had recently sung the praises of his Toyota Prius. The hackers just took a wild guess on the third question. It turns out that at the dawn of the new millennium, David Pogue, like the rest of the world, was at a "party."

With that, the hackers were in. They dove into his address book (he's pals with magician David Blaine!) and locked him out of his kitchen iMac.

OK, you might think, but that could never happen to me: David Pogue is Internet- famous, a prolific writer for the major media whose every brain wave goes online. But have you thought about your LinkedIn account? Your Facebook page? Your kids' pages or your friends' or family's? If you have a serious web presence, your answers to the standard questions—still often the only options available—are trivial to root out. Your mother's maiden name is on Ancestry.com, your high school mascot is on Classmates, your birthday is on Facebook, and so is your best friend's name—even if it takes a few tries.

The ultimate problem with the password is that it's a single point of failure, open to many avenues of attack. We can't possibly have a password-based security system that's memorable enough to allow mobile logins, nimble enough to vary from site to site, convenient enough to be easily reset, and yet also secure against brute-force hacking. But today that's exactly what we're banking on—literally.

Who is doing this? Who wants to work that hard to destroy your life? The answer tends to break down into two groups, both of them equally scary: overseas syndicates and bored kids.

The syndicates are scary because they're efficient and wildly prolific. Malware and virus-writing used to be something hobbyist hackers did for fun, as proofs of concept. Not anymore. Sometime around the mid-2000s, organized crime took over. Today's virus writer is more likely to be a member of the professional criminal class operating out of the former Soviet Union than some kid in a Boston dorm room. There's a good reason for that: money.

Given the sums at stake—in 2011 Russian-speaking hackers alone took in roughly $4.5 billion from cybercrime—it's no wonder that the practice has become organized, industrialized, and even violent. Moreover, they are targeting not just businesses and financial institutions but individuals too. Russian cybercriminals, many of whom have ties to the traditional Russian mafia, took in tens of millions of dollars from individuals last year, largely by harvesting online banking passwords through phishing and malware schemes. In other words, when someone steals your Citibank password, there's a good chance it's the mob.

But teenagers are, if anything, scarier, because they're so innovative. The groups that hacked David Pogue and me shared a common member: a 14-year-old kid who goes by the handle "Dictate." He isn't a hacker in the traditional sense. He's just calling companies or chatting with them online and asking for password resets. But that does not make him any less effective. He and others like him start by looking for information about you that's publicly available: your name, email, and home address, for example, which are easy to get from sites like Spokeo and WhitePages.com. Then he uses that data to reset your password in places like Hulu and Netflix, where billing information, including the last four digits of your credit card number, is kept visibly on file. Once he has those four digits, he can get into AOL, Microsoft, and other crucial sites. Soon, through patience and trial and error, he'll have your email, your photos, your files—just as he had mine.

Click to Open Overlay GalleryMatthew Prince protected his Google Apps account with a second code that would be sent to his phone—so the hackers got his cell account. Ethan Hill

Why do kids like Dictate do it? Mostly just for lulz: to fuck shit up and watch it burn. One favorite goal is merely to piss off people by posting racist or otherwise offensive messages on their personal accounts. As Dictate explains, "Racism invokes a funnier reaction in people. Hacking, people don't care too much. When we jacked @jennarose3xo"—aka Jenna Rose, an unfortunate teen singer whose videos got widely hate-watched in 2010—"I got no reaction from just tweeting that I jacked her stuff. We got a reaction when we uploaded a video of some black guys and pretended to be them." Apparently, sociopathy sells.

A lot of these kids came out of the Xbox hacking scene, where the networked competition of gamers encouraged kids to learn cheats to get what they wanted. In particular they developed techniques to steal so-called OG (original gamer) tags—the simple ones, like Dictate instead of Dictate27098—from the people who'd claimed them first. One hacker to come out of that universe was "Cosmo," who was one of the first to discover many of the most brilliant socialing exploits out there, including those used on Amazon and PayPal. ("It just came to me," he said with pride when I met him a few months ago at his grandmother's house in southern California.) In early 2012, Cosmo's group, UGNazi, took down sites ranging from Nasdaq to the CIA to 4chan. It obtained personal information about Michael Bloomberg, Barack Obama, and Oprah Winfrey. When the FBI finally arrested this shadowy figure in June, they found that he was just 15 years old; when he and I met a few months later, I had to drive.

It's precisely because of the relentless dedication of kids like Dictate and Cosmo that the password system cannot be salvaged. You can't arrest them all, and even if you did, new ones would keep growing up. Think of the dilemma this way: Any password-reset system that will be acceptable to a 65-year-old user will fall in seconds to a 14-year-old hacker.

For the same reason, many of the silver bullets that people imagine will supplement—and save—passwords are vulnerable as well. For example, last spring hackers broke into the security company RSA and stole data relating to its SecurID tokens, supposedly hack-proof devices that provide secondary codes to accompany passwords. RSA never divulged just what was taken, but it's widely believed that the hackers got enough data to duplicate the numbers the tokens generate. If they also learned the tokens' device IDs, they'd be able to penetrate the most secure systems in corporate America.

On the consumer side, we hear a lot about the magic of Google's two-factor authentication for Gmail. It works like this: First you confirm a mobile phone number with Google. After that, whenever you try to log in from an unfamiliar IP address, the company sends an additional code to your phone: the second factor. Does this keep your account safer? Absolutely, and if you're a Gmail user, you should enable it this very minute. Will a two-factor system like Gmail's save passwords from obsolescence? Let me tell you about what happened to Matthew Prince.

This past summer UGNazi decided to go after Prince, CEO of a web performance and security company called CloudFlare. They wanted to get into his Google Apps account, but it was protected by two-factor. What to do? The hackers hit his AT&T cell phone account. As it turns out, AT&T uses Social Security numbers essentially as an over-the-phone password. Give the carrier those nine digits—or even just the last four—along with the name, phone number, and billing address on an account and it lets anyone add a forwarding number to any account in its system. And getting a Social Security number these days is simple: They're sold openly online, in shockingly complete databases.

Prince's hackers used the SSN to add a forwarding number to his AT&T service and then made a password-reset request with Google. So when the automated call came in, it was forwarded to them. Voilà—the account was theirs. Two-factor just added a second step and a little expense. The longer we stay on this outdated system—the more Social Security numbers that get passed around in databases, the more login combinations that get dumped, the more we put our entire lives online for all to see—the faster these hacks will get.

The age of the password has come to an end; we just haven't realized it yet. And no one has figured out what will take its place. What we can say for sure is this: Access to our data can no longer hinge on secrets—a string of characters, 10 strings of characters, the answers to 50 questions—that only we're supposed to know. The Internet doesn't do secrets. Everyone is a few clicks away from knowing everything.

Instead, our new system will need to hinge on who we are and what we do: where we go and when, what we have with us, how we act when we're there. And each vital account will need to cue off many such pieces of information—not just two, and definitely not just one.

This last point is crucial. It's what's so brilliant about Google's two-factor authentication, but the company simply hasn't pushed the insight far enough. Two factors should be a bare minimum. Think about it: When you see a man on the street and think it might be your friend, you don't ask for his ID. Instead, you look at a combination of signals. He has a new haircut, but does that look like his jacket? Does his voice sound the same? Is he in a place he's likely to be? If many points don't match, you wouldn't believe his ID; even if the photo seemed right, you'd just assume it had been faked.

And that, in essence, will be the future of online identity verification. It may very well include passwords, much like the IDs in our example. But it will no longer be a password-based system, any more than our system of personal identification is based on photo IDs. The password will be just one token in a multifaceted process. Jeremy Grant of the Department of Commerce calls this an identity ecosystem.

Click to Open Overlay Gallery"Cosmo," a teenage hacker in Long Beach, California, used social-engineering exploits to crack accounts at Amazon, AOL, AT&T, Microsoft, Netflix, PayPal, and more. Photo: Sandra Garcia

What about biometrics? After watching lots of movies, many of us would like to think that a fingerprint reader or iris scanner could be what passwords used to be: a single-factor solution, an instant verification. But they both have two inherent problems. First, the infrastructure to support them doesn't exist, a chicken-or-egg issue that almost always spells death for a new technology. Because fingerprint readers and iris scanners are expensive and buggy, no one uses them, and because no one uses them, they never become cheaper or better.

The second, bigger problem is also the Achilles' heel of any one-factor system: A fingerprint or iris scan is a single piece of data, and single pieces of data will be stolen. Dirk Balfanz, a software engineer on Google's security team, points out that passcodes and keys can be replaced, but biometrics are forever: "It's hard for me to get a new finger if my print gets lifted off a glass," he jokes. While iris scans look groovy in the movies, in the age of high-definition photography, using your face or your eye or even your fingerprint as a one-stop verification just means that anyone who can copy it can also get in.

Does that sound far-fetched? It's not. Kevin Mitnick, the fabled social engineer who spent five years in prison for his hacking heroics, now runs his own security company, which gets paid to break into systems and then tell the owners how it was done. In one recent exploit, the client was using voice authentication. To get in, you had to recite a series of randomly generated numbers, and both the sequence and the speaker's voice had to match. Mitnick called his client and recorded their conversation, tricking him into using the numbers zero through nine in conversation. He then split up the audio, played the numbers back in the right sequence, and—presto.

None of this is to say that biometrics won't play a crucial role in future security systems. Devices might require a biometric confirmation just to use them. (Android phones can already pull this off, and given Apple's recent purchase of mobile-biometrics firm AuthenTec, it seems a safe bet that this is coming to iOS as well.) Those devices will then help to identify you: Your computer or a remote website you're trying to access will confirm a particular device. Already, then, you've verified something you are and something you have. But if you're logging in to your bank account from an entirely unlikely place—say, Lagos, Nigeria—then you may have to go through a few more steps. Maybe you'll have to speak a phrase into the microphone and match your voiceprint. Maybe your phone's camera snaps a picture of your face and sends it to three friends, one of whom has to confirm your identity before you can proceed.

In many ways, our data providers will learn to think somewhat like credit card companies do today: monitoring patterns to flag anomalies, then shutting down activity if it seems like fraud. "A lot of what you'll see is that sort of risk analytics," Grant says. "Providers will be able to see where you're logging in from, what kind of operating system you're using."

Google is already pushing in this direction, going beyond two-factor to examine each login and see how it relates to the previous one in terms of location, device, and other signals the company won't disclose. If it sees something aberrant, it will force a user to answer questions about the account. "If you can't pass those questions," Smetters says, "we'll send you a notification and tell you to change your password—because you've been owned."

The other thing that's clear about our future password system is which trade-off—convenience or privacy—we'll need to make. It's true that a multifactor system will involve some minor sacrifices in convenience as we jump through various hoops to access our accounts. But it will involve far more significant sacrifices in privacy. The security system will need to draw upon your location and habits, perhaps even your patterns of speech or your very DNA.

We need to make that trade-off, and eventually we will. The only way forward is real identity verification: to allow our movements and metrics to be tracked in all sorts of ways and to have those movements and metrics tied to our actual identity. We are not going to retreat from the cloud—to bring our photos and email back onto our hard drives. We live there now. So we need a system that makes use of what the cloud already knows: who we are and who we talk to, where we go and what we do there, what we own and what we look like, what we say and how we sound, and maybe even what we think.

That shift will involve significant investment and inconvenience, and it will likely make privacy advocates deeply wary. It sounds creepy. But the alternative is chaos and theft and yet more pleas from "friends" in London who have just been mugged. Times have changed. We've entrusted everything we have to a fundamentally broken system. The first step is to acknowledge that fact. The second is to fix it.

Mat Honan (@mat) is a senior writer for Wired and Wired.com's Gadget Lab.