|
Cancel Cable: How Internet Pirates Get Free Stuff |
Chapter 2 – Understanding BitTorrent
BitTorrent is the most popular communications protocol (set of standard rules) that pirates use to exchange files over the internet. You can use the internet your whole life knowing nothing about its many protocols, but it pays to learn the basics of BitTorrent. (Don’t confuse BitTorrent the protocol with BitTorrent the company — the latter founded by Bram Cohen, inventor of the protocol.)
Client-Server Networks
When you visit a typical website and click a link to download a (non-BitTorrent) file, you’re using traditional client-server file distribution. The browser on your computer (the client) tells the server (the remote system holding the desired file) to transfer a copy of the file to your computer. As the download progresses, sequential pieces of the file travel over the internet and are assembled into a whole file on your drive at completion. The protocol handling the transfer is usually HTTP (Hypertext Transfer Protocol) or FTP (file transfer protocol). This scheme works well in general but has a few problems:
- You depend solely on the file’s original distributor (a single point of failure) — if the server stalls, you can’t download.
- Popular downloads are prone to bottlenecks as more and more people try to suck files from a single source. (Technically, the client-server approach doesn’t scale.)
- The more popular the download, the more it costs the server in bandwidth charges.
- If the client or server has a problem mid-download (a power outage, lost connection, or system crash), then you’re stuck with an incomplete file and typically must restart the download — possibly a big download — from scratch.
Peer-to-Peer Networks
Adequate mirroring (use of cloned servers) alleviates some of the problems of client-server networks, but BitTorrent solves them outright by using a peer-to-peer (P2P) file-sharing network. Unlike a server-based network, where most of the resources lie with a few central servers, a P2P network has only peers, which are ordinary computers (like yours) that all act as equal points on the network. Every machine on a P2P network can simultaneously download from and upload to every other machine, so the notion of dedicated clients and servers doesn’t apply to P2P.
What You’ll Need
To download files via BitTorrent, you need:
- A high-speed internet connection such as DSL, cable, fiber, T1, or satellite. BitTorrent is about transferring big files, but if you’re downloading a small document or photo, a dial-up connection will work in a creaky sort of way.
- A computer running a mainstream operating system. This book covers Windows and Mac OS X.
- A free program called a BitTorrent client, described in Chapter 6.
- A hard drive with lots of free space.
When you visit a pirate website for the first time, you might be surprised by the massive amount and variety of what’s freely available and the human motivations behind it. People share files to be generous, share knowledge, spread propaganda, return favors, sabotage employers, spread viruses, refute reputations, show technical prowess, advertise products, compete with other sharers, sell services, escape obscurity, be useful to others, betray friends, defy authority, show off to girls, earn bragging rights, and on and on.
Despite its strictureless amorality, the world of mass piracy has rules. (Rules emerge in all self-organizing complex systems.) Experienced file-sharers:
- Use filenames and keywords that make it easy for others to find the files.
- Organize multiple-file downloads in folders.
- Encode files in popular and standard formats such as MP3 for audio files and PDF or EPUB for books.
- Split different categories of files into independent distributions (movies, music, books, games, and so on).
With experience, you’ll notice other rules, self-enforced because no one wants to look like a tourist. Individual pirate sites have their own rules (some forbid porn, for example) that they enforce by removing offending files or banning violators.
BitTorrent, Step by Step
Let’s look at the birth, life, and decline of a generic file shared via BitTorrent. As a new pirate, you’ll be downloading files that other people have provided. The first step below is something you do yourself only when you’re sharing your own files with others. Any number of files can be shared in a single download, but for simplicity this example uses only one file.
One seeder. The original sharer uses his BitTorrent client to create a torrent file and save it on his hard drive. This file contains metadata, or information about the file to share, not the file itself. A torrent file:
- Has a filename that describes what’s being shared, so that people can search for it. The filename for a TV show, for example, should contain at least the show’s title and episode number.
- Has the filename extension .torrent (for details about extensions, see Chapter 3).
- Points to the location (path) on the sharer’s hard drive of the file to share.
- Specifies a tracker to manage file sharing. A tracker is a server but not in the sense used in a client-server network. That is, it’s not a central location that holds the file but a traffic cop that directs the connections of everyone who’s transferring (downloading or uploading) the shared file. Trackers can negotiate huge numbers of connections; it’s common for tens of thousands of people to transfer the same movie at the same time. The sharer can choose from among many public (open) and private trackers run by pirate websites.
- Contains other metadata, such as filenames, file sizes, and error-checking values (checksums).
- Is small, about 20 KB. If you’re curious, you can open a torrent file in a text editor, but its contents are encoded for compactness.
The sharer then uploads the torrent file to a pirate website (not necessarily the same site whose tracker he’s using) and types a title and description full of searchable keywords. The tracker adds the torrent to its pool, and then the site indexes it and gives it its own webpage and download link.
The nascent torrent now waits for people to notice and download it. The original sharer is, for now, the torrent’s only seeder. A seeder is a peer who has an entire copy of a file and offers it for upload. In a torrent’s early stages, the initial seeder can’t turn off his computer, as this would make the complete file unavailable. Nor can he edit, rename, delete, or move the shared file on his drive, which would corrupt the torrent.
One seeder/one leecher. You find the torrent (see Chapter 8), click its link in your browser, and then open the torrent file in your BitTorrent client. The download starts as the file travels, slowly at first, over the network from the original seeder’s hard drive to yours. You’re now a leecher: a peer who doesn’t have the entire file and is downloading it. Peers (seeders and leechers) sharing the same torrent are called a swarm. A file distributed via BitTorrent is broken into many equal-sized pieces, like the cars of a freight train. These pieces are sent randomly — not sequentially — to a swarm’s leechers.
One seeder/two leechers. Time passes. It’s just the two of you, seed and leech. You’ve downloaded most of the file but are still missing pieces. Then a new leecher joins the swarm and starts downloading the file. And here’s where BitTorrent changes the game: the file pieces of everyone in a swarm — seeders and leechers alike — are available to everyone else in the swarm. So the new leecher downloads pieces from not only the original seeder but from you too, even though you don’t yet have the entire file. Now you’re simultaneously downloading pieces from the original seeder and uploading pieces to the new leecher. In a short time, the other leecher will get random pieces from the original seeder that you don’t yet have, and he will start uploading to you too. Contrast this scheme with that of a client-server network, where clients can’t communicate with each other and can get pieces from only the server.
Two seeders/many leechers. You finish downloading the file and the torrent gains steam as new leechers join the swarm. You’ve transformed from leecher to seeder now that you have the entire file, meaning you’re no longer downloading, only uploading to leechers. To everyone else in the swarm, you’re now no different from the original seeder.
You now can delete the torrent file and do what you want with the file that you’ve downloaded. But quitting early is impolite, so seed for a few more hours (or days). BitTorrent works best when people continue to seed after their downloads finish. When seeding, you can open or copy the files that you’ve downloaded but if you edit, rename, delete, or move them, they become unavailable to the swarm. Uploading is much slower than downloading (typically, about one tenth the speed), so seeding won’t dent your bandwidth, particularly in large swarms.
Usage note: Originally, leecher referred to someone who downloaded much more than he uploaded. Most BitTorrent sites now use the term neutrally but in some contexts leecher still denotes selfishness and peer is the neutral term.
Many seeders/many leechers. The swarm grows large as new leechers join and old leechers become seeders. Now any peer can shut down his computer or quit the swarm without affecting the other peers. If a leecher quits, he can resume downloading later at the point where he left off. Even the initial seeder can quit the swarm now that so many other seeders exist. The tracker manages connections and traffic-flow as peers come and go.
Few seeders/few leechers. Over time, the torrent declines as seeders leave the swarm. A torrent’s life span can be hours or years, depending on its popularity and the conduct of its seeders. Other things equal, the more a swarm shrinks, the longer it takes to complete a download. Swarms with only a handful of peers can be quite slow.
Death. A torrent with no seeders is dead but can be revived if someone reseeds by rejoining the swarm as a seeder to allow the remaining leechers to complete their downloads. Torrent listings show the current number of seeders and leechers (zero seeders = dead). Some pirate websites exclude dead torrents from search results by default.
If a torrent dies while you’re still downloading, don’t just delete it from your BitTorrent client. Wait a few days to see whether a reseeder appears. If a torrent dies just before you’ve completed downloading a file, you may be able to get the missing pieces from the swarm’s other leechers. If so, it’s courteous to stay on as a seeder. If not, you’re out of luck unless you can get the same file from a different torrent or an alternative source like RapidShare
or Usenet.
Now you know how BitTorrent works. A few more points:
- In contrast to this slow-motion example, the genesis of real-life torrents is often rapid. Newly released torrents for popular TV shows and movies swell to thousands of leechers in minutes. BitTorrent easily handles such flashcrowds.
- You might prefer to think of the pieces of a file as pages of a book or nucleotides of a DNA strand (rather than cars of a freight train) because pieces must be reassembled in their proper order for the information to keep its integrity.
- You can’t choose whom you trade pieces with. BitTorrent clients enforce tit-for-tat trading by monitoring peers and choking leechers who try to game bandwidth.
- Your personal files are safe. BitTorrent restricts swarm access to only the shared files on your drive. Despite millions of savvy users, no fatal security flaw has come to light.
- To see BitTorrent in action, visit mg8.org/processing/bt.html
.