Saturday, February 8, 2014

Computational mind, Part 2

Note: This series of posts isn't meant to be a technical computers for dummies guide. Quite the opposite. The conclusion will be very non-technical. I just need to get a few details out of the way first.

The Internet has been around, in one flavor or another, for over 40 years. There are two groups of people with respect to that knowledge. The first group, recent immigrants to the net, aren't surprised at all, since they just assumed that it had been around forever. The second group, people who grew up with desktop and laptop computers is the 1990s and 2000s, are very surprised. They thought that the Internet had been raised with the home computer. In some ways it was, though it was born well before them.

The Internet's daddy was called ARPANet. ARPA stands for the Advanced Research Projects Administration. It later became DARPA, with the D standing for Defense. Yep, the Internet was invented by the government. The Department of Defense, no less. Al Gore made himself the butt of many jokes when he claimed to have invented the Internet. Ol' Al was still at Harvard when the Internet was invented. The NSA probably figures that since the Internet was invented by the government, they don't have to feel so bad about using it for their nefarious ends. I doubt they would feel bad about it in any case.

ARPANet woke up in 1969. It was the world's first packet switched network. Before that, everything was circuit switched. That distinction has survived all the way to your cell phone, by the way. Circuit switching is the way your land line telephone worked. You picked up the receiver and dialed some digits. The wire from your house went to a place that had a bunch of switches in it that connected your wire to the wire going in to the house belonging to the person you called. Effectively, there was one continuous piece of wire between you and whoever you called. Kind of like the old idea of two tin cans connected by a piece of string. Barring eavesdropping, it was just you and whoever you called on the line. In packet switching, a lot of houses could be connected together via something called a router. It was originally called an Interface Message Processor, or IMP, but we'll stick with router. The information you want to send is bundled up into little packets that have an address on them and the router sends them where they need to go. So you have a wire to the router, but only the router has a wire to everywhere else. Think of it as a network traffic cop. When ARPANet woke up, it had four routers. The Internet today as tens of millions of routers.

In networking there is a term, protocol, that is widely used for a number of purposes. Just like in diplomacy for humans, a network protocol lets computers agree on a way to talk to each other. There are hundreds of protocols in use today, but we're just interested in Internet Protocols, or IP. If you watch any cop shows lately, they talk about tracking bad guys down by using their IP address. It's a load of crap, for the most part, but it is based on the fact that every IP packet has an address on it so the routers know where to send it. Because the address follows a very specific Internet format, it is called an IP address. The first Internet protocol was called the Transfer Control Protocol, or TCP. To associate it specifically with the Internet, it is usually referred to as TCP/IP. TCP is used because most of the information you want to send won't fit in one packet. There can be a huge number of routers connected in different ways, so all of the broken up packets containing your information can get to you via different routes, and they may not arrive in order. TCP provides a way to number your packets so that the computer on the receiving end can put them back together in the right way, kind of like a jigsaw puzzle. TCP allowed for a large network of interconnected computers, in other words, the Internet.

There is a whole bunch of other technical stuff, like collision detection, token passing, OSI layers, and so on, but they aren't important for the point of these posts. Plus, this has gotten pretty deep already. You'll just have to trust me when I say all of this will be important later.

The actual Internet woke up in 1982. At first, it was just used to interconnect universities and government agencies. It was called NSFNet, after the National Science Foundation, which funded the interconnected locations. Shortly after that, commercial providers, called ISPs, started plugging into the network. In less than five years, the Internet was world wide. At first, the only way that information could be passed was via programs like Gopher, Archie, and Veronica. I told you computer geeks are weird. The biggest repository of information was something called Usenet (use net). It was essentially a big electronic bulletin board. People could leave messages on it for other people, and other people could respond with messages for other people, and so on. A conversation of sorts could be carried on. Facebook is a direct descendant of Usenet. Today, the Usenet group, alt.binaries contains the vast majority of the information on Usenet. All of its subgroups contain things like computer programs, music, and, of course, porn. Most of the major places that host Usenet still have everything that was posted there in the last ten years, at least. There is some information on Usenet that goes back to the beginning, 30 or more years ago.

In 1989, a guy named Tim Berners-Lee came up with the idea of creating a large, heavily interconnected network, a web, using something called hyperlinking. Think of a spider web. Hyperlinking is a way for various pieces of information to be linked to parts of other information. When you "click on a link," you are using hyperlinking. The link contains an address for another piece of information. Thus was born the World Wide Web. That's what the www in web page addresses means. The idea caught fire, and within a few years, it was everywhere. Since there were no limitations on what could be hyperlinked, the first few years of the web were a struggle. There was a lot of crap out there. And I do mean crap. Web pages were nothing like they are today. Anybody could build a web page, and unfortunately a lot of people did.

This is a diagram of Wikipedia's view of a small part of the web.


The World Wide Web very quickly became a monstrous maze of information. Very early on, directories were  built, but the web was growing too fast to keep up with it. The immediate product of this was to write programs that did exactly what a spider does on a web. They were called web crawlers. Because of the nature of hyperlinking, a web page was usually connected to another web page which was usually connected to yet another web page. The crawler simply follows the hyperlinks and makes a note of each page it encounters. The information is put into a huge database. Another program, called a search engine, uses the web crawler database to find things on the web. The very first large scale search engine was built by the Digital Equipment Corporation (DEC). It was called Altavista. The web page for it is still around, but it sends you to Yahoo. Yahoo was the first commercial search engine. Google is now the largest search engine. Here's the important fact to remember. Google's thousands of web crawlers, which run 24 hours a day, seven days a week, have only indexed part of the web, most estimates put it at 70%. With a little under 2 billion pages indexed, that means that over 300 million pages are off in dark, dusty corners of the Internet that no one knows about. 300 million.

Today, most people don't even use the term, World Wide Web. It's just the Internet. They aren't aware that all of the things that were on the Internet before the World Wide Web are still there. Most people don't even use a conventional email program like Thunderbird or (trying not to choke on this) Outlook. Most people these days use things like Yahoo, Hotmail, or Google Mail. They get there through pages on the World Wide Web. Very few of the users of the Internet conceive of its size and complexity. Those of us who were around in the beginning can't really conceive of it. We know it's huge, but the human brain balks at the actual size of it. And it's still growing. More and more smartphones and pads are connected every day. The use of home control systems is growing. We're starting to connect our cars. It was a long road from ARPANet to Internet, and it was covered in a very short time.

Now here is an important point to remember. When something is put on the internet, for the overwhelming majority of cases, it is there forever. Teenage girls learn this to their own chagrin. Those racy pictures you sent to your boyfriend will have a life of their own. Something you put on the Internet will end up on someone else's computer. It will be there until the owner deletes it or the computer dies without a backup. Even if it disappears from all of the publicly available places, it can find its way back onto the net from all of the computers that had it before it disappeared. Google keeps archive copies of web pages that don't have homes any more. There are millions of pages that have been gone for years that are still archived on Google. I had another blog here on blogspot before Google bought it. When that happened, a bunch of blogs were deleted, mine included. However, I would bet real money that some or all of it is still out there somewhere.

There is no feat of engineering in human history that has ever been close to the Internet in size and complexity. It is ubiquitous. It connects hundreds of millions of people. Over the next 5 or 10 years, that number will be in the billions. And, it all comes down to the fact that you can sit in your living room in Topeka, Kansas and say hello to someone in Beijing, China in less time than it took you to type the word.

Up next: The microprocessor revolution.

No comments:

Post a Comment