CIRCA:WWW

From CIRCA

Jump to: navigation, search

Based on a presentation by Michael Burden (slides at Google Docs)
October 14, 2010

Contents

What is WWW?

WWW stands for the World Wide Web. It is an internet service allowing navigation between interlinked hypertext and hypermedia documents.

Development of the idea of the WWW

Almost a steampunk web

In 1934, [Paul] Otlet sketched out plans for a global network of computers (or “electric telescopes,” as he called them) that would allow people to search and browse through millions of interlinked documents, images, audio and video files. He described how people would use the devices to send messages to one another, share files and even congregate in online social networks. He called the whole thing a “réseau,” which might be translated as “network” — or arguably, “web.” [Wright, 2008]

The world's knowledge was growing too large and Otlet was trying to find a way for people to search through the enormous and scattered volumes of printed and published material available, and to access its contents.

By 1895 he was collecting data on every book ever published, along with many other media forms, including photographs, pamphlets, posters, and magazines. Around 1915 he had established a service that allowed customers to send a request for information via telegram or mail. But he never moved beyond a centralized storehouse of knowledge.

Hypertext

An early attempt to provide access to all the world's knowledge was the publication of encyclopedias in the 18th century. The most famous early encyclopedia was lead by Denis Diderot and Jean le Rond D'Alembert, and modelled on an earlier English Cyclopedia by Ephraim Chambers, published in 1728. One of Chambers innovations was cross-referencing between similar articles. Diderot expanded such links to allow "the publication of non-sequential information implemented as nodes or articles connected by links [1]."

In 1945 Vannevar Bush described the Memex in an article in the Atlantic Monthly. It was a personal mechanical library, containing any records a person chose to store. Each stored document would receive a numerical code, for easy retrieval. Users could create associative trails of links between documents, mimicking the way that the human mind linked ideas or memories.

Digital systems are particularly suited to non-sequential navigation of information, so innovation in navigation grew with technological development.

In 1968 Ted Nelson coined the term hypertext to refer to a portion of text that is not only related to another text, but also allows immediate access to the related text. Douglas Engelbart gave the first public demonstration of a hypertext system that year.

In 1987 Apple released one of the most successful hypertext applications, HyperCard.

Although the Internet had been developing in parallel with hypertext, hypertext implementations were generally limited to access within a single dataspace. The breakthrough of the World Wide Web was to combine hypertext navigation with the decentralized capabilities of the Internet.

The Idea of the World Wide Web

HyperText is a way to link and access information of various kinds as a web of nodes in which the user can browse at will. Potentially, HyperText provides a single user-interface to many large classes of stored information such as reports, notes, data-bases, computer documentation and on-line systems help. We propose the implementation of a simple scheme to incorporate several different servers of machine-stored information already available at CERN, including an analysis of the requirements for information access needs by experiments. [Berners-Lee & Cailliau, 1990]


Like the encyclopedia editors and earlier hypertext developers, Tim Berners-Lee was concerned with how to access an enormous quantity of informational resources. But unlike their efforts he didn't try to copy all the information or locate it within a single application; he merely wanted to create a navigation system.

Technology of the WWW

The TCP/IP protocol suite

The internet and other global networks are made possible through a stack of layered protocols

  • Link layer - the protocols for exchanging packets of data along a link between two nodes
  • Internet layer (IP)- the protocols for transporting packets from a host node to a destination node
  • Transport layer (TCP) - end-to-end communication services for applications across a network
  • Application layer - any protocol or method enabling process-to-process communication

The World Wide Web is located in the Application layer. Other services and applications in this layer include:

  • BitTorrent (peer-to-peer file exchange)
  • Internet Relay Chat or IRC (text messaging)
  • Voice over IP or VoIP (telephony)
  • Simple Mail Transfer Protocol or SMTP (email, typically sending)
  • Internet Message Access Protocol or IMAP, and Post Office Protocol or POP (email retrieval)

Standards for the Web

A decentralized service requires certain standards, protocols and methods in order to function.

  • Uniform Resource Identifier or URI is a string that identifies a name or resource on the Internet. The most common URI is the Uniform Resource Location) or 'address' of a website (e.g. http://www.google.com)
  • the Domain Name System (DNS) is a hierarchical naming scheme containing URLs. The highest level of the hierarchy, or top-level domains, includes .ca, .com, .net, and .fr
  • HyperText Transport Protocol or HTTP is the means of sending and retrieving hypermedia documents across the internet
    • HTTP Secure or HTTPS allows encrypted exchange of hypermedia
  • HyperText Markup Language or HTML is the language for marking up hypertext documents
    • HTML developed from SGML
  • Cascading Style Sheets or CSS is a language for describing the presentation of HTML documents
  • PNG's not GIF or PNG is a method for encoding an image

The World Wide Web and Humanities Computing Projects

Suitability to Certain Projects

The Web is a means of hyperlinking digital documents and media. Although the Web is sometimes used interchangably with the term Internet, this is not accurate. WWW is a specific application on the Internet. Not all humanities computing projects will be suited to the Web's capabilities.

Awareness of Changing Standards

World Wide Web standards change over time. Any project with ongoing use of the WWW should have awareness of changes within the standards. Changes may mean that a website no long functions they way it was originally designed. HTML5, for example, is being introduced over the next decade or so to replace the current HTML language.

Web Browsers

Users access media on the web through computer programs, the most common example being a browser. The most popular web browsers are Internet Explorer, Firefox, Chrome,Safari and Opera. Different web browsers may behave differently when presenting hypermedia to the user. Different versions of the same browser may also behave differently. Web browsers may also contain features independent of the Web.

Browsers as Platforms

Web broswers have grown in power and now act closer to the functionality of an operating system than a display terminal. Using web browsers as a platform may be suited to some projects. Some hypermedia are applications for the web browser - Flash and Java applications are examples of these. Some projects may be suited to offline use of these hypermedia with the web browser.

Information is Public

Security of information is a concern on the Internet. But in order for a web browser to display a hypermedia document, it must receive a copy of the document. It is not possible for a browser to display a document without receiving a copy of it. Additionaly, unseen content may accompany the document, including cookies (files stored on the user's computer).

Search Engines

Finding specific resources on the World Wide Web can be time consuming. Search engines attempt to index large portions of the web to enable faster searching. Each search engine sends 'spiders' to all servers connected to the web, which automatically read and index pages on that server. Files and directories can be hidden from spiders using a robots.txt file. Search engines rank websites in their search results based on different criteria. A field exists (Search Engine Optimization) specifically to ensure higher ranking for websites in online searches. It can be useful to understand search engine ranking in order to make a project more visible to people using search engines.

Citations

Personal tools