HTTP: The Definitive Guide

Book description

Behind every web transaction lies the Hypertext Transfer Protocol (HTTP) --- the language of web browsers and servers, of portals and search engines, of e-commerce and web services. Understanding HTTP is essential for practically all web-based programming, design, analysis, and administration.While the basics of HTTP are elegantly simple, the protocol's advanced features are notoriously confusing, because they knit together complex technologies and terminology from many disciplines. This book clearly explains HTTP and these interrelated core technologies, in twenty-one logically organized chapters, backed up by hundreds of detailed illustrations and examples, and convenient reference appendices. HTTP: The Definitive Guide explains everything people need to use HTTP efficiently -- including the "black arts" and "tricks of the trade" -- in a concise and readable manner.In addition to explaining the basic HTTP features, syntax and guidelines, this book clarifies related, but often misunderstood topics, such as: TCP connection management, web proxy and cache architectures, web robots and robots.txt files, Basic and Digest authentication, secure HTTP transactions, entity body processing, internationalized content, and traffic redirection.Many technical professionals will benefit from this book. Internet architects and developers who need to design and develop software, IT professionals who need to understand Internet architectural components and interactions, multimedia designers who need to publish and host multimedia, performance engineers who need to optimize web performance, technical marketing professionals who need a clear picture of core web architectures and protocols, as well as untold numbers of students and hobbyists will all benefit from the knowledge packed in this volume.There are many books that explain how to use the Web, but this is the one that explains how the Web works. Written by experts with years of design and implementation experience, this book is the definitive technical bible that describes the "why" and the "how" of HTTP and web core technologies. HTTP: The Definitive Guide is an essential reference that no technically-inclined member of the Internet community should be without.

Publisher resources

View/Submit Errata

Table of contents

  1. HTTP: The Definitive Guide
  2. Preface
    1. Running Example: Joe’s Hardware Store
    2. Chapter-by-Chapter Guide
    3. Typographic Conventions
    4. Comments and Questions
    5. Acknowledgments
  3. I. HTTP: The Web’s Foundation
    1. 1. Overview of HTTP
      1. 1.1. HTTP: The Internet’s Multimedia Courier
      2. 1.2. Web Clients and Servers
      3. 1.3. Resources
        1. 1.3.1. Media Types
        2. 1.3.2. URIs
        3. 1.3.3. URLs
        4. 1.3.4. URNs
      4. 1.4. Transactions
        1. 1.4.1. Methods
        2. 1.4.2. Status Codes
        3. 1.4.3. Web Pages Can Consist of Multiple Objects
      5. 1.5. Messages
        1. 1.5.1. Simple Message Example
      6. 1.6. Connections
        1. 1.6.1. TCP/IP
        2. 1.6.2. Connections, IP Addresses, and Port Numbers
        3. 1.6.3. A Real Example Using Telnet
      7. 1.7. Protocol Versions
      8. 1.8. Architectural Components of the Web
        1. 1.8.1. Proxies
        2. 1.8.2. Caches
        3. 1.8.3. Gateways
        4. 1.8.4. Tunnels
        5. 1.8.5. Agents
      9. 1.9. The End of the Beginning
      10. 1.10. For More Information
        1. 1.10.1. HTTP Protocol Information
        2. 1.10.2. Historical Perspective
        3. 1.10.3. Other World Wide Web Information
    2. 2. URLs and Resources
      1. 2.1. Navigating the Internet’s Resources
        1. 2.1.1. The Dark Days Before URLs
      2. 2.2. URL Syntax
        1. 2.2.1. Schemes: What Protocol to Use
        2. 2.2.2. Hosts and Ports
        3. 2.2.3. Usernames and Passwords
        4. 2.2.4. Paths
        5. 2.2.5. Parameters
        6. 2.2.6. Query Strings
        7. 2.2.7. Fragments
      3. 2.3. URL Shortcuts
        1. 2.3.1. Relative URLs
          1. 2.3.1.1. Base URLs
          2. 2.3.1.2. Resolving relative references
        2. 2.3.2. Expandomatic URLs
      4. 2.4. Shady Characters
        1. 2.4.1. The URL Character Set
        2. 2.4.2. Encoding Mechanisms
        3. 2.4.3. Character Restrictions
        4. 2.4.4. A Bit More
      5. 2.5. A Sea of Schemes
      6. 2.6. The Future
        1. 2.6.1. If Not Now, When?
      7. 2.7. For More Information
    3. 3. HTTP Messages
      1. 3.1. The Flow of Messages
        1. 3.1.1. Messages Commute Inbound to the Origin Server
        2. 3.1.2. Messages Flow Downstream
      2. 3.2. The Parts of a Message
        1. 3.2.1. Message Syntax
        2. 3.2.2. Start Lines
          1. 3.2.2.1. Request line
          2. 3.2.2.2. Response line
          3. 3.2.2.3. Methods
          4. 3.2.2.4. Status codes
          5. 3.2.2.5. Reason phrases
          6. 3.2.2.6. Version numbers
        3. 3.2.3. Headers
          1. 3.2.3.1. Header classifications
          2. 3.2.3.2. Header continuation lines
        4. 3.2.4. Entity Bodies
        5. 3.2.5. Version 0.9 Messages
      3. 3.3. Methods
        1. 3.3.1. Safe Methods
        2. 3.3.2. GET
        3. 3.3.3. HEAD
        4. 3.3.4. PUT
        5. 3.3.5. POST
        6. 3.3.6. TRACE
        7. 3.3.7. OPTIONS
        8. 3.3.8. DELETE
        9. 3.3.9. Extension Methods
      4. 3.4. Status Codes
        1. 3.4.1. 100-199: Informational Status Codes
          1. 3.4.1.1. Clients and 100 Continue
          2. 3.4.1.2. Servers and 100 Continue
          3. 3.4.1.3. Proxies and 100 Continue
        2. 3.4.2. 200-299: Success Status Codes
        3. 3.4.3. 300-399: Redirection Status Codes
        4. 3.4.4. 400-499: Client Error Status Codes
        5. 3.4.5. 500-599: Server Error Status Codes
      5. 3.5. Headers
        1. 3.5.1. General Headers
          1. 3.5.1.1. General caching headers
        2. 3.5.2. Request Headers
          1. 3.5.2.1. Accept headers
          2. 3.5.2.2. Conditional request headers
          3. 3.5.2.3. Request security headers
          4. 3.5.2.4. Proxy request headers
        3. 3.5.3. Response Headers
          1. 3.5.3.1. Negotiation headers
          2. 3.5.3.2. Response security headers
        4. 3.5.4. Entity Headers
          1. 3.5.4.1. Content headers
          2. 3.5.4.2. Entity caching headers
      6. 3.6. For More Information
    4. 4. Connection Management
      1. 4.1. TCP Connections
        1. 4.1.1. TCP Reliable Data Pipes
        2. 4.1.2. TCP Streams Are Segmented and Shipped by IP Packets
        3. 4.1.3. Keeping TCP Connections Straight
        4. 4.1.4. Programming with TCP Sockets
      2. 4.2. TCP Performance Considerations
        1. 4.2.1. HTTP Transaction Delays
        2. 4.2.2. Performance Focus Areas
        3. 4.2.3. TCP Connection Handshake Delays
        4. 4.2.4. Delayed Acknowledgments
        5. 4.2.5. TCP Slow Start
        6. 4.2.6. Nagle’s Algorithm and TCP_NODELAY
        7. 4.2.7. TIME_WAIT Accumulation and Port Exhaustion
      3. 4.3. HTTP Connection Handling
        1. 4.3.1. The Oft-Misunderstood Connection Header
        2. 4.3.2. Serial Transaction Delays
      4. 4.4. Parallel Connections
        1. 4.4.1. Parallel Connections May Make Pages Load Faster
        2. 4.4.2. Parallel Connections Are Not Always Faster
        3. 4.4.3. Parallel Connections May “Feel” Faster
      5. 4.5. Persistent Connections
        1. 4.5.1. Persistent Versus Parallel Connections
        2. 4.5.2. HTTP/1.0+ Keep-Alive Connections
        3. 4.5.3. Keep-Alive Operation
        4. 4.5.4. Keep-Alive Options
        5. 4.5.5. Keep-Alive Connection Restrictions and Rules
        6. 4.5.6. Keep-Alive and Dumb Proxies
          1. 4.5.6.1. The Connection header and blind relays
          2. 4.5.6.2. Proxies and hop-by-hop headers
        7. 4.5.7. The Proxy-Connection Hack
        8. 4.5.8. HTTP/1.1 Persistent Connections
        9. 4.5.9. Persistent Connection Restrictions and Rules
      6. 4.6. Pipelined Connections
      7. 4.7. The Mysteries of Connection Close
        1. 4.7.1. “At Will” Disconnection
        2. 4.7.2. Content-Length and Truncation
        3. 4.7.3. Connection Close Tolerance, Retries, and Idempotency
        4. 4.7.4. Graceful Connection Close
          1. 4.7.4.1. Full and half closes
          2. 4.7.4.2. TCP close and reset errors
          3. 4.7.4.3. Graceful close
      8. 4.8. For More Information
        1. 4.8.1. HTTP Connections
        2. 4.8.2. HTTP Performance Issues
        3. 4.8.3. TCP/IP
  4. II. HTTP Architecture
    1. 5. Web Servers
      1. 5.1. Web Servers Come in All Shapes and Sizes
        1. 5.1.1. Web Server Implementations
        2. 5.1.2. General-Purpose Software Web Servers
        3. 5.1.3. Web Server Appliances
        4. 5.1.4. Embedded Web Servers
      2. 5.2. A Minimal Perl Web Server
      3. 5.3. What Real Web Servers Do
      4. 5.4. Step 1: Accepting Client Connections
        1. 5.4.1. Handling New Connections
        2. 5.4.2. Client Hostname Identification
        3. 5.4.3. Determining the Client User Through ident
      5. 5.5. Step 2: Receiving Request Messages
        1. 5.5.1. Internal Representations of Messages
        2. 5.5.2. Connection Input/Output Processing Architectures
      6. 5.6. Step 3: Processing Requests
      7. 5.7. Step 4: Mapping and Accessing Resources
        1. 5.7.1. Docroots
          1. 5.7.1.1. Virtually hosted docroots
          2. 5.7.1.2. User home directory docroots
        2. 5.7.2. Directory Listings
        3. 5.7.3. Dynamic Content Resource Mapping
        4. 5.7.4. Server-Side Includes (SSI)
        5. 5.7.5. Access Controls
      8. 5.8. Step 5: Building Responses
        1. 5.8.1. Response Entities
        2. 5.8.2. MIME Typing
        3. 5.8.3. Redirection
      9. 5.9. Step 6: Sending Responses
      10. 5.10. Step 7: Logging
      11. 5.11. For More Information
    2. 6. Proxies
      1. 6.1. Web Intermediaries
        1. 6.1.1. Private and Shared Proxies
        2. 6.1.2. Proxies Versus Gateways
      2. 6.2. Why Use Proxies?
      3. 6.3. Where Do Proxies Go?
        1. 6.3.1. Proxy Server Deployment
        2. 6.3.2. Proxy Hierarchies
          1. 6.3.2.1. Proxy hierarchy content routing
        3. 6.3.3. How Proxies Get Traffic
      4. 6.4. Client Proxy Settings
        1. 6.4.1. Client Proxy Configuration: Manual
        2. 6.4.2. Client Proxy Configuration: PAC Files
        3. 6.4.3. Client Proxy Configuration: WPAD
      5. 6.5. Tricky Things About Proxy Requests
        1. 6.5.1. Proxy URIs Differ from Server URIs
        2. 6.5.2. The Same Problem with Virtual Hosting
        3. 6.5.3. Intercepting Proxies Get Partial URIs
        4. 6.5.4. Proxies Can Handle Both Proxy and Server Requests
        5. 6.5.5. In-Flight URI Modification
        6. 6.5.6. URI Client Auto-Expansion and Hostname Resolution
        7. 6.5.7. URI Resolution Without a Proxy
        8. 6.5.8. URI Resolution with an Explicit Proxy
        9. 6.5.9. URI Resolution with an Intercepting Proxy
      6. 6.6. Tracing Messages
        1. 6.6.1. The Via Header
          1. 6.6.1.1. Via syntax
          2. 6.6.1.2. Via request and response paths
          3. 6.6.1.3. Via and gateways
          4. 6.6.1.4. The Server and Via headers
          5. 6.6.1.5. Privacy and security implications of Via
        2. 6.6.2. The TRACE Method
          1. 6.6.2.1. Max-Forwards
      7. 6.7. Proxy Authentication
      8. 6.8. Proxy Interoperation
        1. 6.8.1. Handling Unsupported Headers and Methods
        2. 6.8.2. OPTIONS: Discovering Optional Feature Support
        3. 6.8.3. The Allow Header
      9. 6.9. For More Information
    3. 7. Caching
      1. 7.1. Redundant Data Transfers
      2. 7.2. Bandwidth Bottlenecks
      3. 7.3. Flash Crowds
      4. 7.4. Distance Delays
      5. 7.5. Hits and Misses
        1. 7.5.1. Revalidations
        2. 7.5.2. Hit Rate
        3. 7.5.3. Byte Hit Rate
        4. 7.5.4. Distinguishing Hits and Misses
      6. 7.6. Cache Topologies
        1. 7.6.1. Private Caches
        2. 7.6.2. Public Proxy Caches
        3. 7.6.3. Proxy Cache Hierarchies
        4. 7.6.4. Cache Meshes, Content Routing, and Peering
      7. 7.7. Cache Processing Steps
        1. 7.7.1. Step 1: Receiving
        2. 7.7.2. Step 2: Parsing
        3. 7.7.3. Step 3: Lookup
        4. 7.7.4. Step 4: Freshness Check
        5. 7.7.5. Step 5: Response Creation
        6. 7.7.6. Step 6: Sending
        7. 7.7.7. Step 7: Logging
        8. 7.7.8. Cache Processing Flowchart
      8. 7.8. Keeping Copies Fresh
        1. 7.8.1. Document Expiration
        2. 7.8.2. Expiration Dates and Ages
        3. 7.8.3. Server Revalidation
        4. 7.8.4. Revalidation with Conditional Methods
        5. 7.8.5. If-Modified-Since: Date Revalidation
        6. 7.8.6. If-None-Match: Entity Tag Revalidation
        7. 7.8.7. Weak and Strong Validators
        8. 7.8.8. When to Use Entity Tags and Last-Modified Dates
      9. 7.9. Controlling Cachability
        1. 7.9.1. No-Cache and No-Store Headers
        2. 7.9.2. Max-Age Response Headers
        3. 7.9.3. Expires Response Headers
        4. 7.9.4. Must-Revalidate Response Headers
        5. 7.9.5. Heuristic Expiration
        6. 7.9.6. Client Freshness Constraints
        7. 7.9.7. Cautions
      10. 7.10. Setting Cache Controls
        1. 7.10.1. Controlling HTTP Headers with Apache
        2. 7.10.2. Controlling HTML Caching Through HTTP-EQUIV
      11. 7.11. Detailed Algorithms
        1. 7.11.1. Age and Freshness Lifetime
        2. 7.11.2. Age Computation
          1. 7.11.2.1. Apparent age is based on the Date header
          2. 7.11.2.2. Hop-by-hop age calculations
          3. 7.11.2.3. Compensating for network delays
        3. 7.11.3. Complete Age-Calculation Algorithm
        4. 7.11.4. Freshness Lifetime Computation
        5. 7.11.5. Complete Server-Freshness Algorithm
      12. 7.12. Caches and Advertising
        1. 7.12.1. The Advertiser’s Dilemma
        2. 7.12.2. The Publisher’s Response
        3. 7.12.3. Log Migration
        4. 7.12.4. Hit Metering and Usage Limiting
      13. 7.13. For More Information
    4. 8. Integration Points: Gateways, Tunnels, and Relays
      1. 8.1. Gateways
        1. 8.1.1. Client-Side and Server-Side Gateways
      2. 8.2. Protocol Gateways
        1. 8.2.1. HTTP/*: Server-Side Web Gateways
        2. 8.2.2. HTTP/HTTPS: Server-Side Security Gateways
        3. 8.2.3. HTTPS/HTTP: Client-Side Security Accelerator Gateways
      3. 8.3. Resource Gateways
        1. 8.3.1. Common Gateway Interface (CGI)
        2. 8.3.2. Server Extension APIs
      4. 8.4. Application Interfaces and Web Services
      5. 8.5. Tunnels
        1. 8.5.1. Establishing HTTP Tunnels with CONNECT
          1. 8.5.1.1. CONNECT requests
          2. 8.5.1.2. CONNECT responses
        2. 8.5.2. Data Tunneling, Timing, and Connection Management
        3. 8.5.3. SSL Tunneling
        4. 8.5.4. SSL Tunneling Versus HTTP/HTTPS Gateways
        5. 8.5.5. Tunnel Authentication
        6. 8.5.6. Tunnel Security Considerations
      6. 8.6. Relays
      7. 8.7. For More Information
    5. 9. Web Robots
      1. 9.1. Crawlers and Crawling
        1. 9.1.1. Where to Start: The “Root Set”
        2. 9.1.2. Extracting Links and Normalizing Relative Links
        3. 9.1.3. Cycle Avoidance
        4. 9.1.4. Loops and Dups
        5. 9.1.5. Trails of Breadcrumbs
        6. 9.1.6. Aliases and Robot Cycles
        7. 9.1.7. Canonicalizing URLs
        8. 9.1.8. Filesystem Link Cycles
        9. 9.1.9. Dynamic Virtual Web Spaces
        10. 9.1.10. Avoiding Loops and Dups
      2. 9.2. Robotic HTTP
        1. 9.2.1. Identifying Request Headers
        2. 9.2.2. Virtual Hosting
        3. 9.2.3. Conditional Requests
        4. 9.2.4. Response Handling
          1. 9.2.4.1. Status codes
          2. 9.2.4.2. Entities
        5. 9.2.5. User-Agent Targeting
      3. 9.3. Misbehaving Robots
      4. 9.4. Excluding Robots
        1. 9.4.1. The Robots Exclusion Standard
        2. 9.4.2. Web Sites and robots.txt Files
          1. 9.4.2.1. Fetching robots.txt
          2. 9.4.2.2. Response codes
        3. 9.4.3. robots.txt File Format
          1. 9.4.3.1. The User-Agent line
          2. 9.4.3.2. The Disallow and Allow lines
          3. 9.4.3.3. Disallow/Allow prefix matching
        4. 9.4.4. Other robots.txt Wisdom
        5. 9.4.5. Caching and Expiration of robots.txt
        6. 9.4.6. Robot Exclusion Perl Code
        7. 9.4.7. HTML Robot-Control META Tags
          1. 9.4.7.1. Robot META directives
          2. 9.4.7.2. Search engine META tags
      5. 9.5. Robot Etiquette
      6. 9.6. Search Engines
        1. 9.6.1. Think Big
        2. 9.6.2. Modern Search Engine Architecture
        3. 9.6.3. Full-Text Index
        4. 9.6.4. Posting the Query
        5. 9.6.5. Sorting and Presenting the Results
        6. 9.6.6. Spoofing
      7. 9.7. For More Information
    6. 10. HTTP-NG
      1. 10.1. HTTP’s Growing Pains
      2. 10.2. HTTP-NG Activity
      3. 10.3. Modularize and Enhance
      4. 10.4. Distributed Objects
      5. 10.5. Layer 1: Messaging
      6. 10.6. Layer 2: Remote Invocation
      7. 10.7. Layer 3: Web Application
      8. 10.8. WebMUX
      9. 10.9. Binary Wire Protocol
      10. 10.10. Current Status
      11. 10.11. For More Information
  5. III. Identification, Authorization, and Security
    1. 11. Client Identification and Cookies
      1. 11.1. The Personal Touch
      2. 11.2. HTTP Headers
      3. 11.3. Client IP Address
      4. 11.4. User Login
      5. 11.5. Fat URLs
      6. 11.6. Cookies
        1. 11.6.1. Types of Cookies
        2. 11.6.2. How Cookies Work
        3. 11.6.3. Cookie Jar: Client-Side State
          1. 11.6.3.1. Netscape Navigator cookies
          2. 11.6.3.2. Microsoft Internet Explorer cookies
        4. 11.6.4. Different Cookies for Different Sites
          1. 11.6.4.1. Cookie Domain attribute
          2. 11.6.4.2. Cookie Path attribute
        5. 11.6.5. Cookie Ingredients
        6. 11.6.6. Version 0 (Netscape) Cookies
          1. 11.6.6.1. Version 0 Set-Cookie header
          2. 11.6.6.2. Version 0 Cookie header
        7. 11.6.7. Version 1 (RFC 2965) Cookies
          1. 11.6.7.1. Version 1 Set-Cookie2 header
          2. 11.6.7.2. Version 1 Cookie header
          3. 11.6.7.3. Version 1 Cookie2 header and version negotiation
        8. 11.6.8. Cookies and Session Tracking
        9. 11.6.9. Cookies and Caching
        10. 11.6.10. Cookies, Security, and Privacy
      7. 11.7. For More Information
    2. 12. Basic Authentication
      1. 12.1. Authentication
        1. 12.1.1. HTTP’s Challenge/Response Authentication Framework
        2. 12.1.2. Authentication Protocols and Headers
        3. 12.1.3. Security Realms
      2. 12.2. Basic Authentication
        1. 12.2.1. Basic Authentication Example
        2. 12.2.2. Base-64 Username/Password Encoding
        3. 12.2.3. Proxy Authentication
      3. 12.3. The Security Flaws of Basic Authentication
      4. 12.4. For More Information
    3. 13. Digest Authentication
      1. 13.1. The Improvements of Digest Authentication
        1. 13.1.1. Using Digests to Keep Passwords Secret
        2. 13.1.2. One-Way Digests
        3. 13.1.3. Using Nonces to Prevent Replays
        4. 13.1.4. The Digest Authentication Handshake
      2. 13.2. Digest Calculations
        1. 13.2.1. Digest Algorithm Input Data
        2. 13.2.2. The Algorithms H(d) and KD(s,d)
        3. 13.2.3. The Security-Related Data (A1)
        4. 13.2.4. The Message-Related Data (A2)
        5. 13.2.5. Overall Digest Algorithm
        6. 13.2.6. Digest Authentication Session
        7. 13.2.7. Preemptive Authorization
          1. 13.2.7.1. Next nonce pregeneration
          2. 13.2.7.2. Limited nonce reuse
          3. 13.2.7.3. Synchronized nonce generation
        8. 13.2.8. Nonce Selection
        9. 13.2.9. Symmetric Authentication
      3. 13.3. Quality of Protection Enhancements
        1. 13.3.1. Message Integrity Protection
        2. 13.3.2. Digest Authentication Headers
      4. 13.4. Practical Considerations
        1. 13.4.1. Multiple Challenges
        2. 13.4.2. Error Handling
        3. 13.4.3. Protection Spaces
        4. 13.4.4. Rewriting URIs
        5. 13.4.5. Caches
      5. 13.5. Security Considerations
        1. 13.5.1. Header Tampering
        2. 13.5.2. Replay Attacks
        3. 13.5.3. Multiple Authentication Mechanisms
        4. 13.5.4. Dictionary Attacks
        5. 13.5.5. Hostile Proxies and Man-in-the-Middle Attacks
        6. 13.5.6. Chosen Plaintext Attacks
        7. 13.5.7. Storing Passwords
      6. 13.6. For More Information
    4. 14. Secure HTTP
      1. 14.1. Making HTTP Safe
        1. 14.1.1. HTTPS
      2. 14.2. Digital Cryptography
        1. 14.2.1. The Art and Science of Secret Coding
        2. 14.2.2. Ciphers
        3. 14.2.3. Cipher Machines
        4. 14.2.4. Keyed Ciphers
        5. 14.2.5. Digital Ciphers
      3. 14.3. Symmetric-Key Cryptography
        1. 14.3.1. Key Length and Enumeration Attacks
        2. 14.3.2. Establishing Shared Keys
      4. 14.4. Public-Key Cryptography
        1. 14.4.1. RSA
        2. 14.4.2. Hybrid Cryptosystems and Session Keys
      5. 14.5. Digital Signatures
        1. 14.5.1. Signatures Are Cryptographic Checksums
      6. 14.6. Digital Certificates
        1. 14.6.1. The Guts of a Certificate
        2. 14.6.2. X.509 v3 Certificates
        3. 14.6.3. Using Certificates to Authenticate Servers
      7. 14.7. HTTPS: The Details
        1. 14.7.1. HTTPS Overview
        2. 14.7.2. HTTPS Schemes
        3. 14.7.3. Secure Transport Setup
        4. 14.7.4. SSL Handshake
        5. 14.7.5. Server Certificates
        6. 14.7.6. Site Certificate Validation
        7. 14.7.7. Virtual Hosting and Certificates
      8. 14.8. A Real HTTPS Client
        1. 14.8.1. OpenSSL
        2. 14.8.2. A Simple HTTPS Client
        3. 14.8.3. Executing Our Simple OpenSSL Client
      9. 14.9. Tunneling Secure Traffic Through Proxies
      10. 14.10. For More Information
        1. 14.10.1. HTTP Security
        2. 14.10.2. SSL and TLS
        3. 14.10.3. Public-Key Infrastructure
        4. 14.10.4. Digital Cryptography
  6. IV. Entities, Encodings, and Internationalization
    1. 15. Entities and Encodings
      1. 15.1. Messages Are Crates, Entities Are Cargo
        1. 15.1.1. Entity Bodies
      2. 15.2. Content-Length: The Entity’s Size
        1. 15.2.1. Detecting Truncation
        2. 15.2.2. Incorrect Content-Length
        3. 15.2.3. Content-Length and Persistent Connections
        4. 15.2.4. Content Encoding
        5. 15.2.5. Rules for Determining Entity Body Length
      3. 15.3. Entity Digests
      4. 15.4. Media Type and Charset
        1. 15.4.1. Character Encodings for Text Media
        2. 15.4.2. Multipart Media Types
        3. 15.4.3. Multipart Form Submissions
        4. 15.4.4. Multipart Range Responses
      5. 15.5. Content Encoding
        1. 15.5.1. The Content-Encoding Process
        2. 15.5.2. Content-Encoding Types
        3. 15.5.3. Accept-Encoding Headers
      6. 15.6. Transfer Encoding and Chunked Encoding
        1. 15.6.1. Safe Transport
        2. 15.6.2. Transfer-Encoding Headers
        3. 15.6.3. Chunked Encoding
          1. 15.6.3.1. Chunking and persistent connections
          2. 15.6.3.2. Trailers in chunked messages
        4. 15.6.4. Combining Content and Transfer Encodings
        5. 15.6.5. Transfer-Encoding Rules
      7. 15.7. Time-Varying Instances
      8. 15.8. Validators and Freshness
        1. 15.8.1. Freshness
        2. 15.8.2. Conditionals and Validators
      9. 15.9. Range Requests
      10. 15.10. Delta Encoding
        1. 15.10.1. Instance Manipulations, Delta Generators, and Delta Appliers
      11. 15.11. For More Information
    2. 16. Internationalization
      1. 16.1. HTTP Support for International Content
      2. 16.2. Character Sets and HTTP
        1. 16.2.1. Charset Is a Character-to-Bits Encoding
        2. 16.2.2. How Character Sets and Encodings Work
        3. 16.2.3. The Wrong Charset Gives the Wrong Characters
        4. 16.2.4. Standardized MIME Charset Values
        5. 16.2.5. Content-Type Charset Header and META Tags
        6. 16.2.6. The Accept-Charset Header
      3. 16.3. Multilingual Character Encoding Primer
        1. 16.3.1. Character Set Terminology
        2. 16.3.2. Charset Is Poorly Named
        3. 16.3.3. Characters
        4. 16.3.4. Glyphs, Ligatures, and Presentation Forms
        5. 16.3.5. Coded Character Sets
          1. 16.3.5.1. US-ASCII: The mother of all character sets
          2. 16.3.5.2. iso-8859
          3. 16.3.5.3. JIS X 0201
          4. 16.3.5.4. JIS X 0208 and JIS X 0212
          5. 16.3.5.5. UCS
        6. 16.3.6. Character Encoding Schemes
          1. 16.3.6.1. 8-bit
          2. 16.3.6.2. UTF-8
          3. 16.3.6.3. iso-2022-jp
          4. 16.3.6.4. euc-jp
      4. 16.4. Language Tags and HTTP
        1. 16.4.1. The Content-Language Header
        2. 16.4.2. The Accept-Language Header
        3. 16.4.3. Types of Language Tags
        4. 16.4.4. Subtags
        5. 16.4.5. Capitalization
        6. 16.4.6. IANA Language Tag Registrations
        7. 16.4.7. First Subtag: Namespace
        8. 16.4.8. Second Subtag: Namespace
        9. 16.4.9. Remaining Subtags: Namespace
        10. 16.4.10. Configuring Language Preferences
        11. 16.4.11. Language Tag Reference Tables
      5. 16.5. Internationalized URIs
        1. 16.5.1. Global Transcribability Versus Meaningful Characters
        2. 16.5.2. URI Character Repertoire
        3. 16.5.3. Escaping and Unescaping
        4. 16.5.4. Escaping International Characters
        5. 16.5.5. Modal Switches in URIs
      6. 16.6. Other Considerations
        1. 16.6.1. Headers and Out-of-Spec Data
        2. 16.6.2. Dates
        3. 16.6.3. Domain Names
      7. 16.7. For More Information
        1. 16.7.1. Appendixes
        2. 16.7.2. Internet Internationalization
        3. 16.7.3. International Standards
    3. 17. Content Negotiation and Transcoding
      1. 17.1. Content-Negotiation Techniques
      2. 17.2. Client-Driven Negotiation
      3. 17.3. Server-Driven Negotiation
        1. 17.3.1. Content-Negotiation Headers
        2. 17.3.2. Content-Negotiation Header Quality Values
        3. 17.3.3. Varying on Other Headers
        4. 17.3.4. Content Negotiation on Apache
          1. 17.3.4.1. Using type-map files
          2. 17.3.4.2. Using MultiViews
        5. 17.3.5. Server-Side Extensions
      4. 17.4. Transparent Negotiation
        1. 17.4.1. Caching and Alternates
        2. 17.4.2. The Vary Header
      5. 17.5. Transcoding
        1. 17.5.1. Format Conversion
        2. 17.5.2. Information Synthesis
        3. 17.5.3. Content Injection
        4. 17.5.4. Transcoding Versus Static Pregeneration
      6. 17.6. Next Steps
      7. 17.7. For More Information
  7. V. Content Publishing and Distribution
    1. 18. Web Hosting
      1. 18.1. Hosting Services
        1. 18.1.1. A Simple Example: Dedicated Hosting
      2. 18.2. Virtual Hosting
        1. 18.2.1. Virtual Server Request Lacks Host Information
        2. 18.2.2. Making Virtual Hosting Work
          1. 18.2.2.1. Virtual hosting by URL path
          2. 18.2.2.2. Virtual hosting by port number
          3. 18.2.2.3. Virtual hosting by IP address
          4. 18.2.2.4. Virtual hosting by Host header
        3. 18.2.3. HTTP/1.1 Host Headers
          1. 18.2.3.1. Syntax and usage
          2. 18.2.3.2. Missing Host headers
          3. 18.2.3.3. Interpreting Host headers
          4. 18.2.3.4. Host headers and proxies
      3. 18.3. Making Web Sites Reliable
        1. 18.3.1. Mirrored Server Farms
        2. 18.3.2. Content Distribution Networks
        3. 18.3.3. Surrogate Caches in CDNs
        4. 18.3.4. Proxy Caches in CDNs
      4. 18.4. Making Web Sites Fast
      5. 18.5. For More Information
    2. 19. Publishing Systems
      1. 19.1. FrontPage Server Extensions for Publishing Support
        1. 19.1.1. FrontPage Server Extensions
        2. 19.1.2. FrontPage Vocabulary
        3. 19.1.3. The FrontPage RPC Protocol
          1. 19.1.3.1. Request
          2. 19.1.3.2. Response
        4. 19.1.4. FrontPage Security Model
      2. 19.2. WebDAV and Collaborative Authoring
        1. 19.2.1. WebDAV Methods
        2. 19.2.2. WebDAV and XML
        3. 19.2.3. WebDAV Headers
        4. 19.2.4. WebDAV Locking and Overwrite Prevention
        5. 19.2.5. The LOCK Method
          1. 19.2.5.1. The opaquelocktoken scheme
          2. 19.2.5.2. The <lockdiscovery> XML element
          3. 19.2.5.3. Lock refreshes and the Timeout header
        6. 19.2.6. The UNLOCK Method
        7. 19.2.7. Properties and META Data
        8. 19.2.8. The PROPFIND Method
        9. 19.2.9. The PROPPATCH Method
        10. 19.2.10. Collections and Namespace Management
        11. 19.2.11. The MKCOL Method
        12. 19.2.12. The DELETE Method
        13. 19.2.13. The COPY and MOVE Methods
          1. 19.2.13.1. Overwrite header effect
          2. 19.2.13.2. COPY/MOVE of properties
          3. 19.2.13.3. Locked resources and COPY/MOVE
        14. 19.2.14. Enhanced HTTP/1.1 Methods
          1. 19.2.14.1. The PUT method
          2. 19.2.14.2. The OPTIONS method
        15. 19.2.15. Version Management in WebDAV
        16. 19.2.16. Future of WebDAV
      3. 19.3. For More Information
    3. 20. Redirection and Load Balancing
      1. 20.1. Why Redirect?
      2. 20.2. Where to Redirect
      3. 20.3. Overview of Redirection Protocols
      4. 20.4. General Redirection Methods
        1. 20.4.1. HTTP Redirection
        2. 20.4.2. DNS Redirection
          1. 20.4.2.1. DNS round robin
          2. 20.4.2.2. Multiple addresses and round-robin address rotation
          3. 20.4.2.3. DNS round robin for load balancing
          4. 20.4.2.4. The impact of DNS caching
          5. 20.4.2.5. Other DNS-based redirection algorithms
        3. 20.4.3. Anycast Addressing
        4. 20.4.4. IP MAC Forwarding
        5. 20.4.5. IP Address Forwarding
        6. 20.4.6. Network Element Control Protocol
          1. 20.4.6.1. Messages
      5. 20.5. Proxy Redirection Methods
        1. 20.5.1. Explicit Browser Configuration
        2. 20.5.2. Proxy Auto-configuration
        3. 20.5.3. Web Proxy Autodiscovery Protocol
          1. 20.5.3.1. PAC file autodiscovery
          2. 20.5.3.2. WPAD algorithm
          3. 20.5.3.3. CURL discovery using DHCP
          4. 20.5.3.4. DNS A record lookup
          5. 20.5.3.5. Retrieving the PAC file
          6. 20.5.3.6. When to execute WPAD
          7. 20.5.3.7. WPAD spoofing
          8. 20.5.3.8. Timeouts
          9. 20.5.3.9. Administrator considerations
      6. 20.6. Cache Redirection Methods
        1. 20.6.1. WCCP Redirection
          1. 20.6.1.1. How WCCP redirection works
          2. 20.6.1.2. WCCP2 messages
          3. 20.6.1.3. Message components
          4. 20.6.1.4. Service groups
          5. 20.6.1.5. GRE packet encapsulation
          6. 20.6.1.6. WCCP load balancing
      7. 20.7. Internet Cache Protocol
      8. 20.8. Cache Array Routing Protocol
      9. 20.9. Hyper Text Caching Protocol
        1. 20.9.1. HTCP Authentication
        2. 20.9.2. Setting Caching Policies
      10. 20.10. For More Information
    4. 21. Logging and Usage Tracking
      1. 21.1. What to Log?
      2. 21.2. Log Formats
        1. 21.2.1. Common Log Format
        2. 21.2.2. Combined Log Format
        3. 21.2.3. Netscape Extended Log Format
        4. 21.2.4. Netscape Extended 2 Log Format
        5. 21.2.5. Squid Proxy Log Format
      3. 21.3. Hit Metering
        1. 21.3.1. Overview
        2. 21.3.2. The Meter Header
      4. 21.4. A Word on Privacy
      5. 21.5. For More Information
  8. VI. Appendixes
    1. A. URI Schemes
    2. B. HTTP Status Codes
      1. B.1. Status Code Classifications
      2. B.2. Status Codes
    3. C. HTTP Header Reference
      1. Accept
      2. Accept-Charset
      3. Accept-Encoding
      4. Accept-Language
      5. Accept-Ranges
      6. Age
      7. Allow
      8. Authorization
      9. Cache-Control
      10. Client-ip
      11. Connection
      12. Content-Base
      13. Content-Encoding
      14. Content-Language
      15. Content-Length
      16. Content-Location
      17. Content-MD5
      18. Content-Range
      19. Content-Type
      20. Cookie
      21. Cookie2
      22. Date
      23. ETag
      24. Expect
      25. Expires
      26. From
      27. Host
      28. If-Modified-Since
      29. If-Match
      30. If-None-Match
      31. If-Range
      32. If-Unmodified-Since
      33. Last-Modified
      34. Location
      35. Max-Forwards
      36. MIME-Version
      37. Pragma
      38. Proxy-Authenticate
      39. Proxy-Authorization
      40. Proxy-Connection
      41. Public
      42. Range
      43. Referer
      44. Retry-After
      45. Server
      46. Set-Cookie
      47. Set-Cookie2
      48. TE
      49. Trailer
      50. Title
      51. Transfer-Encoding
      52. UA-(CPU, Disp, OS, Color, Pixels)
      53. Upgrade
      54. User-Agent
      55. Vary
      56. Via
      57. Warning
      58. WWW-Authenticate
      59. X-Cache
      60. X-Forwarded-For
      61. X-Pad
      62. X-Serial-Number
    4. D. MIME Types
      1. D.1. Background
      2. D.2. MIME Type Structure
        1. D.2.1. Discrete Types
        2. D.2.2. Composite Types
        3. D.2.3. Multipart Types
        4. D.2.4. Syntax
      3. D.3. MIME Type IANA Registration
        1. D.3.1. Registration Trees
        2. D.3.2. Registration Process
        3. D.3.3. Registration Rules
        4. D.3.4. Registration Template
        5. D.3.5. MIME Media Type Registry
      4. D.4. MIME Type Tables
        1. D.4.1. application/*
        2. D.4.2. audio/*
        3. D.4.3. chemical/*
        4. D.4.4. image/*
        5. D.4.5. message/*
        6. D.4.6. model/*
        7. D.4.7. multipart/*
        8. D.4.8. text/*
        9. D.4.9. video/*
        10. D.4.10. Experimental Types
    5. E. Base-64 Encoding
      1. E.1. Base-64 Encoding Makes Binary Data Safe
      2. E.2. Eight Bits to Six Bits
      3. E.3. Base-64 Padding
      4. E.4. Perl Implementation
      5. E.5. For More Information
    6. F. Digest Authentication
      1. F.1. Digest WWW-Authenticate Directives
      2. F.2. Digest Authorization Directives
      3. F.3. Digest Authentication-Info Directives
      4. F.4. Reference Code
        1. F.4.1. File “digcalc.h”
        2. F.4.2. File “digcalc.c”
        3. F.4.3. File “digtest.c”
    7. G. Language Tags
      1. G.1. First Subtag Rules
      2. G.2. Second Subtag Rules
      3. G.3. IANA-Registered Language Tags
      4. G.4. ISO 639 Language Codes
      5. G.5. ISO 3166 Country Codes
      6. G.6. Language Administrative Organizations
    8. H. MIME Charset Registry
      1. H.1. MIME Charset Registry
      2. H.2. Preferred MIME Names
      3. H.3. Registered Charsets
  9. Index
  10. About the Authors
  11. Colophon
  12. Copyright

Product information

  • Title: HTTP: The Definitive Guide
  • Author(s): David Gourley, Brian Totty, Marjorie Sayer, Anshu Aggarwal, Sailu Reddy
  • Release date: September 2002
  • Publisher(s): O'Reilly Media, Inc.
  • ISBN: 9781565925090