Failure Is Not An Option: How to build reliable computer systems from unreliable parts using Open Source software.

By Jeff Silverman
  1. Acknowlegements
  2. Introduction: The problem
  3. Management and organizations
    1. Cost and Value
    2. Roles and functions
    3. Creating a reliable organization
      1. Ethics
      2. Communication
      3. Job descriptions
    4. Outsourcing - Make or Buy?
  4. Support:
    1. Facilities
    2. Shipping and Receiving
    3. Inventory
    4. Accounting: accounts payable, accounts receivable, auditing
    5. Security
  5. Technology part 1: Design - Why? What?  Where?  When?
    1. Establishing design goals
      1. Cost
      2. Performance
      3. Reliability
      4. Security
      5. Ease of accomodating change ( expansion and contraction) (new software)
    2. Developing implementation designs How?
      1. Strategy
        1. A quick introduction to the laws of probability
        2. Failure resistence
        3. Fail safe
        4. Failure tolerant
          1. Components
          2. Computers - VIPs
          3. applications
          4. networks of computers
          5. rooms/buildings
      2. Physical Security
      3. Facilities
        1. Disaster recovery site and how to amortize its cost
        2.  Design rules for the facility
          1. Measure power consumption
          2. Measure cooling load
        3. Accomodate future growth
      4. Communications infrastructure
        1. The OSI Network model
        2. Layer 3 and 4
        3. Source Network Address Translation and Destination Network Address Translation
        4. Virtual Servers
      5. Software
        1. Make or buy (or obtain)?
        2. Software packages
        3. Single tier vs. multi tier
        4. Programming languages
      6. Monitoring
      7. Logging
  6. Technology part 2: Implementation - how?
    1. Processes and process documentation
    2. Hardware
    3. Software
      1. Application
      2. Network services
        1. DNS
        2. Authentication
          1. NIS (yp)
          2. Kerberos
          3. LDAP
        3. NTP
      3. Backup
      4. Security
        1. Intrusion prevention
        2. Intrusion detection
        3. port scanning prevention
        4. security in the backend
      5. system monitoring
      6. Fail over
        1. lvs - ipvsadm
        2. keepalived
        3. The high availability daemo ha.d
      7. database
      8. Webserver
        1. Apache
        2. khttpd
      9. Log processing
      10. Monitoring
    4. Network infrastructure
      1. Network Address Translation
    5. Monitoring
      1. Low level - ping testing
      2. Low level - networking service testing: NTP, DNS, NIS (yp), LDAP, Kerberos
      3. Medium level - SNMP
      4. Medium level - memory, CPU utilization checking
      5. Daemon level - SMTP (E-mail)
      6. Application level - health check
    6. Logging
  7. Technology part 3: Operations -Why?  Who?
    1. Monitoring
    2. Reponding to problems
      I have a dream: an automatic application start from a cold metal machine.
    3. System administrator training
    4. Auditing - SOX, HIPPA, FACTA
    5. Projects
  8. Case Study
  9. Conclusions
    1. The four way arms race - performance, reliability, security, cost
    2. Technology trends
    3. Outsourcing
  10. Stories
  11. Glossary and Acronyms.
  12. Index, appendices, bibliography
        The ideas page.

        Typography

        This book was written in HTML and it uses typography to convey meaning.  Stories from the trenches are shown in this italic font.  Computer dialogs have keyboard input, sample output, and variables.  Where the sequence goes through menus, tabs, buttons and the like, the labels are rendered in strong emphasis.

        Things that are important (most of the text in this book, if truth be told, is not that important) is labeled important.

        Text that cautions you against stupid mistakes which will not injure or kill you are labeled with caution.

        Text that warns you about things that are potentially injurious or lethal are labeled with warning.

All files
$Log: index.html,v $
Revision 1.1.1.1  2006/10/01 23:36:20  cvsuser
Initial checkin to CVS
      
Revision 1.2  2006/09/20 21:22:45  jeffs
Updated the acknowledgements list

Revision 1.1 2006/01/05 06:02:19 jeffs
Initial revision