TODO



Outline(?):

if sequence changes, then revalidation required
  sampling not enough
  ordering is
    only one way to change a unix machine -- syscall interface
      make sure same syscalls in same order
    behavior will always change if disk state changes
      cat /dev/*
problem domain
  simplicity, harried admins, commercial environment
  remove unpredictability, nondeterminism
    through validation, repeatability, invariance [alva]
      log files, user data not invariant, not subject to validation
      comment lines in config files are subject to validation
        otherwise why have comment lines?  they are human code
  congruence, not convergence
    diminishing returns as automation increases [Northrup]
  enterprise-wide consistent builds and subsequent management
    continuous management over life of machine
      some operations can be done live
      some must be done at boot
      most operations must be repeated 
      no rebuilds in production
      no testing in production
      no convergence
    more you automate, the easier [layers]
      convergence is opposite
    testing is role of admin, not vendor
      vendor unit testing
      admin integration testing
  full auto
    can't be done halfway
    politics main issue until we reach critical mass of automated sites
  problem solving vs. corrective action
    humans solve problems
    code implements action
  tend to be commercial
    why
  don't tend to be academic
    why
      athena exception
  should they be different?	
disk state
  2^N
  only spot checks are possible
    suitable only for (partial) convergence, not congurence
    2^(N-S)
    cfengine
      encourages use of spot checks
      100k is .005 percent of a 2Gb disk
      any rule can be triggered by 2^(N-S) disk states
        non-deterministic?
    sampling cannot be deterministic, no matter how careful
      action triggered by result of sample can be retriggered by future change 
      editing prior actions 
  ignore it
    ISconf
      act only on permanent history
sequence
  is a full expression of a host
    managed
    unmanaged
  2!
    hacmp cluster build
    culling and leaves
  orthogonality
    local admin talent -- who decides?
      difficult to train
  reproducible outcome
    testing and validation
  changing order
    means changing set
    randomization
      N! means infeasible to prove same outcome
        N as low as 8 or 9
      full regression math 
      genetic algorithm 
	build is 1-hour fitness function
	  DNA 8 = 4.6 years
  determining set
    discovery order
  dependency graph
    8-30 deep for hacmp
    breaking implicit order breaks build 
      show examples
  discovering sequence is easy
    discovery order
    P, P'
      ISconf makefile edit
  deterministic works
    easy to train
    start at beginning of host life
    only add new to end
    testing and validation
  new hosts
    create by prototype, not class
    include known subsequences
      internal ordering preserved
  ISconf/make
    expresses sequenced groups of sequenced operations
    human error (missing deps) masked by deterministic make behavior
      randomized known to break
    only one possible disk state represented by given timestamps
      deterministic state transitions
  convergence-only tools
    don't care about sequence
    can't do it
    "can't just tack things onto end" [alva]
      must recreate convergence via rewrite
        editing prior actions
	admin can't be depended on to detect non-orthogonal subsystems
  security
    not exploited
  unforseen dependencies
    cause build failures during development, common failure mode
  dependent order between peers
    x43,x42
      commutivity
    cfengine 1.X ignores?
    make implicitly avoids altering order
  parallelizing bad
  multiple admin agents bad
chomsky's heirarchy
  four levels
  state
    configuration tools pretend to be in here
      cfengine 
        detect/act
	no permanent history
      ISconf 
        timestamp state
	act only on permanent history
  turing
    useful model for illustrating points
    machines+config tool combo actually here
    difficult for admin to avoid turing equivalence
      more complexity than simple state mods means difficult to predict
      avoid pitfalls through ordering
    reload ruleset?
    cfengine
      one-tape 
    ISconf
      two-tape
    all collapse into one-tape
      self-modifying
        avoid pitfalls thru ordering
        direct
	indirect -- kernel shared lib etc.
	  difficult to predict
	  must be assumed self-modifying
      church?
      n-tape to 1-tape equivalence theorem
      complexity requires simplification
        ordering is a simplification
church
  order/outcome undecidable?
  discovering a valid ordering is like halting problem
table comparing cfengine, isconf
out of band changes
  security breach
LSB
  "transitivity of validation"
cannot prove that two configuration always exhibit same behavior?
  so two configuration operations undecidable by lifting?
test environments
  needed
  testing in production bad
enterprise consistency via reproducable sequences
  allows us to ignore orthogonality
'make' isn't the only way to go
ISconf
  stable for years
    barrier problem
  cloth wings and piano wire
  6 hours to implement at cat
  4 days clearing fud first
future
  kernel level support to intercept open() etc.
ISconf vs cfengine
  orthogonal to each other
  combination of convergence and congruence might be ideal
alva's recap

Joel comments Thu Aug 1 14:02:12 PDT 2002 rogerwilco.com turing equiv is a factor now, with UNIX machines -- we no longer have discrete hardware channel controllers etc. There is no more out of band management, except for mounted root partition. We have evolved to a model that is nearly UTM equivalent. To get out of turing equiv, we would need to get rid of in-band changes.
Order doesn't matter
An automated administration tool utilizing a sufficiently descriptive language will be able to detect the current disk state. It can do this through sampling a subset of the disk. The tool will be able to take the correct action in all cases. Reproducible change order is not required.
Order matters
It is not possible to forecast and code for all possible machine states; if there are N bits on a disk, then there are 2^N possible disk states. Sampling is deceptive; a 100k sample cannot reliably detect the state of a 2Gb disk. The only method of creating reproducible change is to deterministically order the changes made to a disk. Reproducible change is required in order to validate a build or an enterprise-wide update.

[ possible para about how sampling subsets of disk content to detect and act on current state is non-reentrant. show how this, too, highlights the importance of ordering ]