SKEDSOFT

Artificial Intelligence

Introduction: The candidate elimination algorithm keeps two lists of hypotheses consistent with the training data:

(i) The list of most specific hypotheses S and,

(ii) The list of most general hypotheses G. This is enough to derive the whole version space VS.

Steps:

  1. Initialize G to the set of maximally general hypotheses in H
  2. Initialize S to the set of maximally specific hypotheses in H
  3. For each training example X do
  • If X is positive: generalize S if necessary
  • If X is negative: specialize G if necessary

      4. Output {G,S}

Step (a) Positive examples

If X is positive:

  • Remove from G any hypothesis inconsistent with X
  • For each hypothesis h in S not consistent with X
  • Remove h from S
  • Add all minimal generalizations of h consistent with X such that some member of G is more general than h
  • Remove from S any hypothesis more general than any other hypothesis in S

Step (b) Negative examples

If X is negative:

  • Remove from S any hypothesis inconsistent with X
  • For each hypothesis h in G not consistent with X
  • Remove g from G
  • Add all minimal generalizations of h consistent with X such that some member of S is more specific than h
  • Remove from G any hypothesis less general than any other hypothesis in G

The candidate elimination algorithm is guaranteed to converge to the right hypothesis provided the following:
a) No errors exist in the examples
b) The target concept is included in the hypothesis space H

If there exists errors in the examples:
a) The right hypothesis would be inconsistent and thus eliminated.
b) If the S and G sets converge to an empty space we have evidence that the true concept lies outside space H.