
Like the BACON systems we have looked at, DALTON employs heuristics

to search a constrained space of solutions. However, unlike BACON,

whose heuristics were chiefly domain independent algebraic manipulations

chosen to find regularities in clean data, the heuristics used in DALTON

are embodiments of Alvagadro's atomic hypothesis. In this sense, 

Langley et.al. characterize DALTON as a "theory driven" system, 

whereas BACON is seen to be a "data driven" system. Chapter 8 shows

how a simple version of atomic theory can lead to some of the

same "discoveries" made by John Dalton, and how a single refinement

can account for an error of John Dalton's.




The DALTON system takes high level description of chemical reactions

(such as "hydrogen and oxygen combine to form water") and attempts

to produce a description of this reaction that is consistent with 

Alvagadro's atomic hypothesis.  Simply put, Alvagadro's hypothesis was 

that compounds are comprised of molecules and that molecules are in 

turn comprised of atoms. This model of chemistry is built into the

production rules of the DALTON system, which in turn make several

assumptions about the nature of a correct solution and the relevant

constraints upon operator application.



DALTON "knows" that oxygen, hydrogen, and nitrogen are (for example)

chemical elements, as opposed to more complex compounds, and it

also assumes that elements are comprised of some integral number of 

atoms.  These assumptions are used implicitly by the production rules

of the system. The first three rules together are sufficient to "discover"

John Bacon's molecular model of the reaction in which hydrogen and

oxygen combine to form water. These rules are




1) SPECIFY-MOLECULE. Given a high level description of a reaction,

DALTON guesses the number of molecules of each substance which

are involved. These guesses are small integers, starting at 1 and progressing

to some small, but unspecified, upper bound. If a guess turns out

to be inconsistant with any of the possible operator applications

at lower levels, DALTON will backtrack to this operator and try

the next largest integer.



2) SPECIFY-ELEMENT. Given a (partially) instantiated description of

a reaction, this operator ventures guesses as to the number of atoms

which comprise a molecule of an element. Thus it is necessary for

the system to know in advance that oxygen and hydrogen are elements

and that water is not. Once again, guesses begin at 1 and iterate on 

failure up to some unspecified upper bound. Note that this behavior,

particularly beginning with 1 and only trying larger numbers if no

model is consistant with this guess, effectively produces John Dalton's

rule of greatest simplicity. We shall have more to say about this 

below.



3) SPECIFY-COMPOUND. Given a reaction in which all but one of the

reactants (presumably the resultant) is fully specified, this operator

attempts to find a consistant specification for the remaining 

compound. This operator assumes a fundamental principle of conservation,

i.e. both that all quantities present in the reaction are preserved

and that atoms are in fact atomic. Given these constraints, this operator

searches for a match between the structural descriptions of the 

inputs and outputs of the reaction. If one is found, the system will

halt with a description for a reaction; otherwise, this operator fails and

the search must backtrack to applications of the SPECIFY-ELEMENT and/or

SPECIFY-MOLECULES operators.



There are several points worth noting about this third operator. First,

part of it's heuristic power comes from the fact that it expects all

structures to be consistant with all previous models it has constructed. 

If it turns out that it cannot explain a reaction given the hypothesized

structures involved in a previoulsy examined reaction, DALTON will find

a solution for the current reaction and then try to find a solution for the

earlier problems that is consistant with this solution. Thus history

becomes important to DALTON, and it is conceivable that its final

model of atomic structure will depend upon the order in which it is

presented problems.



Furthermore, from the description presented, it appears that DALTON

is only capable of analyzing reactions of the form 



	(element,element,...,element -> compound)



which is of limited usefulness and generality.


Langley et al. show how these three operators can be used to derive the

same explanation of the (hydrogen, oxygen -> water) reaction that

was proposed by John Dalton. They point out that though this explanation

is now considered incorrect, a correct explanation would have been

obtained (presumably by both John Dalton and DALTON) if the 

(hydrogen, oxygen -> hydrogen peroxide) reaction were considered first.

Nonetheless, this latter reaction would still not be modeled correctly.



To overcome this deficiency of the model, the authors show how another

heuristic rule can be introduced to model Avagadro's belief that the

ratio's of combining volumes were directly related to the number

of molecules involved in the reaction. This heuristic, which they

call INFER-MULTIPLES assumes that the number of molecules involved

in a reaction (of gases) is an integral multiple of the volume of the

gas. Given just this new rule, and data concerning the relevant 

volumes, DALTON (which they rename to be DALTON*) is capable of

constructing yet another incorrect but consistant explanation of the

(hydrogen,oxygen -> water reaction). 



This implication of the work to this point is that we can find the same

result as John Dalton given only SPECIFY-MOLECULES, SPECIFY-ELEMENTS

and SPECIFY-COMPOUND. The new heuristic effectively embodies the

atomic theory of Guy-Lussac, and adding it to the simpler account

does in fact give us the same (incorrect) result arrived at by Guy-Lussac.






