?he `LIMLOG``Page %`
?fo `Steve Hardy`- % -`October 1977`
.ce2
Limited Inference in Language Understanding
--------------------------------------------
.sp
This handout continues a description of language 'understanding` 
systems started in the TEMPLATE handout. That handout described two
'stimulus response` models of language communication - ELIZA and
PARRY.
.sp
If one were to tell ELIZA, say, that: 
 		: [JOHN IS A FIREMAN]
.br
it would respond, in a superficially appropriate way by printing out:
 		** [HOW INTERESTING]
.br
or		** [DO GO ON]
.br
ELIZA`S lack of understanding creates problem if I later
ask:
 		: [IS JOHN A FIREMAN]
.br
It would 'bluff`, perhaps by printing:
 		** [WHY DO YOU ASK]
.br
or		** [IS IT IMPORTANT TO YOU THAT JOHN BE A FIREMAN]
.br
.sp
A subtler example would be my telling ELIZA that:
 		: [ EVERY FIREMAN HAS REDBRACES]
.br
and later asking
 		: [DOES JOHN HAVE REDBRACES]
.br
.sp
Crucially, ELIZA and PARRY lack the ability to infer anything
from what they are told.
A natural development of the models is to add some 'inference machine` to a
stimulus-response system like ELIZA. An enhanced system of this type
could maintain some internal model of the world. It`s 'responses`
to input 'stimuli` would involve operations on this model.
.sp
The CONVERSE demo describes a very crude program of this type. CONVERSE
has two types of knowledge - a database of facts about the world and programs
to manipulate this database. It's knowledge of the world is represented by association lists',
for example:
 	: VARS STEVE;
 	: [[HAIR BROWN] [SEX MALE] [FATHER FRANK]...]
 	: 	-> STEVE;
.br
Here I am represented by a list of associations between 'properties` (like
FATHER) and 'attributes` (like FRANK).
This list is stored as the values of the variable STEVE.
.sp
The CONVERSE system has a number of templates, such as:
 	: [WHAT IS THE ?PROP OF ?OBJECT]
.br
The 'response` associated with this pattern is a program to search the list 
representing OBJECT for the attribute related to the property PROP.
Other patterns in CONVERSE cause additions to (or even creation of) associations
lists. For example, the input:
 	: [THE WIFE OF FRANK IS MEG]
.br
matches the pattern
 	: [THE ?PROP OF ?OBJ IS ?VAL]
.br
and causes the running of an appropriate 'response function` to add the
new element [WIFE MEG] to the association list representing FRANK.
.sp
The CONVERSE system cannot make complicated inferences, being
restricted to simple storage and retrieval of facts.
.sp
NB. Since writing the CONVERSE demo I have written two new
packages for POP11. The first of these, DATABASE, provides a
representational system more powerful and convenient than association
lists; the second, PFUNCTIONS, simplifies the activation of
functions on a pattern directed
basis. Using these two packages would simplify 
CONVERSE enormously, for example:
.tp8
 	: PFUNCTION WHATIS([WHAT IS THE ?P OF ?X]);
 	:	VARS R;
 	:	IF PRESENT([^P ^X ?R]) THEN
 	:		R
 	:	ELSE
 	:		[I DONT KNOW]
 	:	CLOSE
 	: END;
.br
.tp4
 	: PFUNCTION THEIS([THE ?P OF ?X IS ?R]);
 	:	ADD([^P ^X ^R]);
 	:	[OKAY - I HAVE NOTED THAT FACT]
 	: END;
.br
.tp9
 	: PFUNCTION DOANY([DO ANY OF ??LIST HAVE ?R AS ?P]);
 	:	UNTIL LIST = [] THEN
 	:		IF PRESENT ([^P ^(HD(LIST)) ^R]) THEN
 	:			[YES - ^(HD(LIST))]
 	:		EXIT;
 	:		TL(LIST) -> LIST;
 	:	CLOSE;
 	:	[I DONT KNOW]
 	: END;
.br
These functions create and access a database of items like:
 		[CAPITAL FRANCE PARIS]
.br
 		[SEX STEVE MALE]
.br
 		[WEALTH STEVE LOW]
.bb
The representation of the world used by CONVERSE is
limited (it it instructive to consider what things CONVERSE
cannot be told). Even within its limits, CONVERSE is a bit
sloppy. It would accept, at face value, the two assertions
 	: [THE CAPITAL OF FRANCE IS PARIS]
 	: [THE CAPITAL OF FRANCE IS LONDON]
.br
without worrying that FRANCE can`t have more than one CAPITAL.
.sp
To cure this defect would require only relatively minor changes to the
THEIS function (to check for a contradiction).
.sp
Notice, though that being able to notice inconsistencies is a skill
we have to add to CONVERSE. It doesn`t come for free as an
implication of being able to represent the contradictory facts.
This seems reasonable enough to me I am sure much of my own
knowledge is contradictory. This might be because I don`t have the
'programs` to detect the inconsistency, like CONVERSE, or I 
don`t have the time to run them.
(Unlike CONVERSE there are lots of things I want to do, worrying
about inconsistencies in my knowledge is not of the utmost
importance).
.bb
Bertram Raphael, has  written a introduction to AI
called 'The Thinking Computer`, developed a system called
SIR (Semantic Information Retrieval)
very like CONVERSE. Unlike CONVERSE, SIR could combine stored facts
to answer questions like
 	:	 [DOES JOHN HAVE REDBRACES]
.br
.sp
SIR`s understanding of the syntax of English is very limited. This is because Raphael was  concerned with
what the input meant, rather than how it was phrased.
.sp
Actually, SIR`s grasp of meaning was also rather limited (hence the
'Limited Logic` in the title of this demo).
.sp
For the moment, I concern myself with only two types of assertion:
 	: [?OBJ IS A ?TYPE]
.br
and
 	: [EVERY ?TYPE HAS ?PART]
.br
There are two associated questions:
 	: [IS ?OBJ A ?TYPE]
.br
and
 	: [DOES ?OBJ HAVE ?PART]
.br
.sp
The two types of assertion can be used to generate hierarchical trees of relationships between objects for example:
 	:	 [JOHN IS A FIREMAN]
 	:	 [FIREMAN IS A HUMAN]
 	:	 [HUMAN IS A OBJECT]
 	:	 [EVERY HUMAN HAS LEG]
 	:	 [EVERY OBJECT HAS WEIGHT]
 	:	 [EVERY LEG HAS FOOT]
 	:	 [FOOT IS A OBJECT]
.br
Notice that a PARRY type preprocessor could be used to make SIR`s input more
natural.
.sp
To answer questions like
 	: [DOES JOHN HAVE FOOT]
or
 	: [DOES FOOT HAVE WEIGHT]
.br
requires a complex chain of inferences;
finding some type of thing that has the desired attribute and then showing that JOHN is of that type. Even the latter task is not trivial.
How would you infer that [JOHN IS A HUMAN] given the above information -
can you make your intuitions sufficently precise to
turn them into a program?
.sp
Here is a rough outline
.in+8
To see if some object is of a given type see if an appropriate ISA relationship is
stored explicitly. If not find all explicit ISA relationships
between the object and some intermediate types and then show that one of these 
intermediate types is an instance of the required type. 
.in-8
Translated into POP11 (and using DATABASE) this becomes:
.tp13
 	: PFUNCTION ISQ([IS ?OBJ A ?TYPE]);
 	:	IF PRESENT([ISA ^OBJ ^TYPE]) THEN
 	:		[YES]
 	:	ELSE
 	:		VARS TEMP;
 	:		FOREACH [ISA ^OBJ ?TEMP] THEN
 	:		    IF REPLYTO([IS ^TEMP A ^TYPE]) = [YES] THEN
 	:			  [YES]
 	:		    EXIT
 	:		CLOSE;
 	:		[NO]
 	:	CLOSE
 	: END;
.br
Notice this function accesses a database of facts like:
.br
 		[IS JOHN A FIREMAN]
.br
 		[HAS FIREMAN REDBRACES]
.br
.sp
This assumes the existence of a function called REPLYTO
perhaps defined thus:
 	: FUNCTION REPLYTO(SENTENCE);
 	:	GETONE(SENTENCE, [ISQ HASQ ...])
 	: END;
.br
.sp
This may be slightly redundant since there is only one function,
ISQ, to answer
questions of the form
 	: [IS ?OBJ A ?TYPE]
.br
A more advanced version of SIR might be answering question so 
complex there might be several methods of answering them.
.bb
After Raphael had written his SIR system Cordell Green, working
at Stanford University, produced a system called QA3 (it was his
third Question Answering program) which could manipulate 'first order
predicate logic`. This language is very powerful in that quite complex statements
can be represented. Green used a single 'rule of inference` (i.e. program
to infer new facts from old ones) called 'resolution` so powerful
that it is the only one needed for the system to make any necessary
inference. QA3 wasn`t intended as a natural language system, so its input is 
very stylised:
 	: (IS(X,T) AND HAS (T,A)) IMPLY HAS(X,A);
.br
if X is of type T and T has attribute A then X has attribute A.
 	: (IS(X,Y) AND  IS(Y,Z)) IMPLY IS(X,Z);
.br
if X is of type Y and Y is of type Z, then X is of type Z.
 	: (HAS(X,Y) AND HAS(Y,Z)) IMPLY HAS(X,Z);
.br
If X has part Y and Y has part Z, then X has part Z.
 	: IS("JOHN", "FIREMAN");
.br
John is a fireman.
 	: HAS("FIREMAN", "REDBRACES");
.br
Firemen have redbraces.
 	: HAS("JOHN", "REDBRACES") =>
.br
Does John have red braces?
.sp
'Theorem proving` systems like QA3 are now widely considered to be not
very useful and since their workings are hard to understand I don`t want to write
a handout on them. If  you are interested you could read "Resolution
as a basis for Question Answering" by Robinson in Machine Intelligence 3
(Edinburgh University Press).
.bb
Although SIR`s representational language was very weak, the apparent
success of 'logic machines` like QA3 suggested that it was time
to concentrate more effort on extending the syntactic capabilities of
language understanding systems.
.sp
One straight-forward experiment in this direction was Green`s
BASEBALL system. This has a pre-defined database of facts about baseball games played
over one season. Each game had a number of stored attributes, such as time played,
winning team, losing team, location etc. The look-up routines return
a list of all games matching some template,
expressed by a partial set of attribute values.
.sp
Questions fed into the systems were converted into these templates, for
example:
 		HOW MANY TEAMS WERE BEATEN BY
 		THE REDSOX AT HOUSTON STADIUM IN JUNE
.br
gives a game specification of
 	[[WINNER REDSOX] [DATE JUNE] [PLACE HOUSTON]]
.br
This specification is done by splitting the input into
preposition groups. A preposition group beginning with
BY 'fills` a WINNER 'slot`, IN fills DATE slots, AT fills
PLACE slots and so on. (BASEBALL
has a PARRY type pre-processor to remove unknown words). 
.sp
Key words (and phrases) like HOW MANY, WHO, WHEN, and WHERE, dictate
what is to be done with the retrieved list of matching games. "HOW MANY"
means print out the numer of games. WHEN means print out the DATE,
and WHERE the PLACE. A little cleverness must be employed with WHO questions
since these might refer to either WINNER or LOSER. An obvious heuristic
is that if, as in the question:
 	WHO WAS BEATEN BY REDSOX IN MAY
.br
the winner slot is specified in the question than the
questioner wants to know the LOSER.
.sp
A second heuristic is to determine the 'agent` and 'patient`
of the sentence.
A nounphrase after an active verb like BEAT is the 'patient`
and hence the LOSER, one before BEAT is the WINNER
to the WINNER.
.sp
Notice that the above question could be rephrased as
 	IN AUGUST, WHO BEAT THE REDSOX
.br
The position of a preposition group doesn`t matter.
.bb
Whilst BASEBALL broke free of the total rigidity of fixed templates it`s
technique could work only in simple contexts. By the time it was written
(in the mid-sixties) there existed a number of computer
languages (like POP) which seemed more like natural languages in the
sense of syntactic variety. These computer languages were developed using
the notion of generative grammers proposed by Chomsky
(the Fontana `Modern Master` book on Chomsky is worth reading
if you are interested). The SILLYSENT demo gives some ideas to play with
concerning generative grammars.
.sp
A a last example of limited logic systems is the STUDENT
system, developed by Danny Bobrow. This system solves simple
'story algebra` problems of a type you probably remember (with hate?)
from school math`s lesson.
I think the STUDENT system quite clever in its own way - at least it
doesn`t give that feeling of being conned that I get from systems like ELIZA.
.sp
This system is described in the STUDENT demo.
