by Ed Perley
This page reflects the planning for a program I intend to design in the future. The goal is to design a program that can correctly interpret English text entered at a keyboard, and give a meaningful reply, based on it's knowledge of words.
Some years ago, I designed a math program on an OSI computer that could parse inquiries in English. For instance if I entered, "What is five plus two divided by 6?", the program would respond with the correct answer. It could understand the English spellings of numbers from zero to ninety-nine. As an added touch, I designed it so that it could understand inquiries in French also. Four math operations were allowed: addition, subtraction, multiplication, and division. For all practical purposes I think one could say that the computer running this program actually understood English, within the narrow context of mathematics.
Getting a computer to understand math questions is relatively easy, since mathematics is very precise, and computers are essentially numerical machines. But how do you get a computer to understand something like "John lives in a red house."? It is a much more daunting challenge, because nonmathematical words are so much less precise. So how can a machine "learn" the meaning of a word, and how to use it properly and relevently in a sentence?
The C Language appears to be the best choice for this program because of the desireabilty of linked lists and high speed for this type of program. The program I envision is able to understand the meaning of a word by how it relates to other words in the program's memory.
The words will be included an a string array, to which the user can continually add new words. The words will be related to each other by a complex system of linked lists inspired by the connections of the neurons in the brain. Each word will be in a string array. It will have an associated code in another array that will indicate whether it is a noun, adjective, verb or adverb. A third table will contain pointers, one for each word. Each pointer will point to a structure I will call a neuron.
Each neuron will have six pointers. Pointer #1 will point back to the word it is associated with. If more than four different words are to be related to the associated word, Pointer #6 will point to another neuron, which will be associated with the same word. This structure will allow any number of words to a relate to a single word.
The other pointers will point to words that relate to its word in some way. The logical structure of the program will probably contain the elements shown below.
A string array of words:
Type: noun, verb, adjective, adverb, logical, other
Pointer 1: Points back to word or to a neuron pointing to it.
Pointer 2: Points to another word
Pointers 3 to 5: The same as Pointer 2, each with its own set
Pointer 6: Points to another neuron if another is needed to relate to more words.
The modifiers are numerical codes that tell how the word relates
to the other words it is pointing at.
The first modifier will indicate whether the word being pointed to has:
The second modifier will be used to relate nouns to verbs. This could range from 0 for a combination that would never be valid, to 10 for a combination that could be considered one hundred per cent valid. For instance, if two words were tree and run, the modifyer would be zero. If the words were dog and run, the modifyer could be five, to indicate that a dog will run some of the time.
Additional modifyers could be used to indicate the compatibility of adjectives and nouns and adverbs and verbs.
A heirarchy of nouns and verbs will be set up, with the most general terms at the top, branching to more and more precise ones. For the nouns, the most basic terms could be: Matter, Energy, Time, and Spirit. For the verbs, the most basic terms could be: To Be, To Go, To Change, Or to Change something.
The computer will be programmed to understand logical words, such as AND, OR and NOT.
The user will add new words to the basic logic structure. Each word will be placed in the heirarchy of words, according to how the user answers questions the program asks about it. Based on the answers it gets, it will set up the word's neurons, and point the appropriate neurons to each other.
Exactly how this program will actually run requires more thought on the author's part. The ultimate goal is to allow a reasonable dialog between the computer and the user, as was possible with the math program described above. It might be desirable for the program to have some limited abilty to modify itself automatically as it learns new words and word relationships.
To simplify the problem somewhat, the program will only deal with the present, and will ignore the flow of time.
There will be no work on this program in the near future. But, further information concerning it will be provided when and if the design process begins.
Back to Program Menu
To Main Menu