Wallace has created tens of thousands of patterns in AIML that can help interpret English input.
I like AIML, and it certainly was a breakthrough ten years ago. But I am experiencing some challenges with it.
Amy Iris has an architecture with the following four components:
- Parsing
- Knowledge Base
- Context Management
- User Interface
An issue I have with AIML in its current form is that the first three components are intertwined. The Parser and Knowledge Base are completely intertwined (to allow recursive calls that AIML refers to as "<srai>").
And I am currently thinking that standard "Alice-style" Context Management is just not as powerful as I'd like. AIML allows you to work with a couple of pre-defined Context variables (called "that" and "topic"). In addition, AIML allows you to set your own context variables. But context management seems to be one of the weakest links in Alice.
So I'm experimenting with the separation of the four functions with my AIML processor. Separation should permit "best in class" substitution of each component. Should be interesting to see how this goes.
Parsing with AIML
I thought it might help to try to explain for how AIML parses. If you load the ALICE brain into PyAIML or another AIML processor, it will build a giant parse tree (sometimes called a trie). At the root level, there are about 2900 branches (in ALICE), that are the choices for the first word of input. So if you type in the sentence "What time is it?, you will go down the "what" branch (which is one of those 2900 branches). Off of the node at the end of the "what" branch, there is a "time" branch. And off that node, there's a "is" branch, and off of that node is a "it" branch. Finally, there's an answer at the "it" node.
Of course there are some wildcard branches as well, which adds to the complexity.
The tree thins out as you go down. For example, there are only 240 branches off the "What" node. There are far less things that make sense (to ALICE and in general) as a second word (after "What"), than the number of first words. Admittedly, in common English, there are far more than 240 - since nearly any noun would work in that context. But in common conversation, a surprisingly few choices come up after the word "what" - more than 240 I'm sure, but significantly less than the entire English vocabulary.
This is a really nice way to organize the parse tree. My frustration is with the re-writes. If the user says "Do you know what time it is?", standard AIML definitions will reduce that to "What time it is", eliminating the part that says "Do you know", and parse the remainder*. So far, so good - the parser will use recursion to continue its work.
But in some of the templates, an answer will be embedded with a recursion. It works well for ALICE chatbot, but not for my purposes. I am looking for a separation of powers - don't mix the parsing function with the knowledge base function.
*It'd be more appropriate, I think, to reduce to "What time is it", as opposed to the current "What time it is". AIML definitions could be written to do this, but I don't believe that they are in the standard AIML (AAA) definitions.
