-
Get a monthly update on best practices for delivering successful software.
As anyone knows who has been to a Business Rules presentation, it's all about RETE. Who has the best RETE? Which engines are mere pretenders and don't actually have RETE? Which poor slobs have only partially baked version of RETE?
Which brings us to the overwhelming question: what is RETE and why do you need it? By way of answering this questions, we need to go back to what a BRE really is. In most cases, a BRE is an inference engine or production system. According to the Wikipedia, a production system is
...a collection of productions (rules), a working memory
of facts and an algorithm, known as forward chaining, for producing new
facts from old. A rule becomes eligible to "fire" when its conditions
match some set of elements currently in working memory. A conflict
resolution strategy determines which of several eligible rules (the
conflict set) fires next. A condition is a list of symbols which
represent constants, which must be matched exactly; variables which
bind to the thing they match and "<> symbol" which matches a
field not equal to symbol.
What does this actually mean? It means you have conditions on the left hand side (the "IF" part) that if matched trigger some action on the right hand side (the "THEN" part). The right hand side may add, delete or modify facts that will cause new rules to match. For example, I may have a fact that "Bob has a Cat" and the rules that "IF Person has a Cat THEN Person hates Dogs" and "IF Person hates Dogs THEN Person is a Mailman." The first time through my rules I see that "IF Bob has a Cat THEN Bob hates Dogs" fires, and now we have two facts, namely "Bob has a Cat" and "Bob hates Dogs." Now another rule matches, namely "IF Bob hates Dogs THEN Bob is a Mailman," and after firing we have three facts, namely "Bob has a Cat," "Bob hates Dogs," and "Bob is a Mailman." So from the initial fact "Bob has a Cat" we inferred that "Bob is a Mailman." Essentially these inference engines chain syllogisms to come up with new facts.
There's a little more to it than that, such as if we have several rules that match, which one gets triggered first? If we have a bunch of rules that matched on an original set of facts and those facts get changed, then the rules that no longer match have to be kicked out. Still, you get the idea.
If you think of implementing a production system the naive way, you would scan the left hand side for matches, pick the one you like, then rescan the left hand side for matches if the facts changed and repeat. That rescanning of the left hand side gets very expensive if you have a lot of rules and facts. In fact, it was so expensive in the olden days that nobody really used production rules except as objects of academic study.
Enter RETE. Designed by Charles Forgy in 1979, it took the approach of only reevaluating those parts of the left hand side that were effected by a change in the facts. The implementation uses a network (RETE is Latin for "net") to organize terms, conditions and rules so that facts come in the top and rules (or rule retractions) come out the bottom. The RETE algorithm was a huge improvement in performance over the naive approach and made production systems -- or Business Rule Engines -- feasible.
Still, if you aren't doing inferencing, i.e. you are only doing a single pass on the rules or you aren't modifying the facts, there are other algorithms and approaches that in many circumstances blow the doors off of RETE.
If you're problem involves cleaning data ("If the zip code doesn't match the city and state") or checking for eligibility for discounts ("If the customer has purchased more than $5000 in the last 3 months"), then you may in fact not need an inference engine. If, however, you are doing compex financial or clinical reasoning, you may well need inferencing capabilities (Clinical Events->Diagnosis->Recommended Action).
Fortunatley, many commercial vendors support other models of rule evaluation than RETE, so even if you've already invested in one you're not stuck using RETE. Some even support hybrid approaches, i.e. some RETE and some rule chaining. So, before you fall in love with a rule engine because it has the best RETE implementation, ask yourself: do I really need inferencing capabilities in my BRE, or is my problem like most and can make use of a simpler solution?
Related posts:
Topics: Business Rules Engines
Many commercial engines like iLog and JESS provide backward chaining features. iLog also has a sequential mode, which is similar to LEAPS algorithm. In my own implementation, I support a variety of options, like turning off the alphaMemory in cases where an application doesn’t need to modify or retract facts. I’ve also implemented additional objectTypeNode hashing so that applications with lots of simple rules will not see a significant decrease in performance. For example, I ran some tests with 8K rules (http://woolfel.blogspot.com/2005/12/constant-performance-is-achievable.html) to demonstrate constant performance is achievable in “very” specific use cases.
across the rule engine world, there’s also Mindbox, which uses a CBR engine. FairIssac has the RETEIII engine, which also has strengths in certain areas.
peter
Comment by Peter Lin, Tuesday, April 18, 2006 @ 10:21 am