BRE Patterns II: The Tranformer

My last post in this series was on the Fact Harvest pattern for using business rule engines. My intent with these patterns is not to give some deep architectural or design insight, but to help BRE novices pierce the jargon fog that has them shaking their head on how to use these beasts. This time we'll be talking about "The Transformer."

Pattern Name: The Transformer

Synopsis
Suppose you want to make many transformations on a dataset but don't know all of the details of what you need to do up front. You could use a Little Language or Interpreter pattern to allow you to compose various transformation operations dynamically. Making such a solution perform would involve a fair amount of work. Instead, you could use a BRE to perform the transformations efficiently without having to develop your own domain specific language.

Context
Take the case of an ETL (Extraction, Transformation and Loading) process. This is the sort of technology you might employ if you provide data analytics for a particular industry (automotive, health care, insurance, etc.) and you ingested huge amounts of data in varying formats. Typically you would use expensive tools like Informatica or Ascential -- or the Open Source tool Kettle, have a look -- to wrangle the data into shape and load it into a database of some sort.

These ETL tools provide various plugins that can be chained together in a data transformation process. You can even write your own plugins in C, C++, Java or other languages. You can embed a BRE as a transformation plugin in one of these ETL tools. Your "facts" would be individual rows of data that are modified by your rules. For example, to do some simpleminded address cleaning, we might use rules like this one:

 

IF dirtyrecord.ZIPCODE equals georef.ZIPCODEAND dirtyrecord.CITY does not equal georef.CITYTHENcleanrecord.CITY = georef.CITYEND

If you are processing truly huge amounts of data and your processing requirements are not that complex, you might have to evaluate rule execution algorithms other than RETE for improved performance. Many of the commercial vendors provide these alternate execution algorithms; ILog, for example, has a sequential execution mode for large numbers of independent rules. As always, do your own evaluation rather than trust vendor claims.

Consequence
The various commercial and open source ETL tools do have canned logic for various tasks such as normalization or name and address cleaning and they also have transformation plugins that allow for scripting. Embedding a BRE allows you to deploy powerful new data transformation rules more quickly than with scripting or custom plugins written in a language like C. Using the pseudo-natural language capabilities of the BRE's, it is possible to construct a DSL (Domain Specific Language) more more quickly and with far greater runtime performance than building it from scratch or using other, less powerful tools.

Again, I hope this example is useful to you and leaves one less person shaking their head and asking: "How the heck do I use a Business Rules Engine?"

Related posts:

  1. Conflict Resolution – Why BRE’s Give Different Results
  2. BRE Patterns III: Collaboration Cop, Part I
  3. BRE Patterns I: The Fact Harvest
  4. Using Design Patterns
  5. BRE Patterns III: Collaboration Cop, Part II

Leave a comment

Powered by WP Hashcash

Launch: Pathfinder Newsletter

    Get a monthly update on best practices for delivering successful software.

    Subscribe via email


    Subscribe via RSS      RSS icon

Topics

Search

WordPress

Comments about this site: info@pathf.com