Anti-patterns in Code: Ubiquitous HashMap

This is something I consider an anti-pattern for parameter passing which describes both a specific data structure as well as the mechanisms surrounding its (ab)use. It involves a way of modeling information where the semantics of the data may be dependent on disparate parts of the system. There are other names I have given this pattern in the past, such as Great Big Bucket In The Sky, or Passing Parameters Under The Table, or Stuff In The Context Of Everything, or just Big Giant HashMap You get the idea. I will describe a specific case, and then describe why I feel this approach is problematic.
First, however, a story that just so happens to be rooted in fact...
Once upon a time there was a data structure used to store thread-local, user-specific data (user name and a request token) for all classes participating in a transaction. This proved to be so useful that the same data structure was used to store even more information about both the user as well as the transaction itself. Eventually the data structure grew to contain information irrelevant to its originally stated purpose, such as exception data thrown or handled during processing, logging and debug information, system-level credentials used to authenticate to other systems during processing, ad nauseum.
Rather than dealing with the growing mess, the original data structure was partitioned into two or three copies of itself and each one was named after the type of data that was being stuffed into it. Somehow this did not stop others from piling on to the mess-- quite the opposite, it spurred others to create their own Ubiquitous HashMaps for distance pieces of code to pass information under the covers. The system became brittle and others complained about the situation, but found it difficult to change.
While I name this anti-pattern the “Ubiquitous HashMap”, the problem has less to do with the ubiquity of the data structure (or the fact that it tends to be a 'HashMap') as it does the way in which it opens itself up to abuse.
Take the following sample code that starts the ball rolling:
public class BigContextInTheSky {
private static ThreadLocal<HashMap<String, Object>> context;
static {
context = new ThreadLocal<HashMap<String, Object>>() {
protected HashMap<String, Object> initialValue() {
return new HashMap<String, Object>();
}
};
}}
If you like where this is going in this hypothetical example, you might be tempted to add the following methods, thereby making the current user’s name available to anything executing in the same thread, avoiding the annoyance of having to pass the current user's name all the way down the stack.
public static String getCurrentUserName() {
return (String) context.get().get("name");
}
public static void setCurrentUserName(String name) {
context.get().put("name", name);
}
Seems harmless so far, but there are already problems with this approach. First, there are temporal issues, and the contract is not clear. Without further investigation, we do not know who exactly sets the name, and when. Callers of this class seem just as likely to overwrite the name as to read it. How do you ensure the name does not change depending on where you are in the call stack? Even more to the point, when is it safe to read the current user’s name, and what happens if it has not been initialized yet? If we are dealing with other, mutable types, when is it safe to hold a reference to objects in the Ubiquitous HashMap?
While many of the issues (such as ones above) mirror the typical set of problems you find dealing with global variables, there are two sub-categories here related to the thread-local nature of this data structure. First, components of the application are soon incapable of participating reliably in multi-threaded transactions (i.e. the introduction of, say, a single JMS call necessitates either managing two ubiquitous maps and synchronizing one against the other, or else avoiding the use of any code that uses the ubiquitous map in the new thread).
Second, it is ridiculously expensive to maintain in the long run. Any amount of cruft that the class attracts will be much harder to refactor than a normal class for the same temporal issues mentioned above. Furthermore, while it is possible to test around this mechanism, it requires effort to do so, and this effort is not a one-time cost. It will need to be repeated for every test against a caller of the ubiquitous map, even if it means creating a harness for tests to execute in whenever they may require the services of the Ubiquitous HashMap. If you avoid mocking out values in the Ubiquitous HashMap during one test, it may pick up modifications to the map made by tests which ran earlier in the same thread, producing unwanted side-effects at best, or false positives at worst.
Be wary of these kinds of issues, and if you are currently dealing with them, seek to contain the scope. Writing package dependency tests will at least put a stake in the ground to keep them quarantined until you find a more permanent solution.
As for what that solution entails, I believe any such solution would be implementation-dependent. The Ubiquitous HashMap seems to come about as a reaction to an unwieldy or overly-stateless architecture, so in order for its use to be less appealing, the system's architecture needs to provide easy answers to managing this kind of data across the application. I find the best defense against Ubiquitous HashMap to be a clean architecture, and in this case an ounce of prevention is truly worth a pound of cure.
Topics: Java
Comments: 4 so far
Leave a comment
About Pathfinder
Follow the Blog
-
Get a monthly update on best practices for delivering successful software.
Subscribe via email
Subscribe via RSS
Categories
Topics
Archives
- July 2009
- June 2009
- May 2009
- April 2009
- March 2009
- February 2009
- January 2009
- December 2008
- November 2008
- October 2008
- September 2008
- August 2008
- July 2008
- June 2008
- May 2008
- April 2008
- March 2008
- February 2008
- January 2008
- December 2007
- November 2007
- October 2007
- September 2007
- August 2007
- July 2007
- June 2007
- May 2007
- April 2007
- March 2007
- February 2007
- January 2007
- December 2006
- November 2006
- October 2006
- September 2006
- August 2006
- July 2006
- June 2006
- May 2006
- April 2006
- March 2006
Blogroll
Recent
- Elements of Testing Style
- Aesthetics and Web Design
- Asterisk-Java Testing with Groovy
- 3 Misuses of Code Comments
- Fluently NHibernate
- Digging a Hole and Covering it with Leaves — The Software Development Version
- The Importance of User Experience - Do You Understand It in Your Bones?
- Writing Your Own Protocol With NSURLProtocol
- What’s In Your Dock: iPhone edition
- Feature Fatigue

> …the typical set of problems you find dealing with global variables
> …a reaction to an unwieldy or overly-stateless architecture
I’ve seen this often when you’re down to crunch-time and you need a high-level piece of data way down in the call-stack (or vice-versa) and you just can’t change 1000 method signatures to include the parameter(s) by sunrise. So, you fall victim to this sort of thing.
Another variant: Passing Parameters as a HashMap, similar to:
void foo( Map params )
That will bite you eventually, as well.
Comment by HarryC, Monday, March 3, 2008 @ 1:27 pm
Good point, Harry. I think it’s important to understand the motivations behind the scenes.
That said, even if one were to apply this to some code, it can’t happen without keeping it tightly scoped and on a very short leash. And even then it adds quite a bit of technical debt.
Comment by Ivan Moscoso, Tuesday, March 4, 2008 @ 8:45 am
This is how most loosely typed languages work. Every object is a Map more or less. Passing around HashMaps is just a way of working around Java’s strong typing. It can be abused quite easily of course. The problem arises when the required data for a function become ambiguous because they are all hidden inside a map and not in the method signature.
Comment by Justin, Tuesday, March 4, 2008 @ 7:55 pm
a.k.a. The Kitchen Sink Design Pattern
Problem:
The data requirements keep changing but the interface must remain constant.
Solution:
1. Create a kitchen sink for storing an unlimited number of objects of any type
2. Create one interface with one method that accepts and returns the kitchen sink
3. Implement the same interface in all services and across all layers everywhere
Strategy:
Implement the kitchen sink using a HashMap.
Forces:
When the domain model does not matter and you just want quick and dirty access to any data from anywhere.
Pros:
- The interface will never change
- The kitchen sink can never be full
Cons:
- Can grow very large
- Serialization can be expensive
- Synchronization can become unmanageable
- Data can become volatile
- Can become difficult to find the data you want
Comment by WarpedJavaGuy, Tuesday, March 4, 2008 @ 11:56 pm