10 June 2008 - 12:20Grid design patterns - de-normalized data

I am watching with a great deal of interest developments in the grid and large scale computing environment because I always found distributed computing interesting.

One very interesting thing that I come across pretty often is the fact that in large scale computing data tends to get de-normalized, basically it grows so large that you cannot fit the entities and the relationships between entities in the same database (this infoQ article is pretty good).
Let me give an example: suppose that you have table of users and a table or messages between users having these columns:
users: userID int, firstName String, lastName String, email String
user_messages: fromUserID int, toUserID int, messageTest String.
In the initial design a message from user A to user B would have only one record in user_messages and each user will be able to get the messages it has sent and the messages it has received from this table.
Now let’s say that the number of users sky-rockets to the point where you need to partition your database horizontally into a series of identical replicas. The problem that you will face now is where to store the data that is shared between these replicas, namely the table user_messages. You cannot store it either in the replica hosting user A’s data, neither in the replica hosting user B’s data and neither in its own instance because it will grow too large and because you will need to carry out a join over 2 physically remote databases. The solution is to drop user_messages for received_user_messages (from_user_name, from_user_email, message_text) and sent_user_messages (to_user_name, to_user_email, message_text) which will store the messages that a user has sent and the ones that it had received. Each database replica will have its own instance of these 2 tables each mapped to the user table. As you can see we are not storing the userID in the message tables, but rather the user information which was previously retrieved via joins.
Sending a message from user A to user B in this environment typically involves updating 2 tables in 2 separate database instances which usually is treated as a distributed transaction. However, you cannot carry out this distributed transaction because it is very costly so sometimes you resort to updating one table and sending a message to update the second table asynchronously (the overhead for delivering a JMS message is way less than the overhead for locking rows in the second database). In the case of updating the second table asynchronously you are still enforcing the relationships between the user table and the user message tables, but at a later point in time.
De-normalized data comes with synchronization costs: when you are updating the user information on the users table you will need to propagate the changes to received_user_messages and sent_user_messages so that the data stored in these tables will be up-to-date. This could be done via asynchronous processing as well, sending a message about data being changed on an ESB and having the concerned parties listening for and processing it. Synchronization costs should be watched very carefully because they could spawn a very high number of messages. Ideally we would synchronize data which is updated rarely (such as user information) in order to keep the number of synchronization messages down. Carrying out synchronization procedures in batches could be a way to deal with a large number of synchronization messages with the side effect that it may increase the latency with which the synchronization takes place.

Large scale computing typically seems to resolve around asynchronous processing (because in transactions message passing is cheaper than database access) and de-normalized data across with the overhead implied by it. The relationships previously enforced by normalizing data are usually enforced by passing JMS messages which carry out the data changes asynchronously. This design is driven primarily by the large volume of data which cannot be serviced by only one database and by the costs of carrying out a distributed transaction spanning 2 databases.
This is one design that I see emerging for large scale computing: de-normalized data along with asynchronous processes which are enforcing the relationships between entities via message passing.

It remains to be seen if this type of grid architecture will prevail in the future. Working against it is the emergence of multi-core processors which would allow for scaling up cheaply as envisaged by Brian Goetz in this article. If chips with hundreds of multi-cores and with large amounts of memory become reality scaling out could end-up costing more than scaling up and all the above could become history pretty fast. Scaling out will continue to make sense in some environments with humongous amounts of data (think Amazon, Google, eBay, etc…) but for the current fastest growing segment in grid applications, typically applications processing moderately large amounts of data, it may make sense to scale up once chips with a large number of multi-cores become a reality.

To watch…

P.S. These shifts in data processing (moving to a grid-type of architecture in order to accommodate the increase in data and then back to a single-box architecture because of the appearance of chip with a large number of cores) makes a pretty good case for abstraction in order to lower the costs of transition from one architecture to another. Ideally the architecture and the environment in which an application runs should not affect the business logic of that application. One way to insulate your application’s business logic from these infrastructure issues is by abstraction.

No Comments | Tags: Development, Favorites

17 March 2008 - 17:22What is missing in OS testing tools

I was watching this infoQ presentation by Alexandru Popescu and Cedric Beust on testing when I realized that a big market for testing is seriously neglected.

Regarding my use of testing I would say that JUnit pretty much fills my bill and I don’t see a need to move to TestNG. I was watching the presentation not so much as to know more about TestNG, but to get some exposure to the market for testing products (*). So the features which were presented and which were said to be in a high demand from the users were the ability to define test groups, to test data connectivity and to define dependencies between tests (**). Pretty fair, I would take this functionality to be a logical progression from the ability to run a bunch of tests at random as JUnit lets you do it, you are basically starting to look at the tests you run from a higher level and you start looking for ways to aggregate these tests into higher-level constructs, such as test groups, which can then be manipulated according to various needs.
So far I would say that Test NG is looking at ways to manage the complexity coming out of a high number of tests. Test NG is for test-heavy shops, where testing is considered a concern on par or close to development. Not a bad thing and I pretty sure that TestNG covers some functionality which is in high demand by Java developers.

There is, however, one need for testing that so far it is largely unsatisfied and sorely missed: the need to test workflows or series of events. Let’s say that you have an application that is a series of MDBs, each accepting messages, transforming them and then outputting them to the next MDBs. You would want to be able to test this application end-to-end.
Let’s say that you have an application that receives market trades, needs to process them and then transform them in order to send them downstream to tax systems, settlement systems, etc… You would want to put thru a trade, set-up in a certain way and then trace its execution thru these flows and determine if, in what stage and in what shape has this trade reached the Application-to-Tax-System gateway.
When I think about work-flow testing I usually think about defining a message, inputting this message into a work-flow engine and then defining interceptors to see how the original message has been manipulated at various stages. You may need to test both the work-flow itself (to see if the message has been going thru the work-flow that it needed to go thru), the end-result (to see where the message has been forwarded to and in what shape) as well as how it behaves at various stages in this workflow (if needed). I see work-flow testing primarily concerned with interception of messages (and this would probably be a great use of AOP) and with the possibility to correlate messages passing thru this work-flow with the original message.

OS testing tools so far are limited to synchronous testing. I wonder at what point will the need for work-flow testing become so pressing and the demand for it so great so that one OS shop will start doing something about implementing a testing framework for testing workflows. Then we could use Event-Driven-Testing for testing an application written in a Event-Driven-Architecture manner…

* I know that trying to form an opinion on a market such as the market for OS testing tools from a vendor presentation is a pretty risky business, what you get from vendor presentations is usually distorted because of bias and time constraints but I will assume that this presentation gave a fair image of what users want from testing tools.

** From what I know TestNG is a lot more than some annotations that give you the possibility to form test-groups and test dependencies. However, I would say that these issues are probably considered more important and more aligned with the market for OS testing tools since they were the ones which were included in this presentation.

P.S. If I were to choose one enterprise concern which is not addressed by the Spring stack I would choose work-flow. Spring has modules for integration, batch, transaction management, security management, connectivity to various end-points, etc…, it is missing this capability. From what I see work-flows are used pretty heavily in the enterprise space and for now I think they are mostly implemented either by in-house solutions or by commercial solutions. OS could probably make a contribution in this space as well. And if it does it should probably try to give the developer the ability to test work-flows, or at least make it easy for him/her.

P.P.S. The comments by Alan Keefer on this thread reinforce my beliefs that we need to carry out tests at every level, from high-level to unit test. For a work-flow based system this would mean that we need to test the work-flows themselves and not only the units making them up.

Later edit:  There are actually a few OS workflow tools. OSWorkflow, the ones at the bottom of this article and probably a few others. I should try them at one point.

No Comments | Tags: AOP, Development, Favorites

12 March 2008 - 1:54Pros and cons of standards

I was reading Bill Burke`s post on transaction compensation via REST and JBPM and I have to tell that I agree with most of his remarks. Bill makes a very interesting point about compensating a transaction: that the compensation itself is an activity of a business process (the activity of handling failure, very likely outside of the system involved) and that this activity could be implemented as a regular activity in a business process using jBPM. It could also be exposed to the outside world thru a REST service.

This is a very interesting take on compensations and his proposition (that compensations are regular business activities that are part of a business process and that could be coded as such) gives a lot of flexibility to handling compensations. However, is exactly this flexibility that will probably force someone away from the compensation scheme devised by Bill and back to WS-BA. As Mark pointed out you still need to enforce end-points to run the same version of the communication protocol, be it WS-BA or an ad-hoc REST-based protocol.
However, I would go further and say that the flexibility devised by Bill runs counter to market acceptance. Standards are straight jackets which people choose to wear when it makes sense, typically when it makes some process (such as interacting with a different system) more efficient. Coding a business activity in WS-BA would enable you to plug it effortlessly different systems implementing the WS-BA standard, in theory at least. You would expose your application to the outside world effortlessly and you could interact with a greater number of systems because you have chosen to trim down your application to fit into the WS-BA straight-jacket.

On the other side, if your application cannot be forced into the WS-BA standard probably it would make sense to drop this standard and expose the compensation logic as you and your partners agree. It would not be the first time that a standard gets ditched for a proprietary solution particular to a few partners and certainly not the last time. Sometimes a straight jacket is just too stifling…

No Comments | Tags: Development

22 February 2008 - 3:16DTOs are not rich beans

Of all the problems of Data Transfer Objects (DTOs) the one that stands out is the anemic domain problem: DTOs are basically objects with a few variables and getters/setters for them, lacking any complex behavior. This way of developing applications has been decried times and times again so I will not waste my time exposing this problem.

However, I started realizing that maybe DTOs should not contain any complex behavior and that they should be as dumb as dirt, especially when being used for communication between multiple systems. Consider this example: we have a tax system which uses the object TaxInformation which various systems use for getting information about taxes. Let`s say that the commodity trades system, the equity trades system and the FX trades system are all using it. Let`s say that this object looks like this:

public class TaxInformation{

private Trade trade;

private TaxReceipt taxReceipt;

public TaxInformation(Trade trade){

this.trade = trade;

this.taxReceipt = new TaxReceipt();

}

public TaxReceipt getTaxReceipt(){

return taxReceipt;

}

}

As you can see a DTO dumber than dirt. Let`s create an endpoint creating and serving such an object:

public class TaxService{

public TaxInformation getTaxInformation(Trade trade){

TaxInformation taxInfo = new TaxInformation(trade);

return taxInfo;

}

}

All is fine, but a requirement comes in which says that if the trade is restricted then the tax receipt of the tax information object related to that trade should also be restricted. Here you are presented with a few choices:

1) Implement this requirement in the TaxInformation object probably in the constructor.

2) Implement this requirement in the TaxService object.

3) Implement this requirement in the TaxReceipt object, in its constructor. For the sake of argument, let`s assume that we do not have this possibility, so I turned it off.

OOP purists will immediatelly recommend choice #1, because this would turn the DTO into a rich bean and would do away with the anemic DTO. This is a mistake, because all the systems using this object for communicating with the tax system will start having class versioning errors unless they are updated with this DTO`s new class. The problem is versioning such a DTO. Putting complex behavior in a DTO raises the risks of change in the DTO, which raises the effort of propagating this change in the systems which use this DTO for communication. At one point it makes sense to drop DTOs and use some sort of protocol for communication, preferrably a protocol which deals with change pretty easily (XML documents with schemas which always get extended and with constraints which get changed very rarely are a pretty good fit).

DTOs should probably be used in systems whose lifecycles are kept in synch (i.e. they get upgraded in synch), in such systems you can propagate type changes pretty easily because you have a certain amount of control over their lifecycles (*). Expanding on this, I would say that typed languages are probably most effective locally because type changes cannot be propagated over large distances efficiently. Type-coupling (or API coupling) is a very hard coupling and its should be avoided and replaced with protocols. DTOs do create an anemic model and should be avoided if possible, but if you choose types for communication between systems it would make sense to keep these DTOs to their most basic function of passing data around or to have procedures for propagating type changes in all the systems using types for communication.
Sometimes weak DTOs are actually a good thing…

* BTW, whether a set of systems whose lifecycles are kept in synch is actually one big system with one particular lifecycle is a pretty good question. I would say that you can think about it as a big system.

No Comments | Tags: Development

17 January 2008 - 19:47Java now and in the future

If you are to read most of what is published today about the Java platform it seems that the future of Java doesn’t look pretty good at the moment as it keeps losing battle after battle with the movement behind dynamic languages such as Ruby and PHP. At the same time improvements to the language which could give it some boost seem to be badly implemented, most people beeing unhappy with the implementation of generics and the proposed closure enhancement looking pretty horrific.

You have to wonder what will happen to Java in the future. Will it disappear and turn into a dinosaur that could not adapt to a changing IT environment or it will manage to survive the problems that it has at the time? One good look at the Java language would have to consider both the language itself and the libraries/frameworks built under it, the contributors to the language, both contributors to the language features and to its libraries/frameworks, etc… as well as the users of the language and its libraries, i.e. the developers that create Java applications.

The different types of contributors to the Java language. I would start by looking at the contributors and I would split them in 2 camps: corporate contributors and non-corporate contributors (*). Looking at the chief contributions these 2 types of contributors make to the Java platform I would say that the corporate contributors are mostly active contributing libraries and frameworks and that the non corporate contributors are contributing Java language variants such as Groovy and Scala (**). The split between contributors seems to mirror the split between the language itself and the libraries/frameworks being built under it, this is an important point.

It is important to look at the motivations of each contributor type for contributing to the Java platform: the corporate entities are mostly contributing libraries and frameworks that target specific problems with a wide audience while the non-corporate entities typically target problems with small audiences, such as languages. The size of the entities involved in a particular task is usually a good indicator of its audience and of the need for coordination (the bigger the number of entities, the bigger the need for coordination).
Corporate entities are more effective in projects where a certain amount of discipline and coordination is required (such as when defining the WS-* specs), while non-corporate entities are more effective in projects which do not require a large amount of coordination. The need for coordination between corporate entities is primarily driven by the number of these entities and the fact that these entities have different, sometimes competing, interests to which a common denominator has to be found.
Opposite to the need of coordination you would find most entities that created the languages currently running on the Java platform: These languages are initiated by individuals and are maintained by a single team that is usually pretty small.

One important thing about the contribution of corporate entities to the Java language in terms of libraries/frameworks is that these corporate entities need these huge investments in libraries/frameworks to be relevant in the future. Backward compatibility is very much needed for them in order to provide the stability required for continuing to make these contributions, this is very important to consider when thinking about truncating Java (see small section below) and when making language changes.

The mushrooming of JVM languages (***). You would expect that once the JRE got modified to allow easier access to different languages on the Java platform that the number of Java variants will grow very large, to the point of becoming unmanageable. This has not happened so far and if I were to identify the reasons for this I would first say that 1) there is an entry barrier for creating a language, you need to design it well and 2) the need to keep relevant investments made when committing to a particular language. The small number of languages used on the Java platform, in sharp contrast to the number of Lisp variants, is due to the fact that when a entity adopts a such a language it will make a commitment to use it in order to keep development costs down (you don’t want your web development shop to work 10 different languages, each needing a guru). The need to keep development costs down, which doesn’t exist in Lisp’s world because Lisp is used by academia rather than by everyday coders, put a ceiling over the number of languages running on the Java platform. I would expect this pressure on the number of languages running on the Java platform to exist in the future, new JVM languages will start being used only if they truly offer gains(****).

Open sourcing Java. Just like the mushrooming of Java languages the mushrooming of Java-like languages which branch out of the main Java language maintained by Sun didn’t happen (looking at the hundreds of Linux variants you would have thought that it should happen). I think that this is due to the exactly same thing that prevented the mushrooming of JVM languages - the commitment that using a language entails.

Truncating Java. The JRE got too big and even with today’s networks it is still a pain to download and install it. Split it up in OSGi bundles and spin them off the runtime (with Java AWT and Swing being the first victims). Keep in the runtime only what is absolutely necessary for the current enterprise libraries to work well (the collections library, threads, etc…).

Competition from Ruby. Ruby appears as a serious contender to Java, supposedly gaining market stare and mind-share. I don’t think that this will last long primarily because Ruby will not have resources for re-creating the current Java libraries. I also think that re-creating Java’s libraries in a different language is a waste of time. My opinion is that it is beneficial to learn Ruby in order to use JRuby and tap into the vast libraries currently available on the Java platform.

My opinion on Java’s present and future. I think that the current split in contribution to the Java platform (corporate contributors generating libraries, frameworks, products, etc… and the non-corporate contributors creating languages) is correct. Let each camp go forward in its own way, the corporate entities will continue to produce the specs, libraries and frameworks that we all use and that we all need to be relevant in the future while the non-corporate entities will work on creating new languages from which to call those libraries. There will not be a mushrooming of languages because the costs associated with using a language will keep the number of languages down. There will be some healthy competition from various languages that attain buzzword-status, but it will not last for long.
This would conclude my post. I admit that its subject (Java’s present and future) is very broad and covering it in one blog post is very hard but I have these opinions that I want to share with the world.

* By non-corporate entities I mean most small cohesive groups that grow around Java such as the groups maintaining Groovy, Scala, etc… By corporate entities I mean both large corporations involved in various specs (such as IBM, BEA, Oracle, etc…) and small corporations such as Spring Source or Red Hat. Also, the contribution of a corporate entity to the Java platform is not restricted to the contribution that entity makes in the open-source space, but it means all the code that entity has created in the Java platform, closed source as well. I know this definition borders somewhat on using the language rather than contributing to it, but I will keep it this way.

** I know that Java languages are also contributed by corporate entities with JRuby being contributed by Sun being one such example (Sun actually brought JRuby under its umbrella, JRuby being started by a non-corporate entity).

*** By JVM languages I mean languages different from Java that run on the JVM such as Grrovy, Scala, JRuby, etc…

**** Interestingly enough, Java platform languages would be a pretty good case of study for word-of-mouth advertising: some guy created Scala, another guy tried it and blogged about it, few more did the same, some other guy requested some features, etc…, before you know Scala is slowly being shoved into the spotlight.

No Comments | Tags: Development, Favorites

28 December 2007 - 21:09PL-SQL vs Java - speed vs ease of development

You know the old debate between PL-SQL developers and Java developers: “Your Java app will never run as fast as my PL-SQL programs”, “you will never be able to pass a message to an external system and update a DB row at the same time”.

Well, the 2 camps are both right, PL-SQL is very good at performance, while Java is very good at modeling an application. We should use each platform’s strengths. So what would follow from this approach? Well, I think it is pretty obvious: create small PL-SQL procedures focused very well on their target and orchestrate them in Java. Avoid creating big PL-SQL procedures because one side effect of this approach would be creating high complexity in a language and a platform that doesn`t deal with complexity so well. Calling these small PL-SQL procedures amd assembling them into larger blocks from Java code is basically orchestrating them from a platform that handles complexity pretty well.

Do this and you will have a Happy New Year!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

No Comments | Tags: Development

16 November 2007 - 19:06The case for meaningful names for methods/classes, etc…

Code contains knowledge about the application and not only the commands to execute that application, this is a thing that is a bit unknown.
Meaningful names for classes/methods, etc… will expose the knowledge buried in the code better to a person working with that code. When you are changing the behavior of a method or a class it is a good policy to also change its name in order to expose the behavior of that class better.
The knowledge of an application should not be encapsulated only in UML diagrams or PowerPoint documents, it should be encapsulated into the code itself because easier to read code translates into a better application.

I have become interested lately into knowledge management. It is a pretty interesting field that applies pretty well to software development.

Later Edit: In the ideal case a person could learn a bit about the domain by going over the APIs, in such an environment you could say that the domain knowledge is pretty well encapsulated in the code. Of course, this would not mean that the domain knowledge should only exist in the code, only that it is a desirable to have the code reflect the domain.

No Comments | Tags: Development, Favorites

16 November 2007 - 14:36Push the technology way back

I was listening to this presentation by some people at ThoughtWorks about designing an application when I was struck by this quote which goes roughly like this:
I keep thinking of clever designs to push the technology out of the picture so that changes in the business domain can be implemented more easily.

Very clever quote. The business domain keeps changing a lot more frequently these days, I think due to an increase of the number of stakeholders which increases the variety of requirements, so it is imperative to be able to make those domains changes easily. Being unable to make these domain changes because of technological reasons carries a pretty big cost.
This is why the technology should take a back seat to the domain. Ideally it would be abstracted so that it is decoupled from the domain.

Another quote that I liked was: you are given a domain when you start working on a project, you are not making this domain up and that your implementation should reflect very well that domain. The job of the domain modeler is to find out that domain and then represent it as faithfully as possible. The domain will change, but it not change in a radical fashion because this would imply that the underlying business changes radically and this doesn’t happen that often.
As I was saying before, the domain changes more frequently these days so it is better to have your application very well aligned with the domain so that changes in the domain can be translated into changes in the application very easily. If you cannot add a new feature because some framework doesn’t allow you to then you have a pretty big problem.

Another big theme in this panel was how to defer decisions till the last possible moment, which is a pretty neat things: it pays off to make decisions later when you probably have more information to base your decisions on.

All in all, a pretty interesting presentation.

No Comments | Tags: Development, Favorites

15 October 2007 - 15:37Annotations used for deployment

I was reading this post from Bill Burke’s blog and I found myself shocked by the sheer number of annotations required for making a business object WS-BA compatible (if I understand well the post).
One thing that I don’t like about using annotations is that they tend to stay with the code. I think that these documentation provided by annotations should be outside of the code because in this case you are coupling your business component to an external concern: you are coupling your component to the WS specification.
What you get in Bill’s post is the first stage of annotation creep: you need to externalize your component in order to plug it into an operational environment and you do it thru an annotation. Next operational environment comes with a brand-new set of annotations. And so on…
I keep thinking that the best way to deal with all these annotations would be to sub-class the original interface and apply annotations to it. You will have quite a lot of problems: the classes which are mapped to the original interface will need to be sub-classed as well and their sub-classes mapped to the annotated interface. You will have problems when updating the original interface, you will find that you need to update all its annotated sub-interfaces. And a whole other set of problems.
You feel like you almost need to define inheritance in annotations: an interface would inherit the methods from an interface and the annotations from a different place and you would work with this new interface properly annotated for the operational environment. My opinion is that annotations, like any other type of static, hard-to-change documentation, have serious limitations when it comes to exposing a class to many, varied environments and when you are using it for documenting operational concerns. No wonder so many people prefer XML…

BTW, this post uses Bill Burke’s post only for displaying an example of annotation-creep.

Later edit: If you think about the way annotations are currently used (putting them all over your code base in order to insert a component into various frameworks) they can be thought of as some sort of cross-cutting concerns (they are cross-cutting to a certain extent). I wonder if using something like an aspect would not do a better job of expressing the various concerns that annotations are currently used for. Aspects are orthogonal to the code-base at the same time, a pretty neat thing.
Wouldn’t it make sense to WS-BA-enable your whole application by specifying a point-cut and then applying annotations against this point-cut? It would be a lot easier and it would de-couple your code-base from external concerns. I am not sure if it is doable (I could think of a solution, but it is not pretty).
I think that the way annotations are currently used is not 100% OK and that it poses some maintainability problems in the future. Putting information into annotations is partly a response to XML-creep (as Bill Burke pointed out in the comments) but it is a solution far from perfect. What we are dealing with here is the need to add some information to a component in order to plug it into a framework. Annotations, as well as XML, are not doing a good job because they address this problem (plugging the component into the framework) at the individual level, at the component level. What we really need is a mechanism that would let you plug a whole set of components into a framework easily (I suggested using point-cuts).

Later Edit 2.0: I think that adding querying capabilities at the code-base level (similar to AspectJ`s point-cut language) along with a mechanism to interpret these queries into Java would be the killer feature that Java lovers wait for. It would create a whole new way of developing: for example AOP would be implemented by joining point-cuts to an interception mechanism. A lot of the current mappings (in XML or in annotations) could be solved by defining a point-cut and then a mechanism to interpret this point cut. In the above case you would define a point-cut of your components that you want to WS-BA-enable and a mechanism for working with the components returned by this point-cut. Embedding a framework into an application would consist of creating a point-cut and then passing it to the framework in order to process it.
This would be application-level introspection and reflection and it would do away with both annotation-creep and XML-creep. I can`t dwell too much on it cause I gotta go to sleep.

Later Edit 3: In this post Bill Burke argues about using meta-annotation in order to de-couple your code-base from framework specific annotations. Adding a level of indirection between framework-specific annotations will free your application from framework-specific dependencies, this is for sure, but you will still tie your code to a representation (the representation specified by your annotations that serve as end-points for the original framework specific annotations). I am not sure that this is desired. I still think that your code-base should not be polluted by any external concerns.
It is interesting to see what will happen:
1) Will the world continue with annotation-creep and XML-creep for plugging their applications into various frameworks?
2) Will the world settle for a mid-way solution, like the meta-annotation solution suggested by Bill for the same problem?
3) Will the world try to solve this thru a different design?
4) Will the world decide that frame-work plugging is so costly that we might as well migrate to a different language/platform in order to reduce these costs?
It is all a question of supply and demand basic economics: plugging an application into a framework is currently a pretty costly operation. If the demand for this operation rises significantly (the number of times an application is plugged into a framework increases) it would make sense to make an investment (new methodology, new framework, etc…) in order to slash the costs. Either you make the investment or your language/platform becomes so costly so that it will prove cost-effective to move to a different language/platform.

10 Comments | Tags: Development, Econo-computing, Favorites

4 July 2007 - 18:09Java and dynamic languages

I was watching this presentation by Rod Johnson on infoq when I came across the part when Rod Johnson started talking about using dynamic languages in a Java application. The main reason that he gave for using dynamic languages in a Java app was that due to pushing the core business logic into domain objects the service layer becomes very thin, “script-like” as he called it. Rod saw an advantage of using dynamic languages (DLs) in implementing the service layer as a series of scripts, I interpret his take correctly.
This is an interesting proposition. I find it a bit off-the-wall and at the same time very timid, which is telling about how DLs are currently treated in the Java world. There is a bit of fear about using DLs, stemming mainly from the lack of exposure to them, like working with any other thing that you do not know you are bound to make mistakes and the fear that getting too involved with DLs may cause some very costly mistakes is present. At the same time you fear being left behind while the rest of the computing environment embarks on these popular DLs, you may be missing out on the next great thing (BTW, the next great thing is very rare). This could sum up the attitude that Java developers have towards DLs: DLs are pretty interesting things, too bad I don’t know how to work with them.
I decided to write this post in order to correct a misconception that most of the people writing about DLs have: that DLs are all about capabilities which are encapsulated in a language. I think that DLs are about their capabilities, but, just as importantly, about the workforce coding in such DLs and I find that scripting example put forward by Rod Johnson fits this statement.

One big problem that using a DL in a Java application is the exposure that the developers using it had to that DL. You will find that most developers will absorb a foreign language only so much, and will tend not to get too deep into it unless it applies heavily to their tasks. You will also find that getting to know well a DL requires quite a lot of effort, effort that most developers are not willing to put in. The correct use of a DL being related to the exposure to that DL it follows that you cannot really use a DL’s more sophisticated features and that you will also try to avoid developing complex constructs in a DL. The scripting example becomes very telling right now, because scripts are nimble little things which do a very small and simple task, I would say that this type of tasks can be coded by a regular Java developer in a language that it cannot master completely. I would even venture to say that because of high costs of adopting a DL DLs will never get beyond simple nimble things in the Java world. I am not so sure about coding the service layer in scripts though. Will the service layer dwindle in complexity to the point where you could script it up? I am not so sure, time will tell.

P.S. The costs for adopting a DL are the costs of adopting any language, in particular a domain-specific language (DSL). These costs are roll-back costs (you cannot roll-back a language out of an application so easily), training costs (the work-force needs to be trained in this language) and human resources costs (if you happen to choose a language which is not considered cool anymore you may end up having recruitment problems, who wants to work on something perceived as a dinosaur?)

This would explain why developers have the same reaction to DSLs as they have to DLs: a bit of fear laced with excitement.

P.S. 2 I wrote this post in a hurry, I think it shows thru.

Later Edit: It would be interesting to see that now with scripting languages moving into JEE applications whether JEE applications will be used by developers using scripting languages. If Rod was right in his assumption that the service layer could be script-ed up it is conceivable that a Java app could be picked by a Ruby shop that would modify its service layer as it fits its needs. I don’t see this happening because the Ruby people would have minimal control over the app in this case (in this scenario they would have control only over the service layer, even though in theory it is possible to plug Ruby business logic beans into the original Java app via Spring) and because it would require that the Ruby team has some exposure to a Java application deployment environment (Spring files, etc…). Again, it requires that a set of developers (Ruby developers in this case) has some exposure to the other language and environment (the Java application) with all the problems that this implies.
Later Edit 2: Today I read about an interesting case of scripting: JRuby GUI APIs. What I find interesting about coding GUI in Ruby is the fact that it lowers the barrier of entry to coding them to scripting, a coding style that is much closer to a graphic designer than your typical Swing event handlers. Who knows, maybe the future desktop designers will grok JRuby…

2 Comments | Tags: Development