31 August 2006 - 21:15REST and SOAP

I have started reading on REST-type architectures (I recommend this intro in particular). It is an interesting exercise to compare them to SOAP, one of the main differences being the amount of information a client has to know about the remote system it is interacting with. With SOAP the client must know the whole API stack in order to be able to carry out a series of actions while a REST client gets passed the resource for its next actions as it progresses. To go back to the example in the above intro: as the REST client searches the catalog it is given URIs which can be used for carrying out certain actions (viewing an article, adding it to cart, etc…). A REST client just needs to agree on the semantics of the data (how do you specify the URI for viewing an article or adding it to the shopping cart) with the server and then it just needs to call those URIs when it needs a particular action taken.
In contrast, a SOAP client needs to know the whole server API before engaging in a conversation with the server. It needs to know what method to call and what types to use in order to retrieve a product, get product information, add it to shopping cart, etc… It needs all these before else it doesn’t know what to do.
The REST architecture is very similar to the world wide web: a human being browses a page, sees some links, if they are interesting it follows them, etc… The SOAP architecture is, to a certain extent, similar to a human being using Microsoft Office: if you do not know the bazillion buttons and dropdowns MS Office comes with you cannot use it. In order to use MS Office you need to know its user guide, similar to having to know the API of a SOAP server in order to use it.
I like REST, conceptually for now, because it gives me the same feeling as when I was studying distributed computing in university: you have a node, you specify its relationships with other nodes, you set it up it in the network and before you know a whole web of relations comes alive. In REST you have to think in local terms (to what other nodes am I connected now that I am on this node and what can I do now?), while with SOAP you have to think in global terms (I know the whole application, I have this data, what do I do with it now?).
That being said, I am wondering how will REST architecture scale when the number of connections between nodes will increase. Let’s say that I have the product node and for now I have to other nodes (or URIs in REST-speak) attached to it: ordering the product and viewing it. What if I need to add 15 more actions to this node, will that affect the size of the message I am sending the client and effectively consume a lot of resources? What if these actions are not that different one from another, how will the clients be able to make a decision about where to proceed next?
Another point I’d like to make is the fact that when programming to a REST-architecture you are intimately tied to it. You cannot download the API, put a layer of abstraction on top of it and plug it into your application as you would do in a SOAP scenario. With REST you get a node, you parse it for its relationships to the rest of the application (and you have to make this all the time because the REST server may decide to change the URIs of those relationships) and then you are ready to do something. In SOAP the relationships are expressed in the API, in REST they are discovered every time you process a remote node. You are dependent on the remote server a lot more than in a SOAP scenario. Interacting with REST-servers may result in tight-coupled systems*.
One last thing (related to the above) would be the fact the REST is so different from SOAP to the point that a customer wishing to communicate with a REST-based system may have to alter its application entirely (not only the layer it uses for communicating with the rest of the world). REST doesn’t let people get their feet wet, you are either in it or totally outside of it. This could hamper mass adoption.
I am waiting to see how this unfolds. Given the fact that most corporations grok SOAP I would be surprised if this technology will make a beach-head in the enterprise space. But you never know… I, for one, wouldn’t mind getting that feeling I got in university again…

Later edit: A REST application is probably best used for distributed computing where various nodes talk to one another. In this scenario the concept of a REST client and a REST server doesn’t exist, basically the REST client is emerged in the network the REST server(s) are running. You cannot have an external (at least I don’t think so) REST client, the REST client is similar to a spider that goes from node to node carrying out certain actions at each node and finishes by becoming part of the network. I think it is for this reason that trying to connect an external system to a REST-network fails, because the external system is to intimately tied to the REST network that it finishes by becoming part of it.
Comments welcome!!

Later edit: You can abstract the interaction with a REST-server and plug it in your application, except that it is awkward mainly because REST-ful behavior is very different from your standard interaction with a library that you know before-hand.
* I am not sure about this, please drop a comment if you want to talk about it.

No Comments | Tags: Development, Favorites

30 August 2006 - 14:45The emergence of the managed environment

You wouldn’t have guessed that this post is about EJBs, right? I wrote a rather critical post about EJBs, but I’d like to take a second look at them.
I was looking at EJBs from a historical point of view and I realized that the EJBs are probably the first managed environment. Their intention was noble: to free the developer of any operational concerns and focus exclusively on the business logic. The developer was not supposed to worry about securing an application, about propagating credentials in a cluster, the environment was supposed to do this. The developer was not supposed to code for concurrency, the environment was going to do this. The developer was not supposed to deal with resource management, the environment would have done this for it. The developer was not supposed to code TX behaviour, but rather declare it and let the environment enforce it and propagate it in a cluster. The developer was not supposed to connect to a DB manually, but rather delegate this to the environment. The aim was really high, the functionality that this managed environment was supposed to implement was not trivial. On top of this, the spec defined this managed environment to be a distributed managed environment. Very high goals…
The result was disappointing: in order to embed your component in this managed environment you had to code to a series of interfaces and design policies which had to become second nature to a developer, in other words the environment was very intrusive. Embedding your component in the environment also came with a very large overhead: for some beans no less than 6-7 files were required in order to be able to turn an ordinary class into an EJB. Working with entity beans was particularly hard on anyone who had to use collections in the early days of EJBs: I remember scanning the log of the application and waiting forever for the app server to load 20 entities. This managed environment also came with some perverse side effects, the most known of which was the vendor lock-in: transporting an application from one app server to another was an undertaking that very few organizations had the resources to carry out.

Now, if we look at it from a “historical” point of view, the above doesn’t seem so bad for a first try. One cool thing about EJBs that resonates well with me is the fact that EJBs promoted the use of interfaces. One very good use of this would have been the abstraction of a business process behind interfaces, unfortunately this didn’t take off, most of the implementations were going the other way around, from concrete class to interface, XDoclet-style.
The “mistakes” that this operational environment made would be that it was advertised as “enterprise” and that it was too inclusive. Trying to address every issue important in enterprise computing on your first shot is prone to failure. The fact that this was done in a “democratic” fashion with each industry vendor pitching in his customers’ needs didn’t make defining the spec easier.

Some people look at EJBs as an abject failure. I look at it as the first step on the road to something new, a first step loaded with mistakes as all first steps are. Currently everything is running in a container, in a managed environment if you wish, which you get to define and tinker with if you wish. The lessons learned by working with EJBs provided valuable when the current successful containers were designed. It’s an evolution if anything else.

No Comments | Tags: Development, Favorites

29 August 2006 - 18:33Communication thru UML

I am reading Martin Fowler’s UML Distilled and I am going over his introduction to use cases. So far it looks like use cases are one point where domain experts and developers come together and is the main interaction between them. Basically, the domain experts communicate the behaviour of the system to the system architect which can extract from it an essence which can further be abstracted into collaboration diagrams, class diagrams, etc… A conceptual domain model is built and supposedly from this domain model you can start drawing collaboration diagrams and the rest.
This doesn’t get even close to the complexity of the business logic. It is just so simplistic. One other “great” way to “model” a business was to treat nouns as classes and verbs as methods. I don’t know why, but to me it sounds like medieval witch-craft. This packaging of information so that it fits inside a developer’s head is atrocious. It probably works with someone that has some knowledge of the business, but it doesn’t work with someone that is just entering the business. It is simplistic to assume that the ramp-up time for getting into a domain can be obviated by some collaboration diagrams. You have to know what you model, and you have to know it very well before starting to design.

Let’s take an architect that spent the last 10 years architecting video games. It probably has its head filled with design patterns, UML diagrams, tricks, frameworks, etc… that enable it to do his job very well. Suppose that the same person is given the task of designing an application in the insurance industry. After collecting a mountain of use cases will it be able to see points of change in the application he’s building, will it be able to abstract the business process that it is implementing so that the application changes easily as the business process changes itself? I’d be surprised. That architect will go thru its collection of use cases and come up with a platform independent model and from it design the whole application. And he will probably fail to capture quite a few assumptions that the domain expert takes for granted and create a system that deviates from the original intent. It would have to adapt itself to the domain and then start to make some assumption about it in order to model it better.

UML can be used as a tool for communication between the business analysts and developers, but we should not expect too much from this tool. I think that the potential for errors and the noise is pretty big for this communication channel. Once things start to get complex these diagrams will not be able to translate efficiently the process described by the domain owners to the system architect. Instead of relying on various meetings with the business analysts, the architect should be able to understand the process, should be able to ask question meaningful to the business owners about the process that it is implementing and then it should start modeling it. It would help a lot if the business analysts would have some sort of a manual that distils the business and that the system architect can read before engaging with them.
I, for one, treat use cases and collaboration diagrams mostly as documentation needed for bringing a developer on board, rather than a way to pull information out of a domain.

As a final note I would say that the ability to adapt to a new environment is becoming a requirement for a developer. In order to be able for a developer to function efficiently across a wide array of industries that developer should learn how to adapt to a new business environment, and pretty easily. A developer should scratch beyond the surface of a industry in order to make meaningful decisions.

P.S. I’ll give an example from my experience.
I was designing an e-commerce application that had to support multiple customers and had to be industry-agnostic. What was the functionality that was most prone to change from one customer to another and which had to be abstracted first? Navigation? Layout? No. It was/is the prices. Every customer has its internal pricing structure (tiered, contract-based, rebates, etc…) that has to be reflected in the application. I learned it the hard-way, and it is a valuable lesson. When I designed a B2B application the first interface that I wrote was the interface for handling prices because this was one piece of functionality that I knew the customer cared about and that was prone to change. Do you think that this need ever transpired from the various interactions I had with the customers and various use cases I walked thru? No. The prices were the functionality that were the most prone to change and there was nothing out there to tell me that, most customers would assume that their price schema is what everyone is using. It was the interaction with quite a few customers that showed me that what is really prone to change is the pricing module.
P.P.S. In the above application the next thing that had to be abstracted was the interaction with their back-office systems, but it is a lot more obvious that it has to be abstracted.

No Comments | Tags: Development

25 August 2006 - 14:34Open Source Trademarks

One of the revenue models for open source (OS) companies is the selling of support services and documentation. JBoss is widely known for pioneering the model of making money off support rather than off selling a product. The question that is posed is how do they protect this revenue stream? You can set-up a shop somewhere in Eastern Europe, staff it up with committers to that particular open source project in order to gain credibility, set-up a workforce that masters the product and costs 3 to 4 times less than their Western counterparts and start selling support services. Chances are that you will be able to leap ahead of the original open source company because of the pricing power you yield due to lower costs.
So, how do the OS companies protect themselves from this threat? One great protection is thru the use of trademarks. Take a look at Red Hat, MySQL, and JBoss. Boy, do they protect their brand. MySQL goes as far as to prevent the use of its trademark on documentation. An IT shop selling support services for MySQL (for example) could be taken to court by MySQL because they are using the MySQL trademark (when they are advertising “Buy great MySQL support from us”) without consent from the MySQL group. An IT shop planning to compete with them on support should make sure it flies under the radar and doesn’t make a significant dent in the original OS company’s revenues. The moment it gains traction in the market and starts having an effect they should prepare to meet the original OS company’s legal team. Chances are they will not be able to sell their services in a country/region that takes intellectual property (IP) seriously.
Eastern Europe could sell support for OS software to small and medium business in the West mainly because this is a market that is not in the OS companies’ cross-hairs. Western small and medium business could gain from this service (no-cost software and low-cost services) because currently their other alternatives for troubleshooting their Linux desktop are googling their problems or calling tech-savvy cousin Joe.
I really like the way these OS companies manouver. You may have the impression that OS is run by a bunch of pony-tailed hippies fed on utopia, the fact is that these guys grok IP and grok it very well. These guys seem pretty well prepared for an environment in which IP is the main expenditure. A lot better than some closed-cource establishments…

Later Edit: Who would have thought this is possible? You know what I think? That the APIs are a commodity, what is not a commodity is the knowledge to use them, knowledge which is very important given the size and power of these APIs. It looks like RedHat is protecting that knowledge from becoming a commodity pretty well, but the manner in which they are doing it smacks of despair. You would have thought that the almighty “professional open source” model would not rely on hunting down individuals who piggyback on an OS product to make some money, but it looks like RedHat is looking at locking down that revenue stream.

No Comments | Tags: Development, IT in S-E Europe

25 August 2006 - 13:42Using Google

These days I have done something that I have not done for the last year: I searched on Google. Yep, I didn’t do a search on Google for quite a while. The reasons that kept me away from Google was first the Google desktop which behaves more like a trojan than anything else and then its relationship with the Chinese government. Its hunger for information didn’t score many points with me neither: Google is a vacuum cleaner which absorbs every piece of information it can get its hands on. The huge (at the time) mailbox size of GMail reflects, to a certain extent, its hunger for information. Give the users a huge mailbox and load the mail program with features that make the users reluctant to delete email (such as tracking conversations) and they will keep most of their data with you… Its never expiring cookie says the same thing: "We want information and we will whatever is needed to get it".
To go back to Google search, I didn’t use Google search because I didn’t have to. The environment in which I was working before didn’t require me to search the web, most of the information I needed I could access very quickly, either thru information systems or thru people.
Well, that changed. I changed jobs and in the new environment I found myself tackling an application server problem. With little documentation and no tech-support I had to go back to Google. I spent 3 hours on Google trying to find some documentation with no results. It’s not the fact the Google didn’t help me find what I needed that prompted me to write this post. Is the fact that I grew to view web-search as a last choice for information retrieval. For the last year I have put up a library on del.icio.us where I go if I need to find something. I have subscribed to various feeds which provide me with very good information that I index according to my needs. I have come back to it when I needed to get some research about outsourcing from Western Europe, I didn’t have to go to Google and start from scratch and sift thru mountains of garbage in order to find what I need. The effort of putting up a small library, of interacting with human beings interested in the same fields as you is a good long-term investment.
I am slowly coming to the conclusion that dependence on web-search reflects a poor environment (in my case poor documentation and no tech support). What I needed should have been covered way better by my application server provider, I should not have to go Google and struggle to find a significant keyword that would have provided me with the answer. My application server provider should have found a way to channel all the information that their users have to me. They failed and the only thing I could fall back was searching on Google. (I am wondering how many tech departments out there outsource to Google the indexing their information rather than putting up a decent system that their users can rely on.)
Using a search engine may mean, to a certain extent, the assumption of instant gratification: "I don’t really want to put the effort in acquiring this information, I’ll just ask Google for it". Don’t be surprised if Google will not find it for you. It may also mean a poor environment where your source of information is a search engine that is not an authority in the field you are interested in.

No Comments | Tags: Development

24 August 2006 - 17:13Some effects of IT outsourcing in Romania

I was talking to my cousin in the beautiful(??) country of Romania about developments in the IT industry over there. From what I remember it looks like the wages in the IT industry have shot up significantly, being one of the very few drivers of salary growth over there. From some email I got from a friend in Romania it looks like positions which require 3 to 4 years of experience are going for at least 1000-1500 euros a month. Unfortunately for IT professionals some costs (such as the cost of owning a home) have greatly outpaced salary growth…
One of the effects on the locals will be the slow disappearance of in-house IT departments. Many businesses have in-house IT deparments covering pretty much all the spectrum of an IT operation: Oracle DBAs, JEE specialists, sys-admins, etc… Obviously, this cannot be supported anymore because of the migration of IT specialists towards better-paying jobs coming from foreign corporations. The local businesses are left with a problem: who will continue to carry out our in-house, incredibly customized and brittle IT operations?
In the problem lies the opportunity: an entrepreneur could create an IT company that implements these processes (preferrably using an open-architecture in order to accomodate customers as diverse as possible), have the businesses shed their IT departments (they cannot afford them anyway) and have them outsource their IT operations to its company. Outsourcing in the land of off-shoring seems contrarian, but it is one of the few viable options that a business can use in order to continue to automate its processes (book-keeping, customer relations, etc…).
Open-source could play a very important role in the implementation of an outsourced environment, please remember that SE Europe is dirt poor. Operations Support Systems for Java could answer some questions about an open architecture for creating very diverse and customized implementations of various business processes.
Opportunity lays with providing IT services to home-grown businesses as well as to the off-shoring behemoths that are visiting the area.

1 Comment | Tags: IT in S-E Europe

24 August 2006 - 17:11IT in South-Eastern Europe

I decided to create this category in order to keep up with the latest developments in that part of the world. I happen to call Romania my home country and I am interested in what happens over there, IT-wise or otherwise.

No Comments | Tags: IT in S-E Europe

20 August 2006 - 22:35Service Data Objects

I have been reading the specs for the Service Data Objects (here and here) a bit puzzled. The scope of the spec is daunting: “You need to know only one API, the SDO API, which lets you work with data from multiple data sources, including relational databases, entity EJB components, XML pages, Web services, the Java Connector Architecture, JavaServer Pages pages, and more.” One API to rule them all. All laughing aside I really do not understand the effort: don’t we already have JCA, JDO and JPA? Who needs another persistence API? Who needs this persistence API to be platform-agnostic, so that your JEE app can use a .Net app as data repository?
The only interesting thing about it is the disconnected data graph, but I think you could abstract this behavior in JDO or in a DAO. One thing I find odd about this implementation of a disconnected data graph is the fact that it comes with a locking strategy built in (namely optimistic-locking) when it would have made sense to be able to specify this declaratively or at least provide for ways to plug in your own data layer. If you have 2 sales reps working remotely on the same purchase order you should provide for collision detection or resolution if one tries to override the other’s order when it re-commits it.
What I find the worst in this spec is granularity of access between 2 applications tied together thru SDO. SDO’s purpose is to abstract the data layer to the level that you can plug in any data source, including another application. This in turn means that an application has access to another application’s data layer. Access to an external application data layer is a very fine grained access to another application, which is a big no-no. Think just about the amount of coupling that you create between 2 applications, or services if we want to use the buzzword du jour. Access between applications should be coarse-grained, not fine-grained. An application should publish its coarse-grained behavior (which should not be prone to change) not its data layer. A data layer that gets published and used by external applications will evolve with great difficulty because the dependencies involving it are enterprise-wide and not application-wide.
SDO appears as the data layer of a client-server application in heteregenous environment, however I am skeptical about how useful or how needed it is. You could say that it is tied to the re-emergence of the rich client, but I am neither sure that this rich client is really needed nor that it needs these capabilities.
I don’t know. The more I read about it, the less I like it. I’ll be watching it with interest, but I am betting against it. The main reason for betting against it is that the effort for implementing SDO is huge (implementing the Data Mediator and the Data Graph in a heterogeneous environment is not trivial at all) while the benefits really modest. I would be surprised if it achieves any penetration in the enterprise space.

Later Edit: I was reading this article on infoq.com about offline storage in AJAX applications. It looks like we have one more disconnected client type that will need the capabilities of SDO. Will anyone think about porting SDO to an AJAX environment? I hardly think so.

No Comments | Tags: Development

16 August 2006 - 19:37C programming is not OOP

You are probably familiar to the chorus of C programmers saying "We were doing OOP in C and we were doing a good job of it", if you are not here is an article about how to program in C using OOP concepts. The article addresses only encapsulation, inheritance and polymorphism are left to better-suited languages like Smalltalk.
They probably did use some OOP constructs in C. However, what they were doing was a process which could not scale. The object-oriented C that was produced was a twisted mangle of code that had a pretty high entry barrier: it was very complex and, I would assume, poorly documented. I assume that it was poorly documented because it was some sort of a hack which was known only among a small group of luminaries at a time when documentation was not considered crucial to the development process. As I said, this process could not scale, it is obvious. Implementing it across a large developer base would have required error-proof coding practices, it would have required strict enforcementes of these practices, and this would have been very costly. At the end of it the developer would have been turned into a human compiler who would have spent most of its time making sure that is procedural code was also object-oriented. The sheer complexity of this task would have prevented mass adoption and ultimately stop the "object-oriented" C’s growth: how could you have implemented more exotic uses of OOP if you had to constantly juggle functions passed as pointers, crazy typedefs and so on? Another roadblock to its success would have been the various conflicts that would have emerged between various groups if this object-oriented C was successful. Various OOP frameworks would have emerged each with its vocal set of fans and dissenters.
Then Bjarne Stroustrup came along with C++. C++ put the house in order and formalized the language to the extent that development could scale meaningfully. The OOP hack known among 5-10 programmers in a team was suddenly released to the world at large. The workforce could adapt to a standard that was not going to change according to the moods of some gurus. This was a process that could scale and that scaled very well: dozens of thousands of developers used C++ for over a decade developing very complex systems.
The group claiming that they were developing in "object-oriented" C suffer from seeing the world thru glasses tinted according to their beliefs. I get the same feeling anytime I hear people abusing the buzzword du jour: "We were doing SOA for years". Not really, sending messages over a CORBA-talking pipe is not SOA.

1 Comment | Tags: Development

13 August 2006 - 20:31Various uses of RSS

RSS, or Really Simple Syndication or Rich Site Summary, is a popular broadcasting protocol. A collection of items is made available to outside parties by packaging a digest of it and releasing it. The outside parties can determine what items have been added to the feed, which ones were updated, etc… The digest also provides pointers toward the location of the items.
RSS is pretty popular among the DIY crowd which uses it mostly for keeping in synch with the blogs and sites it follows. Various sites, such as rojo.com, are using it under the hood for enabling users to keep in touch with their blogs. It is creeping into browsers and since IE 7 will have an RSS reader it is a safe bet that this technology is becoming mainstream. I find it interesting that it is becoming mainstream because this is a mindset change: rather than looking for information on the web you wait for this info to be pushed to you. Anyway, I didn’t intend to talk about social changes in this entry.

I am more interested in RSS’s behavior and the way you could apply to various problems.
One of the first uses would be to use RSS for broadcasting events in a group. Let’s say that you are a MySpace user and want to send a message to a group of people. MySpace could broadcast the message to your group by using RSS: each group member would have access to the group’s feed and any member could write to this feed. The result would be that each member could send a message which is guaranteed to be received by the rest of the group as long as the RSS server is up. Setting up this messaging infrastructure in a web application could be as easy as getting an RSS AJAX client and mapping it to the appropriate feed.
Let’s go to some exotic uses of RSS. The fact that RSS is a platform-agnostic broadcasting protocol means that it could be used for managing the interactions between various services in an SOA environment. To go back to the tax example: you have to apply a service (taxation) across a wide array of applications deployed on different platforms in a financial institution. You could have each of these application broadcast events (such as a stock have been traded, a commodity has been bought or sold, etc…) in an RSS feed and the tax program would listen to all these events and take appropriate action. The messaging infrastructure would be pretty simple to implement on both sides, I assume that there are quite a few clients written for various platforms.
You could use it for pushing deltas in a cluster if guaranteed delivery is not mandatory. A company could consolidate its messaging infrastructure using RSS: the email client would be replaced with an RSS client that is mapped to a specific feed, various groupware software could work off RSS feeds.

Various client-server interaction could be implemented easily with RSS: I wrote a desktop application that had to connect to a server and get the latest updates for catalogs, accounts, etc… This could have been done very easily using RSS. The changes in a catalog would have been published in an RSS feed and the client would have just had to connect to this feed and get its updates. Gone is the proprietary protocol which cost me a few nights, instead you get a tried, tested and true RSS client and plug it in your application.
An anti-virus or a patch manager could get a list of the latest virus definitions or patches and their location in an RSS feed and proceed with installation based on the information published in the feed.

I don’t want to be misunderstood. I am not suggesting to gut out MOM and replace it with RSS, I’m suggesting to consider RSS when you need an infrastructure for passing messages with a certain reliability rather than build it in-house. You could have various parts of your application talking to each other literally in seconds. I hold the conviction that code-writing is an error-proce process which has to be minimized as much as possible. In order to do this we have to start thinking in processes, we have to look at various technologies in an open way, realize their potential and use it.

Later edit: This site seems to be a good resource for RSS-related software.

No Comments | Tags: Development