29 July 2009 - 14:42Old and new media

For the past 2 weeks I have been staying at home without a computer and without cable TV. I took this opportunity to try to see how my grand-parents would get their news and read newspapers all thru-out this period. When I went back to work and back to a computer I saw that I have not missed much, the newspapers covered all the major events of this small period: ethnic unrest in China, a continous drop in equity markets and the subsequent re-bound, etc… The number of news that got to me thru this channel was a lot smaller than the usual number of news that I get, but I found that the coverage was way better and the subjects were treated in depth.

I think that the main difference between the old media and the new media is that the old media is delivered thru a vastly more narrow channel than the new media, and that the width of this delivery channel pretty much dictates the format and the news that reach you.

Your typical daily is a newspaper with a limited number of pages. The number of pages is both large enough so that it allows the daily to cover more sections, but at the same time small enough to keep the number of sections in check. A typical daily ends up having Local, National, International, Business and Sports sections.
The limited number of pages means that the stories are competing with each other in order to get printed. The result is that a daily covers only the issues which are the most important to its readership. Some print dailies try to compensate for the limited amount of articles that they make available by the quality of these articles (this partly supports the case of print media moving into the luxury goods category because of the scarcity of items its format can deliver when compared with new media).
The limited number of pages also means that there is competition between the journalists wanting to  get published, and this further enhances the quality of the news paper, typically print media which has a large supply  of journalists (New York Times, Washington Post, Los Angeles Times, etc…) tends to print articles of very good quality as only the best journalists get published.

New media changes all this: the very wide channel along which content gets pushed to readers increases the diversity of the content and while it dimishes its quality. The only constraint that I see on content consumption in new media is not the scarcity of supplied content, but rather the opportunity cost which comes with consuming a particular piece of content: when you are consuming some content you are forgoing consuming some other content.
I think that opportunity cost will define the way content gets consumed in the new media and that the organizations operating in the new media will have to dedicate a bigger amount of resources to marketing themselves because the competition between new media organizations will be far fiercer as the typical boundaries between content sources disappear (*).

* One example of a boundary between content sources which disappears and generates competition is spatial distribution of news papers: New York Times and Los Angeles Times which were not competing with one another previously (each being distributed in different cities they were competing only with local newspapers)  are now competing with each other when they are both distributed via the web.

No Comments | Tags: Miscellaneous

2 July 2009 - 16:47Key-value stores and relational databases

Relational databases are going thru a rough patch right now, some pundits going as far as writing them off. Their main problem is the fact that they are constrained to run within the same physical box (*) and scaling out is pretty hard. Once the datasets reach a certain size you will probably need to look beyond the typical relational database at other ways to store your data.

One type of alternative datastore that is getting a lot of attention is the distributed key-value store which maps values to keys and then assign keys onto multiple storage nods according to a hash function (**). In such a setting you get an object thru a key, you work with it and then you save it back in the datastore. The fact that this configuration scales out easily (***) makes this datastore very appealing, and if you are working with large datasets you will probably have no choice but to use something like it.

The transition from relational databases to key-value stores will probably include taking relations out of your application, re-modelling your application in order to group data together and streaming data to a reporting database. Relational databases gave the user both data storage as well as relations between entities which came as both a blessing and a curse (while relations made reporting easier they also gave un-checked access to data which allowed all sorts of corners to be cut). Well, key-value stores will take relations out: data access is more local, you typically have access only to what is immediate to a particular entity and you cannot perform queries spanning your entire data model. This type of data access could turn out to be actually a good thing, because these very close relationships will force an application to have a cleaner design since you will not be able to rely on monster SQL statements for compensating for design deficiencies.

In an application which needs to handle massive amounts of data relations between data will be delegated to activities which require them (reporting is a pretty good example) and data will be streamed from the key-value stores to the relational databases where these activities take place.
All in all, a pretty good division of tasks: applications with a (hopefully) cleaner model relying on key-value stores streaming data to relational databases for reporting.

All the above doesn’t mean that relational databases will disappear, the vast majority of applications do not require to process the massive amounts of data which render your typical database unpractical. Relational databases will be around for quite a while.

* Obviously, there are database clusters (such as Oracle RACs), but they are pretty costly.

** Examples of key-value stores are Google’s Big Table, Amazon’s Dynamo and IBM’s eXtreme scale.

*** Actually, there are constraints attached to it. In order to get the best performance you need to model your application so that data access occurs within the same node in order to avoid a distributed transaction spanning across multiple nodes. Please read this article by Billy Newport from IBM for a better understanding.

No Comments | Tags: Development, Management