November 18, 2004
Congratulation to the C-JDBC team!
It is a new example of how the Apache Software Foundation and the ObjectWeb Consortium collaborates to create great Open Source products.
Another great example which has not been talked a lot (because it lacks the cool factor) is the inclusion of Howl in Apache Geronimo and ActiveMQ so that these products can support transaction persistence and recovery.
Another milestone I'm waiting for is the integration of JOTM in Geronimo. Jean-Pierre Laisne and I were at the origin of the change of license of JOTM from LGPL to BSD to make it possible to use JOTM in Geronimo (ASF does not accept integration of LGPL code in their codebase). I believed that the change of license was a good opportunity for both JOTM and the J2EE Open Source community in general because ObjectWeb and Apache communities would then be able to contribute to it instead of wasting resource and time to duplicate a transaction manager.
To be honest, JOTM has not attracted a lot of contributors yet but I expect it to change once it's integrated in Geronimo and can reach a wider audience among J2EE developpers.
Brian Behlendorf used the term coopetition (cooperation + competition) to describe the Apache and ObjectWeb relationship and that's a quite accurate description (no needs of lawyers between the two, only software layers). At the end, what counts is that both communities gain from collaborating and sharing code.
November 8, 2004
When you call the toString()
method on an array in Java, you don't get an useful representation:
Object[] foo = {"bar", "baz"};
System.out.println(foo);
>>> [Ljava.lang.Object;@e0b6f5
I used to create my own representation of the array:
StringBuffer buff = new StringBuffer("[");
for (int i = 0; i < foo.length; i++) {
if (i > 0) {
buff.append(", ");
}
buff.append(foo[i]);
}
buff.append("]");
System.out.println(buff.toString());
>>> ["bar", "baz"]
But I recently discovered that the Collections Framework already offers this function:
System.out.println(Arrays.asList(foo));
>>> ["bar", "baz"]</code>
The trick is that you rely on the stringified representation of a List
to get the stringified representation of the array.
October 27, 2004
Well there was a time when you let me know
What's really going on below
But now you never show that to me do ya
But remember when I moved in you
And the holy dove was moving too
And every breath we drew was Hallelujah
Hallelujah Hallelujah Hallelujah
I like the original one from Leonard Cohen but Jeff Buckley's version is closer to my heart.
October 12, 2004
I often make the same search queries in Gmail (e.g. give me all the mails I sent to a specific mailing list). Unfortunately, these queries are not kept by Gmail and you have to type them everytime.
But I found a simple way to make Gmail remember them.
So, if you want to access Gmail queries directly from its search field history, you just have to search for them on Google (provided you have the Google cookie).
Since Gmail seach field kept the history of your Google searches, they will be available next time you search mails in Gmail.
October 12, 2004
Murphy's Law has been expressed mathematically as
((U+C+I) x (10-S))/20 x A x 1/(1-sin(F/10))
- U for urgency
- C for complexity
- I for importance
- S for skill
- F for frequency
- A for aggravation
So now, I know how to beat it!
No bad luck anymore... ;-)
October 7, 2004
I often need to compare two files to see their differences.
The UNIX geek in me is using diff --side-by-side
for that. But sometimes I want something more fancier with bright colors and easier to scroll.
Eclipse provides a handy feature to compare two unrelated files (e.g. not two CVS revisions of the same file) but it's quite difficult to find it first.
To compare two files in Eclipse, select both files (Control click them) and in the contextual menu (right button), chose Compare With > Each Other
. Et voila! you got a nicer diff than its CLI counterpart.
October 4, 2004
Last winter, I had the opportunity to work on a road traffic supervision software. I was in charge of the design and development on the communication and data aggregation subprojects.
The communication subproject was receiving the data from various sources
the probes and sensors on the roads which were sending regularly traffic data (number of cars, speed,...)
weather information from Meteo France
various information from partners (cabs, fire department,...)
When the communication subproject received the information, depending of its type, it dispatched it to the data aggregation subproject which computed interesting values (traffic trend, weather trend,...) based on a set of customizable rules.
Then depending on the result values, the aggregation subproject used the communication subproject to dispatch the information to the supervision center (to alert supervisors), to the archives or to information displays on the roads (Big storm coming!!!).
To design these two parts, I used a simple loosely coupled system where the information was flowing as messages and components declared their interests on specific message types. The messages were either XML messages (we also retrieved CSV files and mapped them to XML ones) or simple MapMessages.
I had a centralized Content-Based Router which, depending on the message types, sent it to the interested components which did what they wanted and then dispatched it to the router once their task were finished.
The flow was quite simple and could be expressed using a finite state machine.
The router was also in charge of thread management (the application was heavily multihtreaded): if there were not enough threads in the pool, new ones were created. If the pool maximum size was reached, the messages were queued and waited for free threads which could take care of them.
The application was delivered on time and is up and running quite well.
I left for another project but I kept thinking about what I could have done if I had the opportunity to make a second version of this project.
What bothered me on my design was that the thread management was not smart enough. Let take a simple use case: an accident. In that case, there could be a massive number of incoming messages containing road information (traffic jams are forming, speed is quickly varying,...). But the priority in that case is to give the data to the supervision center and display them on the roads to prevent other accidents. We don't want to lose time computing values while we have such an important message to communicate. However the router was not smart enough to prioritize this type of messages.
Another interesting use case was weather variation: there is some chances that if the weather is quickly degrading (and we computed this trend), the road traffic will be altered some time later. But the application was not smart enough to know that if the weather was degrading it should increase its thread pool size to be ready for huge data acquisition.
What's more there are regulars peaks of activity during the morning and the afternoon when people drives to work or to home. But the router was not smart enough to increase its thread pool to be ready for these regular peaks.
Enter SEDA. I first heard about SEDA from John Beatty on his weblog about removing buttons and knobs from servers with SEDA architecture (incidentally I was at HPTS when David Campbell talked about SQL Server tuning but I don't think he mentioned SEDA, did he?).
Had I the opportunity to write the second version of the project I'd go for a SEDA architecture style.
The main advantage I see in using a SEDA style would be to have separate thread pools for the different components (incoming communication, aggregation, outgoing communication) and be able to tune them in response of the behavior of the system. Taking the weather degradation use case, it'd possible to increase the ingoing communication thread pool when the weather is degrading.
In the same way, in case of an accident (that we know from incoming messages), we could prioritize the outgoing communication pool to be sure that the more important thing to do is to give the information to competent people.
The thread pools could self tune themselves (for example to be ready on the morning and afternoon peaks) and could also interact between them (the aggregation pool could increase the incoming pool if the weather is degrading).
The application would then behave better and, more importantly, degrade better.
The more I think about that and the more I believe that we can significantly raise the self awareness of our applications. Using SEDA and a good set of rules we can make our applications smarter and self tunable. David Campbell had a good point. Most of the properties for our application are not accurate and the applications would be better than us to get the correct value (while we can only guess them). The applications could be more aware of their health and status and could know what to do to degrade nicely if they knew how to prioritize their work. SEDA makes it possible to get such a behavior while separating it from the business code.
Since that project, I didn't have the opportunity to work on a similar project but I'd like to give SEDA a try on a new project to validate that we can design and build better applications by raising their self awareness.
What do you think? Did you have the opportunity to work on a SEDA projects? What was your experience?
October 4, 2004
Google has updated Gmail and added a few features such as inline contacts and atom feed (more details).
On my account I saw two other great features:
- the ability to save drafts
- mail forwarding
Drafts saving was a feature I was eagerly expecting (I've been bitten too much times writing a mail in Gmail and accidentally closing the tab...)
October 1, 2004
Recently, I stumbled on deprecated code which took deprecation a little too seriously.
I was in the process a upgrading a dependency to an internal project we were using. The modification were only listed at bugfix level which meant that there should have been no changes in the API of the project.
I saw that some public methods I used were marked as deprecated and I wrote on my todo list to refactor my code to use the new public methods later. It was not a priority task and it could have waited for the next release.
Unfortunately, the developer was too extreme in his understanding of API deprecation:
Of course, all my tests went crazy after I upgraded the project.
I had an interesting discussion with the developer afterwards. Basically his argument was that these methods were deprecated with good reasons and shouldn't be used anymore because they were not safe.
I understood his point but I had two objections:
deprecated methods shouldn't be used but they could be used
deprecating methods and modifying them in the same release is not friendly to the project users because it forces them to refactor their code without any delay.
His project was no longer backwards compatible while the release should have only been about bug fixes. He defined backwards compatible as "the API signature did not change". But IMHO, the fact that his methods always return null
breaks the contract of the API. Broadly, I see the contract of the API as more than just the sum of its public methods signature, it also includes the preconditions and postconditions associated with the methods. And one of the (implicit) postconditions of its methods was that they did not return null
.
Sun defines deprecation as :
A deprecated API is one that you are no longer recommended to use, due to changes in the API. While deprecated classes, methods, and fields are still implemented, they may be removed in future implementations, so you should not use them in new code, and if possible rewrite old code not to use them.
I told the developer that deprecating the methods and modifying them within the same release is of no use. It breaks the API in the same way that if the methods had been plainly and simply removed because I had no other way than to modify my code to remove calls to deprecated methods immediately. It had only the illusion of backwards compatibility.
When you have a public API that evolves, deprecation is an invaluable tool but you still have to use it carefully so that your user can upgrade their code at their rhythm and not when you want them too. If you want a clear change in your API, the release version of your project should reflect that big change and it shoud come with a big warning flag.
I'm interested to know how you handle deprecation of your API project, especially in the Open Source Software world.
An useful resource concerning deprecation is How and When To Deprecate APIs.
October 1, 2004
Reading my newsfeed this morning, I saw the Google Labs Aptitude Test. Question 12 is "What is the most beautiful equation ever derived?".
To me the most beautiful equation (derived or not, I don't care) is without any contest
ei * π + 1 = 0
This equation is not a mathematical expression, it's pure poetry:
- You've got e and π which are two numbers we are not able to know (i.e. to compute exactly) but still can be found everywhere in our universe.
- You also got i which exists only in our imagination but still can describe accurately phenomenons happening in the nature.
- 1 which is the beginning of many and finally O who exists only to describe what does not exist.
When I explain to some friends what I like some much in mathematics (and to a lesser extent in computing), I show them this equation and rewrite it as:
1 = -ei * π
Then it shows that something as simple to apprehend as 1 can also be viewed as something as complex and cryptic as -ei * π. They are exactly the same thing, yet they are two different entities seemingly unrelated except by the power of our imagination and intelligence.
Simplicity and complexity sometimes are differents facets of the same entity and they may differ only in how you look at the entity.