Sunday, March 28, 2010

In short: The advantage of no power

In the second week of this month I went to munich again, because working together in the office with Jens and Manuel is much more fun than sitting there alone.
Somehow I forgot that USB-cable I usually use to charge my cell phone, so it ran out of battery after a few days. Since then I was very astonished how I good I can live without it. The reason for this might be that mainly used my cell phone as a (alarm )clock with telephone/SMS functions. First I used KAlarm to wake me up with my favourite music, but this wasn't very energy friendly(okay I want a beach right in front of my appartment in Dortmund, but not THAT fast :P), so I bought a normal alarm clock which can receive the time signal of the atomic clock in Frankfurt(its signal range is over 1500 km, so you might be able to receive it as well). Purpose was that if I travel around putting batteries in and out, I didn't want to have to set the current time each time and it costed about 8 € which is a fair price.

Since the batterys got no power, I do have it, means total control over when I want to call, to whom and so on.
This is real freedom! I just love it :)
Maybe I should buy a wristwatch and put the cell phone away (nearly) forever...(and the secret services would not be able to trace me anymore....big brother is watching you ;) )

Donations, plus and a few words about Jesus and religion

Yesterday I had some my kind minutes and I donated one month last.fm subscriber to a good mate in New Zealand and was happy about doing so. Not that I am not a very friendly and nice person, but I love to make other poeple happy.
Today I was thinking what to write in my next article and I thought "you always wanted to donate some money for a good purpose".
So, why didn't I do it before? Lets shift that question a bit for now.

First time I made thoughts about donating money was about 2 years ago. I was in 12th class at german high school and in religion classes we read a really awesome book called "Was würde Jesus heute sagen"("What would Jesus say today") by a german politican called Heiner Geißler. From this book, a learned many things and thanks whoever I read it.

Sunday, March 21, 2010

Testing the next generation Last.fm client

Note: The software shown is in pre-alpha status and might change any time.

This morning I was just wondering what to do with this wonderful day. So I had a great breakfast with some cooked eggs, a cup of coffee, some corn flakes, watched some TV and took a bath afterwards. But what to do then? Start a new Java project...uhm I coded all week long, but the sun does not shine that much outside.

So lets install the current last.fm client from github. At this point, a big green plus to the programmers at Last.fm for putting the whole last.fm desktop software of the git repository(which includes the radio player, too) under GNU General Public License. Thats awesome!
Looking at the Network graph, I saw that eartle has the most recent version. As I am not familiar to git yet, I lauched a console and typed "man git" and read the manual of git.
Afterwards I did
mkdir ~/lastfm
cd ~/lastfm
git clone git://github.com/eartle/lastfm-desktop.git
So far, so good. After trying the normal compile way (./configure && make) I ran into some errors.
I read the README's, installed the missing packages and additionally I got liblastfm via "git clone git://github.com/eartle/liblastfm.git". I compiled the latter(you have to install some additional packages) and gave the last.fm client a try. It failed with an error that "boost" could not be found. I googled a bit and did
sudo aptitude install libboost-dev
Eventually you have to install boost-build, too. Tried again, failed again, this time for something called "yajl"(Yet another JSON library". Too bad the package is only included in the new Ubuntu 10.04. But I did not want to wait one month for the next version, so I downloaded the source via
$ git clone git://github.com/lloyd/yajl
I compiled it, then switched to the lastfn-desktop folder and compiled successfully:
cd ~/lastfm/lastfm-desktop
./configure
make
Then I tried to launch "./_bin/radio", but it did not find some librarys. I guess the way I fixed it was really dirty(copied the required librarys to /usr/lib) and I will undo that and try a better way next time.


The client

From console, I ran
cd ~/lastfm/lastfm-desktop/
_bin/radio -stylesheet app/radio/radio.css &
Main window
Main window
It has everything to satisfy your needs. Looks quite simple, but is very effective in functionality.
Hybrid stations
Hybrid stations
You can mix up up to three different stations, like your recommendations, Meat Loaf's similar artists and the contents of a plalist. Awesome feature!
Hybrid stations advanced
Hybrid Stations advanced
If you check "Show options" you can connect your selected stations with words like 'and','or' and 'not', so you could do something like "play my radio station and the chillout tag, but not Lady Gaga".
Now playing
Now playing
Shows the currently playing track, has all needed buttons. Tag and share are inside the dropdown arrow at the lower right corner.
Station settings
Playback options
More popular or more obscure? You decide. You cannot only select the popularity of the artists you want to to listen to, but also how often they should repeat(in percent).
The tagging window
The tagging window
The tagging window is awesome, you can drag tags directly into the text box on the right. Too bad it currently always displays the wrong track.
About window
About window
So do not tell me I would tell you fairy tales ;)

One word on functionality before finishing: The client is still pre-alpha, plaback often just stops, stays silent and then continues after a couple of minutes.

As I am a curious guy(especially regarding the API), I wondered about those lastfm://rql/somelongstring URL's in the console output and took a look at the source(horray to GPL :) ). What I disovered? Its the station url for the hyrbid stations...

Sunday, March 14, 2010

Testing Java HTML parsers

A few weeks ago I had to code some data export for which I had to test the speed of a bunch of Java HTML parser/cleaner libraries to have a valid XHTML output.
Jens proposed me to publish the results here and I thought that would be a really great idea.
At first, I'd like to present each one to you first and hopefully give some useful pieces of information on them. "Maven" means if the library can be found in the Maven repositories.

Jericho HTML-Parser

License: Eclipse Public License/LGPL
Maven: Yes
Has many features, like recognizing PHP tags and is easy to use.

JTidy

License: MIT License
Maven: Yes, but only the "old" builds
JTidy is tiny and pretty fast, can output wellformed XHTML.
Has bad internal exception handling(lots of empty catch blocks!)

HTMLCleaner

License: BSD License
Maven: No
DOM based, supports XPATH(really cool). Has a good bunch of confuguration options.

NekoHTML

License: Apache Software License
Maven: Yes
Good, fast, seems to be famous

TagSoup

License: Apache 2.0
Maven: Yes
Parses HTML and provides a SAX handler. Entry class is "Parser" to which a custom SAX handler can be given.

HTML Parser

License: GPL

HotSAX

License: LGPL
HotSAX looked pretty fast, but according to the homepage it is still in pre-alpha stadium, so it was not useful for my task.

Java Swing HTML parser

Comes with Sun Java.
XHTML is a more strict form of HTML 4.01, but this parser only supports HTML 3.2, so it was not in question for my purposes. Just wanted to mention it here.

Cobra: Java HTML Renderer and Parser

License: LGPL 2.1
Major plus of this one is that it is capable of parsing js and CSS, too. The browser is a good start(my admiration for that project!) although it fails all ACID tests. But nevertheless, this hasn't to say anything about the parser's quality.
One con is that this library is really slow.

Mozilla Java HTML-Parser

License: Mozilla Public License 1.1 (MPL 1.1)
The setup is not really suitable for a multi-developer setup so it fell out of the test selection.

Test results

Task was to load a predefined, really errorneous HTML document and select all <a> tags.
I used JUnit tests for each parser/cleaner and the measurement was taken ten times, while the first one was skipped due to the compilation time.
RankNameTime/msDeviation/ms
1HTMLCleaner95±18
2HotSAX124±19
3JTidy158,3±17
4Jericho HTML150±59
5NekoHtml380±44
6TagSoup439±50,5
7Cobra675±100


Jtidy is listed before Jericho HTML because it had the better deviation.I first used HTMLCleaner, because its advance in time was really big. The problem was that it couln't handle some of the real input data. HotSAX was pre-alpha(although the results are very good), so JTidy was my next choice as I needed reliability. I had not a single problem with it, it works really fine.
Last point to say is that the results of the Cobra parser are very bad...

Jens is working on a website so I can provide you the testing source code, I will put a link to there when it is online.
If anyone is interested in more detailed statistics, just contact me and I'll put them here.

As someone recently has begun to work on JTidy again, I'll try the SVN version soon and tell you the results in another post, promise! I hope they improved the exception handling.
Have a look at this piece of code:
public Node parse(InputStream in, OutputStream out)
{
Node document = null;

try
{
document = parse(in, null, out);
}
catch (FileNotFoundException fnfe) {}
catch (IOException e) {}

return document;
}

That's gruesome, isn't it?

Sunday, March 07, 2010

Icons, Limit by Frank Schätzing

Today I tried to create some nice icons for twitter and for "bookmark this page", one with a bird, two with t's and one with a star inside. Especially the star was kind of difficult, as the icon was 20x20 pixels. If you try to center the star by it's top peak you are either one pixel too much on the left or the right side, as you need an odd number of pixels to do so(if the star's most top pixel should be centered).
While trying to draw a star manually I occasionally used Google for "gimp star" and found the Gfig-Filter, which can create some really nifty shapes using vectors - and a "create-a-star" button was there, too. Currently there is one major drawback: The lack of a zoom function. I tried using Kmag(KDE Magnifier) and it was useful, but if you try drawing on 20 px, the mouse cursor covers larger parts of the image. Luckily there's already a ticket on GIMP's bug tracker :)
The result was acceptable, but I had to solve the centering problem and make some corrections. The solution was quite simple: I took the raw shape of the star and made the peaks 2 pixels instead of one, so its peaks got round.

After breakfast I went to my grandparents and because these visits sometimes can take longer than expected, I took the book with me which I recently bought in Munich. It's called "Limit" by Frank Schätzing and only costed about 26 Euro for 1300 pages. I only read ~150 of them until now and - truely- this book is awesome! The main plot takes place on 2025 and is about harvesting He-3 for power generation using nuclear fusion. As always, the facts Schätzing presents in his book are very well researched(plus a realistic extra candy of science fiction like in "The Swarm" and respects most recent political developments like Barack Obama supporting green energy.
This book is really awesome! Currently no english translation seems to be available at amzon.com so I could link to it, too bad.