Archive for the ‘Java’ Category

Bookmark Cleanup #2

Tuesday, April 15th, 2008

More links from my bookmarks. Mostly Java today.

More links coming soon.

Hyperic SIGAR

Monday, February 11th, 2008

Hyperic SIGAR (System Information Gatherer and Reporter) is a cross-platform, cross-language library and command-line tool for accessing operating system and hardware information in C, Java, Perl and C#. SIGAR is licensed under the GPL version 2. Not quite sure what this implies for Java projects, but anyway.

Just spotted this library as part of the upcoming GridGain 2.0 release.

Distributed File Systems

Saturday, February 9th, 2008

I’ve been thinking about the best way to configure a bunch of computers for doing large-scale machine learning experiments. One problem that always pops up is how to get some piece of the data to the node that needs to process it (a mapping in the Map Reduce framework).

You can cook up various schemes to distribute the data, but in the end I don’t think anything is going to beat the simplicity of a shared file system. However, when your cluster starts getting big and your data starts getting large, you start running into problems with traditional shared file systems like NFS (contention mostly). This leads one to consider a truly distributed file system.

It should come as no surprise that Google has the Google File System. I think many of the amazing things the people at Google are able to do can be attributed to the fact that they have their map-reduce and distributed file system infrastructure properly sorted out.

For the rest of us, there’s Hadoop, which is nice, but still not quite as easy to use as I’d like it. Ideally, I want to install the latest version of my Linux distribution or run a setup program on Windows and it should just work. No mess, no fuss. On Windows I want to see my distributed file system as a drive letter (or as a directory on Linux): this makes it easy to make legacy applications (C++ programs, MATLAB scripts, etc.) operate on your data. Along these lines, Hadoop has something called Pipes which could be used in some cases, but ideally I want the fact that I’m operating on distributed data to be completely transparent to my applications.

Here OpenAFS is showing some promise. It seem some guys are working on an IFS driver for OpenAFS (see OpenAFS for Windows Requested Features and Road Map). IFS looks like the right way to integrate a new file system with the Windows platform. Last I checked, Hadoop didn’t support all the functions of a general purpose file system, but maybe it could still be integrated with IFS to give a it a really nice interface for Windows users. I don’t know what OpenAFS does on Linux, but I’m assuming it works nicely there already. I should investigate…

I mention Hadoop and OpenAFS, since they seem to be the only candidates in the list of distributed file systems on Wikipedia that appear to be free, properly maintained and generally useful.

Once you have your data sorted out, you still need to distribute your computation across the nodes in your cluster. I’ll discuss that in another post.

By the way, the Hadoop folks recently created a subproject called Mahout, that is focusing on building distributed implementations of various machine learning algorithms, following the ideas published in Map-Reduce for Machine Learning on Multicore.

Jar Jar Links and One-JAR

Tuesday, February 5th, 2008

Java links of the day: Jar Jar Links (jarjar) and One-JAR. I’ve used jarjar before, but I’ve run into some bugs when trying to bundle certain libraries. The JRuby project have also had this issue. I guess jarjar’s maintenance went south after Google bought Tonic Systems…

Java Native Access (JNA)

Friday, February 1st, 2008

I’ve been meaning to write something about Java Native Access (JNA), but I’ve been too busy actually using it! According to the JNA site, “JNA provides Java programs easy access to native shared libraries.” Python folks have had the same functionality in ctypes for a while now.

I’ve been using JNA for about 9 months for code related to my master’s thesis. I’ve built Java code on top native libraries for BLAS (mostly Intel MKL for now), HDF, PRIMME and MATLAB.

JNA’s future is looking bright. It provides an easy-to-use alternative to JNI. The JNA maintainer, Timothy Wall, is extremely active on the mailing list. Even the JRuby folks are catching on.

Java compiler bugs

Friday, February 1st, 2008

During the work on the Java code related to my master’s thesis, I’ve run into two bugs in Sun’s Java compiler (using Java 6).

The first bug was Bug ID: 6570761 Possible generics regression - inconvertible types. I’ve since changed my the design so that this bug no longer affects me, but it was annoying none the less.

The second bug has been reported by others, but there doesn’t seem to exist a report for it in Sun’s bug database. I submitted a bug report to them in November of 2007, but the report seems to have been ignored since then. I figured I’d reproduce the report here in case anybody else runs into this problem.

The offending code looks like this:

interface IA {
IA op();
}
interface IB {
IB op();
}
public interface IC extends IA, IB {
IC op();
}

The error message is:

IC.java:7: types IB and IA are incompatible; both define op(), but with unrelated return types

This same issue has previously been raised for the Eclipse compiler:

It seems §9.4.1 of the Java Language Specification applies to this problem. It says:

“An interface inherits from its direct superinterfaces all methods of the superinterfaces that are not overridden by a declaration in the interface. It is possible for an interface to inherit several methods with override-equivalent signatures (§8.4.2). Such a situation does not in itself cause a compile-time error. The interface is considered to inherit all the methods. However, one of the inherited methods must must be return type substitutable for any other inherited method; otherwise, a compile-time error occurs.”

While Sun drags its feet with this issue (maybe it’ll get fixed for Java 8 in a few decades from now), you can use Eclipse. If you need to build with Ant, you can get it to use the Eclipse compiler by setting the build.compiler property to org.eclipse.jdt.core.JDTCompilerAdapter and including ecj.jar in your Ant classpath.

It seems some French guys were also grappling with this problem: Héritage multiple des interfaces et surcharge de méthodes.

Update: Finally found the right bug for the covariant return problem with the help of Jonathan Gibbons from Sun. It is Bug ID: 6294779 Problem with interface inheritance and covariant return types. The bug was created in 2005. Don’t know why I couldn’t find it with my previous searches, but maybe other people will have better luck now.

libffi

Friday, February 1st, 2008

It seems there is some renewed interest in libffi, the library used by ctypes and JNA to call functions in native libraries from Python and Java, respectively. Until recently, the efforts around libffi have been very fragmented, with various patches only being available in only ctypes or only JNA (or only elsewhere).

On the ctypes side, there are some patches to build libffi with Visual Studio, which are useful for Windows junkies like myself. There is also a patch for Win64 support, which really needs to get into JNA (Java on Windows Server 2003 x64 rocks!). Timothy Wall of JNA fame has also produced some patches. A lot of this work has featured on the gcc-patches mailing list.

Anthony Green is also doing some work on libffi, but this seems to be happening separately from the work of the gcc-patches folks. Hopefully all these disparate efforts can be unified so that we can all benefit from a single libffi that works on many platforms (including Win64, please!).

Hadoop does WebDAV

Wednesday, July 25th, 2007

There’s recently been some movement on HADOOP-496, the issue for exposing the Hadoop Distributed File System as a WebDAV store. Maximum respect to Enis Soztutar for sorting this out.

The patch hasn’t made it into SVN yet, but I hope it will show up there in a few weeks. Getting files onto your HDFS has never been this easy: add a new network place in Windows Explorer, open the folder, drag and drop! I’m pretty sure your favourite Linux desktop environment will have similar functionality.

I’ve also been running the litmus WebDAV server protocol compliance test suite against this Hadoop/WebDAV combination to test the patch. There’s still a few issues to fix, but things are looking good.

In case you want to add a WebDAV interface to your Java application, you might consider building it on top of Apache Jackrabbit, as this patch does. I took a brief look at doing what this patch does with Jakarta Slide, but even with the WebDAV Construction Kit to help you, Slide is a rather large beast.

More from Eclipse and Java land

Wednesday, July 4th, 2007

My new favourite Eclipse 3.3 keyboard shortcut is Ctrl+3. It allows you to search through almost anything you can do in the IDE. If you’re a BASH user, Ctrl+3 will remind you of Ctrl+R (reverse-i-search).

InfoQ has an article on New Concurrency Features for Java SE 7, with a link to Doug Lea’s A Java Fork/Join Framework. Along these lines, I have been playing with CompletionServices and Executors quite a bit the past 3 weeks. I hope to present the fruits of my labour here soon.

Eclipse 3.3 First Impressions

Sunday, July 1st, 2007

Eclipse 3.3 (aka Eclipse Europa) was released a few days ago and I decided to give it a spin. Here are some notes in no particular order:

  • The probably isn’t a new feature, but you can configure your default web browser under Window | Web Browser.
  • You can now do “Sort Members” on a whole project. You can also “Organize Imports” in the same way. Between Eclipse’s code formatter, cleanups on save, the ability to export settings for these features and The Checkstyle Plug-in for Eclipse there is no reason I can think of for all the Java code in your project not to look exactly the same.
  • Libraries references by a Java project are all grouped together under a “Referenced Libraries” item in the tree view. Now you don’t have to fiddle with the filters so often.
  • I gave the Dynamic Languages Toolkit a quick spin. I was able to get up and running with JRuby in seconds.
  • The Help system uses Jetty now. Nice. But for some reason it wants to listen on all my interfaces, causing Windows to pop up a Firewall allow/block dialog the first time you launch the Help system. What’s up with that?
  • The Java compiler’s warnings with regards to generics have been improved. Way back in the 3.3Mx days I found that it rejected some invalid generics code that would compile with 3.2, so it looks like there’s been some improvements.
  • The new Run/Debug launching is weird. The default is now to have Eclipse launch what it thinks you want to launch, instead of launching the last thing you launched. You can turn off this behavior under Window | Preferences | Run/Debug | Launching by selecting “Always launch the previously launched application”.
  • The new interface for the Rename refactoring is annoying. Previously I could Alt+Shift+R and a dialog would pop up with the previous name selected, so that if I typed something, it would overwrite the old name. Now I get an editing thing that takes my cursor position in the name I’m trying to refactor into account, with no obvious way of completely nuking the old name so that I can just type the new one quickly.
  • The existing FindBugs plugin still works. Nice. Still need to install a Subversion plugin. Maybe it’s time to switch to Subversive?
  • The Remote System Explorer is very useful.

Now for my only major gripe so far: the platform proxy settings. Bug 154100 - Platform level proxy settings has the details. These proxy settings were clearly not designed by someone who actually uses a proxy on a daily basis, especially with a laptop. Here are some hints:

  • As I move around with my laptop between networks, my proxy settings usually need to change. The proxy settings need some kind of “profile” feature. I guess I could switch workspaces, but that’s just annoying.
  • As far as I can tell, I can’t exclude ranges of IPs or hostnames with wildcards (e.g., *.foo.com or 192.168.0.0/255.255.0.0 or 192.168.0.0/16). I guess I could type in the host name or IP address of every machine on my network, but that’s going to take a while.

And if you’re going to fix this second issue, don’t be stupid like Firefox. If I add 192.168.0.0/255.255.0.0 to the list of IPs to exclude, I actually also mean any host name that resolves to an IP in that range.

Update: Tried Subversive. No, thank you. I’ll stick with Subclipse for the moment.