Friday August 01, 2008About me: murpheeOnline
Maintainer of EclipseShell. Have a peek at murphee's bookshelf
Thanks to a hilariARSE sequence of events, I am now stranded in Canada.
Indefinietly.
Indefinitely, ey?
Well. It‘s only indefinite until I bother to make it definite.
“Why?” you ask. Let‘s just say I came for the Fringe of Rubyism, and stayed because of the rubycolored tape of bueraucracy.
Why am I posting this here? I‘m getting tired of typing up my story, which I‘ve now done 4 or 5 times, in two languages, making me either seem the survivor of an heroic struggle, or as a Chaplinesque Tramp, thumbing his nose in the general direction of all that is gray, calcified and bueraucratic. (I have to admit, I‘m starting to forget which version is true).
So, I thought I‘d type up the official version here, just so the various murphee appreciation groups (“murphee‘s mercenaries“, the “gathering groupees“ and the “ranting rodents”, among others) won‘t have to be notified individually.
So, what I‘m doing right now is beboppin‘, flipflopping, bubble gum popping through the mean streets of Toronto. Not quite flipflopping, cause I‘m here with nothing but a backpacks worth of spare clothes, trying so live in hotels without having to sell a major organ to foot the bill later on. Mind you, Toronto is a convenient place to do so… I‘m staying in a pretty fancy hotel right now for about 120 Canadian bucks… in a city like London or NYC, you‘d pay that much for the privilige to use a hobo‘s armpit as shelter. And you‘d be grateful, that‘s what you‘d be!
Talking about NYC… maybe it‘s just me, but this place does remind me of Manhattan, in a way. Take Yonge street, which is a bit like Broadway, going all the way up the city, ahustlin‘ and abustlin‘ all day and night. Not to mention that Times Square knockoff that is the intersection of Yonge and Dundas… which is a good place to check out every now and then; the Toronto natives seem to be staging a whole bunch of free entertainment here.
Tom Waits' Atlanta performance on NPR podcast
Them folks over at NPR managed to tape Tom Waits' performance in Atlanta (Let's all hail The Eyeball Kid blog for keeping us in the know about all that Tom Waits magic).
Before you despair looking for downloads, go to the page describing the Atlanta performance then use the “Podcast” button in the upper right hand corner, which will take you to the podcast‘s overview page. Either subscribe to the podcast with a pod wrangler of your choice, or download the file directly.
Although the audio version is great, I do indeed pity everyone who hasn‘t been at the live performance. No one can stomp, march, convulse, grimace, gesticulate, or simply articulate an audience into a frenzy like the growling force of nature that is The Tom.
Oh oh oh… make sure you don‘t miss the family reunion song… ‘bout 25 minutes in… one of the highlights, which seems to be a melange of “God‘s Away on Business“ and “Everything goes to hell anyway” and some other bits‘n bobs.
I know what I‘ll be listening to while hotel hopping here in the far north, where the moose roam… (don‘t ask).
MenTaLguY points out some fancy tunes
Yes. When I‘m not busy tracking Tom Waits around the US(this year: Atlanta, yay!), this is the other kind of music that twangs its way out of my MP3omat‘s sound trumpet.
A Banjo (yeah… it‘s probably not a banjo looks like one though), a gang throat singers, and a backdrop of ice floes drifting by … throw in a polar bear (to ward off the ubiquitous paparazzi), a huge pile of lumber and Brunhilda (my trusty old axe), and you have the ingredients for a (hypothetical) hiatus for old murphee.
(2008-06-07 17:29:20.0) Permalink Comments [0]What Newspeak brings to the table
Newspeak is one of those projects that have the Smalltalk space aquiver. Not least because it‘s headed by Gilad Bracha, one of the folks behind Strongtalk After years of slogging in the Java mines over at Sun HQ, he went off to Cadence who apparently are in need of a new language, and pulled in a bunch of other Smalltalkers. (As a sidenote, I award the "Best Bride Burning Award" to Gilad for his farewell entry on his old Sun blog).
It took me a while to put the pieces of the Newspeak puzzle together, but recently I finally had my light bulb moment. I have to say: I‘m quite intrigued about how it solves some problems by removing features and sticking to a few principles. In this post, I‘ll focus on the delta of concepts that sold Newspeak to me; there‘s a lot more to talk about but I‘ll leave that to the linked papers.
One of the basic ideas is to take something away: global state. Nothing‘s global.
I hear you say: “Big whoop! No one uses global variables anymore, and we have the decency to have a bad conscience when we use singletons…”
Fine… but there‘s one big global lookup table that we all use. The lookup table that takes a class name and returns the class implementation.
The ever present dream of the software world is to have software components resembling meatspace components, a la integrated circuits (IC), screws, tires, carburators, arse implants,... whatever. All those useful bits‘n pieces that can, more or less, be used in projects instead of reinvented from scratch.
However: we‘ll never get to that state in any reasonable way by messing about in the way we do today. There are solutions that move us, somewhat, closer. OSGi, in the Java space, is a way to encapsulate components and also making dependencies explicit. The only way you can use a class to import it … well, unlesss it has been loaded by the system classloader, which covers tons of classes. In OSGi, there are various ways, either importing packages or everything exported by a bundle and similar approaches.
What does this have to do with Newspeak: Newspeak has all this baked in from the start. A class in Newspeak doesn‘t have access to any shared, global state at all Remember that global class table I mentioned above? It‘s not available in Newspeak. A class in Newspeak only sees a class if someone explicitely passes in a reference to the class… OR: if it has access to it in it‘s local environment.
Ah… I should have mentioned: Newspeak has only classes, no packages, modules or any other special concepts. The only way to organize classes is to… nest them. From the paper on nesting method calls(which also describes the way Newspeak handles lookup of methods in the presence of nested classes):
class Sup = () class Outer = ( m = (^91) class Inner = Sup ( foo = (^m) ) )(Note: the
^91 is Smalltalkish for return 91, and m = (^91) is a method definition.
Which means: related classes are located in the same class, basically a library or module. This also makes for some neat bonuses, like parameterizing a whole library the same way you parameterize a class because… well, they‘re the same concept. Ie. you can load the same library multiple times, but initialize it with different classes. (Sample from the Newspeak overview paper):
class MyApp platform: p args: args = (... )As you can see, the class takes arguments, which means to use this class, you have to pass in all it needs to work.
This model probably leaves you with a chicken or egg question: Who initializes the root class? The nesting stops with an object literal that basically represents the applications configuration and initialization. The object literal is like a class defnition with out the ‘class‘. This sample creates an applications, using the MyApp class which is parameterized with access to the platform object (some general capabilities). Note that Newspeak uses Smalltalkish message send syntax, which means platform commandLineArgs means sending the commandLineArgs to the platform object. What you also notice: there's no new here. That is where ubiquitous "factories" come in. The class definition defines a pattern (call it a method signature) you have to use to create a new object. While this may sound like a constructor, it's not. Actually, accessing MyApp is interpreted as a message send, which means: MyApp platform: platform args: platform commandLineArgs is a call to the class‘ ‘factory method‘ to create a new object. I put “factory method” in quotes because… well, it‘s not really a special concept, it‘s just a regular old message send.
( ) (
class MyApp platform: p args: args = (...)
public main: platform = ( MyApp platform: platform args: platform commandLineArgs)
)
main method (for startup or initialization) plus configuration in one. After all: it contains all dependencies and assembles them here. In a way, this represents configuration management data like which libraries are required and “linked”. In other systems this is hidden in some other data formats, like make files (which define what gets linked together), Ant files, OSGi Bundle configurations, etc. ... except Newspeak just uses Newspeak to configure all this.
As Gilad mentions in the self sustaining systems talk (see links section), passing in dependencies explicitely is a neat property because it makes the dependencies explicit Instead of the sea of bits‘n pieces as in other systems, Newspeak modules tell you what they depend on… and they do so in a quite DRY way. No external configuration files; instead the Newspeak module code is what defines the dependencies. Throw in proper AST support for the language (which is there and customizable, just watch the talk), and you can analyze these from your code as well… just as you would with other module systems that do require config.
Let me try it from another angle.
Ever used a Dependency Injection Framework ? It's all the craze in the Java world. Hell, you can't boogy five seconds in the Java space without tripping over a DIF framework... and they keep on popping up. If you don't know what a DIF does, check Martin Fowler's article about them .
DIFs are cool because at their very core, there's irony. See, the first thing that happens right after you decide to use a DIF: your project gets a dependency injected. (In a way, this is reminiscent of how a new drug user kicks of everything with an injection, which leads to a life of dependency... I wonder if that's where the name comes from... but I digress). Yes, suddenly your project depends on the bleeping DIF framework, with all the merry issues that a dependency on a piece of software entails.
If you‘re lucky the dependency mostly stays outside of your own code, and is located in configuration files or initialization code (in that case, the DIF can be helpful). Other DIF systems, on the other hand, require you to put annotations into your code. A nice strategy, also used by bunnyboilers all over the world, ie. you‘d better stick with it (bunnyboiler, DIF), because any attempt at terminating the relationship involves cutting (body parts, code)... But I digress (again).
The other interesting aspect of the DIF phenomenon is that it‘s a neat symptom of a basic flaw. A basic flaw in a system often opens up a niche that will be filled by players offering a workaround or solution. A bit like nasty, annoying, useless wisdom teeth keep dentists in a healthy supply of golf clubs, or how mankind‘s basic gullibility keeps the infomercial industry alive.
What does this have to do with Newspeak. Well, I‘m how a little bit of consistency, a consistently applied idea can make for a powerful, simple system which completely obviates some clumsy, complicated workaround. Just like the number tower or immediate values bypass all the nonsense and madness of primitives/AutoBoxing/etc., Newspeak solves the DIF problem. How? As I mentioned above: a class needs to have all it's dependees passed to it... otherwise it won't see them.
Ah!
Son of a gun!
Paint me pink and call me Leeloo.
There it is: the solution to the dependency problem. Simple. Unremarkable, actually. And most importantly: free of three letter acronyms, products, or religious wars (yes, wherever there are two solutions to a programmer‘s problems, there‘s a war abrewin‘). No need for books written about DIF frameworks, no need for training courses, no need to decide among many of these systems,...
It‘s not even an issue… it‘s just how you do things.
You want to create instances of a class but don‘t want to hardcode a name in? Well… nothing easier than that. Everything you want to use has to be passed in anyway… so there you go.
Actually… throw in the messaging concept of Newspeak… it doesn‘t even have to be a class. Just pass in anything that handles the right message, the one to create an instance, and you‘re done.
See how the whole DI problem has become a nonissue? You can hardly fill a sentence with it... unlike the billion words spilled by everyone talking about DI. It's become unremarkable. which the best thing for a concept. Any concept that disappears frees up some space in our minds for more interesting problems. It‘s like the eradication of smallpox: once a big problem, caused many a headache, lots of inoculation and general brouhaha. Until it was gone… and now it‘s a nonissue for us, leaving our minds to worry about other things, like bird flu or Britney‘s mental state. Yay.
(Disclaimer: Just a quick message to particularly easily offended DIF or IoC groupies with nothing better to do: Don‘t bother commenting. I know these things are solutions to a problem. That‘s fine (I guess). But what‘s even dandier is a world or at least a language where these problems don‘t even arise).
Think about the big global class lookup table I mentioned above. Besides the problems described above, it has another problem: it gives a lot of functionality into the hands of everyone who can access it. Eg. every class can use java.io.File or similar classes to open a file.
Why?
Because everyone can see and use the class.
Hiding all classes by default also means: you can only use functionality if the class was given access to the code. You could say: you get the capability to do something handed to you.
Sounds like natural model. After all, if you rent a house … you get the key to it, which means: you can now get into the house. The house is locked (as in “door lock”) by default and all those people running around have no possibility to get into the house. Let‘s ignore the existence of batting rams, locksmiths or McGuyver: in this model the key‘s the only way to get in.
Actually. If you read J.K. Rowling‘s books, there‘s another way of seeing it. Remember the Secret Keeper concept in the ol‘ H. Potter books? It‘s possible to hide something completely, and the only way to make it even visible to others is to tell them a shared secret. I guess it‘s time to reveal that the Potterverse seems to be running on an OS using capabilities for access control.
I was going to prattle on wisely about capabilities, eg. how the MACH kernel had some type of capabilities built into their port system, etc. However… a quick glance at a few messages on the cap-talk mailing listtells me one thing: Capabilities are yet another one of those CS topics I had considered to be a simple idea. Except, they‘re not, and instead they‘re a whole research field, complete with people with strong opinions and a whole Dyson sphere worth of papers. So I‘ll shut up, and just finish this up with some links to Newspeak stuff.
There are a few interviews with Gilad at Channel 9:
Finally, the language mentioned as influence for some of Newspeak‘s ideas is the E language/system, eg. look at the explanation of capabilities on the E website .
Disclaimer: all of these insights into Newspeak were gleaned from public available material, which is somewhat scarce. This means: I might have misunderstood some concepts or details. If you know better, feel free to point out errors. Also: there are many other concepts that make up Newspeak, like using message sending for everything, that I didn‘t go into or fawn over explicitly.
Systems to watch out for: Newspeak, Ometa
Oh what a tease: Gilad Bracha promises that Newspeak will be available under ASL Now we just have to wait for it to appear, until then, Gilad points to all current papers and information on Newspeak
The system looks very nifty, particularly the focus on modularity and/or composability. Be sure to read all the papers (only a handful right now), or wimp out and skim Gilad's Blog 'Room 101'.
I‘ve been slowly assembling the puzzle that is Newspeak… and it‘s slowly coming together, looking dandy. Come on… any technology that, with a single concept, condemns the horrors of Dependency Injection frameworks to purgatory. Aye folks, simply stubbornly sticking to late binding gives a whole cottage industry the boot… whudda thunk, huh?
Well, I guess Alan Kay did… does actually! Rafael de F. Ferreira had an interesting post about various DSLs, among them OMeta In his current project, Alan Kay and friendsagain don their pioneering hats, and explore the frontiers of software development. Be sure to read Steps Toward the Reinvention of Programming(PDF).
(2008-05-06 21:23:54.0) Permalink Comments [0]PEHDTSCKJMBA or: Tom Waits tours this summer!
Flipping Wombats! The Eyeball Kid blog has all the news about Tom Waits‘ summer tour.
So, No time Toulouse, hop on over to the Video of the Press Conference with Tom Waits announcing the tour details
Yes… you have to watch the video of the press conference... it's only a few minutes and it'll be worth your time. And then... you'll belong to that exclusive club who knows what PEHDTSCKJMBA means.
So… next steps from here? Well, until we know when tickets can be purchased, there‘s time for preparations.
Like getting out the old steel spiked boots (for the ticket stampede), stop snacking on chalk (to get all nice and gravelly by the time of the concert), and then figure out where I hid my travelling hat… the damn US tour starts in the middle of June, wrapping up (as it seems) early July, cutting it awfully close for my current ocean hopping schedule.
So yeah… lots to do. No time, Toulouse, and I shall stop dilly dallying right now and start preppin‘ for the tour. Oh… ahem… Yehaw!
(2008-05-05 12:33:00.0) Permalink Comments [0]Hell hath no fury like Polymorphism scorned...
Let me tell you: Polymorphism is a testy ol‘ bat… Ignore at your own risk. Case in point: Slava Pestov points out the latest victim of an ignored Polymorphism: NIO2
If there‘s one thing in CS that you shouldn‘t ignore is Polymorphism. Before you click away, thinking I'm talking about boring ol' OOP Polymorphism, just indulge me by reading a few more paragraphs… might be interesting.
You see… Polymorphism isn‘t just about classes. Actually, it doesn‘t have anything to do with classes at all The fact that Java‘s Polymorphism construct, Interfaces, must be implemented by classes comes from the “Pure” OOP idea of Java. Polymorphism simply means that there can be multiple implementation hidden behind the same name.
To show how Polymorphism has nothing to do with OOP, let me point out some Polymorphism in a place that‘s quite far removed from OOP: the C language (note: I‘m talking about C not C++).
Now… C has just about no Polymorphism . At all. Except for a few areas where the total lack of it was totally obvious: a few operators.
Think about it:a = b + c
You see: this code works, no matter whether b or c are int,float, long or all those
other insanities in the type system (long long anyone?).
While this behavior seems obvious you have to remember that it isn't: there's a big difference between adding up ints or floats. Adding up the contents of a float variable with an int adder will produce a result, but not one you want: floats contain a data structure, made up of a mantissa and an exponent... treating that as a scalar int is a sure way to getting nutted by your team.
What does this have to do with Polymorphism? A lot: it means C‘s Plus operator is polymorphic: it uses the same name, but has different implementations for different
types. This code:
float b = 1.0; float c = 2.0; a = b + c;
means that the code generated for the addition looks different than the one for this code:
int b = 1; int c = 2; a = b + c;
Quite convenient, huh? Much nicer than having to use a different operator for different variable types, isn‘t it?
Funny, then, this is about where the Polymorphism ends in C. There is no (convenient, extensible) way for the developer to define their own polymorph constructs. Eg. you can‘t overload a function.
“Where would you need that?”, you ask. Well… let‘s see. Say you‘re implementing a list data structure. Let‘s see… what do we need for that:// We need a constructor - this list will be a linked one List* list_create(); // some basics void add(void * item); int length(); // ... whatever else
OK… simple enough. Once you‘re done, you‘re happily using your list in various places around your codebase. Until… you use it in a place where you don‘t want the performance or space requirements of a linked list. Something like a Vector/ArrayList style implementation would be much nicer.
OK… fine… let‘s do that… except… now your whole list design becomes awkward. Let‘s look at your first draft for the new list:
*ArrayList arraylist_create(); // some basics void arraylist_add(...); int arraylist_length();
Hmm… that‘s not nice. Why does the firstborn list (the linked one) get the nice add or length function, but this one needs to get a prefixed one? Doesn‘t seem fair. Particularly because… no one really cares how the damn list is implemented when the only thing of interest is the number of items in it (let‘s ignore the fact that a linked list has different performance characteristics).
Wouldn‘t it be much nicer if, around your codebase, you could always just call, say, length when you want the item count? Well... we could hack around it. How about this: there's only one implementation of all list functions. What they do is to determine the type of the list, and then delegate to the right implementation. Ie. a linked list gets its linkedlist_length called, etc. etc.
Of course, you know where this ends up: a handrolled implementation of Polymorphism. The interface to the list structure must implement the Pattern Matching for Polymorphism (ie. get the type of the list from a structure and then choose the correct implementation to delegate the call). This isn‘t just a lot of overhead… it‘s also not really extensible. What if someone adds their own list type? They‘ll need to mess with all the list functions and add in their own list type.
What a mess!
What this all comes down to is: Polymorphism is nice. It‘s nice to use the functions of a list independently of it‘s specific implementation. It‘s nice to read from a socket, no matter whether it‘s a TCP socket, an SSL socket, a Unix domain socket or whatever else.
Before you think the answer to all this is OOP… think again. It doesn‘t matter at all The basic principle is Polymorphism, backed up by... Pattern Matching .
After all… let‘s look at thelength implementation of our C sample:
// pseudo code, don't bother pointing out the myriad of ways
// this thing will cause the compiler to fail
int length(*void list){
if(list.type == LINKED_LIST){
return linkedlist_length(list);
}
if(list.type == ARRAY_LIST){
return arraylist_length(list);
}
return -1;
}
Think about what this code does… it tries to match the list‘s type to a number of options… yeah… probably a bit much to refer to this as Pattern Matching ... but that‘s what it is. It‘s useful to think about it this way, because this gives us a name for a concept we need for Polymorphism.
Just have a look around the programming language world: we see Polymorphism and Pattern Matching in so many places around Java:
Overriding is a rather confused concept. It seems like a special feature of class based inheritance… but that‘s just a bad way of thinking about it: it‘s Polymorphism. The fact that you‘re changing the behavior of a superclass doesn‘t matter. What you‘re actually saying is: I want to provide a different implementation for the function with this name based on its arguments.
You might say: this is stupid: how are you going to distinguish x.length() implementations by their arguments? They ain't got no arguments! Well... not in the popular message sending syntax used here... but when you think about it for a moment... that's not quite true. In a language like Java (to pick just one OOP language, feel free to use any other like Ruby, Smalltalk, C#,...), a nonstatic method always has one argument, albeit one hidden from the method signature: this.
ArrayList.length(), looks with an explicit this:
public int length(ArrayList this){
return this.store.length();
}
With this in mind… we realize that all the length methods in all classes are just functions distinguished from each
other by the type patterns in their argument lists.Every time you call one of them, they‘re found by pattern matching, ie. take the class of the this argument and find the function definition that takes it as an argument.
There‘s no need to think in classes… only think in terms of Pattern Matching
Eg.
public int length(Equator this){
return 40075;
}
Actually… these examples are only pseudo code, they won‘t work correctly in Java. What we want is the language implementation to take the method name and choose the right implementation based on the runtime type of the object. This has nothing to do with dynamic language stuff... it's just how Java's OOP works. Why don't the code examples work? Because of the way another type of polymorphism in Java works: overloading
Actually… no. There is actually no reason for overloaded operators to be separate concepts… an operator is, after all, just an special form of a function call, one that‘s succinct and can be prefix/infix/postfix. So… let‘s use the proper name:
public int foo(String this){
// do something stringy
}
public int foo(Integer this){
// int that eger
}
//...
Object a = surpriseMeWithAnIntegerOrString();
foo(a);
What I thought this would do, is to pick the right method based on the object type at runtime. Heh… naive little tyke. The code won‘t even compile. The compiler will complain loudly about not being able to choose, and you‘ll only get it to shut up by manually casting the object… which of course makes this code all the more tedious to write:
Object a = surpriseMeWithAnIntegerOrString();
if(a instanceof String){
foo((String)a);
}
if(a instanceof Integer){
foo((Integer)a);
}
And we‘re back to Pattern Matching… tedious, handrolled, not extensible Pattern Matching. (Told ya the term would come in handy).
The Visitor example uses only one argument, but I‘m sure you can think of overloaded methods with many arguments. Which brings us to the concept of multimethods.
public int foo(FooVisitor this, String this){
// do something stringy
}
public int foo(FooVisitor this, Integer this){
// integer ahoi...
}
Ah! It‘s double dispatch because now there are two items to consider, i.e. the pattern is two elements long. No matter what, this is just a special case of multi dispatch, just like Single dispatch, which is what OOP languages do when you call x.foo(), ie. dynamically find the right implementation of foo based on the type of the receiver, the this, in this case x.
Which brings us to OOP.
Ah… did you see that explanation of single dispatch in the last section? You see, it‘s funny how the message sending (don‘t confuse
this with “message passing”) syntax can be seen as just a case of Polymorphism + Pattern Matching?
Somehow, the message sending idea has caught on, and now it‘s the prevalent syntax of mainstream OOP languages. I.e. you always have a receiver ie. an object, that you're sending a message to, like:
receiver.message(argument);
It‘s a programming model… but is it really one that‘s good to use for everything? Obviously not… or we wouldn‘t have class methods (Java‘s static methods), which don‘t have a receiver.
OOP is easy to visualize: you model your problem as a bunch of things (“objects“), and they pretty much do everything by throwing each other messages (“method” calls in mainstream languages).
length like this:
length(list);Or, you can take the first argument and write it in a special location: before the function name, followed by a period:
list.length();
Now… ain‘t that slick? These two calls go to the exact same function for the list type, except in one case the focus is on the function applied to a object, whereas the other shows this as a message “sent“ to the object. I hope this makes it clear, how the Polymorphism that multidispatch allows, is a more powerful model than plain old message sending.I say “More powerful” simply because it can implement more styles of programming without building your own Pattern Matcher or other systems… just remember how you have to handroll double dispatch every time you want to provide a Visitor style to iterate over something.
I still owe you an explanation how all this polymorphism talk relates to the woes of NIO2.
It all has to do with ignoring Polymorphism, or at least thinking of Polymorphism as only being that thing to do with class based OOP or inheritance.
See, the original Buffer implementation considered the OOP part, but failed to consider the rest. Buffer‘s an abstract class, with many implementations in subclasses, for ByteBuffers, MapppedBuffers, etc. A lot of the interface allows to be implementation independent… except for two minor details:
By hardcoding these to be ints, NIO decided that, while there could be many implementations of Buffers... except ones that are bigger than about 2 billion elements .
But wait: No one cares about the particular bit width of a buffer‘s index. The only thing you want is a Buffer that responds to something like: buffer.get(index). The value passed in to this function only needs to behave like a mathematical integer: it needs to be comparable to another integerlike number and some arithmetic behaviors. Ie. it needs to be comparable to the maximum size of the buffer, so the Buffer object can complain about it being out of range if it must. If the buffer is backed by, say, a linked list structure, the index must be comparable, eg. if the lookup/get method loops through all elements and does a linked_backing_store.currentElement().getIndex == index to determine whether it has found the element at this index. And so on.
So why did NIO choose to go with int? Who knows… speed concerns might have been one reason. I can‘t imagine a lack of
foresight… after all… NIO was created in this decade, and the idea of 64 bit systems ... well, wasn't an idea anymore, it was very much reality.
The real problem of course was the fateful decision back in the early days of Java to ignore Polymorphism, and expose hardware implementation details to the programmer. By forcing the developer to choose between 8,16,32,64 bit integer types, and then making them seperate incompatible types, you were suddenly forced to make very limiting design decisions. You see: Nobody cares whether the index fits inside 8 bits or needs to be represented by some data structure because it‘s so big…
What developers care about is a way to use an integer number, if possible in the most efficient way for the CPU they‘re targeting.
Naturally, this has been solved a long time before Java was even a glint in the mailman‘s glasseye (Spot the TV reference!). You start with a representation that fits inside the CPUs native elements (registers,...), but once it exceeds that, you switch to a more fitting representation.
LISPs, Smalltalks, Ruby, etc, etc, etc all do this. One way to implement this is immediate values, where the integer value is store inside a pointer. The pointer is distinguished from regular pointers by a tag (that‘s why the concept also has the name “tagged values”). Numbers in Ruby, for instance, start out as Fixnums that fit inside a pointer; an operation that yields something bigger than Fixnum's maximum value yields a Bignum... which behaves the same way as a Fixnum, except that it stores bigger values and is actually stored on the heap.
A solution like this would have made Java a whole lot more consistent, and saved a lot of people a lot of work. Just think of the NIO2 team, who had to duplicate the whole Buffer hierarchy, basically copy/pasting the same stuff just because by choosing int for the Buffer classes, the designers limited themselves to indices that fit inside 31 bits (remember: there‘s a sign bit). The choice of a 32 bit type wasn‘t a proper choice… there‘s absolutely no reason for a buffer to have, roughly, two billion elements or less.
Why are Java developers that use a 64 bit VM excluded from opening a GigaPixel RGB bitmap image file, that uses one byte per color, with a MappedBuffer? It‘s not different from an image file that contains only 100 MegaPixels.
Sure, it‘s possible now… but only after bumping into the upper barrier, writing your senator to be allowed to upgrade to Java 7 (the one with NIO2), and then using BigBuffer. And then realizing that all your carefully Genericized code won‘t compile, because you have code like:
public void add(List<? extends Buffer>)
strewn around your code base. Since BigBuffer is a copy/paste reimplementation of Buffer… well… have fun figuring out a good way
of consolidating the two, or buckle down and replace all your uses of Buffer with BigBuffer. ... Of course... good luck with any libraries that take Buffer objects and won't accept the BigBuffer no matter how much you tell the library that “Hey! Look! It has the same bloody interface… it‘s just… just… using an int instead of a long…“.
FileChannel class… try to spot which one‘s the new method in Java 7:
map(FileChannel.MapMode mode, long position, long size) mapBigBuffer(FileChannel.MapMode mode, long position, long size)
Sigh… so… yeah. The next time someone asks you why you like them frufru, quiche eating languages like Ruby, Smalltalk, LISP and friends… kindly point out to them that by not duplicating the errors of other mainstream languages , they managed to avoid whole canyons of problems. And no: this has nothing to do with dynamic typing... it just stems from not ignoring Polymorphism in your designs.
(2008-02-10 19:15:12.0) Permalink Comments [2]Userspace Threads =~ Kernel Threads OR: How I came to like Async I/O
You know… I‘d like to know something. Why in the blistering world am I lugging around, approximately, a 100 billion neurons? Those little whiskery bastards. Nothing to do than slouch about all day, only occasionlly firing some signals. Or, to put it for the Web 2.0 generation: messaging their pals on the synaptic Social Graph-structure that makes up our brains.
But while they‘re keeping busy, processing data, occasionally procuring epiphanies and pestering me with requests for more information… they still miss glaring holes in my understanding of the world and crucial concepts.
Case in point, the topic keeping everyone awake during lunch time: the tradeoffs of userspace threading systems. A topic I‘d stuffed into the storage compartments of my mind years ago, believing I‘d understood it.
Except… of course… I didn‘t. I had simply eaten up the common misconceptions without putting much of a fight. Bugger. Just goes to show that the monoculture of Java, that I‘d spend the first half of this decade in, is a terrible place for a mind to be (before you call this Java bashing: this was just my particular mono culture, groupthink hell…).
Once I looked into other systems, such as Smalltalk, it turned out there do exist solutions for this problem out there.
So here goes: language runtimes that map the language‘s threads (or processes or whatever they‘re called) directly to kernel threads are not necessarily superior to systems that use userspace threading for the language threads.
Now… whether a thread is scheduled by the OS or by a userspace scheduler is insignificant. Before you put up the word “cooperative”... think twice: userspace schedulers can work preemptively too. So.. no difference.
Here‘s the rub… and this is what I hadn‘t fully realized until some time ago. Two problems, I thought, kept userspace threads inferior to kernel threads. Let‘s look at them… and simultaneously, why aren‘t problems:
read and expects it to block until the data has arrived. Except... the interface between this call and the OS doesn't do it this way... it uses select or other non-blocking I/O to do the read asynchronously. The reading language thread is suspended, and another userspace thread is scheduled... the runtime's busy again. Once the data is available, the OS notifies the runtime and the userspace thread is scheduled again. Wait... did you notice it? The runtime didn't block or wait, twiddling it's thumbs. It kept on running userspace code. The whole blocking syscall monster slain... transparently to the users code, ie. it doesn't have to implement this via select or other means. Using multiple coresselect. So: these are delegated to a kernel thread. Ie. the userspace thread requesting this functionality is suspended; a request for the syscall invocation is sent to a kernel thread (to increase speed, imagine we have a thread pool of fired up threads). The request contains all necessary data, the userspace thread id, etc. With the thread suspended, the userspace scheduler wakes another thread… and the CPU‘s busy again. The kernel thread performs the call, ie. is blocked until the syscall returns, then simply kicks the scheduler and hands him the response packet(result + thread id). The suspended thread is scheduled again, and presented with the result for it‘s nasty blocking call. Voila.This does put the Ruby 1.9 threading model in perspective. I thought it was a weird model, a big honking GIL but no true paralellism for Ruby code? But what the native threads allow is full native threading compatibility. If a standard library method wants to do some asynchronous calling – it can! It can simply fire up a native thread and do whatever it needs. Sure: this means that only one Ruby thread can be scheduled … but many other threads can busily handle synchronous syscalls and I/O, and this work can be scheduled to other cores. Not to mention: there‘s still the option of creating MultiVMs, a la Rubinius.
Whether it‘s possible to have two Ruby schedulers inside a single Ruby VM… that gets into a different area. It‘d mean that the VM kernel would have to be thread safe, i.e. all the internal management data structures. Eg. the data structure managing class data. If a new class is loaded, the class table/list must by locked, modified, unlocked. Why? Because if another thread accesses it, it mustn‘t be in an unusable state. Or: think about open classes: if changing a class needs a multi step operation, an atomic
transaction, you must protect the modifying and the reading code.
BTW… let‘s end on a tangent. This MultiVM idea? See… an interesting pattern emerges with Rubinius - which already has MultiVMs, which do run on different kernel threads. What we need now is to establish patterns, libraries and idioms to actually make use of this.
You see… starting off a separate VM is difficult without MultiVM
features… as I explained in the linked InfoQ article. Not just that: IPC‘s a bitch, even if you only want to exchange minuscule nuggets of data.
With the MultiVM library… it‘s fine. Particularly because the setup‘s hidden behind the nice interface. Hell… you could do the same interface for plain ol‘ Ruby 1.8.6 by firing up another Ruby process. Sure… it changes the cost structure (firing up an OS process is more expensive)... but at least all the nasty setup worry‘s gone. The library can take care of all the details, setups, OS specific workarounds… and that‘s that.
One blog entry that put me up to writing this is All System Calls Should Be Asynchronous(and others on the same blog).
This just reminded me of how we constantly have to reevaluate preconceptions and design decisions… Async I/O might have been a
nuisance with languages that didn‘t offer things like Coroutines or Continuations, ie. where you basically had to handroll state machines to handle it. But with languages like that, you can get I/O APIs that feel synchronous to the high level language, but are actually implemented using asynchronous APIs underneath.
Come to think of it… that‘s exactly what the OS does. I/O is asynchronous… the moment an Ethernet packet is in your
network card‘s buffers, the card will (yes, asynchronously) kick (interrupt) your CPU and shout “Hey! There‘s a new packet… use it or lose it, pal!”.
Let me close with a question: am I missing something? Is there a reason why mapping threads to kernel threads and using synchronous versions of syscalls for I/O is better than their async/nonblocking brethren? One potential problem might be the bad or inconsistent state of nonblocking I/O APIs across OSes.
Even if that‘s the case… what‘s the solution? Midterm, I could think of is simply replacing crappy OSes with ones that do async I/O
properly. How? Well, I‘m out of time here, but feel free to google for “Paravirtualization” once in a while. For instance, take an OS that supports m:n threading like NetBSD , rip out it‘s guts and run it on a Hypervisor like Xen.
But I digress… let me know what you think, all you reddit KnowItAlls… what am I missing?
(2008-02-05 19:20:29.0) Permalink Comments [1]Hats off to Microsoft everyone...
Here‘s something I never thought I‘d say: “Congratulations Microsoft! You‘ve done a remarkable job”. Wait! I mean that seriously, no sarcasm in sight.
What would drive me to say something as unpolitically correct as this? Well, it a quote from this little love fest on the Ruby.NET mailing list
I can‘t help but agree here: I do believe RoR on .NET would be a good starting point, but ASP.NET is too powerful of a framework (in particular, as you point out, the ASP.NET pipeline) to play second fiddle. A solid integration of the best of both worlds is really what I personally believe would be the best overall approach: Place the focus on bringing RoR to ASP.NET as a first-class ASP.NET-based application, not ASP.NET to RoR as a first-class Rails application.
I just love the fact that MS has managed to condition their developer community so thoroughly... Soooooo thoroughly that even guys on the mailing list of an Open Source alternative to MS‘ IronRuby, will hasten to ensure that MS‘ products get preferential treatment.
Yeah… Screw the Open Source product Rails.
Screw the fact that once a Ruby implementation can run a Rails application flawlessly, it has crossed a significant milestone towards being fully compatible to MRI.
Screw all that… first focus on supporting ASP.NET correctly… then maybe add in Rails support… but only so it‘s compatible with ASP.NET.
So yeah… my hat is off to Microsoft and their community liaison department for doing such a great job. Hey… I vote for a raise for those guys and gals in that department… you gotta respect people doing a good job. Move over Steve Jobs‘ Reality Distortion Field… the .NET developer community‘s minds seem to have been warped around Microsoft‘s hemorrhoids so thoroughly as to basically forming a symbiotic relationship.
It gets even funnier, once you realize that even Microsoft‘s own employees, keep on telling everyone how Rails support in IronRuby is important. Isn‘t it curious that evil Microsoft (yeah they are evil ... Hell, John Lam didn't even bother to hide his pointy tail in his recent RubyConf07 talk) would go on about being compatible, whereas people on the Open Source Ruby.NET project are wreaking their brains how to best preserve the value of MS products such as ASP.NET.
Oh boy…
(2008-02-05 18:07:58.0) Permalink Comments [1]Oy… someone‘s awfully proud of having internalized bit operations
Unless I‘m misunderstanding this, the poster argues that replacing inlined bit twiddling code is preferablE Over extracting this code and stuffing it into a function. The offending quote from the book (also in the post):
Hm. Frankly, if I see (low + high) >>> 1 sprinkled over over a code base I know a few things about the programmer that wrote it:
while(foo()){
int x = nextFoo();
// do stuff
}
into:
int x = 0;
while(foo()){
x = nextFoo();
// do stuff
}
calculateMidpoint in there, I instantly know what this expression does. Why? Because it says what it does...right there in the code. When I see (low + high) >>> 1, I have decode it. It‘s not difficult… but it‘s still more work for the reader. But that‘s the problem our old brains have: we have a limited number of general purpose registers (GPRs)... so we have to use them carefully. If I need all GPRs to figure out that snippet of code, this pushes out all the other little nuggets of information that I‘ve accumulated… like what the containing code actually does, what that variable is or that I left my hand on the scolding hot coffee mug… This is the secret behind the power of abstraction: by naming a concept and only dealing with that (instead of balancing the whole concept in my mind) I can combine more than one concept simultaneously.calculateMidpoint instead of that other snippet is more beautiful, in a small way.Now… just so I‘m not just negative… I do kinda agree with some of of the blog entry, i.e. that many code monkeys programmers nowadays have little to no education in the basics. But then... everyone has a different interpretation of what basics are.
Basically: we have limited brain space, not to mention limited time to acquire skills. By moving up the abstraction layers to using more powerful tools, we can do more
If you‘re not convinced: this article titled Brain Rot by Theodore Gray argues this much better and in more depth. If you ever thought the world‘s going to the dogs because teenagers of today don‘t know how to do [old kill you spent a lot of time learning], then this article is for you.
Note: BTW the title of this post was chosen to avoid using that cliche‘d old Dijkstra title. I‘ll give you a head start with decoding by providing the Wikipedia entry for Superbia...
(2007-12-13 16:40:32.0) Permalink Comments [1]Conference season is upon us… and I‘m smack in the middle of it. I got my little bag onna stick ( apple and loaf of bread laptop and passport), hobostyle, and set off Sunday.
Currently I‘m chillin‘ in Manhattan, hanging out in the Club lounge of my hotel (whose daily rate I still can‘t believe), using the free wireless and living of f‘ree h‘ors d‘oeuvre. Free is good… better than the kind of prices you usually deal with in hotels.
Not to mention cabs… every once in a while, when traveling, I seem to have “gullible” spraypainted on my forehead, plain for everybody, but me, to see. Yesterday, at JFK, I actually hopped into an unmarked, unlicensed taxi… no Yellow Cab, just a plain white car. I can‘t remember why I ended there… but I do remember holding on to my only defensive weapon, my mobile phone, during most of the ride... (if you don't think a mobile phone is good for self defense, you haven't seen my 1999 humdinger of a S1088). While the cabbie was friendly enough, I had visions of ending up in some back alley, stripped of all possessions, bareassed and embarrassed for falling for a tourist trap. Well... lucky me: it ended well with me being safely deposited next to my hotel, and the cabby riding off with ... wayyyy too much of my money. Next time I'd like to take a GPS receiver, just to see what kinds of fractal looperoo I had been taken on. All I know is that I passed by the 1964 NY World Fare sitetwice…
Oh well… actually, the stay in NYC is just a temporary one… I had to go via NYC anyway, so I thought I might as well stay four days, before going of to these two conferences:
So… lots of travelling, bad airline food, cramped seating and the horror of the LGA (I‘ll never understand how people can call Heathrow bad… they must never have seen LGA‘s two dimensional waiting lines of hell, not to mention the roofs that seem to be leaking after a bit of rain…). Oh well, on the upside there‘s coast to coast sampling of Starbuckses, lots of rooms in highrise hotels (48th floor in NYC… Yay! I can‘t help it… I stayed in a crappy, teenzy appartment for ages just because it was on the 12th floor…).
PS: Disclaimer: above blog entry is entirely devoid of useful information for you reader… I guess that‘s why it‘s filed under /proc/murphee …
(2007-10-29 21:39:00.0) Permalink Comments [0]Daily Show Archives now online!
Oh oh oh! Look at what lovely young Morgan Webb dragged in: The Daily Show is now available online! 8 years worth of it!Ding… dong!
And not just that… it‘s even legal, paid for by pre/post run ads (which, BTW will be burned into your mind after a wild romp through the archives).
Thanks MorganWell… and thanks, big, faceless company that made this possible.
Oh yeah… to get you started: here, have some Tom Waits meets Jon Stewart action.
(2007-10-22 19:11:30.0) Permalink Comments [0]iPhone/iPod touch native apps - now for real
Engadget reports a new item on Apple's HotNews page ... yes, straight from the horse's mouth: iPhone and iPod native apps SDK is coming in February 2008 (note to future readers: it‘s the news item from October 17th 2007 – I can‘t find a Permlink for this particular item).
Dammit Apple… stop tempting me… a 2nd generation iPod touch is going to be hard to resist…
(2007-10-17 12:54:31.0) Permalink Comments [0]Finally, a motivational poster I'd want...
Hehe… John Wisemanpoints to a motivational poster that's actually inspiring And while there are other CS heroes out there, LISPcreator John McCarthyhas the disapproving glare to make it work. Others like Donald Knuthis probably known to more people, but no one‘s ever been inspired to take a critical look at the methods they use by the smiley, skinny Buddha that is Knuth. Nope, that kind of introspection can only be achieved by a glare that says “Look, get your act together or I‘ll do some blightin‘ smitin‘!”.
On my mind: Rubinius, Pattern Matching, Generic functions vs. parametric polymorphism vs. OOP, Manhattan hotels, whiskers on kittens (wha?).
In my ears: er… I guess Jill Scott via Giles Peterson... a bientot…
(2007-10-12 14:13:34.0) Permalink Comments [0]Here we go again, JRuby 1.0.1 is out and so is JParseTree 1.0.1.717. In case you‘re wondering: the first three figures of the version number are the JRuby version that JParseTree works with, the rest is just the SVN version of the released code.
This release is compatible to JRuby 1.0.1 and ParseTree 2.0.1 (BTW ignore the APIs mentioned in the ParseTree release… the names haven‘t changed, it‘s just a formatting error in the release).
Important: this release will only work with JRuby 1.0.1 because of a changed API in JRuby. In case you're wondering, it's the JRuby.parse(source, filename) method which was replaced with a three argument version. That‘s what you get for using unsupported APIs…
Anyway… enjoy… and watch out for ParseWeasels…
(2007-08-26 15:08:03.0) Permalink Comments [0]
Creating Passionate Users
Headius (JRuby Team)
InfoQ Joel On Software Mika
Ola Bini (JRuby Team)
Planet Classpath
Schneier on Security Tim Bray: Ongoing
Tom's Ruminations (JRuby Team)
You are viewing a mobilized version of this site...
View original page here