Rip.java: stream manipulation for Java programmers
I never learned
sed or
awk. Or even Perl. But I'm pretty good with Java's
regex, and I'm familiar with the new
text formatting facilities in Java 5.
So rather than tricking myself into learning
sed and
awk, I wrote my own stream processor that uses Java's regex and pattern syntax:
jessewilson$ Rip.java
Usage: Rip [flags] <regex> <format>
regex: a Java regular expression, with groups
http://java.sun.com/javase/6/docs/api/java/util/regex/Pattern.html
you can (parenthesize) groups
\s whitespace
\S non-whitespace
\w word characters
\W non-word
format: a Java Formatter string
http://java.sun.com/javase/6/docs/api/java/util/Formatter.html
%[argument_index$][flags][width][.precision]conversion
'%s', '%1$s' - the full matched text
'%2$s' the first (parenthesized) group
Use 'single quotes' to prevent bash from interfering
flags:
--skip_unmatched: ignore input that doesn't match <regex>
-s:
--newline <text>: use <text> to separate lines in output
-n <text>:
So it takes Java regexes in, finds matching groups in parenthesis, and then spits those back out using String.format. Here's some examples:
jessewilson$ echo "7278 ttys001 0:00.66 ssh jessewilson.publicobject.com" |
Rip.java 'ssh.*' '%s'
ssh jessewilson.publicobject.com
jessewilson$ echo "http://publicobject.com/glazedlists/ Glazed Lists Homepage" |
Rip.java 'http://([\w.]+)\S*\s+(.*)' '%3$s: %2$s'
Glazed Lists Homepage: publicobject.com
These examples are certainly the tip-of-the-iceberg. I suspect I'll be using this tool to munge output from many processes into the input for many other processes.
Try Rip Out
Download
Rip.java, make it executable (
chmod a+x Rip.java) and put it somewhere on your path. In what is almost certainly more clever than useful, I hacked it up so the uncompiled source can be executed directly by Bash:
/*bin/mkdir /tmp/rip 2> /dev/null
javac -d /tmp/rip $0
java -cp /tmp/rip Rip "$@"
exit
*/
import java.io.*;
import java.util.*;
import java.util.regex.*;
public class Rip {
...
}
Replace my clever hack with a
.class and wrapper script if you'd prefer.
# posted by Jesse Wilson
on Saturday, August 23, 2008
2 comments
post a comment
Coding in the small with Google Collections: AbstractIterator
Part 17 in a Series.I really like the Java Collections API. So much so, that I use 'em when I'm doing work that isn't particularly collectioney. For example, I recently wrote a quick-n-dirty app that rewrote some files line-by-line. Instead of using a
Reader as input, I used an
Iterator<String>. The easiest way to create such an iterator is to load the entire file into memory first.
Before:
public Iterator<String> linesIterator(Reader reader) {
BufferedReader buffered = new BufferedReader(reader);
List<String> lines = new ArrayList<String>();
try {
for (String line; (line = buffered.readLine()) != null; ) {
lines.add(line);
}
} catch (IOException e) {
throw new RuntimeException(e);
}
return lines.iterator();
}
That code is simple, but inefficient. And it won't work if the file doesn't fit into memory. A better approach is to implement
Iterator and to read through the file on-demand as the lines are requested. Google Collections '
AbstractIterator makes this easy. Whenever a new line is requested, it gets called back to read it from the stream.
After:
public Iterator<String> linesIterator(Reader reader) {
final BufferedReader buffered = new BufferedReader(reader);
return new AbstractIterator<String>() {
protected String computeNext() {
try {
String line = buffered.readLine();
return line != null ? line : endOfData();
} catch (IOException e) {
throw new RuntimeException(e);
}
}
};
}
This class is really takes the fuss out of custom iterators. Now it's not difficult to create iterators that compute a series, process a data stream, or even compose other iterators.
# posted by Jesse Wilson
on Wednesday, August 13, 2008
3 comments
post a comment
Coding in the small with Google Collections: Sets.union, intersection and difference
Part 16 in a Series.The traditional approach to unions is to first create a new Set, and then to
addAll using each component set. You can use a similar approach to do differences and intersections.
Before:
private static final ImmutableSet<String> LEGAL_PARAMETERS;
static {
Set<String> tmp = new HashSet<String>();
tmp.addAll(REQUIRED_PARAMETERS);
tmp.addAll(OPTIONAL_PARAMETERS);
LEGAL_PARAMETERS = ImmutableSet.copyOf(tmp);
}
public void login(Map<String, String> params) {
if (!LEGAL_PARAMETERS.containsAll(params.keySet())) {
Set<String> unrecognized = new HashSet<String>(params.keySet());
unrecognized.removeAll(LEGAL_PARAMETERS);
throw new IllegalArgumentException("Unrecognized parameters: "
+ unrecognized);
}
if (!params.keySet().containsAll(REQUIRED_PARAMETERS)) {
Set<String> missing = new HashSet<String>(REQUIRED_PARAMETERS);
missing.removeAll(params.keySet());
throw new IllegalArgumentException("Missing parameters: " + missing);
}
...
}
Google Collections has methods that do set arithmetic in a
single line.
After:
private static final ImmutableSet<String> LEGAL_PARAMETERS
= Sets.union(REQUIRED_PARAMETERS, OPTIONAL_PARAMETERS).immutableCopy();
public void login(Map<String, String> requestParameters) {
if (!LEGAL_PARAMETERS.containsAll(requestParameters.keySet())) {
throw new IllegalArgumentException("Unrecognized parameters: "
+ Sets.difference(requestParameters.keySet(), LEGAL_PARAMETERS));
}
if (!requestParameters.keySet().containsAll(REQUIRED_PARAMETERS)) {
throw new IllegalArgumentException("Missing parameters: "
+ Sets.difference(REQUIRED_PARAMETERS, requestParameters.keySet()));
}
...
}
Unlike the traditional approach, these methods don't do any copies! Instead, they return views that delegate to the provided sets. In the occasional case when the copy is worthwhile, there's a handy method
immutableCopy to give you one.
# posted by Jesse Wilson
on Monday, August 11, 2008
0 comments
post a comment
Google Collections talk, Aug 6 at the Googleplex
The
Google Tech Users Group is hosting a talk that will interest Java developers:
Overview:
How the Google Collections Library builds on java.util to provide more building blocks for doing your job.
Where:
Building 42 of the Googleplex, Mountain View, California
When:
6:00pm Food, social, demos and announcements
7:00pm Talk by Kevin Bourrillion
If you'll be in the valley, you can
register for the free event. After the talk, please join us for beers 'n' boardgames!
# posted by Jesse Wilson
on Thursday, July 31, 2008
4 comments
post a comment
Correctness and my wife
I do this really annoying thing when I'm hanging out with my wife. I correct her when she uses the "wrong" words...
We're walking around town when we see something out of the ordinary - like a humongous fat dog or a friendly hobo or a police chase.
Her: "That was random"
Me: "It was unexpected. Nothing really random about it."
Her: "Ohhkayyy programmer boy. It was random. Get over it.
Our apartment is in a bit of a ghetto. So we're lying in bed and we can hear another loud argument between the neighbours downstairs. That couple has one of those dramatic relationships and they break out in yelling one night every month or so.
Me: Jodie, why are you so anxious? Get some sleep, gotta wake up early in the morning!
Her: "I can't sleep. I'm worried."
Me: "About what?"
Her: "Lots of abstract things. For example, I'm worried that she'll pull a gun on him, and they'll shoot a bullet through the ceiling, and it will hit me!"
Me: "Huh? That's not going to happen. And if it would, it would totally hit you in the ass and then you'd have a cool story."
Her: "But I'm a worrier. I always worry about abstract things like that."
Me: "That's not abstract. It's imaginary! You're worrying about something very explicit and vivid. An abstract worry might be about say, your happiness or the future. But not getting shot in the ass. That's just imaginary."
Her: "Ohhkayyy programmer boy. It was random. Get over it.
As you can see, I'm clearly getting all caught up in anal-retentive word usage and it drives Jodie crazy. I think it's cause I read code all day and argue with my coworkers whether the method should be named
safeSubstring() or
lenientSubstring(). She was surprised to hear that as somebody who uses computer languages all day, I spend a lot of time thinking about words from English.
I gotta stop code-reviewing our conversations.
# posted by Jesse Wilson
on Friday, July 25, 2008
8 comments
post a comment
Two use cases - two names?
Today I did something I've never done before - I created a method that's an 'alias' to another method. I still think this is the right thing to do, but I still find it kind of weird...
Getting Providers
Binder.getProvider() allows your module to get a
Provider<T> while the injector is still being created. The returned provider can't be used until the injector is ready, but that usually isn't a deal breaker:
public class ListOfFiltersModule extends AbstractModule {
public void configure() {
/* get the filters set, perhaps bound with multibindings */
final Provider<Set<Filter>> filterSetProvider
= getProvider(new TypeLiteral<List<Filter>>(){});
/* use that provider in a provider instance binding */
bind(new TypeLiteral<List<Filter>>() {}).toProvider(
new Provider<List<Filter>>() {
public List<Filter> get() {
Set<Filter> filtersUnordered = filterSetProvider.get();
List<Filter> result = new ArrayList<Filter>();
result.addAll(filtersUnordered);
Collections.sort(filteresUnordered, FILTER_ORDER);
return Collections.unmodifiableList(result);
});
}
}
For this purpose, this works pretty good.
Listing Dependencies
There's another use case for
getProvider - making dependencies explicit. It's handy to include the list of bindings that a module
wants right in the module. This helps Guice to fail-faster if a required binding is missing. More importantly, it makes the dependency obvious to the maintainers of the code:
public class DeLoreanModule extends Module {
public void configure() {
/* bindings needed from other modules */
getProvider(FluxCapacitor.class);
getProvider(TimeCircuits.class);
/* our bindings */
bind(Car.class).toProvider(DmcCarProvider.class);
}
This is okay,
but a clarifying comment is necessary here. Otherwise the
getProvider call looks out of place - the maintainers you're trying to help may delete this "unnecessary" call!
requireBinding
Giving
getProvider a better name for use case #2 makes the intentions more explicit. It also means we can axe the now-redundant comment:
public class DeLoreanModule extends Module {
public void configure() {
requireBinding(FluxCapacitor.class);
requireBinding(TimeCircuits.class);
bind(Car.class).toProvider(DmcCarProvider.class);
}
Behind-the-scenes, all
requireBinding() does is call through to
getProvider() to ensure that key is bound. But this is just an implementation detail - we could later change
requireBinding to do something different if there was a preferred alternative.
Multiple Names
We have two use cases and two names. But only one implementation! I think it's legit, but it's a bit surprising. Where else does this come up?
# posted by Jesse Wilson
on Thursday, July 24, 2008
5 comments
post a comment
Don't create multiple annotations with the same simple name
Nobody reads imports. Good IDEs do their best to pretend imports don't even exist - they'll hide 'em from you, and manage them for you. They'll even add imports on demand when you're writing new code.
Suppose you create your own, say
@Inject or
@RequestScoped annotation. In the code, it's practically impossible to differentiate between this and a Guice-supplied annotation:
package com.publicobject.pizza;
import com.google.common.base.Preconditions;
import com.google.common.collect.ImmutableSet;
import com.google.inject.Injector;
import com.google.inject.Provider;
import com.publicobject.pizza.annotation.RequestScoped;
import com.publicobject.pizza.annotation.Inject;
import com.publicobject.pizza.geography.GeographyService;
import com.publicobject.pizza.hr.EmployeeRoster;
@RequestScoped
public class PizzaStore {
@Inject PizzaStore(GeographyService geography,
EmployeeRoster workers) { ... }
}
Your head will explode debugging problems if the wrong annotation is applied. Guice can detect
some problems (blowing up on a mismatched scope annotation) but it's still risky business.
# posted by Jesse Wilson
on Monday, July 21, 2008
2 comments
post a comment
TypeResolver tells you what List.get() returns
It's diminishingly rare that I get to write code that improves the internals of both
Glazed Lists and
Guice...
Glazed Lists' BeanProperty
BeanProperty is a convenient utility class that can expose a JavaBeans getter/setter property as its own object. You give it a class (like
Baz.class) and a property name (like "bar") and it gives you a full property object - you can use it to read and write the property. It even exposes the property's type:
class Baz {
private String bar;
String getBar() {
return bar;
}
void setBar(T bar) {
this.bar = bar;
}
}
public void testBaz() {
Baz baz = new Baz();
BeanProperty<Baz> bar = new BeanProperty<Baz>(Baz.class, "bar");
assertEquals(String.class, bar.getValueClass());
bar.set(baz, "hello");
assertEquals("hello", baz.getBar());
}Eric Burke reported
a bug where
BeanProperty wasn't doing the right thing for generic properties. In this example, we report the value type of Foo.bar as
Object.class rather than
String.class:
class Foo<T> {
T getBar();
}
class Baz extends Foo<String> {}
In order to get the return type of
Baz.getBar(), we need to map Foo's type parameter
T to
java.lang.String.
Guice's ProviderMethods
The upcoming release of Guice lets you specify bindings with annotated methods:
class BarProviderMethods {
@Provides @Singleton
Bar provideBar() {
Bar result = new Bar();
result.setBaz("hello");
return result;
}
}
...but we run into problems if users specify generic provider methods. Guice wants to bind the return type of the provider method. Unfortunately, due to generics, that type might be insufficient:
class SetProviderMethods<T> {
@Provides
Set<T> provideSetOfT(T onlyElement) {
return ImmutableSet.of(onlyElement);
}
}
Enter TypeResolver
This class takes a generic type (like
ArrayList<String>) and exposes
precisely what the return types will be:
public void testTypeResolver() {
Type listOfString = new TypeLiteral<List<String>>() {}.getType();
Method getMethod = List.class.getMethod("get", int.class);
TypeResolver resolver = new TypeResolver(listOfString);
assertEquals(String.class, resolver.getReturnType(getMethod));
}
I suspect this class is generally useful for any app that uses a reasonable amount of reflection. If this is useful to you, grab the source from
Guice SVN.
# posted by Jesse Wilson
on Sunday, July 20, 2008
0 comments
post a comment
The reasons I'm not on iPhone
It's really tempting. As far as devices go, iPhone 3G is the best there is. It's a generation ahead of its competitors, and the gap is growing. The app store is great for both developers and for its users. But I'm not gonna get one:
Amazon MP3. Lots of good songs with no 'deregister this computer' nonsense. But Amazon MP3
cannot compete with iTunes on Apple's platform—third party apps can't download while I surf the web. This also kills movie and video downloads.
Skype. Voice-over-IP is forbidden on the iPhone 'cause AT&T wants to bill voice traffic at a different rate than data. Even on my home network, AT&T would prefer I pay them for each minute of each call. I want net neutrality but the iPhone won't have any of it.
Flash and Java. That means no Line Rider, Hulu, Naval Command or flexgames.com. Apple would prefer for developers to write iPhone-only apps rather than phone-independent apps. This is good for Apple but bad for developers.
The iPhone is a lock-in platform. Sorry, Apple, but I'm not ready for that level of commitment.
# posted by Jesse Wilson
on Friday, July 18, 2008
8 comments
post a comment
Strict vs. Forgiving APIs
Suppose it's the early 1990's and you're James Gosling implementing
String.substring(int, int) for the first time. What should happen when the index arguments are out-of-range? Should these tests pass? Or throw?
public void testSubstring() {
assertEquals("class", "superclass".substring(5, 32));
assertEquals("super", "superclass".substring(-2, 5));
assertEquals("", "superclass".substring(20, 24));
assertEquals("superclass", "superclass".substring(10, 0));
}
Forgiving APIs
In a forgiving API, these tests pass. The implementation would recognize the out-of-range indices and correct for them. Benefits of forgiving APIs:
- Fault-tolerant. An off-by-one mistake won't bring a production system to its knees.
- Easier to code against. If you don't know what to use for a given argument, just pass
null and the implementation will do something reasonable.
Strict APIs
In a strict APIs, the out-of-range arguments to
substring are forbidden and the method throws an
IllegalArgumentException. Benefits of strict APIs:
- Fail-fast. An off-by-one mistake will be caught in unit tests, if they exist.
- Easier to maintain. By limiting the number of valid inputs, there's less behaviour to maintain and test.
- More Predictable. Mapping invalid inputs to behaviour is an artform. In the example, should
substring(10, 0) return the empty string? Or "superclass"? What would the caller expect?
For maintainability, I almost always prefer strict APIs. I like to think of the classes in my code as the gears in a fine Swiss watch. Everything fits together tightly, with firm constraints on both the inputs and the outputs. I can refactor with confidence because the system simply won't work if I've introduced problems into it. With a forgiving API, I could introduce bugs and not find out about them until much later.
# posted by Jesse Wilson
on Monday, June 30, 2008
4 comments
post a comment
What's a Hierarchical Injector?
Our application has two implementations for one interface.
EnergySource is implemented by both
Plutonium and
LightningBolt:
class DeLorean {
@Inject TimeCircuits timeCircuits;
@Inject FluxCapacitor fluxCapacitor;
@Inject EnergySource energySource;
}
interface FluxCapacitor {
boolean isFluxing();
}
@Singleton
class RealFluxCapacitor implements FluxCapacitor {
@Inject TimeCircuits timeCircuits;
boolean isFluxing;
public boolean isFluxing() {
return isFluxing;
}
}
class TimeCircuits {
Date whereYouveBeen;
Date whereYouAre;
Date whereYourGoing;
}
interface EnergySource {
void generateOnePointTwentyOneGigawatts();
}
class Plutonium implements EnergySource { ... }
class LightningBolt implements EnergySource { ... }
And to allow for sequels, we assume other implementations of
EnergySource are possible. We'd like to create an
Injector immediately and create a Plutonium-powered DeLorean. Shortly thereafter, we'd like to re-use that same
Injector, but with a lightning bolt for energy.

Option one: Factory classes
We can solve this problem by introducing a
DeLorean.Factory interface that accepts an
EnergySource as its only parameter:
class DeLorean {
private final TimeCircuits timeCircuits;
private final FluxCapacitor fluxCapacitor;
private final EnergySource energySource;
DeLorean(TimeCircuits timeCircuits,
FluxCapacitor fluxCapacitor,
EnergySource energySource) {
this.timeCircuits = timeCircuits;
this.fluxCapacitor = fluxCapacitor;
this.energySource = energySource;
}
static class Factory {
@Inject TimeCircuits timeCircuits;
@Inject FluxCapacitor fluxCapacitor;
DeLorean create(EnergySource energySource) {
return new DeLorean(timeCircuits, fluxCapacitor, energySource);
}
}
}
This works for our
specific problem, but in general it's quite awkward:
- It requires a gross amount of boilerplate code.
- It discourages refactoring of the
DeLorean class. - It increases the complexity of getting an
EnergySource. - It doesn't work unless
EnergySource is a direct dependency of the DeLorean class. Otherwise you need to create lots of little factories that cascade. - And
EnergySource is no longer in-the-club—it doesn't participate in Guice's injection, AOP, scoping, etc.
Option two: AssistedInject
AssistedInject is a Guice extension that's intended to reduce the boilerplate of option one. Instead of a factory class, we write a factory interface plus annotations:
class DeLorean {
TimeCircuits timeCircuits;
FluxCapacitor fluxCapacitor;
EnergySource energySource;
@AssistedInject
DeLorean(TimeCircuits timeCircuits,
FluxCapacitor fluxCapacitor,
@Assisted EnergySource energySource) {
this.timeCircuits = timeCircuits;
this.fluxCapacitor = fluxCapacitor;
this.energySource = energySource;
}
interface Factory {
DeLorean create(EnergySource energySource);
}
}
This fixes some problems. But the core issue still remains: getting an instance of
EnergySource is difficult. Unlike regular Guice (
@Inject is the new
new), you need to change all callers if you add a dependency on
EnergySource.
Option three: Hierarchical Injectors
The premise is simple.
@Inject anything, even stuff you don't know at injector-creation time. So our
DeLorean class would look exactly as it would if
EnergySource was constant:
class DeLorean {
TimeCircuits timeCircuits;
FluxCapacitor fluxCapacitor;
EnergySource energySource;
@Inject
DeLorean(TimeCircuits timeCircuits,
FluxCapacitor fluxCapacitor,
EnergySource energySource) {
this.timeCircuits = timeCircuits;
this.fluxCapacitor = fluxCapacitor;
this.energySource = energySource;
}
}
To use it, we start with an
Injector that had bindings for everything
except for
EnergySource. Next, we create a second injector that extends the first, and binds either
Plutonium or
LightningBolt. This second injector fills in its missing binding.
The injectors share singletons, so we don't have to worry about having multiple
TimeCircuits. Static analysis is applied to both injectors as a whole, where complete information is known. And all objects are
in-the-club and get Guice value-adds like injection, scoping and AOP.
This is the solution to the mystical
Robot Legs problem, wherein we have a
RobotLeg class, that needs be injected with either a
LeftFoot or a
RightFoot, depending on where the leg will ultimately be used.
Criticism of Hierarchical Injectors
They suggest competing bindings. One
parent injector could have relations with multiple
child injectors. In our example, the parent injector binds
DeLorean and
TimeCircuits, and each child binds a different
EnergySource.
They require abstract Injectors. The parent injector in our example wouldn't be able to create an instance of
DeLorean, since it doesn't have all of the prerequisite bindings. This is just weird.
They're complex. Guice was born out of making code simpler. Does the conceptual weight of hierarchical injectors justify their inclusion?
Going forward
Today's Guice includes a simplified implementation of hierarchical injectors written by Dan Halem. It doesn't cover the interesting (but complex) case where the parent injector cannot fulfill all of its bindings. I'm studying the use cases, trying to come up with a balance between ease-of-use and power.
For example, one idea is to require users to explicitly call-out bindings that child injectors will provide:
public void configure() {
bind(EnergySource.class).throughChildInjector();
}
I'd also like to do something similar to AssistedInject's factory interfaces. This way the second injector would be created, used and discarded transparently, so the user never needs to see it. From the user's perspective, this would just be like AssistedInject, but the assisted parameters could be injected anywhere.
If you have suggested use-cases or ideas, I'd love to hear 'em.
# posted by Jesse Wilson
on Monday, June 23, 2008
1 comments
post a comment
Wanted: Guice Injector Graphing
One of the nice new features of Guice 2.0 is the new introspection API. It's the equivalent of
java.lang.reflect for Guice - it lets you inspect your application at runtime. Our goal is to make it easy to write rich tools for Guice. A natural use case is visualizing an application. The right graph can reveal the structure of your application. I've opened a feature request for this,
Issue 213. I created a proof-of-concept to drum-up excitement for this idea.
Example Graph: Application Code
class DeLorean {
@Inject TimeCircuits timeCircuits;
@Inject FluxCapacitor fluxCapacitor;
@Inject EnergySource energySource;
}
class FluxCapacitor {
@Inject TimeCircuits timeCircuits;
}
class TimeCircuits {
Date whereYouveBeen;
Date whereYouAre;
Date whereYourGoing;
}
interface EnergySource {
String generateOnePointTwentyOneGigawatts();
}
class Plutonium implements EnergySource {
public String generateOnePointTwentyOneGigawatts() {
return "newk-you-ler";
}
}
Example Graph: Guice Configuration
Injector injector = Guice.createInjector(new AbstractModule() {
protected void configure() {
bind(EnergySource.class).to(Plutonium.class);
bind(FluxCapacitor.class);
bind(DeLorean.class);
}
});
.dot File
My
Grapher code takes the above Injector and outputs a
.dot file that describes a graph:
digraph injector {
"FluxCapacitor" -> "FluxCapacitor.()" [arrowhead=onormal];
"FluxCapacitor" -> "TimeCircuits" [label=timeCircuits]
"DeLorean" -> "DeLorean.()" [arrowhead=onormal];
"DeLorean" -> "TimeCircuits" [label=timeCircuits]
"DeLorean" -> "FluxCapacitor" [label=fluxCapacitor]
"DeLorean" -> "EnergySource" [label=energySource]
"EnergySource" -> "Plutonium" [arrowhead=onormal];
}
The Rendered Graph
Finally,
Graphviz renders the
.dot file to a pretty picture:

This graph is a good start, but there's a long way to go. Unfortunately, I don't have the bandwidth to take this project to completion and am seeking a contributor. If you're interested, post a note on
the issue. Coding is its own reward!
# posted by Jesse Wilson
on Monday, June 23, 2008
0 comments
post a comment
Integer.class and int.class as Guice Keys
Shortly after
fixing arrays, I've found another
multiple representations bug. This problem is probably familiar - I'm confusing primitive types (like
int) with wrapper types (like
Integer).
It's one binding
The critical question: should these tests pass?
assertEquals(Key.get(int.class), Key.get(Integer.class));
assertEquals(TypeLiteral.get(int.class), TypeLiteral.get(Integer.class));
Currently these are non-equal, so Guice has special cases so that they both work. But some special cases are missing! Consider issue
116:
Injector injector = Guice.createInjector(new AbstractModule() {
protected void configure() {
bind(int.class).toInstance(1984);
}
});
assertEquals(1984, (int) injector.getInstance(int.class)); /* passes */
assertEquals(1984, (int) injector.getInstance(Integer.class)); /* passes */
assertEquals(1984, (int) injector.getInstance(Key.get(int.class))); /* passes */
assertEquals(1984, (int) injector.getInstance(Key.get(Integer.class))); /* passes */
assertNotNull(injector.getBinding(Key.get(int.class))); /* passes */
assertNotNull(injector.getBinding(Key.get(Integer.class))); /* fails! */
Should Key be fixed?
Yes. I think I'll change
Key so that it always uses
Integer.class, regardless of whether it was created with
int.class or
Integer.class. Otherwise, this stuff is just too prone to bugs. For example, our new Binder SPI can return
Provider<int>, even though that's not a valid type.
Should TypeLiteral be fixed?
Probably not. I'm leaning towards leaving it as-is. Consider an interface with these methods:
public boolean remove(Integer value)
public Integer remove(int index);
If I change
TypeLiteral, then the "remove" method is ambiguous when I know the
TypeLiteral of the parameter type. But as a side effect of being inconsistent with
Key, this test will always fail:
TypeLiteral<Integer> primitive = TypeLiteral.get(int.class);
TypeLiteral<Integer> wrapper = new TypeLiteral<Integer>() {};
assertEquals(Key.get(primitive), Key.get(wrapper));
assertEquals(primitive, wrapper);
I think it's a fair compromise.
Fixing the right thing
By making my changes to
Key and
TypeLiteral, I can make Guice behave consistently throughout all of its APIs. I won't have to worry about users who bind both
int.class and
Integer.class. And I should be able to rip out some special cases both Guice and its extensions.
If there's a
Guice issue that's you'd like fixed, this is a great time to get your feelings heard. I'm spending a lot of time on the issues list, trying to decide what will make the cut for 2.0.
# posted by Jesse Wilson
on Sunday, June 01, 2008
0 comments
post a comment
Wanted: javax.interceptor extension for Guice
I'm feverishly preparing Guice for the 2.0 release later this summer, and tonight I scanned through our
issues list. There's a whole bunch of good features that I won't get to before our release. So I'm looking for Guice users to help out with development!
Introducing javax.interceptor
javax.interceptor is a fairly-simple method interception package for modern Enterprise Java stacks. These
examples show how it works. Create a class with an
@Interceptors annotation:
@Interceptors(AuditInterceptor.class)
public class AccountBean implements Account {
private int balance = 0;
public void deposit(int amount) {
balance += amount;
}
public void withdraw(int amount) {
balance -= amount;
}
}
Then create the interceptor:
public class AuditInterceptor {
@AroundInvoke
public Object audit(InvocationContext invocationContext) throws Exception {
System.out.println("Invoking method: " + invocationContext.getMethod());
return invocationContext.proceed();
}
}
If everything works as intended, the interceptor's
audit() method will intercept all calls to
deposit() and
withdraw().
Guice's MethodInterceptor
Guice has another API for this,
MethodInterceptor. Guice uses
Matchers to support arbitrary selection of interceptable methods.
Wanted: an Extension for javax.interceptor
I believe this feature could be implemented as an extension to Guice. The extension would certainly require some
clever tricks, but it should not require changes to the Guice internals. Here's what I guess the implementation could look like:
public class JavaxInterceptorModule extends AbstractModule {
public void configure() {
JavaxInterceptor interceptor = new JavaxInterceptor();
injectMembers(interceptor);
bindInterceptor(Matchers,any(), new InterceptMethodsMatcher(), interceptor);
}
static class JavaxInterceptor {
/**
* Loop over all the injector's bindings, looking for @Interceptor.
* Then verify that the interceptor classes are either injectable (ie.
* they have bindings), or they have a no-arg constructor. This
* isn't strictly necessary, but it allows us to fail faster, which is
* always a Guicy thing to do.
*/
@Inject void initialize(Injector injector) { ... }
/**
* Validate that the target method is intercepted. We need to
* consider just-in-time bindings that might not have been
* checked during initialize(). We also need to check for the
* ExcludeClassInterceptors etc. annotations.
*
* <p>Instantiate all of the injectors for this method, then
* run 'em. We'll need our own implementation of InvocationContext
* to pass to the interceptors.
*/
Object invoke(MethodInvocation invocation) { ... }
}
}
Of course it's not this simple. The implementation needs be tested to be consistent with the Java EE implementations. It needs to have reasonable error handing. And there's nuances related to inheritance etc. It also needs thorough unit tests.
Recruiting Contributors
Does writing this code sound fun to you? If it does, we'd love your help. Post your interest on
the bug! You'll need to
checkout the Guice code, write the code and tests, and upload a patch. I'll code review the patch (a fairly involved process) and we'll iterate until it's perfect.
In return, you'll get to see your
@author line in the Guice source code. You'll probably learn a lot about AOP, Guice and reflection. It's resume padding. And of course, coding is its own reward.
# posted by Jesse Wilson
on Friday, May 30, 2008
3 comments
post a comment
Bug pattern: multiple ways to represent the same data
There's a class of bugs that come up when one logical datatypes has representations in multiple classes. The best example of this is
1 vs.
1L. Both ones represent the same data. But
new Integer(1) is not equal to
new Long(1) according to the corresponding equals() methods.
Calling
contains(1) on a
List<Long> compiles just fine, it just won't ever return true. Similarly for
Map.get() and
Set.contains(). Anything that depends on
equals() is broken if you mix different types to express 'one'.
The problem is that each defines an equals method that is local to its class. This is a fair design - but as a consequence these types should not be mixed.
A short catalog of tricky types
...that can cause you pain if you mix them. These types can all represent the same logical value. But if you mix them, you will certainly get burned:
- "0" : Byte, Short, Integer, Long
- "0.0" : Float, Double
- "Jan 1, 1970 12:00.00am UTC" : Date, long, Calendar
- "http://publicobject.com" : URI, URL
- "integer type" : int.class, Integer.class
- "ABC" : StringBuffer, StringBuilder, CharSequence, String
- "natural order" : Comparators.naturalOrder(), null
- "String[].class" : GenericArrayType, Class (both of which implement Type)
A shorter catalog of good types
Fortunately, in a few places the JDK has interfaces that dictate how
equals and
hashCode must be implemented. As a consequence, you can freely intermix these types without consequence:
- Sets: HashSet, LinkedHashSet, TreeSet
- Maps: ConcurrentHashMap, HashMap, Collections.emptyMap()
- Lists: ArrayList, LinkedList, Vector, Arrays.asList
Defining this behaviour for interfaces is somewhat difficult - use these classes as a guide. All implementations must implement the spec exactly or behaviour will be unreliable.
Recommendations
Avoid creating classes that allow one logical datatype to be represented by different classes. If you must, consider writing an interface to specify equals and hashCode at that level.
Choose a preferred, canonical form for your data. For example, if you consider 'null' and 'empty string' to be equal, choose one form and stick to it. Throw IllegalArgumentExceptions to callers that use the wrong one. If you're using collections, always use the canonical type for inserts and lookups.
Use an smart IDE like
IntelliJ IDEA. It'll warn you when you mix types.
An Aside...
It turns out that Guice 1.0 suffered an ugly bug because of this problem. You can represent arrays in two different ways using Java 5's Types API. Either as an instance of
Class or as an instance of
GenericArrayType. The two are equivalent but not equals(). As a consequence, some injections would incorrectly fail with 'missing bindings' exceptions.
# posted by Jesse Wilson
on Wednesday, May 28, 2008
0 comments
post a comment