Implementing Java Reflection
[Home]
[Flint]
[JavaFlint]
[Project page]
Introduction
On this page, I give a description of the possible ways of implementing Java
Reflection, both in general and in terms of the JavaFlint translation to an
intermediate language, and I try to summarize the pros and cons of each one.
Standard Implementations
Almost all JVM implementations currently implement the Java Reflection API in
a similar way. They simply write the critical methods of the classes, Class,
Field, Method, etc. in native code (C/assembly/special machine code). For
example, the Field::get(Object o) method returns the value of the field
in the object o. This method is implemented in native code, and does:
(1) explicit checks to make sure the semantics of using the reflection
package are the same as regular Java code, ie.
o is non-null instance of declaring class of the
field; access checking (unless overridden)
(2) given the pointer to o, it adds the offset of the desired field, and
casts it to an (Object) before returning it.
The Field class has several private variables, including an integer variable
storing the offset. When the JVM loads a new class, it creates a new
instance of Field for each field in the class and it initializes the
offset and other private variables appropriately.
Pros: This approach is the simplest approach. It is probably
about as efficient as reflection can get since it is only doing a few
checks and then a pointer offset and cast.
Cons:
- Although straightforward, it is a somewhat ugly approach. There has to
be explicit code which checks that, for example, the semantics of Java
field selection is preserved by the reflection classes. Since the
methods are not implemented in Java, there is no difference between
accessing public and private fields; the explicit checks just make sure
that appropriate access control is enforced.
- There is a tight coupling between the JVM and the reflection class
libraries, in that the JVM must know exactly what the reflection API is
like, so that it knows what classes to instantiate when it loads a
new class. This coupling is probably unavoidable with reflection, but
the current implementation is quite limiting since the classes in the
class library assume that objects are laid out in a certain way in the
JVM (hence the offset variables, etc.) If a JVM implementation has a
different object layout it might not be able to use the standard Sun
Java Reflection libraries.
- The reflection libraries have to be included
in the TCB (trusted computing base), because the
implementations are native code and cannot be type-checked.
- It is difficult to express this approach in a formal manner, since the
implementations of methods are in a different (unsafe) language. In
the translation from source to target language, how to formally express
the reflection API in the source language, while maintaining consistency
with the Java specification, is unclear.
(So, ideally Java should have been designed so that the classes in the
Reflection API have about the same interface as they do now, but with a
cleaner implementation, and not so much dependence on the object layout in
the JVM.)
Pure Java Implementation
The first alternative approach to implementing reflection is to try and
have a "pure Java" implementation of all the classes in the reflection
library. The idea is that the interfaces stay exactly the same, however,
the classes Class, Method, Field, etc. are made abstract so that instances
of these classes themselves are never created. However, instances of
subclasses of these classes are created and used.
Now, when the JVM loads a class, instead of simply creating instances of
class Class, it first dynamically generates a new subclass of Class and then
creates an appropriate instance of that subclass. The JVM generates the
subclasses by filling in a class definition template. Once the template
has been filled in with the information related to the class that is being
loaded, this subclass of Class can be added to the class list of the JVM,
bytecode verified and linked in with the program.
So, for example, the simplified template for the subclass of Field might be:
class Field_<CLASSNAME>_<FIELDNAME> extends Field {
public Field_<CLASSNAME>_<FIELDNAME>() // constructor
{ super( "<FIELDNAME>", ... ); ... }
public Object get(Object o) { // get value of this field in o
// check that o is of correct type by casting
// then select the field and return it
return ((<CLASSNAME>) o).<FIELDNAME>;
}
public void set(Object o, Object v) { // set value of this field in o to v
((<CLASSNAME>) o).<FIELDNAME> = v;
}
}
When the JVM loads a Point class, say:
class Point { public Integer x; }
it fills in the template above to create a new subclass of Field:
class Field_Point_x extends Field {
public Field_Point_x() // constructor
{ super( "x", ... ); ... }
public Object get(Object o) { // get value of this field in o
// check that o is of correct type by casting
// then select the field and return it
return ((Point) o).x;
}
public void set(Object o, Object v) { // set value of this field in o to v
((Point) o).x = v;
}
}
This class can now be type-checked and compiled along with the rest of the
classes. There is no need for native code in the implementation. Class Class
is similarly dealt with by filling in templates to create a subclass which,
upon calling Point.getClass().getField("x") will return an instance of the
Field_Point_x class.
There are several other details, such as the fact that every class will have
a getClass() method, including the Field_Point_x class above. So, to prevent
going into an infinite loop of creating Field_Field_Point_x_... classes,
the getClass() method in these template-generated classes will also be
overridden to prevent such recursion by making it appear
that all such subclasses (when using reflection) are not really subclasses
but are Field classes.
Pros:
- No native code.
- Semantics of the Java language are enforced automatically, no need for
explicit checks (although they might be put in, in order to generate
appropriate reflection exceptions).
- JVM's can have any object layout or whatever else they want, since there
is no longer any need for having offsets or some other
implementation-related variables in the classes of the reflection
library.
- Reflection library is no longer in the TCB, since all the classes can
be verified, type-checked, etc. as regular Java classes
- It would seem that it would be easier to specify this setup in a formal
manner, since there is no non-Java code. We might just have a "class
loading" transformation which takes a list of classes and for each
class generates the subclasses of the reflection library and creates a
new class list which can then be type-checked, linked, compiled and
run (or translated to a target language- the target language requires no
special additional features to support reflection).
Cons:
- The biggest problem with this (in a pure Java framework, maybe not in the
context of translation to Flint) is with private fields. Clearly, if
x in the Point class above is private, then the Field_Point_x::get() and
set() methods will not work. This is fine, because reflection is
supposed to preserve semantics of Java field access, etc. However, the
currect reflection API does specify that in certain cases access control
checks can be overridden. This is necessary if using reflection for, say,
pickling and unpickling objects: the pickler (which is trusted somehow,
maybe by having certain security permissions), needs to be able to access
private data as well.
I imagine that to get around this problem, JVML would have to be extended
with one more special instruction which allows overridding of access
control if someone has appropriate security permissions.
In the context of a Java-to-Flint translation, we might be able to avoid
this problem by filling in the reflection class templates, not in
the source language, but after the Java classes have been translated into
a set of Flint data. Although private fields in Flint would be hidden
in existentials, the reflection classes might be instantiated inside
the package somehow, so that they know about the private fields, but it
is not clear if this will work.
- This approach might require some minor changes to the Java
language itself and to the definition of the classes in the class
library. It therefore might not be completely compliant with the current
specification of Java Reflection.
- Efficiency is a concern, since at first glance it seems that this
approach will lead to an explosion in the number of classes
(not just instances) that are created. This problem might be alleviated
partially by some sort of smart sharing of the reflection classes.
Certainly if one had to redesign the Java language from scratch, I would
think that this approach to handling reflection is much cleaner than the first
approach above. However, it is a little more complicated and it might
require a few additions to the JVML instruction set itself, as mentioned.
Using Intensional Type Analysis (ITA)
In the context of a Java-to-Flint translation, we could try another approach,
which is to take standard Java reflection as it is currently specified,
formalize it to some degree in the source language and then translate it
into a target language containing ITA. The implementation of methods of the
classes Class, Field, Method, etc. would be done in the target language
which supports ITA. So their implementations could be as simple as the
native code implementations described above (ie. selection from records using
integer offsets, which are runtime values) but at the same time could be
type-checked.
By doing things this way, we would hope to get the best of both previous
approaches: simplicity, efficiency and compliance with current specification
of the reflection API while remaining outside of the TCB (since all code
can still be type-checked). The interaction with privacy and overriding
access control will probably depend on the form of ITA that is supported
by the target language.
This is the current state of my research...
Last updated 2000-11-16