Commit ca6b827f by Nic Ferrier

* gcj.texi (About CNI): New node.

From-SVN: r49417
parent d3c52658
......@@ -123,6 +123,7 @@ files and object files, and it can read both Java source code and
* Invoking jcf-dump:: Print information about class files
* Invoking gij:: Interpreting Java bytecodes
* Invoking jv-convert:: Converting from one encoding to another
* About CNI:: Description of the Cygnus Native Interface
* Resources:: Where to look for more information
@end menu
......@@ -395,7 +396,7 @@ question actually does violate array bounds constraints.
@item -fjni
With @code{gcj} there are two options for writing native methods: CNI
and JNI. By default @code{gcj} assumes you are using CNI. If you are
and JNI@. By default @code{gcj} assumes you are using CNI@. If you are
compiling a class with native methods, and these methods are implemented
using JNI, then you must use @code{-fjni}. This option causes
@code{gcj} to generate stubs which will invoke the underlying JNI
......@@ -831,6 +832,874 @@ Print version information, then exit.
@c man end
@node About CNI
@chapter About CNI
This documents CNI, the Cygnus Native Interface,
which is is a convenient way to write Java native methods using C++.
This is a more efficient, more convenient, but less portable
alternative to the standard JNI (Java Native Interface).
@menu
* Basic concepts:: Introduction to using CNI@.
* Packages:: How packages are mapped to C++.
* Primitive types:: Handling Java types in C++.
* Interfaces:: How Java interfaces map to C++.
* Objects and Classes:: C++ and Java classes.
* Class Initialization:: How objects are initialized.
* Object allocation:: How to create Java objects in C++.
* Arrays:: Dealing with Java arrays in C++.
* Methods:: Java methods in C++.
* Strings:: Information about Java Strings.
* Mixing with C++:: How CNI can interoperate with C++.
* Exception Handling:: How exceptions are handled.
* Synchronization:: Synchronizing between Java and C++.
* Reflection:: Using reflection from C++.
@end menu
@node Basic concepts
@section Basic concepts
In terms of languages features, Java is mostly a subset
of C++. Java has a few important extensions, plus a powerful standard
class library, but on the whole that does not change the basic similarity.
Java is a hybrid object-oriented language, with a few native types,
in addition to class types. It is class-based, where a class may have
static as well as per-object fields, and static as well as instance methods.
Non-static methods may be virtual, and may be overloaded. Overloading is
resolved at compile time by matching the actual argument types against
the parameter types. Virtual methods are implemented using indirect calls
through a dispatch table (virtual function table). Objects are
allocated on the heap, and initialized using a constructor method.
Classes are organized in a package hierarchy.
All of the listed attributes are also true of C++, though C++ has
extra features (for example in C++ objects may be allocated not just
on the heap, but also statically or in a local stack frame). Because
@code{gcj} uses the same compiler technology as G++ (the GNU
C++ compiler), it is possible to make the intersection of the two
languages use the same ABI (object representation and calling
conventions). The key idea in CNI is that Java objects are C++
objects, and all Java classes are C++ classes (but not the other way
around). So the most important task in integrating Java and C++ is to
remove gratuitous incompatibilities.
You write CNI code as a regular C++ source file. (You do have to use
a Java/CNI-aware C++ compiler, specifically a recent version of G++.)
@noindent A CNI C++ source file must have:
@example
#include <gcj/cni.h>
@end example
@noindent and then must include one header file for each Java class it uses, e.g.:
@example
#include <java/lang/Character.h>
#include <java/util/Date.h>
#include <java/lang/IndexOutOfBoundsException.h>
@end example
@noindent These header files are automatically generated by @code{gcjh}.
CNI provides some functions and macros to make using Java objects and
primitive types from C++ easier. In general, these CNI functions and
macros start with the @code{Jv} prefix, for example the function
@code{JvNewObjectArray}. This convention is used to avoid conflicts
with other libraries. Internal functions in CNI start with the prefix
@code{_Jv_}. You should not call these; if you find a need to, let us
know and we will try to come up with an alternate solution. (This
manual lists @code{_Jv_AllocBytes} as an example; CNI should instead
provide a @code{JvAllocBytes} function.)
@subsection Limitations
Whilst a Java class is just a C++ class that doesn't mean that you are
freed from the shackles of Java, a @acronym{CNI} C++ class must adhere to the
rules of the Java programming language.
For example: it is not possible to declare a method in a CNI class
that will take a C string (@code{char*}) as an argument, or to declare a
member variable of some non-Java datatype.
@node Packages
@section Packages
The only global names in Java are class names, and packages. A
@dfn{package} can contain zero or more classes, and also zero or more
sub-packages. Every class belongs to either an unnamed package or a
package that has a hierarchical and globally unique name.
A Java package is mapped to a C++ @dfn{namespace}. The Java class
@code{java.lang.String} is in the package @code{java.lang}, which is a
sub-package of @code{java}. The C++ equivalent is the class
@code{java::lang::String}, which is in the namespace @code{java::lang}
which is in the namespace @code{java}.
@noindent Here is how you could express this:
@example
(// @r{Declare the class(es), possibly in a header file:}
namespace java @{
namespace lang @{
class Object;
class String;
...
@}
@}
class java::lang::String : public java::lang::Object
@{
...
@};
@end example
@noindent The @code{gcjh} tool automatically generates the nessary namespace
declarations.
@subsection Leaving out package names
Always using the fully-qualified name of a java class can be
tiresomely verbose. Using the full qualified name also ties the code
to a single package making code changes necessary should the class
move from one package to another. The Java @code{package} declaration
specifies that the following class declarations are in the named
package, without having to explicitly name the full package
qualifiers. The @code{package} declaration can be
followed by zero or more @code{import} declarations, which
allows either a single class or all the classes in a package to be
named by a simple identifier. C++ provides something similar with the
@code{using} declaration and directive.
@noindent In Java:
@example
import @var{package-name}.@var{class-name};
@end example
@noindent allows the program text to refer to @var{class-name} as a shorthand for
the fully qualified name: @code{@var{package-name}.@var{class-name}}.
@noindent To achieve the same effect C++, you have to do this:
@example
using @var{package-name}::@var{class-name};
@end example
@noindent Java can also cause imports on demand, like this:
@example
import @var{package-name}.*;
@end example
@noindent Doing this allows any class from the package @var{package-name} to be
refered to only by its class-name within the program text.
@noindent The same effect can be achieved in C++ like this:
@example
using namespace @var{package-name};
@end example
@node Primitive types
@section Primitive types
Java provides 8 @dfn{primitives} types which represent integers, floats,
characters and booleans (and also the void type). C++ has its own
very similar concrete types. Such types in C++ however are not always
implemented in the same way (an int might be 16, 32 or 64 bits for example)
so CNI provides a special C++ type for each primitive Java type:
@multitable @columnfractions .20 .25 .60
@item @strong{Java type} @tab @strong{C/C++ typename} @tab @strong{Description}
@item @code{char} @tab @code{jchar} @tab 16 bit Unicode character
@item @code{boolean} @tab @code{jboolean} @tab logical (true or false) values
@item @code{byte} @tab @code{jbyte} @tab 8-bit signed integer
@item @code{short} @tab @code{jshort} @tab 16 bit signed integer
@item @code{int} @tab @code{jint} @tab 32 bit signed integer
@item @code{long} @tab @code{jlong} @tab 64 bit signed integer
@item @code{float} @tab @code{jfloat} @tab 32 bit IEEE floating point number
@item @code{double} @tab @code{jdouble} @tab 64 bit IEEE floating point number
@item @code{void} @tab @code{void} @tab no value
@end multitable
When refering to a Java type You should always use these C++ typenames (e.g.: @code{jint})
to avoid disappointment.
@subsection Reference types associated with primitive types
In Java each primitive type has an associated reference type,
e.g.: @code{boolean} has an associated @code{java.lang.Boolean} class.
In order to make working with such classes easier GCJ provides the macro
@code{JvPrimClass}:
@deffn macro JvPrimClass type
Return a pointer to the @code{Class} object corresponding to the type supplied.
@example
JvPrimClass(void) @result{} java.lang.Void.TYPE
@end example
@end deffn
@node Interfaces
@section Interfaces
A Java class can @dfn{implement} zero or more
@dfn{interfaces}, in addition to inheriting from
a single base class.
@acronym{CNI} allows CNI code to implement methods of interfaces.
You can also call methods through interface references, with some
limitations.
@acronym{CNI} doesn't understand interface inheritance at all yet. So,
you can only call an interface method when the declared type of the
field being called matches the interface which declares that
method. The workaround is to cast the interface reference to the right
superinterface.
For example if you have:
@example
interface A
@{
void a();
@}
interface B extends A
@{
void b();
@}
@end example
and declare a variable of type @code{B} in C++, you can't call
@code{a()} unless you cast it to an @code{A} first.
@node Objects and Classes
@section Objects and Classes
@subsection Classes
All Java classes are derived from @code{java.lang.Object}. C++ does
not have a unique root class, but we use the C++ class
@code{java::lang::Object} as the C++ version of the
@code{java.lang.Object} Java class. All other Java classes are mapped
into corresponding C++ classes derived from @code{java::lang::Object}.
Interface inheritance (the @code{implements} keyword) is currently not
reflected in the C++ mapping.
@subsection Object fields
Each object contains an object header, followed by the instance fields
of the class, in order. The object header consists of a single
pointer to a dispatch or virtual function table. (There may be extra
fields @emph{in front of} the object, for example for memory
management, but this is invisible to the application, and the
reference to the object points to the dispatch table pointer.)
The fields are laid out in the same order, alignment, and size as in
C++. Specifically, 8-bite and 16-bit native types (@code{byte},
@code{short}, @code{char}, and @code{boolean}) are @emph{not} widened
to 32 bits. Note that the Java VM does extend 8-bit and 16-bit types
to 32 bits when on the VM stack or temporary registers.
If you include the @code{gcjh}-generated header for a
class, you can access fields of Java classes in the @emph{natural}
way. For example, given the following Java class:
@example
public class Int
@{
public int i;
public Integer (int i) @{ this.i = i; @}
public static zero = new Integer(0);
@}
@end example
you can write:
@example
#include <gcj/cni.h>;
#include <Int>;
Int*
mult (Int *p, jint k)
@{
if (k == 0)
return Int::zero; // @r{Static member access.}
return new Int(p->i * k);
@}
@end example
@subsection Access specifiers
CNI does not strictly enforce the Java access
specifiers, because Java permissions cannot be directly mapped
into C++ permission. Private Java fields and methods are mapped
to private C++ fields and methods, but other fields and methods
are mapped to public fields and methods.
@node Class Initialization
@section Class Initialization
Java requires that each class be automatically initialized at the time
of the first active use. Initializing a class involves
initializing the static fields, running code in class initializer
methods, and initializing base classes. There may also be
some implementation specific actions, such as allocating
@code{String} objects corresponding to string literals in
the code.
The GCJ compiler inserts calls to @code{JvInitClass} at appropriate
places to ensure that a class is initialized when required. The C++
compiler does not insert these calls automatically---it is the
programmer's responsibility to make sure classes are initialized.
However, this is fairly painless because of the conventions assumed by
the Java system.
First, @code{libgcj} will make sure a class is initialized
before an instance of that object is created. This is one
of the responsibilities of the @code{new} operation. This is
taken care of both in Java code, and in C++ code. (When the G++
compiler sees a @code{new} of a Java class, it will call
a routine in @code{libgcj} to allocate the object, and that
routine will take care of initializing the class.) It follows that you can
access an instance field, or call an instance (non-static)
method and be safe in the knowledge that the class and all
of its base classes have been initialized.
Invoking a static method is also safe. This is because the
Java compiler adds code to the start of a static method to make sure
the class is initialized. However, the C++ compiler does not
add this extra code. Hence, if you write a native static method
using CNI, you are responsible for calling @code{JvInitClass}
before doing anything else in the method (unless you are sure
it is safe to leave it out).
Accessing a static field also requires the class of the
field to be initialized. The Java compiler will generate code
to call @code{Jv_InitClass} before getting or setting the field.
However, the C++ compiler will not generate this extra code,
so it is your responsibility to make sure the class is
initialized before you access a static field from C++.
@node Object allocation
@section Object allocation
New Java objects are allocated using a
@dfn{class instance creation expression}, e.g.:
@example
new @var{Type} ( ... )
@end example
The same syntax is used in C++. The main difference is that
C++ objects have to be explicitly deleted; in Java they are
automatically deleted by the garbage collector.
Using @acronym{CNI}, you can allocate a new Java object
using standard C++ syntax and the C++ compiler will allocate
memory from the garbage collector. If you have overloaded
constructors, the compiler will choose the correct one
using standard C++ overload resolution rules.
@noindent For example:
@example
java::util::Hashtable *ht = new java::util::Hashtable(120);
@end example
@deftypefun void* _Jv_AllocBytes (jsize @var{size})
Allocates @var{size} bytes from the heap. The memory is not scanned
by the garbage collector but it freed if no references to it are discovered.
@end deftypefun
@node Arrays
@section Arrays
While in many ways Java is similar to C and C++, it is quite different
in its treatment of arrays. C arrays are based on the idea of pointer
arithmetic, which would be incompatible with Java's security
requirements. Java arrays are true objects (array types inherit from
@code{java.lang.Object}). An array-valued variable is one that
contains a reference (pointer) to an array object.
Referencing a Java array in C++ code is done using the
@code{JArray} template, which as defined as follows:
@example
class __JArray : public java::lang::Object
@{
public:
int length;
@};
template<class T>
class JArray : public __JArray
@{
T data[0];
public:
T& operator[](jint i) @{ return data[i]; @}
@};
@end example
There are a number of @code{typedef}s which correspond to @code{typedef}s
from the @acronym{JNI}. Each is the type of an array holding objects
of the relevant type:
@example
typedef __JArray *jarray;
typedef JArray<jobject> *jobjectArray;
typedef JArray<jboolean> *jbooleanArray;
typedef JArray<jbyte> *jbyteArray;
typedef JArray<jchar> *jcharArray;
typedef JArray<jshort> *jshortArray;
typedef JArray<jint> *jintArray;
typedef JArray<jlong> *jlongArray;
typedef JArray<jfloat> *jfloatArray;
typedef JArray<jdouble> *jdoubleArray;
@end example
@deftypemethod {template<class T>} T* elements (JArray<T> @var{array})
This template function can be used to get a pointer to the elements of
the @code{array}. For instance, you can fetch a pointer to the
integers that make up an @code{int[]} like so:
@example
extern jintArray foo;
jint *intp = elements (foo);
@end example
The name of this function may change in the future.
@end deftypemethod
@deftypefun jobjectArray JvNewObjectArray (jsize @var{length}, jclass @var{klass}, jobject @var{init})
Here @code{klass} is the type of elements of the array and
@code{init} is the initial value put into every slot in the array.
@end deftypefun
@subsection Creating arrays
For each primitive type there is a function which can be used to
create a new array of that type. The name of the function is of the
form:
@example
JvNew@var{Type}Array
@end example
@noindent For example:
@example
JvNewBooleanArray
@end example
@noindent can be used to create an array of Java primitive boolean types.
@noindent The following function definition is the template for all such functions:
@deftypefun jbooleanArray JvNewBooleanArray (jint @var{length})
Create's an array @var{length} indices long.
@end deftypefun
@deftypefun jsize JvGetArrayLength (jarray @var{array})
Returns the length of the @var{array}.
@end deftypefun
@node Methods
@section Methods
Java methods are mapped directly into C++ methods.
The header files generated by @code{gcjh}
include the appropriate method definitions.
Basically, the generated methods have the same names and
@emph{corresponding} types as the Java methods,
and are called in the natural manner.
@subsection Overloading
Both Java and C++ provide method overloading, where multiple
methods in a class have the same name, and the correct one is chosen
(at compile time) depending on the argument types.
The rules for choosing the correct method are (as expected) more complicated
in C++ than in Java, but given a set of overloaded methods
generated by @code{gcjh} the C++ compiler will choose
the expected one.
Common assemblers and linkers are not aware of C++ overloading,
so the standard implementation strategy is to encode the
parameter types of a method into its assembly-level name.
This encoding is called @dfn{mangling},
and the encoded name is the @dfn{mangled name}.
The same mechanism is used to implement Java overloading.
For C++/Java interoperability, it is important that both the Java
and C++ compilers use the @emph{same} encoding scheme.
@subsection Static methods
Static Java methods are invoked in @acronym{CNI} using the standard
C++ syntax, using the @code{::} operator rather
than the @code{.} operator.
@noindent For example:
@example
jint i = java::lang::Math::round((jfloat) 2.3);
@end example
@noindent C++ method definition syntax is used to define a static native method.
For example:
@example
#include <java/lang/Integer>
java::lang::Integer*
java::lang::Integer::getInteger(jstring str)
@{
...
@}
@end example
@subsection Object Constructors
Constructors are called implicitly as part of object allocation
using the @code{new} operator.
@noindent For example:
@example
java::lang::Integer *x = new java::lang::Integer(234);
@end example
Java does not allow a constructor to be a native method.
This limitation can be coded round however because a constructor
can @emph{call} a native method.
@subsection Instance methods
Calling a Java instance method from a C++ @acronym{CNI} method is done
using the standard C++ syntax, e.g.:
@example
// @r{First create the Java object.}
java::lang::Integer *x = new java::lang::Integer(234);
// @r{Now call a method.}
jint prim_value = x->intValue();
if (x->longValue == 0)
...
@end example
@noindent Defining a Java native instance method is also done the natural way:
@example
#include <java/lang/Integer.h>
jdouble
java::lang:Integer::doubleValue()
@{
return (jdouble) value;
@}
@end example
@subsection Interface methods
In Java you can call a method using an interface reference. This is
supported, but not completly. @xref{Interfaces}.
@node Strings
@section Strings
@acronym{CNI} provides a number of utility functions for
working with Java Java @code{String} objects.
The names and interfaces are analogous to those of @acronym{JNI}.
@deftypefun jstring JvNewString (const char* @var{chars}, jsize @var{len})
Returns a Java @code{String} object with characters from the C string
@var{chars} up to the index @var{len} in that array.
@end deftypefun
@deftypefun jstring JvNewStringLatin1 (const char* @var{bytes}, jsize @var{len})
Returns a Java @code{String} made up of @var{len} bytes from @var{bytes}.
@end deftypefun
@deftypefun jstring JvNewStringLatin1 (const char* @var{bytes})
As above but the length of the @code{String} is @code{strlen(@var{bytes})}.
@end deftypefun
@deftypefun jstring JvNewStringUTF (const char* @var{bytes})
Returns a @code{String} which is made up of the UTF encoded characters
present in the C string @var{bytes}.
@end deftypefun
@deftypefun jchar* JvGetStringChars (jstring @var{str})
Returns a pointer to an array of characters making up the @code{String} @var{str}.
@end deftypefun
@deftypefun int JvGetStringUTFLength (jstring @var{str})
Returns the number of bytes required to encode the contents of the
@code{String} @var{str} in UTF-8.
@end deftypefun
@deftypefun jsize JvGetStringUTFRegion (jstring @var{str}, jsize @var{start}, jsize @var{len}, char* @var{buf})
Puts the UTF-8 encoding of a region of the @code{String} @var{str} into
the buffer @code{buf}. The region to fetch is marked by @var{start} and @var{len}.
Note that @var{buf} is a buffer, not a C string. It is @emph{not}
null terminated.
@end deftypefun
@node Mixing with C++
@section Interoperating with C/C++
Because @acronym{CNI} is designed to represent Java classes and methods it
cannot be mixed readily with C/C++ types.
One important restriction is that Java classes cannot have non-Java
type instance or static variables and cannot have methods which take
non-Java types as arguments or return non-Java types.
@noindent None of the following is possible with CNI:
@example
class ::MyClass : public java::lang::Object
@{
char* variable; // @r{char* is not a valid Java type.}
@}
uint
::SomeClass::someMethod (char *arg)
@{
.
.
.
@} // @r{@code{uint} is not a valid Java type, neither is @code{char*}}
@end example
@noindent Of course, it is ok to use C/C++ types within the scope of a method:
@example
jint
::SomeClass::otherMethod (jstring str)
@{
char *arg = ...
.
.
.
@}
@end example
But this restriction can cause a problem so @acronym{CNI} includes the
@code{GcjRaw} class. The @code{GcjRaw} class is a @dfn{non-scanned reference}
type. In other words variables declared of type @code{GcjRaw} can
contain any data and are not checked by the compiler in any way.
This means that you can put C/C++ data structures (including classes)
in your @acronym{CNI} classes, as long as you use the appropriate cast.
@noindent Here are some examples:
@example
class ::MyClass : public java::lang::Object
@{
GcjRaw string;
MyClass ();
GcjRaw getText ();
void printText ();
@}
::MyClass::MyClass ()
@{
char* text = ...
string = text;
@}
GcjRaw
::MyClass::getText ()
@{
return string;
@}
void
::MyClass::printText ()
@{
printf("%s\n", (char*) string);
@}
@end example
@node Exception Handling
@section Exception Handling
While C++ and Java share a common exception handling framework,
things are not yet perfectly integrated. The main issue is that the
run-time type information facilities of the two
languages are not integrated.
Still, things work fairly well. You can throw a Java exception from
C++ using the ordinary @code{throw} construct, and this
exception can be caught by Java code. Similarly, you can catch an
exception thrown from Java using the C++ @code{catch}
construct.
@noindent Here is an example:
@example
if (i >= count)
throw new java::lang::IndexOutOfBoundsException();
@end example
Normally, G++ will automatically detect when you are writing C++
code that uses Java exceptions, and handle them appropriately.
However, if C++ code only needs to execute destructors when Java
exceptions are thrown through it, GCC will guess incorrectly. Sample
problematic code:
@example
struct S @{ ~S(); @};
extern void bar(); // @r{Is implemented in Java and may throw exceptions.}
void foo()
@{
S s;
bar();
@}
@end example
The usual effect of an incorrect guess is a link failure, complaining of
a missing routine called @code{__gxx_personality_v0}.
You can inform the compiler that Java exceptions are to be used in a
translation unit, irrespective of what it might think, by writing
@code{#pragma GCC java_exceptions} at the head of the
file. This @code{#pragma} must appear before any
functions that throw or catch exceptions, or run destructors when
exceptions are thrown through them.
@node Synchronization
@section Synchronization
Each Java object has an implicit monitor.
The Java VM uses the instruction @code{monitorenter} to acquire
and lock a monitor, and @code{monitorexit} to release it.
The corresponding CNI macros are @code{JvMonitorEnter} and
@code{JvMonitorExit} (JNI has similar methods @code{MonitorEnter}
and @code{MonitorExit}).
The Java source language does not provide direct access to these primitives.
Instead, there is a @code{synchronized} statement that does an
implicit @code{monitorenter} before entry to the block,
and does a @code{monitorexit} on exit from the block.
Note that the lock has to be released even when the block is abnormally
terminated by an exception, which means there is an implicit
@code{try finally} surrounding synchronization locks.
From C++, it makes sense to use a destructor to release a lock.
@acronym{CNI} defines the following utility class:
@example
class JvSynchronize() @{
jobject obj;
JvSynchronize(jobject o) @{ obj = o; JvMonitorEnter(o); @}
~JvSynchronize() @{ JvMonitorExit(obj); @}
@};
@end example
So this Java code:
@example
synchronized (OBJ)
@{
CODE
@}
@end example
@noindent might become this C++ code:
@example
@{
JvSynchronize dummy (OBJ);
CODE;
@}
@end example
Java also has methods with the @code{synchronized} attribute.
This is equivalent to wrapping the entire method body in a
@code{synchronized} statement.
(Alternatively, an implementation could require the caller to do
the synchronization. This is not practical for a compiler, because
each virtual method call would have to test at run-time if
synchronization is needed.) Since in @code{gcj}
the @code{synchronized} attribute is handled by the
method implementation, it is up to the programmer
of a synchronized native method to handle the synchronization
(in the C++ implementation of the method).
In otherwords, you need to manually add @code{JvSynchronize}
in a @code{native synchornized} method.
@node Reflection
@section Reflection
Reflection is possible with CNI code, it functions similarly to how it
functions with JNI@.
@c clean this up... I mean, what are the types jfieldID and jmethodID in JNI?
The types @code{jfieldID} and @code{jmethodID}
are as in JNI@.
@noindent The functions:
@itemize
@item @code{JvFromReflectedField},
@item @code{JvFromReflectedMethod},
@item @code{JvToReflectedField}
@item @code{JvToFromReflectedMethod}
@end itemize
@noindent will be added shortly, as will other functions corresponding to JNI@.
@node Resources
@chapter Resources
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment