As the computer revolution
progresses, “unsafe” programming has become one of the major
culprits that makes programming expensive.
Two of these safety issues are
initialization and cleanup. Many C bugs occur when the programmer
forgets to initialize a variable. This is especially true with libraries when
users don’t know how to initialize a library component, or even that they
must. Cleanup is a special problem because it’s easy to forget about an
element when you’re done with it, since it no longer concerns you. Thus,
the resources used by that element are retained and you can easily end up
running out of resources (most notably, memory).
C++ introduced the concept of a
constructor, a special method automatically called when an object is
created. Java also adopted the constructor, and in addition has a garbage
collector that automatically releases memory resources when they’re no
longer being used. This chapter examines the issues of initialization and
cleanup, and their support in
Java.
You can imagine creating a method called
initialize( ) for every class you write. The name is a hint that it
should be called before using the object. Unfortunately, this means the user
must remember to call the method. In Java, the class designer can guarantee
initialization of every object by providing a special method called a
constructor. If a class has
a constructor, Java automatically calls that constructor when an object is
created, before users can even get their hands on it. So initialization is
guaranteed.
The next challenge is what to name this
method. There are two issues. The first is that any name you use could clash
with a name you might like to use as a member in the class. The second is that
because the compiler is responsible for calling the constructor, it must always
know which method to call. The C++ solution seems the easiest and most logical,
so it’s also used in Java: the name of the constructor
is the same as the name of the class. It makes sense that
such a method will be called automatically on initialization.
Here’s a simple class with a
constructor:
//: c04:SimpleConstructor.java // Demonstration of a simple constructor. class Rock { Rock() { // This is the constructor System.out.println("Creating Rock"); } } public class SimpleConstructor { public static void main(String[] args) { for(int i = 0; i < 10; i++) new Rock(); } } ///:~
new Rock();
storage is allocated and the constructor
is called. It is guaranteed that the object will be properly initialized before
you can get your hands on it.
Note that the coding style of making the
first letter of all methods lowercase does not apply to constructors, since the
name of the constructor must match the name of the class
exactly.
Like any method, the constructor can have
arguments to allow you to specify
how an object is created. The above example can easily be changed so the
constructor takes an argument:
//: c04:SimpleConstructor2.java // Constructors can have arguments. class Rock2 { Rock2(int i) { System.out.println( "Creating Rock number " + i); } } public class SimpleConstructor2 { public static void main(String[] args) { for(int i = 0; i < 10; i++) new Rock2(i); } } ///:~
Constructor arguments provide you with a
way to provide parameters for the initialization of an object. For example, if
the class Tree has a constructor that takes a single integer argument
denoting the height of the tree, you would create a Tree object like
this:
Tree t = new Tree(12); // 12-foot tree
If Tree(int) is your only
constructor, then the compiler won’t let you create a Tree object
any other way.
Constructors eliminate a large class of
problems and make the code easier to read. In the preceding code fragment, for
example, you don’t see an explicit call to some initialize( )
method that is conceptually separate from definition. In Java, definition and
initialization are unified concepts—you can’t have one without the
other.
The constructor is an unusual type of
method because it has no return
value. This is distinctly
different from a void return value, in which the method returns nothing
but you still have the option to make it return something else. Constructors
return nothing and you don’t have an option. If there was a return value,
and if you could select your own, the compiler would somehow need to know what
to do with that return
value.
One of the important features in any
programming language is the use of names. When you create an object, you give a
name to a region of storage. A method is a name for an action. By using names to
describe your system, you create a program that is easier for people to
understand and change. It’s a lot like writing prose—the goal is to
communicate with your readers.
You refer to all objects and methods by
using names. Well-chosen names make it easier for you and others to understand
your code.
A problem arises when mapping the concept
of nuance in human language onto a programming language. Often, the same word
expresses a number of different meanings—it’s overloaded.
This is useful, especially when it comes to trivial differences. You say
“wash the shirt,” “wash the car,” and “wash the
dog.” It would be silly to be forced to say, “shirtWash the
shirt,” “carWash the car,” and “dogWash the dog”
just so the listener doesn’t need to make any distinction about the action
performed. Most human languages are redundant, so even if you miss a few words,
you can still determine the meaning. We don’t need unique
identifiers—we can deduce meaning from context.
Most programming languages (C in
particular) require you to have a unique identifier for each function. So you
could not have one function called print( ) for printing integers
and another called print( ) for printing floats—each function
requires a unique name.
In Java (and C++), another factor forces
the overloading of method names: the
constructor. Because the
constructor’s name is predetermined by the name of the class, there can be
only one constructor name. But what if you want to create an object in more than
one way? For example, suppose you build a class that can initialize itself in a
standard way or by reading information from a file. You need two constructors,
one that takes no arguments (the default constructor, also called the
no-arg constructor), and one that takes a String as an argument,
which is the name of the file from which to initialize the object. Both are
constructors, so they must have the same name—the name of the class. Thus,
method overloading is essential to allow the same method name to be used
with different argument types. And although method overloading is a must for
constructors, it’s a general convenience and can be used with any method.
Here’s an example that shows both
overloaded constructors and overloaded ordinary methods:
//: c04:Overloading.java // Demonstration of both constructor // and ordinary method overloading. import java.util.*; class Tree { int height; Tree() { prt("Planting a seedling"); height = 0; } Tree(int i) { prt("Creating new Tree that is " + i + " feet tall"); height = i; } void info() { prt("Tree is " + height + " feet tall"); } void info(String s) { prt(s + ": Tree is " + height + " feet tall"); } static void prt(String s) { System.out.println(s); } } public class Overloading { public static void main(String[] args) { for(int i = 0; i < 5; i++) { Tree t = new Tree(i); t.info(); t.info("overloaded method"); } // Overloaded constructor: new Tree(); } } ///:~
A Tree object can be created
either as a seedling, with no argument, or as a plant grown in a nursery, with
an existing height. To support this, there are two constructors, one that takes
no arguments (we call constructors that take no arguments
default
constructors[28])
and one that takes the existing height.
You might also want to call the
info( ) method in more than one way. For example, with a
String argument if you have an extra message you want printed, and
without if you have nothing more to say. It would seem strange to give two
separate names to what is obviously the same concept. Fortunately, method
overloading allows you to use the same name for
both.
If the methods have the same name, how
can Java know which method you mean? There’s a simple rule: each
overloaded method must take a unique list of argument types.
If you think about this for a second, it
makes sense: how else could a programmer tell the difference between two methods
that have the same name, other than by the types of their
arguments?
Even differences in the ordering of
arguments are sufficient to distinguish two methods: (Although you don’t
normally want to take this approach, as it produces difficult-to-maintain
code.)
//: c04:OverloadingOrder.java // Overloading based on the order of // the arguments. public class OverloadingOrder { static void print(String s, int i) { System.out.println( "String: " + s + ", int: " + i); } static void print(int i, String s) { System.out.println( "int: " + i + ", String: " + s); } public static void main(String[] args) { print("String first", 11); print(99, "Int first"); } } ///:~
The two print( ) methods have
identical arguments, but the order is different, and that’s what makes
them
distinct.
A primitive can be automatically promoted
from a smaller type to a larger one and this can be slightly confusing in
combination with overloading. The following example demonstrates what happens
when a primitive is handed to an overloaded method:
//: c04:PrimitiveOverloading.java // Promotion of primitives and overloading. public class PrimitiveOverloading { // boolean can't be automatically converted static void prt(String s) { System.out.println(s); } void f1(char x) { prt("f1(char)"); } void f1(byte x) { prt("f1(byte)"); } void f1(short x) { prt("f1(short)"); } void f1(int x) { prt("f1(int)"); } void f1(long x) { prt("f1(long)"); } void f1(float x) { prt("f1(float)"); } void f1(double x) { prt("f1(double)"); } void f2(byte x) { prt("f2(byte)"); } void f2(short x) { prt("f2(short)"); } void f2(int x) { prt("f2(int)"); } void f2(long x) { prt("f2(long)"); } void f2(float x) { prt("f2(float)"); } void f2(double x) { prt("f2(double)"); } void f3(short x) { prt("f3(short)"); } void f3(int x) { prt("f3(int)"); } void f3(long x) { prt("f3(long)"); } void f3(float x) { prt("f3(float)"); } void f3(double x) { prt("f3(double)"); } void f4(int x) { prt("f4(int)"); } void f4(long x) { prt("f4(long)"); } void f4(float x) { prt("f4(float)"); } void f4(double x) { prt("f4(double)"); } void f5(long x) { prt("f5(long)"); } void f5(float x) { prt("f5(float)"); } void f5(double x) { prt("f5(double)"); } void f6(float x) { prt("f6(float)"); } void f6(double x) { prt("f6(double)"); } void f7(double x) { prt("f7(double)"); } void testConstVal() { prt("Testing with 5"); f1(5);f2(5);f3(5);f4(5);f5(5);f6(5);f7(5); } void testChar() { char x = 'x'; prt("char argument:"); f1(x);f2(x);f3(x);f4(x);f5(x);f6(x);f7(x); } void testByte() { byte x = 0; prt("byte argument:"); f1(x);f2(x);f3(x);f4(x);f5(x);f6(x);f7(x); } void testShort() { short x = 0; prt("short argument:"); f1(x);f2(x);f3(x);f4(x);f5(x);f6(x);f7(x); } void testInt() { int x = 0; prt("int argument:"); f1(x);f2(x);f3(x);f4(x);f5(x);f6(x);f7(x); } void testLong() { long x = 0; prt("long argument:"); f1(x);f2(x);f3(x);f4(x);f5(x);f6(x);f7(x); } void testFloat() { float x = 0; prt("float argument:"); f1(x);f2(x);f3(x);f4(x);f5(x);f6(x);f7(x); } void testDouble() { double x = 0; prt("double argument:"); f1(x);f2(x);f3(x);f4(x);f5(x);f6(x);f7(x); } public static void main(String[] args) { PrimitiveOverloading p = new PrimitiveOverloading(); p.testConstVal(); p.testChar(); p.testByte(); p.testShort(); p.testInt(); p.testLong(); p.testFloat(); p.testDouble(); } } ///:~
If you view the output of this program,
you’ll see that the constant value 5 is treated as an int, so if an
overloaded method is available that takes an int it is used. In all other
cases, if you have a data type that is smaller than the argument in the method,
that data type is promoted. char produces a slightly different effect,
since if it doesn’t find an exact char match, it is promoted to
int.
What happens if your argument is
bigger than the argument expected by the overloaded method? A
modification of the above program gives the answer:
//: c04:Demotion.java // Demotion of primitives and overloading. public class Demotion { static void prt(String s) { System.out.println(s); } void f1(char x) { prt("f1(char)"); } void f1(byte x) { prt("f1(byte)"); } void f1(short x) { prt("f1(short)"); } void f1(int x) { prt("f1(int)"); } void f1(long x) { prt("f1(long)"); } void f1(float x) { prt("f1(float)"); } void f1(double x) { prt("f1(double)"); } void f2(char x) { prt("f2(char)"); } void f2(byte x) { prt("f2(byte)"); } void f2(short x) { prt("f2(short)"); } void f2(int x) { prt("f2(int)"); } void f2(long x) { prt("f2(long)"); } void f2(float x) { prt("f2(float)"); } void f3(char x) { prt("f3(char)"); } void f3(byte x) { prt("f3(byte)"); } void f3(short x) { prt("f3(short)"); } void f3(int x) { prt("f3(int)"); } void f3(long x) { prt("f3(long)"); } void f4(char x) { prt("f4(char)"); } void f4(byte x) { prt("f4(byte)"); } void f4(short x) { prt("f4(short)"); } void f4(int x) { prt("f4(int)"); } void f5(char x) { prt("f5(char)"); } void f5(byte x) { prt("f5(byte)"); } void f5(short x) { prt("f5(short)"); } void f6(char x) { prt("f6(char)"); } void f6(byte x) { prt("f6(byte)"); } void f7(char x) { prt("f7(char)"); } void testDouble() { double x = 0; prt("double argument:"); f1(x);f2((float)x);f3((long)x);f4((int)x); f5((short)x);f6((byte)x);f7((char)x); } public static void main(String[] args) { Demotion p = new Demotion(); p.testDouble(); } } ///:~
Here, the methods take narrower primitive
values. If your argument is wider then you must
cast to the necessary type using the type name in
parentheses. If you don’t do this, the compiler will issue an error
message.
You should be aware that this is a
narrowing conversion, which
means you might lose information during the cast. This is why the compiler
forces you to do it—to flag the narrowing conversion.
It is common to wonder “Why only
class names and method argument lists? Why not distinguish between methods based
on their return values?” For example, these two methods, which have the
same name and arguments, are easily distinguished from each
other:
void f() {} int f() {}
This works fine when the compiler can
unequivocally determine the meaning from the context, as in int x =
f( ). However, you can call a method and ignore the return value; this
is often referred to as calling a method for its side
effect since you don’t care about the return value but instead want
the other effects of the method call. So if you call the method this
way:
f();
how can Java determine which
f( ) should be called? And how could someone reading the code see
it? Because of this sort of problem, you cannot use return value types to
distinguish overloaded
methods.
As mentioned previously, a default
constructor (a.k.a. a “no-arg” constructor)
is one without arguments, used to
create a “vanilla object.” If you create a class that has no
constructors, the compiler will automatically create a default constructor for
you. For example:
//: c04:DefaultConstructor.java class Bird { int i; } public class DefaultConstructor { public static void main(String[] args) { Bird nc = new Bird(); // default! } } ///:~
The line
new Bird();
creates a new object and calls the
default constructor, even though one was not explicitly defined. Without it we
would have no method to call to build our object. However, if you define any
constructors (with or without arguments), the compiler will not
synthesize one for you:
class Bush { Bush(int i) {} Bush(double d) {} }
Now if you say:
new Bush();
the compiler will complain that it cannot
find a constructor that matches. It’s as if when you don’t put in
any constructors, the compiler says “You are bound to need some
constructor, so let me make one for you.” But if you write a constructor,
the compiler says “You’ve written a constructor so you know what
you’re doing; if you didn’t put in a default it’s because you
meant to leave it
out.”
If you have two objects of the same type
called a and b, you might wonder how it is that you can call a
method f( ) for both those objects:
class Banana { void f(int i) { /* ... */ } } Banana a = new Banana(), b = new Banana(); a.f(1); b.f(2);
If there’s only one method called
f( ), how can that method know whether it’s being called for
the object a or b?
To allow you to write the code in a
convenient object-oriented syntax in which you “send a message to an
object,” the compiler does some undercover work for you. There’s a
secret first argument passed to the method f( ), and that argument
is the reference to the object that’s being manipulated. So the two method
calls above become something like:
Banana.f(a,1); Banana.f(b,2);
This is internal and you can’t
write these expressions and get the compiler to accept them, but it gives you an
idea of what’s happening.
Suppose you’re inside a method and
you’d like to get the reference to the current object. Since that
reference is passed secretly by the compiler, there’s no identifier
for it. However, for this purpose there’s a keyword: this. The
this keyword—which can be used only inside a method—produces
the reference to the object the method has been called for. You can treat this
reference just like any other object reference. Keep in mind that if
you’re calling a method of your class from within another method of your
class, you don’t need to use this; you simply call the method. The
current this reference is automatically used for the other method. Thus
you can say:
class Apricot { void pick() { /* ... */ } void pit() { pick(); /* ... */ } }
Inside pit( ), you
could say this.pick( ) but there’s no need to. The
compiler does it for you automatically. The this keyword is used only for
those special cases in which you need to explicitly use the reference to the
current object. For example, it’s often used in return statements
when you want to return the reference to the current object:
//: c04:Leaf.java // Simple use of the "this" keyword. public class Leaf { int i = 0; Leaf increment() { i++; return this; } void print() { System.out.println("i = " + i); } public static void main(String[] args) { Leaf x = new Leaf(); x.increment().increment().increment().print(); } } ///:~
Because increment( ) returns
the reference to the current object via the this keyword, multiple
operations can easily be performed on the same object.
When you write several constructors for a
class, there are times when you’d like to call one constructor from
another to avoid duplicating code. You can do this using the this
keyword.
Normally, when you say this, it is
in the sense of “this object” or “the current object,”
and by itself it produces the reference to the current object. In a constructor,
the this keyword takes on a different meaning when you give it an
argument list: it makes an explicit call to the constructor that matches that
argument list. Thus you have a straightforward way to call other
constructors:
//: c04:Flower.java // Calling constructors with "this." public class Flower { int petalCount = 0; String s = new String("null"); Flower(int petals) { petalCount = petals; System.out.println( "Constructor w/ int arg only, petalCount= " + petalCount); } Flower(String ss) { System.out.println( "Constructor w/ String arg only, s=" + ss); s = ss; } Flower(String s, int petals) { this(petals); //! this(s); // Can't call two! this.s = s; // Another use of "this" System.out.println("String & int args"); } Flower() { this("hi", 47); System.out.println( "default constructor (no args)"); } void print() { //! this(11); // Not inside non-constructor! System.out.println( "petalCount = " + petalCount + " s = "+ s); } public static void main(String[] args) { Flower x = new Flower(); x.print(); } } ///:~
The constructor Flower(String s, int
petals) shows that, while you can call one constructor using this,
you cannot call two. In addition, the constructor call must be the first thing
you do or you’ll get a compiler error message.
This example also shows another way
you’ll see this used. Since the name of the argument s and
the name of the member data s are the same, there’s an ambiguity.
You can resolve it by saying this.s to refer to the member data.
You’ll often see this form used in Java code, and it’s used in
numerous places in this book.
In print( ) you can see that
the compiler won’t let you call a constructor from inside any method other
than a constructor.
With the this keyword in mind, you
can more fully understand what it means to make a
method static. It means
that there is no this for that particular method. You cannot call
non-static methods from inside static
methods[29]
(although the reverse is possible), and you can call a static method for
the class itself, without any object. In fact, that’s primarily what a
static method is for. It’s as if you’re creating the
equivalent of a global function (from C). Except global functions are not
permitted in Java, and putting the static method inside a class allows it
access to other static methods and to static
fields.
Some people argue that static
methods are not object-oriented since they do have the semantics of a global
function; with a static method you don’t send a message to an
object, since there’s no this. This is probably a fair argument,
and if you find yourself using a lot of static methods you should
probably rethink your strategy. However, statics are pragmatic and there
are times when you genuinely need them, so whether or not they are “proper
OOP” should be left to the theoreticians. Indeed, even
Smalltalk has the equivalent in its “class
methods.”
Programmers know about the importance of
initialization, but often forget the importance of cleanup. After all, who needs
to clean up an int? But with libraries, simply “letting go”
of an object once you’re done with it is not always safe. Of course, Java
has the garbage collector to reclaim the memory of
objects that are no longer used. Now consider a very unusual case. Suppose your
object allocates “special” memory without using
new. The garbage collector knows only how to
release memory allocated with new, so it won’t know how to
release the object’s “special” memory. To handle this case,
Java provides a method called finalize( )
that you can define for your class. Here’s how it’s supposed
to work. When the garbage collector is ready to release the storage used for
your object, it will first call finalize( ), and only on the next
garbage-collection pass will it reclaim the object’s memory. So if you
choose to use finalize( ), it gives you the ability to perform some
important cleanup at the time of garbage collection.
This is a potential programming pitfall
because some programmers, especially C++ programmers, might initially mistake
finalize( ) for the destructor in C++,
which is a function that is always called when an object is destroyed. But it is
important to distinguish between C++ and Java here, because in C++ objects
always get destroyed (in a bug-free program), whereas in Java objects do not
always get garbage-collected. Or, put another way:
Garbage collection is not
destruction.
If you remember this, you will stay out
of trouble. What it means is that if there is some activity that must be
performed before you no longer need an object, you must perform that activity
yourself. Java has no destructor or similar concept, so you must create an
ordinary method to perform this cleanup. For example, suppose in the process of
creating your object it draws itself on the screen. If you don’t
explicitly erase its image from the screen, it might never get cleaned up. If
you put some kind of erasing functionality inside finalize( ), then
if an object is garbage-collected, the image will first be removed from the
screen, but if it isn’t, the image will remain. So a second point to
remember is:
Your objects might not get
garbage-collected.
You might find that the storage for an
object never gets released because your program never nears the point of running
out of storage. If your program completes and the garbage collector never gets
around to releasing the storage for any of your objects, that storage will be
returned to the operating system en masse as the program exits. This is a
good thing, because garbage collection has some overhead, and if you never do it
you never incur that
expense.
You might believe at this point that you
should not use finalize( ) as a general-purpose cleanup method. What
good is it?
A third point to remember
is:
Garbage collection is only about
memory.
That is, the sole reason for the
existence of the garbage collector is to recover memory that your program is no
longer using. So any activity that is associated with garbage collection, most
notably your finalize( ) method, must also be only about memory and
its deallocation.
Does this mean that if your object
contains other objects finalize( ) should explicitly release those
objects? Well, no—the garbage collector takes care of the release of all
object memory regardless of how the object is created. It turns out that the
need for finalize( ) is limited to special cases, in which your
object can allocate some storage in some way other than creating an object. But,
you might observe, everything in Java is an object so how can this
be?
It would seem that
finalize( ) is in place because of the possibility that you’ll
do something C-like by allocating memory using a mechanism other than the normal
one in Java. This can happen primarily through native methods, which are
a way to call non-Java code from Java. (Native methods are discussed in Appendix
B.) C and C++ are the only languages currently supported by native methods, but
since they can call subprograms in other languages, you can effectively call
anything. Inside the non-Java code, C’s malloc( ) family of
functions might be called to allocate storage, and unless you call
free( ) that storage will not be released, causing a memory leak. Of
course, free( ) is a C and C++ function, so you’d need to call
it in a native method inside your finalize( ).
After reading this, you probably get the
idea that you won’t use finalize( ) much. You’re
correct; it is not the appropriate place for normal cleanup to occur. So where
should normal cleanup be
performed?
To clean up an object, the user of that
object must call a cleanup method at the point the
cleanup is desired. This sounds pretty straightforward, but it collides a bit
with the C++ concept of the destructor. In C++, all
objects are destroyed. Or rather, all objects should be destroyed. If the
C++ object is created as a local (i.e., on the stack—not possible in
Java), then the destruction happens at the closing curly brace of the scope in
which the object was created. If the object was created using new (like
in Java) the destructor is called when the programmer calls the C++ operator
delete (which doesn’t exist in Java). If the C++ programmer forgets
to call delete, the destructor is never called and you have a memory
leak, plus the other parts of the object never get cleaned up. This kind of bug
can be very difficult to track down.
In contrast, Java doesn’t allow you
to create local objects—you must always use new. But in Java,
there’s no “delete” to call to release the object since the
garbage collector releases the storage for you. So from a simplistic standpoint
you could say that because of garbage collection, Java has no destructor.
You’ll see as this book progresses, however, that the presence of a
garbage collector does not remove the need for or utility
of destructors. (And you should never call
finalize( ) directly, so that’s not an
appropriate avenue for a solution.) If you want some kind of cleanup performed
other than storage release you must still explicitly call an appropriate
method in Java, which is the equivalent of a C++ destructor without the
convenience.
One of the things finalize( )
can be useful for is observing the process of garbage collection. The following
example shows you what’s going on and summarizes the previous descriptions
of garbage collection:
//: c04:Garbage.java // Demonstration of the garbage // collector and finalization class Chair { static boolean gcrun = false; static boolean f = false; static int created = 0; static int finalized = 0; int i; Chair() { i = ++created; if(created == 47) System.out.println("Created 47"); } public void finalize() { if(!gcrun) { // The first time finalize() is called: gcrun = true; System.out.println( "Beginning to finalize after " + created + " Chairs have been created"); } if(i == 47) { System.out.println( "Finalizing Chair #47, " + "Setting flag to stop Chair creation"); f = true; } finalized++; if(finalized >= created) System.out.println( "All " + finalized + " finalized"); } } public class Garbage { public static void main(String[] args) { // As long as the flag hasn't been set, // make Chairs and Strings: while(!Chair.f) { new Chair(); new String("To take up space"); } System.out.println( "After all Chairs have been created:\n" + "total created = " + Chair.created + ", total finalized = " + Chair.finalized); // Optional arguments force garbage // collection & finalization: if(args.length > 0) { if(args[0].equals("gc") || args[0].equals("all")) { System.out.println("gc():"); System.gc(); } if(args[0].equals("finalize") || args[0].equals("all")) { System.out.println("runFinalization():"); System.runFinalization(); } } System.out.println("bye!"); } } ///:~
The above program creates many
Chair objects, and at some point after the garbage collector begins
running, the program stops creating Chairs. Since the garbage collector
can run at any time, you don’t know exactly when it will start up, so
there’s a flag called gcrun to indicate whether the garbage
collector has started running yet. A second flag f is a way for
Chair to tell the main( ) loop that it should stop making
objects. Both of these flags are set within finalize( ), which is
called during garbage collection.
Two other static variables,
created and finalized, keep track of the number of objs
created versus the number that get finalized by the garbage collector. Finally,
each Chair has its own (non-static) int i so it can
keep track of what number it is. When Chair number 47 is finalized, the
flag is set to true to bring the process of Chair creation to a
stop.
All this happens in main( ),
in the loop
while(!Chair.f) { new Chair(); new String("To take up space"); }
You might wonder how this loop could ever
finish, since there’s nothing inside the loop that changes the value of
Chair.f. However, the finalize( ) process will, eventually,
when it finalizes number 47.
The creation of a String object
during each iteration is simply extra storage being allocated to encourage the
garbage collector to kick in, which it will do when it starts to get nervous
about the amount of memory available.
When you run the program, you provide a
command-line argument of “gc,” “finalize,” or
“all.” The “gc” argument will call the
System.gc( ) method (to force execution of
the garbage collector). Using the “finalize” argument calls
System.runFinalization( ) which—in
theory—will cause any unfinalized objects to be finalized. And
“all” causes both methods to be called.
The behavior of this program and the
version in the first edition of this book shows that the whole issue of garbage
collection and finalization has been evolving, with much of the evolution
happening behind closed doors. In fact, by the time you read this, the behavior
of the program may have changed once again.
If System.gc( ) is called,
then finalization happens to all the objects. This was not necessarily the case
with previous implementations of the JDK, although the documentation claimed
otherwise. In addition, you’ll see that it doesn’t seem to make any
difference whether System.runFinalization( ) is
called.
However, you will see that only if
System.gc( ) is called after all the objects are created and
discarded will all the finalizers be called. If you do not call
System.gc( ), then only some of the objects will be finalized. In
Java 1.1, a method System.runFinalizersOnExit( ) was introduced that
caused programs to run all the finalizers as they exited, but the design turned
out to be buggy and the method was deprecated. This is yet another clue that the
Java designers were thrashing about trying to solve the garbage collection and
finalization problem. We can only hope that things have been worked out in Java
2.
The preceding program shows that the
promise that finalizers will always be run holds true, but only if you
explicitly force it to happen yourself. If you don’t cause
System.gc( ) to be called, you’ll get an output like
this:
Created 47 Beginning to finalize after 3486 Chairs have been created Finalizing Chair #47, Setting flag to stop Chair creation After all Chairs have been created: total created = 3881, total finalized = 2684 bye!
Thus, not all finalizers get called by
the time the program completes. If System.gc( ) is called, it will
finalize and destroy all the objects that are no longer in use up to that point.
Remember that neither garbage collection
nor finalization is guaranteed. If the Java Virtual Machine (JVM) isn’t
close to running out of memory, then it will (wisely) not waste time recovering
memory through garbage collection.
In general, you
can’t rely on finalize( ) being called, and you must create
separate “cleanup” functions and call them explicitly. So it appears
that finalize( ) is only useful for obscure memory cleanup that most
programmers will never use. However, there is a very interesting use of
finalize( ) which does not rely on it being called every time. This
is the verification of the death
condition[30]
of an object.
At the point that you’re no longer
interested in an object—when it’s ready to be cleaned up—that
object should be in a state whereby its memory can be safely released. For
example, if the object represents an open file, that file should be closed by
the programmer before the object is garbage-collected. If any portions of the
object are not properly cleaned up, then you have a bug in your program that
could be very difficult to find. The value of finalize( ) is that it
can be used to discover this condition, even if it isn’t always called. If
one of the finalizations happens to reveal the bug, then you discover the
problem, which is all you really care about.
Here’s a simple example of how you
might use it:
//: c04:DeathCondition.java // Using finalize() to detect an object that // hasn't been properly cleaned up. class Book { boolean checkedOut = false; Book(boolean checkOut) { checkedOut = checkOut; } void checkIn() { checkedOut = false; } public void finalize() { if(checkedOut) System.out.println("Error: checked out"); } } public class DeathCondition { public static void main(String[] args) { Book novel = new Book(true); // Proper cleanup: novel.checkIn(); // Drop the reference, forget to clean up: new Book(true); // Force garbage collection & finalization: System.gc(); } } ///:~
The death condition is that all
Book objects are supposed to be checked in before they are
garbage-collected, but in main( ) a programmer error doesn’t
check in one of the books. Without finalize( ) to verify the death
condition, this could be a difficult bug to find.
Note that System.gc( ) is
used to force finalization (and you should do this during program development to
speed debugging). But even if it isn’t, it’s highly probable that
the errant Book will eventually be discovered through repeated executions
of the program (assuming the program allocates enough storage to cause the
garbage collector to execute).
If you come from a programming language
where allocating objects on the heap is expensive, you may naturally assume that
Java’s scheme of allocating everything (except primitives) on the heap is
expensive. However, it turns out that the garbage collector can have a
significant impact on increasing the speed of object creation. This might
sound a bit odd at first—that storage release affects storage
allocation—but it’s the way some JVMs work and it means that
allocating storage for heap objects in Java can be nearly as fast as creating
storage on the stack in other languages.
For example, you can think of the C++
heap as a yard where each object stakes out its own piece of turf. This real
estate can become abandoned sometime later and must be reused. In some JVMs, the
Java heap is quite different; it’s more like a conveyor belt that moves
forward every time you allocate a new object. This means that object storage
allocation is remarkably rapid. The “heap pointer” is simply moved
forward into virgin territory, so it’s effectively the same as C++’s
stack allocation. (Of course, there’s a little extra overhead for
bookkeeping but it’s nothing like searching for storage.)
Now you might observe that the heap
isn’t in fact a conveyor belt, and if you treat it that way you’ll
eventually start paging memory a lot (which is a big performance hit) and later
run out. The trick is that the garbage collector steps in and while it collects
the garbage it compacts all the objects in the heap so that you’ve
effectively moved the “heap pointer” closer to the beginning of the
conveyor belt and further away from a page fault. The garbage collector
rearranges things and makes it possible for the high-speed, infinite-free-heap
model to be used while allocating storage.
To understand how this works, you need to
get a little better idea of the way the different garbage collector (GC) schemes
work. A simple but slow GC technique is reference counting. This means that each
object contains a reference counter, and every time a reference is attached to
an object the reference count is increased. Every time a reference goes out of
scope or is set to null, the reference count is decreased. Thus, managing
reference counts is a small but constant overhead that happens throughout the
lifetime of your program. The garbage collector moves through the entire list of
objects and when it finds one with a reference count of zero it releases that
storage. The one drawback is that if objects circularly refer to each other they
can have nonzero reference counts while still being garbage. Locating such
self-referential groups requires significant extra work for the garbage
collector. Reference counting is commonly used to explain one kind of garbage
collection but it doesn’t seem to be used in any JVM
implementations.
In faster schemes, garbage collection is
not based on reference counting. Instead, it is based on the idea that any
nondead object must ultimately be traceable back to a reference that lives
either on the stack or in static storage. The chain might go through several
layers of objects. Thus, if you start in the stack and the static storage area
and walk through all the references you’ll find all the live objects. For
each reference that you find, you must trace into the object that it points to
and then follow all the references in that object, tracing into the
objects they point to, etc., until you’ve moved through the entire web
that originated with the reference on the stack or in static storage. Each
object that you move through must still be alive. Note that there is no problem
with detached self-referential groups—these are simply not found, and are
therefore automatically garbage.
In the approach described here, the JVM
uses an adaptive garbage-collection scheme, and what it does with the
live objects that it locates depends on the variant currently being used. One of
these variants is stop-and-copy. This means that—for reasons that
will become apparent—the program is first stopped (this is not a
background collection scheme). Then, each live object that is found is copied
from one heap to another, leaving behind all the garbage. In addition, as the
objects are copied into the new heap they are packed end-to-end, thus compacting
the new heap (and allowing new storage to simply be reeled off the end as
previously described).
Of course, when an object is moved from
one place to another, all references that point at (i.e., that reference)
the object must be changed. The reference that goes from the heap or the static
storage area to the object can be changed right away, but there can be other
references pointing to this object that will be encountered later during the
“walk.” These are fixed up as they are found (you could imagine a
table that maps old addresses to new ones).
There are two issues that make these
so-called “copy collectors” inefficient. The first is the idea that
you have two heaps and you slosh all the memory back and forth between these two
separate heaps, maintaining twice as much memory as you actually need. Some JVMs
deal with this by allocating the heap in chunks as needed and simply copying
from one chunk to another.
The second issue is the copying. Once
your program becomes stable it might be generating little or no garbage. Despite
that, a copy collector will still copy all the memory from one place to another,
which is wasteful. To prevent this, some JVMs detect that no new garbage is
being generated and switch to a different scheme (this is the
“adaptive” part). This other scheme is called mark and sweep,
and it’s what earlier versions of Sun’s JVM used all the time. For
general use, mark and sweep is fairly slow, but when you know you’re
generating little or no garbage it’s fast.
Mark and sweep follows the same logic of
starting from the stack and static storage and tracing through all the
references to find live objects. However, each time it finds a live object that
object is marked by setting a flag in it, but the object isn’t collected
yet. Only when the marking process is finished does the sweep occur. During the
sweep, the dead objects are released. However, no copying happens, so if the
collector chooses to compact a fragmented heap it does so by shuffling objects
around.
The “stop-and-copy” refers to
the idea that this type of garbage collection is not done in the
background; instead, the program is stopped while the GC occurs. In the Sun
literature you’ll find many references to garbage collection as a
low-priority background process, but it turns out that the GC was not
implemented that way, at least in earlier versions of the Sun JVM. Instead, the
Sun garbage collector ran when memory got low. In addition, mark-and-sweep
requires that the program be stopped.
As previously mentioned, in the JVM
described here memory is allocated in big blocks. If you allocate a large
object, it gets its own block. Strict stop-and-copy requires copying every live
object from the source heap to a new heap before you could free the old one,
which translates to lots of memory. With blocks, the GC can typically use dead
blocks to copy objects to as it collects. Each block has a generation
count to keep track of whether it’s alive. In the normal case, only
the blocks created since the last GC are compacted; all other blocks get their
generation count bumped if they have been referenced from somewhere. This
handles the normal case of lots of short-lived temporary objects. Periodically,
a full sweep is made—large objects are still not copied (just get their
generation count bumped) and blocks containing small objects are copied and
compacted. The JVM monitors the efficiency of GC and if it becomes a waste of
time because all objects are long-lived then it switches to mark-and-sweep.
Similarly, the JVM keeps track of how successful mark-and-sweep is, and if the
heap starts to become fragmented it switches back to stop-and-copy. This is
where the “adaptive” part comes in, so you end up with a mouthful:
“adaptive generational stop-and-copy
mark-and-sweep.”
There are a number of additional speedups
possible in a JVM. An especially important one involves the operation of the
loader and Just-In-Time (JIT) compiler. When a class must be loaded (typically,
the first time you want to create an object of that class), the .class
file is located and the byte codes for that class are brought into memory. At
this point, one approach is to simply JIT all the code, but this has two
drawbacks: it takes a little more time, which, compounded throughout the life of
the program, can add up; and it increases the size of the executable (byte codes
are significantly more compact than expanded JIT code) and this might cause
paging, which definitely slows down a program. An alternative approach is
lazy evaluation, which means that the code is not JIT compiled until
necessary. Thus, code that never gets executed might never get JIT
compiled.
Java goes out of its way to guarantee
that variables are properly initialized before they are used. In the case of
variables that are defined locally to a method, this guarantee comes in the form
of a compile-time error. So if you say:
void f() { int i; i++; }
you’ll get an error message that
says that i might not have been initialized. Of course, the compiler
could have given i a default value, but it’s more likely that this
is a programmer error and a default value would have covered that up. Forcing
the programmer to provide an initialization value is more likely to catch a
bug.
If a
primitive
is a data member of a class, however, things are a bit different. Since any
method can initialize or use that data, it might not be practical to force the
user to initialize it to its appropriate value before the data is used. However,
it’s unsafe to leave it with a garbage value, so each primitive data
member of a class is guaranteed to get an initial value. Those values can be
seen here:
//: c04:InitialValues.java // Shows default initial values. class Measurement { boolean t; char c; byte b; short s; int i; long l; float f; double d; void print() { System.out.println( "Data type Initial value\n" + "boolean " + t + "\n" + "char " + c + "\n" + "byte " + b + "\n" + "short " + s + "\n" + "int " + i + "\n" + "long " + l + "\n" + "float " + f + "\n" + "double " + d); } } public class InitialValues { public static void main(String[] args) { Measurement d = new Measurement(); d.print(); /* In this case you could also say: new Measurement().print(); */ } } ///:~
The output of this program
is:
Data type Initial value boolean false char byte 0 short 0 int 0 long 0 float 0.0 double 0.0
The char value is a zero, which
doesn’t print.
You’ll see later that when you
define an object reference inside a class without initializing it to a new
object, that reference is given a special value of null (which is a Java
keyword).
You can see that even though the values
are not specified, they automatically get initialized. So at least there’s
no threat of working with uninitialized
variables.
What happens if you want to give a
variable an initial value? One direct way to do this is simply to assign the
value at the point you define the variable in the class. (Notice you cannot do
this in C++, although C++ novices always try.) Here the field definitions in
class Measurement are changed to provide initial values:
class Measurement { boolean b = true; char c = 'x'; byte B = 47; short s = 0xff; int i = 999; long l = 1; float f = 3.14f; double d = 3.14159; //. . .
You can also initialize nonprimitive
objects in this same way. If Depth is a class, you can insert a variable
and initialize it like so:
class Measurement { Depth o = new Depth(); boolean b = true; // . . .
If you haven’t given o an
initial value and you try to use it anyway, you’ll get a run-time error
called an exception (covered in Chapter 10).
You can even call a method to provide an
initialization value:
class CInit { int i = f(); //... }
This method can have arguments, of
course, but those arguments cannot be other class members that haven’t
been initialized yet. Thus, you can do this:
class CInit { int i = f(); int j = g(i); //... }
But you cannot do this:
class CInit { int j = g(i); int i = f(); //... }
This is one place in which the compiler,
appropriately, does complain about
forward referencing, since this
has to do with the order of initialization and not the way the program is
compiled.
This approach to initialization is simple
and straightforward. It has the limitation that every object of type
Measurement will get these same initialization values. Sometimes this is
exactly what you need, but at other times you need more
flexibility.
The constructor can be used to perform
initialization, and this gives you greater flexibility in your programming since
you can call methods and perform actions at run-time to determine the initial
values. There’s one thing to keep in mind, however: you aren’t
precluding the automatic initialization, which happens before the constructor is
entered. So, for example, if you say:
class Counter { int i; Counter() { i = 7; } // . . .
then i will first be initialized
to 0, then to 7. This is true with all the primitive types and with object
references, including those that are given explicit initialization at the point
of definition. For this reason, the compiler doesn’t try to force you to
initialize elements in the constructor at any particular place, or before they
are used—initialization is already
guaranteed[31].
Within a class, the order of
initialization is determined by the order that the variables are defined within
the class. The variable definitions may be scattered throughout and in between
method definitions, but the variables are initialized before any methods can be
called—even the constructor. For example:
//: c04:OrderOfInitialization.java // Demonstrates initialization order. // When the constructor is called to create a // Tag object, you'll see a message: class Tag { Tag(int marker) { System.out.println("Tag(" + marker + ")"); } } class Card { Tag t1 = new Tag(1); // Before constructor Card() { // Indicate we're in the constructor: System.out.println("Card()"); t3 = new Tag(33); // Reinitialize t3 } Tag t2 = new Tag(2); // After constructor void f() { System.out.println("f()"); } Tag t3 = new Tag(3); // At end } public class OrderOfInitialization { public static void main(String[] args) { Card t = new Card(); t.f(); // Shows that construction is done } } ///:~
In Card, the definitions of the
Tag objects are intentionally scattered about to prove that they’ll
all get initialized before the constructor is entered or anything else can
happen. In addition, t3 is reinitialized inside the constructor. The
output is:
Tag(1) Tag(2) Tag(3) Card() Tag(33) f()
Thus, the t3 reference gets
initialized twice, once before and once during the constructor call. (The first
object is dropped, so it can be garbage-collected later.) This might not seem
efficient at first, but it guarantees proper initialization—what would
happen if an overloaded constructor were defined that did not initialize
t3 and there wasn’t a “default” initialization for
t3 in its definition?
When the
data is static the same
thing happens; if it’s a primitive and you don’t initialize it, it
gets the standard primitive initial values. If it’s a reference to an
object, it’s null unless you create a new object and attach your
reference to it.
If you want to place initialization at
the point of definition, it looks the same as for non-statics.
There’s only a single piece of storage for a static, regardless of
how many objects are created. But the question arises of when the static
storage gets initialized. An example makes this question clear:
//: c04:StaticInitialization.java // Specifying initial values in a // class definition. class Bowl { Bowl(int marker) { System.out.println("Bowl(" + marker + ")"); } void f(int marker) { System.out.println("f(" + marker + ")"); } } class Table { static Bowl b1 = new Bowl(1); Table() { System.out.println("Table()"); b2.f(1); } void f2(int marker) { System.out.println("f2(" + marker + ")"); } static Bowl b2 = new Bowl(2); } class Cupboard { Bowl b3 = new Bowl(3); static Bowl b4 = new Bowl(4); Cupboard() { System.out.println("Cupboard()"); b4.f(2); } void f3(int marker) { System.out.println("f3(" + marker + ")"); } static Bowl b5 = new Bowl(5); } public class StaticInitialization { public static void main(String[] args) { System.out.println( "Creating new Cupboard() in main"); new Cupboard(); System.out.println( "Creating new Cupboard() in main"); new Cupboard(); t2.f2(1); t3.f3(1); } static Table t2 = new Table(); static Cupboard t3 = new Cupboard(); } ///:~
Bowl allows you to view the
creation of a class, and Table and Cupboard create static
members of Bowl scattered through their class definitions. Note that
Cupboard creates a non-static Bowl b3 prior to the
static definitions. The output shows what happens:
Bowl(1) Bowl(2) Table() f(1) Bowl(4) Bowl(5) Bowl(3) Cupboard() f(2) Creating new Cupboard() in main Bowl(3) Cupboard() f(2) Creating new Cupboard() in main Bowl(3) Cupboard() f(2) f2(1) f3(1)
The static initialization occurs
only if it’s necessary. If you don’t create a Table object
and you never refer to Table.b1 or Table.b2, the static Bowl b1
and b2 will never be created. However, they are initialized only when
the first Table object is created (or the first static
access occurs). After that, the static objects are not
reinitialized.
The order of initialization is
statics first, if they haven’t already been initialized by a
previous object creation, and then the non-static objects. You can see
the evidence of this in the output.
Java allows you to group other
static initializations inside a special
“static construction
clause” (sometimes called a static block)
in a class. It looks like this:
class Spoon { static int i; static { i = 47; } // . . .
It appears to be a method, but it’s
just the static keyword followed by a method body. This code, like other
static initializations, is executed only once, the first time you make an
object of that class or the first time you access a static member
of that class (even if you never make an object of that class). For
example:
//: c04:ExplicitStatic.java // Explicit static initialization // with the "static" clause. class Cup { Cup(int marker) { System.out.println("Cup(" + marker + ")"); } void f(int marker) { System.out.println("f(" + marker + ")"); } } class Cups { static Cup c1; static Cup c2; static { c1 = new Cup(1); c2 = new Cup(2); } Cups() { System.out.println("Cups()"); } } public class ExplicitStatic { public static void main(String[] args) { System.out.println("Inside main()"); Cups.c1.f(99); // (1) } // static Cups x = new Cups(); // (2) // static Cups y = new Cups(); // (2) } ///:~
The static initializers for
Cups run when either the access of the static object c1
occurs on the line marked (1), or if line (1) is commented out and the lines
marked (2) are uncommented. If both (1) and (2) are commented out, the
static initialization for Cups never occurs. Also, it
doesn’t matter if one or both of the lines marked (2) are uncommented; the
static initialization only occurs once.
Java provides a similar syntax for
initializing non-static variables for each object. Here’s an
example:
//: c04:Mugs.java // Java "Instance Initialization." class Mug { Mug(int marker) { System.out.println("Mug(" + marker + ")"); } void f(int marker) { System.out.println("f(" + marker + ")"); } } public class Mugs { Mug c1; Mug c2; { c1 = new Mug(1); c2 = new Mug(2); System.out.println("c1 & c2 initialized"); } Mugs() { System.out.println("Mugs()"); } public static void main(String[] args) { System.out.println("Inside main()"); Mugs x = new Mugs(); } } ///:~
You can see that the instance
initialization clause:
{ c1 = new Mug(1); c2 = new Mug(2); System.out.println("c1 & c2 initialized"); }
looks exactly like the static
initialization clause except for the missing static keyword. This syntax
is necessary to support the initialization of anonymous inner classes
(see Chapter
8).
Initializing arrays in C is error-prone
and tedious. C++ uses aggregate initialization to make it much
safer[32]. Java has
no “aggregates” like C++, since everything is an object in Java. It
does have arrays, and these are supported with
array
initialization.
An array is simply a sequence of either
objects or primitives, all the same type and packaged together under one
identifier name. Arrays are defined and used with the square-brackets
indexing
operator [ ]. To define an array you simply follow your type
name with empty square brackets:
int[] a1;
You can also put the square brackets
after the identifier to produce exactly the same meaning:
int a1[];
This conforms to expectations from C and
C++ programmers. The former style, however, is probably a more sensible syntax,
since it says that the type is “an int array.” That style
will be used in this book.
The compiler doesn’t allow you to
tell it how big the array is. This brings us back to that issue of
“references.” All that you have at this point is a reference to an
array, and there’s been no space allocated for the array. To create
storage for the array you must write an initialization expression. For arrays,
initialization can appear anywhere in your code, but you can also use a special
kind of initialization expression that must occur at the point where the array
is created. This special initialization is a set of values surrounded by curly
braces. The storage allocation (the equivalent of using new) is taken
care of by the compiler in this case. For example:
int[] a1 = { 1, 2, 3, 4, 5 };
So why would you ever define an array
reference without an array?
int[] a2;
Well, it’s possible to assign one
array to another in Java, so you can say:
a2 = a1;
What you’re really doing is copying
a reference, as demonstrated here:
//: c04:Arrays.java // Arrays of primitives. public class Arrays { public static void main(String[] args) { int[] a1 = { 1, 2, 3, 4, 5 }; int[] a2; a2 = a1; for(int i = 0; i < a2.length; i++) a2[i]++; for(int i = 0; i < a1.length; i++) System.out.println( "a1[" + i + "] = " + a1[i]); } } ///:~
You can see that a1 is given an
initialization value while a2 is not; a2 is assigned
later—in this case, to another array.
There’s something new here: all
arrays have an intrinsic member (whether they’re arrays of objects or
arrays of primitives) that you can query—but not change—to tell you
how many elements there are in the array. This member is
length. Since arrays in
Java, like C and C++, start counting from element zero, the largest element you
can index is length - 1. If you go out of
bounds, C and C++ quietly accept
this and allow you to stomp all over your memory, which is the source of many
infamous bugs. However, Java protects you against such problems by causing a
run-time error (an exception, the subject of Chapter 10) if you step out
of bounds. Of course, checking every array access costs time and code and
there’s no way to turn it off, which means that array accesses might be a
source of inefficiency in your program if they occur at a critical juncture. For
Internet security and programmer productivity, the Java designers thought that
this was a worthwhile trade-off.
What if you don’t know how many
elements you’re going to need in your array while you’re writing the
program? You simply use new to create the elements in the array. Here,
new works even though it’s creating an array
of primitives (new won’t create a nonarray
primitive):
//: c04:ArrayNew.java // Creating arrays with new. import java.util.*; public class ArrayNew { static Random rand = new Random(); static int pRand(int mod) { return Math.abs(rand.nextInt()) % mod + 1; } public static void main(String[] args) { int[] a; a = new int[pRand(20)]; System.out.println( "length of a = " + a.length); for(int i = 0; i < a.length; i++) System.out.println( "a[" + i + "] = " + a[i]); } } ///:~
Since the size of the array is chosen at
random (using the pRand( ) method), it’s clear that array
creation is actually happening at run-time. In addition, you’ll see from
the output of this program that array elements of primitive types are
automatically initialized to “empty” values. (For numerics and
char, this is zero, and for boolean, it’s
false.)
Of course, the array could also have been
defined and initialized in the same statement:
int[] a = new int[pRand(20)];
If you’re dealing with an array of
nonprimitive objects, you must always use new. Here, the reference issue
comes up again because what you create is an array of references. Consider the
wrapper type Integer, which is a class and not a
primitive:
//: c04:ArrayClassObj.java // Creating an array of nonprimitive objects. import java.util.*; public class ArrayClassObj { static Random rand = new Random(); static int pRand(int mod) { return Math.abs(rand.nextInt()) % mod + 1; } public static void main(String[] args) { Integer[] a = new Integer[pRand(20)]; System.out.println( "length of a = " + a.length); for(int i = 0; i < a.length; i++) { a[i] = new Integer(pRand(500)); System.out.println( "a[" + i + "] = " + a[i]); } } } ///:~
Here, even after new is called to
create the array:
Integer[] a = new Integer[pRand(20)];
it’s only an array of references,
and not until the reference itself is initialized by creating a new
Integer object is the initialization complete:
a[i] = new Integer(pRand(500));
If you forget to create the object,
however, you’ll get an exception at run-time when you try to read the
empty array location.
Take a look at the formation of the
String object inside the print statements. You can see that the reference
to the Integer object is automatically converted to produce a String
representing the value inside the object.
It’s also possible to initialize
arrays of objects using the curly-brace-enclosed list. There are two
forms:
//: c04:ArrayInit.java // Array initialization. public class ArrayInit { public static void main(String[] args) { Integer[] a = { new Integer(1), new Integer(2), new Integer(3), }; Integer[] b = new Integer[] { new Integer(1), new Integer(2), new Integer(3), }; } } ///:~
This is useful at times, but it’s
more limited since the size of the array is determined at compile-time. The
final comma in the list of initializers is optional. (This feature makes for
easier maintenance of long lists.)
The second form of array initialization
provides a convenient syntax to create and call methods that can produce the
same effect as C’s
variable argument lists
(known as “varargs” in C). These can include unknown quantity of
arguments as well as unknown types. Since all classes are ultimately inherited
from the common root class Object (a subject you will learn more about as
this book progresses), you can create a method that takes an array of
Object and call it like this:
//: c04:VarArgs.java // Using the array syntax to create // variable argument lists. class A { int i; } public class VarArgs { static void f(Object[] x) { for(int i = 0; i < x.length; i++) System.out.println(x[i]); } public static void main(String[] args) { f(new Object[] { new Integer(47), new VarArgs(), new Float(3.14), new Double(11.11) }); f(new Object[] {"one", "two", "three" }); f(new Object[] {new A(), new A(), new A()}); } } ///:~
At this point, there’s not much you
can do with these unknown objects, and this program uses the automatic
String conversion to do something useful with each Object. In
Chapter 12, which covers run-time type identification (RTTI),
you’ll learn how to discover the exact type of such objects so that you
can do something more interesting with
them.
//: c04:MultiDimArray.java // Creating multidimensional arrays. import java.util.*; public class MultiDimArray { static Random rand = new Random(); static int pRand(int mod) { return Math.abs(rand.nextInt()) % mod + 1; } static void prt(String s) { System.out.println(s); } public static void main(String[] args) { int[][] a1 = { { 1, 2, 3, }, { 4, 5, 6, }, }; for(int i = 0; i < a1.length; i++) for(int j = 0; j < a1[i].length; j++) prt("a1[" + i + "][" + j + "] = " + a1[i][j]); // 3-D array with fixed length: int[][][] a2 = new int[2][2][4]; for(int i = 0; i < a2.length; i++) for(int j = 0; j < a2[i].length; j++) for(int k = 0; k < a2[i][j].length; k++) prt("a2[" + i + "][" + j + "][" + k + "] = " + a2[i][j][k]); // 3-D array with varied-length vectors: int[][][] a3 = new int[pRand(7)][][]; for(int i = 0; i < a3.length; i++) { a3[i] = new int[pRand(5)][]; for(int j = 0; j < a3[i].length; j++) a3[i][j] = new int[pRand(5)]; } for(int i = 0; i < a3.length; i++) for(int j = 0; j < a3[i].length; j++) for(int k = 0; k < a3[i][j].length; k++) prt("a3[" + i + "][" + j + "][" + k + "] = " + a3[i][j][k]); // Array of nonprimitive objects: Integer[][] a4 = { { new Integer(1), new Integer(2)}, { new Integer(3), new Integer(4)}, { new Integer(5), new Integer(6)}, }; for(int i = 0; i < a4.length; i++) for(int j = 0; j < a4[i].length; j++) prt("a4[" + i + "][" + j + "] = " + a4[i][j]); Integer[][] a5; a5 = new Integer[3][]; for(int i = 0; i < a5.length; i++) { a5[i] = new Integer[3]; for(int j = 0; j < a5[i].length; j++) a5[i][j] = new Integer(i*j); } for(int i = 0; i < a5.length; i++) for(int j = 0; j < a5[i].length; j++) prt("a5[" + i + "][" + j + "] = " + a5[i][j]); } } ///:~
The code used for printing uses
length so that it doesn’t depend on fixed array
sizes.
The first example shows a
multidimensional array of primitives. You delimit each vector in the array with
curly braces:
int[][] a1 = { { 1, 2, 3, }, { 4, 5, 6, }, };
Each set of square brackets moves you
into the next level of the array.
The second example shows a
three-dimensional array allocated with new. Here, the whole array is
allocated at once:
int[][][] a2 = new int[2][2][4];
But the third example shows that each
vector in the arrays that make up the matrix can be of any
length:
int[][][] a3 = new int[pRand(7)][][]; for(int i = 0; i < a3.length; i++) { a3[i] = new int[pRand(5)][]; for(int j = 0; j < a3[i].length; j++) a3[i][j] = new int[pRand(5)]; }
The first new creates an array
with a random-length first element and the rest undetermined. The second
new inside the for loop fills out the elements but leaves the
third index undetermined until you hit the third new.
You will see from the output that array
values are automatically initialized to zero if you don’t give them an
explicit initialization value.
You can deal with arrays of nonprimitive
objects in a similar fashion, which is shown in the fourth example,
demonstrating the ability to collect many new expressions with curly
braces:
Integer[][] a4 = { { new Integer(1), new Integer(2)}, { new Integer(3), new Integer(4)}, { new Integer(5), new Integer(6)}, };
The fifth example shows how an array of
nonprimitive objects can be built up piece by piece:
Integer[][] a5; a5 = new Integer[3][]; for(int i = 0; i < a5.length; i++) { a5[i] = new Integer[3]; for(int j = 0; j < a5[i].length; j++) a5[i][j] = new Integer(i*j); }
This seemingly elaborate mechanism for
initialization, the constructor, should give you a strong hint about the
critical importance placed on initialization in the language. As Stroustrup was
designing C++, one of the first observations he made about productivity in C was
that improper initialization of variables causes a significant portion of
programming problems. These kinds of bugs are hard to find, and similar issues
apply to improper cleanup. Because constructors allow you to guarantee
proper initialization and cleanup (the compiler will not allow an object to be
created without the proper constructor calls), you get complete control and
safety.
In C++, destruction is quite important
because objects created with new must be explicitly destroyed. In Java,
the garbage collector automatically releases the memory for all objects, so the
equivalent cleanup method in Java isn’t necessary much of the time. In
cases where you don’t need destructor-like behavior, Java’s garbage
collector greatly simplifies programming, and adds much-needed safety in
managing memory. Some garbage collectors can even clean up other resources like
graphics and file handles. However, the garbage collector does add a run-time
cost, the expense of which is difficult to put into perspective because of the
overall slowness of Java interpreters at this writing. As this changes,
we’ll be able to discover if the overhead of the garbage collector will
preclude the use of Java for certain types of programs. (One of the issues is
the unpredictability of the garbage collector.)
Because of the guarantee that all objects
will be constructed, there’s actually more to the constructor than what is
shown here. In particular, when you create new classes using either
composition or inheritance the guarantee of construction also
holds, and some additional syntax is necessary to support this. You’ll
learn about composition, inheritance, and how they affect constructors in future
chapters.
[28]
In some of the Java literature from Sun they instead refer to these with the
clumsy but descriptive name “no-arg constructors.” The term
“default constructor” has been in use for many years and so I will
use that.
[29]
The one case in which this is possible occurs if you pass a reference to an
object into the static method. Then, via the reference (which is now
effectively this), you can call non-static methods and access
non-static fields. But typically if you want to do something like this
you’ll just make an ordinary, non-static method.
[30]
A term coined by Bill Venners (www.artima.com) during a seminar that he and I
were giving together.
[31]
In contrast, C++ has the constructor initializer list that causes
initialization to occur before entering the constructor body, and is enforced
for objects. See Thinking in C++, 2nd edition (available on
this book’s CD ROM and at www.BruceEckel.com).
[32]
See Thinking in C++, 2nd edition for a complete description
of C++ aggregate initialization.