Objects Have Class, References Have Type
Well, you have arrived. If you have survived to this point, you are beginning
to realize both the power and complexity of programming using objects and inheritance.
Much of the confusion about references, types, classes and objects is simplified
by the following statement "Objects have class, references have type." The
more technically correct version may be "Objects have class, reference
variables have type." As long as it is understood that reference variables "contain" references
to objects, the former statement is a superior in its simplicity.
What is a Type?
There are a number of definitions of a type, but I have struggled to find
an illuminating statement.
According to Grady Booch (Object Oriented Analysis and Design) type is "The
definition of the domain of allowable values that an object may possess and
the set of operations that may be performed upon the object. The terms class
and type are usually (but not always) interchangeable; a type is a slightly
different concept than a class, in that it emphasizes the importance of conformance
to a common protocol... Typing lets us express our abstractions so that the
programming language in which we implement them can be made to enforce design
decisions."
According to Bertrand Meyer (Object-Oriented Software Construction) "A
type is the static description of certain dynamic objects: the various data
elements that will be processed during execution of a software system...The
notion of type is a semantic concept, since every type directly influences
the execution of a software object by defining the form of the objects that
the system will create and manipulate at run time."
According to the "Gang of Four" (Design Patterns) "The set
of all signatures defined by an object's operations is called the interface
to the object... A type is a name used to denote a particular interface....
An object may have many types, and widely different objects can share a type....
An object's interface says nothing about its implementation--different objects
are free to implement requests differently."
It is the last definition that emphasizes the concept of programming to an
interface, one or more public views of an object. You can look at an interface
as a contract between the designer of the class and the caller of the class.
The interface determines what public fields and methods are visible to a caller
of a class. C# provides direct support for "design by contract" with
the key word interface. In C#, a class can inherit multiple interfaces and
a single implementation hierarchy. Thus a class can contain one or more distinct
public contracts. Both Booch and Meyer allude to the fact that types allow
the compiler to enforce design decisions and influence the execution of software.
Why is this so important? The answer is that references imply restricted access.
References Have Type
You will recall that methods and properties of an object have access modifiers
such as public, protected and private. The calling context and access modifiers
determine the visibility of fields and methods of an object. Thus, public
fields and methods are visible to callers outside of the class. However,
there is another level of access control that is much more difficult to
grasp and explain. In a nutshell, when you declare a reference variable,
the type
of the reference variable restricts access to one of the object's public
contracts.
You use a reference variable to touch an object's fields and properties. The
key concept is that the type of the reference variable determines which of
the objects public contracts (interfaces) is accessible using the reference
variable. If an object had only one distinct public contract, this discussion
would be moot. But in fact an object can have many public contracts. For instance,
every object in the C# framework derives from the base class Object. Lets look
again at the inheritance hierarchy for TextBox:
System.Object
System.Web.UI.Control
System.Web.UI.WebControls.WebControl
System.Web.UI.WebControls.TextBox
The TextBox control inherits from WebControl which inherits from Control which
inherits from Object. Each of these four classes defines an interface or public
view, a type. If you declare a reference variable of type Object, only the
public members defined in System.Object are visible using the reference variable.
object tb= new TextBox();
If you use the Visual Studio IDE and enter "tb." the IDE will automatically
display the only four methods that can be touched using "tb":
Equals()
GetHashCode()
GetType()
ToString()
Thus the type of the reference variable "tb", object, limits access
to a subset of the public methods and fields of the instance of class TextBox.
Only one of the many public interfaces can be touched with the reference variable "tb".
You could also declare "tb" as type Control:
Control tb= new TextBox();
Now if you enter "tb.", a larger number of public fields and methods
are automatically displayed by the IDE including:
Equals()
GetHashCode()
GetType()
ToString()
...
DataBind()
Dispose()
Finally, if you declare a reference variable of type TextBox all of the public
fields and members of the class TextBox are visible using the reference variable "tb" including:
Equals()
GetHashCode()
GetType()
ToString()
DataBind()
Dispose()
...
Text
All I can suggest is try it! Declare the variable "tb" using different
types and see how the type of the reference variable restricts access to the
public interface of the object. Note that an object's set of possible interfaces
(types) can be defined by the objects class hierarchy as in the preceding example
or through inheritance of multiple interfaces. For example, here is the declaration
of the class Hashtable that implements six distinct interfaces:
public class Hashtable : IDictionary, ICollection, IEnumerable, ISerializable,
IDeserializationCallback, ICloneable
IEnumerable hash=new Hashtable();
The reference variable hash is of type IEnumerable and is restricted to the
public methods and fields of the IEnumerable Interface and System.Object:
Equals()
GetHashCode()
GetType()
ToString()
GetEnumerator()
Objects Have Class
Well this should be pretty self explanatory. Remember, an object is an instance
of a class and goes on the heap. A local reference variable goes on the
stack and is used to touch an object. The full interface of the object
is defined
in the class so an object has class. The object variable, on the other
hand, is restricted by its type and can only be used to touch fields and
properties
defined in its type (or in System.Object). The compiler enforces the design
decisions. Be clear, when you created a reference variable of type object
in the previous example, you still created an instance of class TextBox
on the heap.
object tb= new TextBox();
This line of code creates a local reference variable of type object on the
stack and an object of class TextBox on the heap.
Casting About
Since you can only touch an object using a reference variable, at times the
need arises to change the type of a reference variable. You can do this by "casting" the
type of a reference variable to another valid type. The type of the variable
must be in the domain of types (public interfaces) defined by the class of
the object. If you try to cast an object variable to a type that is not supported
by the class of the object, a runtime exception will be thrown. "Casting
up" the class hierarchy from a subtype to a base type is always safe.
The converse is not true. "Casting down" the class hierarchy is
not always safe and result in a runtime InvalidCastException. Here is an
example of casting down the class hierarchy:
object tb= new TextBox();
((TextBox)tb).Text= "Hello";
By explicitly casting the reference variable to the type TextBox, you are
telling the compiler that you want to access the full interface of the
class TextBox.
A more common task is create another reference variable like this:
TextBox textBox= (TextBox)tb;
This creates a separate local reference variable on the stack of type TextBox.
Both variables, "tb" and "textBox", contain references
to the same object on the heap of class TextBox, but each variable has a
different type and hence different access to the public fields and methods
of the single object on the heap.
A Twisted Analogy
Now if this is just not making sense to you, let me try a twisted analogy of
types. Just as an object can have different public views, a complex person
can wear different "hats" or play different roles in life. A complex
person can have multiple defined roles or types. When a complex person is
wearing the "hat" of say a police officer, he or she is restricted
by the implied contract between the clients (the populace) and the role of
a police officer. When in uniform, the complex person has a type of police
officer. When that same complex person goes home, he or she may put on the "hat" of
a parent and is expected to exhibit behavior consistent with the role of
a parent. At home, off duty, the complex person is expected to have the type
of a parent. In an emergency, the off duty complex person is expected to
assume the role of a police officer. You can "cast" an off duty
officer of type parent to a police officer. You should not cast a twisted
programmer of type parent to the type police officer. This will definitely
result in a runtime exception!
IS and AS
Since casting down the class hierarchy is not "safe", you must code
defensively. The C# language provides the key words is and as which greatly
simplifies writing exception safe code. The key word is returns true if the
object referred to by the left sided operand supports the type of the right
sided operand. According to the IDE:
The is operator is used to check whether the run-time type of an object is
compatible with a given type. The is operator is used in an expression of the
form:
expression is type
Where:
expression
An expression of a reference type.
type
A type.
Remarks
An is expression evaluates to true if both of the following conditions are
met:
- expression is not null.
- expression can be cast to type. That is, a cast expression of the form
(type)(expression) will complete without throwing an exception. For more
information, see 7.6.6
Cast expressions .
A compile-time warning will be issued if the expression expression
is type is known to always be true or always be false.
The is operator cannot be overloaded.
Here is an example using is:
private TextBox Cast(object o)
{
if (o is TextBox)
{
return (TextBox)o;
}
else
{
return null;
}
}
object tb= new TextBox();
TextBox textBox= Cast(tb);
In this sample using is, if the cast is invalid the result is null. Alternatively,
you can use the key word as. According to the IDE:
The as operator is used to perform conversions between compatible types. The
as operator is used in an expression of the form:
expression as type
where:
expression
An expression of a reference type.
type
A reference type.
Remarks
The as operator is like a cast except that it yields null on conversion failure
instead of raising an exception. More formally, an expression of the form:
expression as type
is equivalent to:
expression is type ? (type)expression : (type)null
except that expression is evaluated only once.
As in the previous example using is, if the cast is not valid, the result
is null. I prefer using is. I would argue that code using is, is more readable
than code using as.
Well I hope this discussion of references, types, classes and objects has
clarified things for you. In a nutshell, when you declare a reference variable,
the type of the reference variable restricts access to one of the object's
public contracts. When in doubt, repeat the mantra "Objects have class,
references have type."