Sunday, October 4, 2009

Strongly typed programming language

In computer science and computer programming, the term strong typing is used to describe those situations where programming languages specify one or more restrictions on how operations involving values having different data types can be intermixed. Its antonym is weak typing. However, these terms have been given such a wide variety of meanings over the short history of computing that it is often difficult to know, out of context, what an individual author means when using them.

Interpretation

Most generally, "strong typing" implies that the programming language places severe restrictions on the intermixing that is permitted to occur, preventing the compiling or running of source code which uses data in what is considered to be an invalid way. For instance, an integer division operation may not be used upon strings; a procedure which operates upon linked lists may not be used upon numbers. However, the nature and strength of these restrictions is highly variable.
Benjamin C. Pierce, author of Types and Programming Languages and Advanced Types and Programming Languages, says, "I spent a few weeks... trying to sort out the terminology of "strongly typed," "statically typed," "safe," etc., and found it amazingly difficult.... The usage of these terms is so various as to render them almost useless." [1] Luca Cardelli's article Typeful Programming describes strong typing simply as the absence of unchecked run-time type errors.[2] In other writing, the absence of unchecked run-time errors is referred to as safety or type safety; Tony Hoare's early papers call this property security.

Meanings in computer literature

Some of the factors which writers have qualified as "strong typing" include:
  • Strong guarantees about the run-time behavior of a program before program execution, whether provided by static analysis, the execution semantics of the language or another mechanism.
  • Type safety; that is, at compile or run time, the rejection of operations or function calls which attempt to disregard data types. In a more rigorous setting, type safety is proved about a formal language by proving progress and preservation.
  • The guarantee that a well-defined error or exceptional behavior (as opposed to an undefined behavior) occurs as soon as a type-matching failure happens at runtime, or, as a special case of that with even stronger constraints, the guarantee that type-matching failures would never happen at runtime (which would also satisfy the constraint of "no undefined behavior" after type-matching failures, since the latter would never happen anyway).
  • The mandatory requirement, by a language definition, of compile-time checks for type constraint violations. That is, the compiler ensures that operations only occur on operand types that are valid for the operation.
  • Fixed and invariable typing of data objects. The type of a given data object does not vary over that object's lifetime. For example, class instances may not have their class altered.
  • The absence of ways to evade the type system. Such evasions are possible in languages that allow programmer access to the underlying representation of values, i.e., their bit-pattern.
  • Omission of implicit type conversion, that is, conversions that are inserted by the compiler on the programmer's behalf. For these authors, a programming language is strongly typed if type conversions are allowed only when an explicit notation, often called a cast, is used to indicate the desire of converting one type to another.
  • Disallowing any kind of type conversion. Values of one type cannot be converted to another type, explicitly or implicitly.
  • A complex, fine-grained type system with compound types.

Variation across programming languages

Note that some of these definitions are contradictory, others are merely orthogonal, and still others are special cases (with additional constraints) of other, more "liberal" (less strong) definitions. Because of the wide divergence among these definitions, it is possible to defend claims about most programming languages that they are either strongly or weakly typed. For instance:
  • Java, Pascal and C require all variables to have a defined type and support the use of explicit casts of arithmetic values to other arithmetic types. Java and Pascal are often said to be more strongly typed than C, a claim that is probably based on the fact that C supports more kinds of implicit conversions than Pascal, and C also allows pointer values to be explicitly cast while Java and Pascal do not. Java itself may be considered more strongly typed than Pascal as manners of evading the static type system in Java are controlled by the Java Virtual Machine's dynamic type system.
  • Smalltalk, Ruby, Python, Self, and the LISP family of languages are all "strongly typed" in the sense that typing errors are prevented at runtime, but these languages make no use of static type checking: the compiler does not check or enforce type constraint rules. The term duck typing is now used to describe the dynamic typing paradigm used by the languages in this group.
  • Standard ML, OCaml and Haskell have purely static type systems, in which the compiler automatically infers a precise type for all values. These languages (along with most functional languages) are considered to have stronger type systems than Java, as they permit no implicit type conversions. While OCaml's libraries allow one form of evasion (Object magic), this feature remains unused in most applications.
  • Visual BASIC is a hybrid language. In addition to including statically typed variables, it includes a "Variant" data type that can store data of any type. Its implicit casts are fairly liberal where, for example, one can sum string variants and pass the result into an integer literal.
  • Assembly language and Forth have been said to be untyped. There is no type checking; it is up to the programmer to ensure that data given to functions is of the appropriate type. Any type conversion required is explicit.
  • Ada is statically and strongly typed.
For this reason, writers who wish to write unambiguously about type systems often eschew the term "strong typing" in favor of specific expressions such as "static typing" or "type safety".

No comments:

Post a Comment