Many modern programming languages allow a programmer to allocate and reclaim memory for data whose lifetime is not determined by the lexical scope of the routine that allocates the data. Memory of this type is said to be “dynamically” allocated. Dynamic memory may be manually created and destroyed by the programmer through, for example, explicit use of memory management library routines. Alternatively, dynamic memory may be managed by a program's run-time system. In this latter approach, while the programmer must request the dynamic allocation of memory for data, she does not need to determine when that memory is no longer needed—the program's run-time system does this automatically.
It is a generally recognized practice in computer programming to use what is known as a heap to provide for the dynamic creation (“allocation”) and recovery (“deallocation”) of regions of memory known variously as nodes, blocks, cells, or objects. Several heaps may be associated with a single program. Determining when a node is no longer referenced elsewhere in a program is often a very difficult task and is, therefore, a source of errors and excess memory use due to unused nodes that are not properly or timely deallocated.
One technique to provide memory management is referred to as “reference counting.” Reference counting memory management is based on counting the number of references to each cell or node from other, active cells or nodes. When the number of references to a cell or node is zero, it may be reclaimed and made available for use in subsequent memory allocation operations.
Another technique to provide dynamic memory management uses a garbage collected heap. In this approach, node deallocation is performed by runtime code rather than explicitly by program code. Many runtime-based languages provide this facility so that code written in these languages do not have to manage the complexity of determining when dynamically allocated nodes can be deallocated. Prior art garbage collection technology is discussed in Garbage Collection Algorithms for Automatic Dynamic Memory Management by Richard Jones and Rafael Lins, published by John Wiley & Sons, Copyright 1996. This reference is indicative of the prior art.
Prior art approaches to memory management use one library to support reference counting programs and a separate/different library to support garbage collected programs. Thus, a prior art runtime environment that supports the execution of both reference count and garbage collected applications requires that both sets of libraries (one for reference count operations and one for garbage collected operations) be loaded into a computer system's main memory. Since each shared library is typically very large (e.g., 100+ megabytes, MB), such an approach consumes a great deal of the system's memory resources. As used herein, the term “shared library” is a library where the code segments are shared across processes such that each process using the library doesn't need a private copy of the same code segment. Accordingly, it would be beneficial to provide a mechanism that supports both reference count and garbage collected memory management operations without incurring the memory overhead of separate and distinct library implementations.
In one embodiment the invention provides a method to use a dual-use library. The method includes: receiving a first instruction that, when executed, invokes a first routine; determining the first instruction's required memory management scheme; and executing reference-count specific or garbage collection specific instructions based on whether the first instruction requires reference count or garbage collection memory management.
In another embodiment, the invention provides a dynamic memory management method. The method comprising the acts of receiving a call to a first routine in a runtime library from an executing process and performing a first one or more instructions in the first routine associated with reference count memory management if the process requires reference count memory management, otherwise performing a second one or more instructions in the first routine associated with garbage collection memory management if the entity requires garbage collection memory management.
Methods in accordance with the invention may be implemented as computer executable instructions stored in any media (e.g., a program storage device) that may be read by a computer system.
The following description is presented to enable any person skilled in the art to make and use the invention as claimed and is provided in the context of a computer system executing the Mac OS® X operating system. (MAC OS is a registered trademark of Apple Inc.) Variations will, of course, be readily apparent to those skilled in the art. Accordingly, the claims appended hereto are not intended to be limited by the disclosed embodiments, but are to be accorded their widest scope consistent with the principles and features disclosed herein.
A shared library is generally formatted to identify itself as a shared library. The use of shared libraries is a well known practice that allows the sharing of read-only data across several processes in a multi-process system. The binary form of processor instructions constituting compiled higher level language constructs generally comprise the majority of shared memory. A Mac OS X based System has over 100 MB of shared processor instructions in its libraries. Libraries also generally include routines that are only invoked by other library routines. A library in accordance with the invention is one that includes routines for simultaneously supporting both reference count and garbage collected (generational and full) memory management—it is a “dual-use” library.
When a programmer creates a library for use with a reference count only memory management scheme, they will use reference counting operations to track the number of references to each object subject to dynamic memory management. Most often, these operations are embodied in routines that the programmer explicitly calls. This is not the only approach, however. For example, a programmer could include code to directly manipulate an object's counter or the programmer's compiler application could be modified to include the necessary counter operations each time an object assignment operator is present. While different programming environments may use different names, addReference( ) and removeReference( ) routines will be used herein to represent these actions.
In contrast, a programmer creating a library for use with a garbage collected only memory management scheme will not include any extra program code to track an object's reference count.
A dual-use library in accordance with the invention must be able to receive and correctly handle calls from programs requiring either reference count or garbage collected memory management. As described herein, this capability may be provided by introducing a new assignment routine, assign( ), and instrumenting the addReference( ) and removeReference( ) routines. In one embodiment, dual-use library source code for the assign( ), addReference( ) and removeReference( ) routines is shown in
Implementation of a dual-use library in accordance with
In one embodiment of the invention, a programmer developing a dual-use library would explicitly use the assign( ) routine. That is, rather than coding an assignment in the conventional way (e.g., fred→slot1=wilma), they would use an assignment routine such as that shown in FIG. 4—e.g., assign(fred→slot1, wilma). This approach requires no changes to the developer's compiler application. It does, however, require the programmer to use the assign( ) routine. An implementation in accord with this approach modifies the programming language (e.g., C, C++, Objective-C or Objective-C++) to introduce a new storage type which can be used to annotate a pointer type object. For example, a programmer may create a pointer and assign it a type that restricts its use to pointing to garbage collected heap memory, or to stack memory or to global memory. In embodiments which use this approach, every time a garbage collected pointer is assigned (e.g., a pointer of a type that is restricted to point to garbage collected memory such as, for example, garbage collected heap memory), garbage collected library routines may be invoked at run-time. If an assignment of a non-garbage collected pointer is made (e.g., involving a pointer to global memory), garbage collected memory management library routines are not called.
In another embodiment of the invention, a compiler can be provided that would automatically substitute all standard assignment operations (e.g., fred→slot1=wilma) with the newly defined assignment routine—e.g., assign(fred→slot1, wilma). This approach does not require the library developer to change how they program. It does, however, require a compiler application that has been modified to know about the assignment routine. In embodiments which use this approach, a compiler modified as described here would be invoked with a special flag. One value of this flag would indicate the program should be compiled to use garbage collected memory management. Another value of this flag would indicate the program should be compiled to use reference counting memory management. In this embodiment, the application programmer is tasked with calling the appropriate library routine (i.e., garbage collected or not) such that the proper routines are “compiled into” the final object code.
It will be recognized that there are a small set of coding patterns that work under non-garbage collected memory management schemes that don't work under garbage collected schemes. One such set centers around the difference in deallocation versus finalization order for a subgraph of objects whose last reference has been removed. In general, there is an ordering (e.g., top-down) of deallocation operations under non-garbage collected schemes. Under garbage collection operations, however, an object's graph is traversed in an arbitrary order when finalize calls are issued (if implemented). In such environments, a new finalize call into the dual-use library may be provided to account for the difference in deallocation patterns.
Another such pattern involves the use of object allocation caches. In general, object allocation caches don't work under garbage collection schemes whereas they do in reference counting schemes. It will be recognized that references to objects within objects or by global variables may be stored without the use of an addReference( ) routine to, in particular, avoid creating a reference cycle among a set of objects. It is a best practice to make these locations known to the garbage collector so that it preserves the logical ownership pattern that exists in a reference counting design. This can be done with storage annotation made visible to the compiler. It will also be recognized that references to objects may be held in traditional heap memory at the same time a garbage collected heap is provided. Such references need to be made with addExternal( ) and removeExternal( ) calls. These routines can be instrumented such that they invoke the addReference( ) and removeReference( ) routines if they are called by a program requiring reference count memory management.
Various changes in the components, circuit elements, as well as in the details of the illustrated operational methods and pseudo-code are possible without departing from the scope of the following claims. For example, the pseudo-code described herein is exemplary only. One of ordinary skill in the art of computer programming in general, and programming language and operating system design in particular, will recognize that the functionality of the described routines may be combined into fewer routines or divided into a larger number of routines. It will further be recognized that a dual-use library in accordance with the invention may be embodied in compiled code, assembly language code or an intermediate form of program code such as, for example Java® byte codes. (JAVA is a registered trademark of Sun Microsystems, Inc.) Further, acts in accordance with pseudo code
The invention relates generally to computer program memory management and, more particularly, to a runtime library that supports both reference count and garbage collected memory management. This disclosure is also related to U.S. patent application Ser. No. 11/608,345, entitled “Dynamic Memory Management,” filed 8 Dec. 2006, which is hereby incorporated by reference.