From a72172a2a40a97168a737b1f3c6121990a03b48c Mon Sep 17 00:00:00 2001 From: "Guillermo J. Rozas" Date: Sat, 29 Sep 1990 23:00:31 +0000 Subject: [PATCH] Initial revision --- v7/src/compiler/documentation/safety.txt | 217 +++++++++++++++++++++++ 1 file changed, 217 insertions(+) create mode 100644 v7/src/compiler/documentation/safety.txt diff --git a/v7/src/compiler/documentation/safety.txt b/v7/src/compiler/documentation/safety.txt new file mode 100644 index 000000000..07608c03f --- /dev/null +++ b/v7/src/compiler/documentation/safety.txt @@ -0,0 +1,217 @@ +-*- Text -*- + +$Header: /Users/cph/tmp/foo/mit-scheme/mit-scheme/v7/src/compiler/documentation/safety.txt,v 1.1 1990/09/29 23:00:31 jinx Exp $ + + COMPILER SAFETY INFORMATION + Liar versions 4.77 and later + +This article describes how to control the compilation process in order +to achieve the desired mix of safety, debuggability, and speed. + +The task of the native-code compiler is to translate a source (or +Scode) program into the native machine language in order to make the +program run faster than when interpreted. + +Although a straight-forward translation speeds the program +significantly, much of the achievable performance comes from +optimizations that the compiler can perform after statically analyzing +the program text. There is a limit, however, to the extent of the +information that can be collected statically, and, in order to achieve +higher performance (often desired, occasionally necessary), the +compiler can be directed to assume additional information that is not +apparent after analyzing the program text. + +Compilation switches are (global) variables whose value when the +compiler is run determines how the compilation proceeds. Some of the +switches provide information that cannot be deduced statically and +allow the relaxation of some runtime consistency checking and the +collection of information to be displayed when an error is detected +and signalled. Relaxing the runtime constraints often makes the +generated code smaller and faster, but may cause problems if the +program being compiled has not been fully debugged, or is invoked with +inappropriate arguments at run time. + +Safety (correctness) can primarily be compromised by eliminating +checks that the program should perform at runtime. These checks +are divided into a few categories: + +- Heap availability checks. Programs need to invoke the storage +manager (garbage collector) when they need more memory than is +available. Each time that storage is needed, its availability should +be checked. If this is not done, the system may be damaged. + +- Stack availability checks. Storage is divided into a heap used to +allocate objects with indefinite extent, and a stack used for +procedure call frames with dynamic extent +(call-with-current-continuation copies the stack when invoked). +Availability of storage must be checked in the appropriate region. A +very deep recursion may cause the stack to overflow, and this +condition must be checked in order to avoid overwriting other regions +of memory. + +- Type checks. Scheme is a strongly (albeit dynamically) typed +language. Operations are only defined on certain types of objects, +and a program is in error if it attempts to operate on the wrong type +of data. + +- Range checks. The type of some arguments to a procedure may be +correct, but there may be further restrictions on them which may not +be satisfied. For example, vector and string indices must be +non-negative integers smaller than the length of the vector or string, +filenames represented as strings must denote existing files with the +appropriate protection when the files are going to be opened for +reading, etc. + +These checks obviously require some code, when compared to the code +that could be generated assuming that no violations will occur at +runtime. This code requires space, and time to execute, but +furthermore, may cause other performance degradation with respect to +the version where no violations are guaranteed to occur. This +additional performance degradation arises from preventing the compiler +to make better register assignments or reuse the results of previous +computations. + +For a translation to be safe, ie. completely correct, all these checks +must be performed at runtime except in those situations when the +compiler can prove that violations cannot occur at runtime. These +situations are very rare, so for most programs, most checks would be +included in the code generated by the compiler. + +The MIT Scheme compiler treats each of these consistency checks as +follows: + +- Heap availability checks. Heap availability is currently not checked +on every allocation, but instead is checked when allocating large +blocks of storage, and otherwise checked frequently, typically on +entry to procedures and continuations. The storage manager reserves a +block of storage past the end of the logical end of storage in order +to allow this scheme to work. This scheme is, however, unsafe. It is +possible, but unlikely, to write programs that, after being compiled, +will overflow the heap and cause the system to crash at runtime. The +current heuristic has not being observed to fail, but future versions +of the compiler will improve matters by allowing more careful code +generation, and/or limiting the amount of allocation between checks to +the size of the storage manager's overflow buffer. + +- Stack availability checks: Stack availability is currently not +checked at all by compiled code. A very deep or infinite recursion +will cause the system to crash. This WILL be fixed in the near +future. + +- Type checks and range checks: A Scheme program can be considered to +be a set of calls to primitive operations and some higher-level glue +that pieces them together. The higher-level glue does not directly +manipulate objects, but instead passes it around to the various +primitives in a controlled fashion. Thus type and range checks are +not needed in the higher-level glue, but only in the primitives +themselves. There are various switches that control how primitives +are treated by the compiler, and they provide the main form of user +control of the safety of the compilation process. + + Control of the open coding (in-lining) of primitives + +Primitives may be open-coded or called out of line. The out-of-line +versions are safe, ie. they perform all pertinent consistency checks. +The compilation switches listed below control how the primitives are +open coded. + +Some important considerations: + +- Under all possible settings of the switches described below, any +generated code corresponding to a primitive call, whether open coded +or not, will operate correctly on correct inputs. + +- The compiler will not make an unsafe program safe, ie. safe +translation does not compensate for unsafe programs. + +This article describes whether and when the translation of the program +into native code will reduce the safety of the program (as compared to +the interpreted version), but there is no realistic way to increase +its safety. A program may be inherently unsafe if it uses inherenty +unsafe primitives inappropriately. + +Some primitives of the MIT Scheme system are inherently unsafe. They +are used for system maintenance and low-level system operation, but, +like everything else in the system, they are available to users. +Their use should be avoided except in rare occasions. Using them +arbitrarily may cause the system to crash, or worse, damage it in +subtle ways that will produce spurious wrong results or later crashes. +There is nothing the compiler can effectively do to prevent this, +since any other action might change the meaning of the program on +correct inputs. + +- The switches listed below are not orthogonal. Their meaning +sometimes depends on the settings of the other switches. + +The following compilation switches affect the open-coding of +primitives: + + + COMPILER:OPEN-CODE-PRIMITIVES? + +This N-ary switch can take several values as described below. Two of +the values (true and false) are booleans, the rest symbols. + +Note that if a primitive call is open coded when a switch setting is +used, it will also be open coded with settings that appear below in +the list. + +The possible values for this switch are: + +-- false: No primitive calls are open-coded. All primitives are +called out-of-line and the code is fully safe. + +-- CORRECT: Open code only those primitive calls whose corresponding +code is always correct, and therefore safe. + +-- INNOCUOUS: Open code primitive calls whose corresponding code is +correct when given appropriate arguments, and will not crash +immediately when given inappropriate arguments. Primitive calls may +return values when they should have signalled an error, but the values +returned are relatively innocuous: they are guaranteed to be valid +Scheme objects. The overall program or the system may still fail, +since these incorrect values may cause the program to take the wrong +branches later and end up in unsafe or unexpected code that it would +never have executed had the errors been signalled. Damage to the +system is unlikely. + +-- ALLOW-READS: Open code even if arbitrary memory locations may be +read with inappropriate arguments. This may cause a memory trap if +the location is read-protected by the Operating System, or the +resulting address is not valid (eg. not aligned properly), and may +cause the garbage collector or other parts of the program and system +to crash if the data stored at the location read is not a valid object +but looks like one. If the extracted data is only used temporarily +and never stored in long living data structures or environments, +damage to the system is unlikely. + +-- ALLOW-WRITES: Open code even if arbitrary memory locations may be +written. This may cause an immediate failure if the location is not +writable, or other problems if the integrity of some data is destroyed +causing (often obscure) errors or crashes later. + +-- true: open code all primitive calls (that the compiler is capable of +open-coding) without regard for safety. + + COMPILER:GENERATE-TYPE-CHECKS? + COMPILER:GENERATE-RANGE-CHECKS? + +These boolean switches control whether type or range checks should be +issued. The code generated is longer and slower when they are. Note +that a primitive call that would not fall in the CORRECT setting of +COMPILER:OPEN-CODE-PRIMITIVES? if these checks where not issued, might +very well fall in it when they are. For most intents and purposes, +turning both of these switches on bumps COMPILER:OPEN-CODE-PRIMITIVES? +to ALLOW-WRITES unless it is false. + + + COMPILER:PRIMITIVE-ERROR-RESTARTABLE? + +This boolean switch controls how errors will be signalled if they are +detected at runtime due to incorrect arguments found by checks in the +open coding of primitive calls. If set to true, the code will be +longer and slower, but will provide the maximum amount of debugging +information, and in addition, the primitive call may be bypassed and +the computation restarted as if it had completed successfully. If set +to false, the code may be noticeably smaller and faster, but there may +be less debugging information and some restarting ability may be lost. -- 2.25.1