Chris Hanson [Fri, 16 Jan 2004 19:04:38 +0000 (19:04 +0000)]
Pass the shared objects database as an argument to all the handlers,
rather than using a dynamically-bound variable. Pass an additional
argument to indicate when close-paren and close-bracket are allowed.
Fix long-standing bug in handling of unmatched close parens at top
level: the port comparison was never true because of encapsulation.
Chris Hanson [Sun, 11 Jan 2004 07:18:05 +0000 (07:18 +0000)]
Eliminate INPUT-BUFFER/DISCARD-CHAR, which couldn't be used with
non-blocking input ports because there was no way to tell whether the
char was discarded. Instead, use INPUT-BUFFER/READ-CHAR in its place,
which is only slightly slower and does provide this indication.
Chris Hanson [Tue, 6 Jan 2004 06:22:37 +0000 (06:22 +0000)]
Implement SRFI 27, except for RANDOM-SOURCE-PSEUDO-RANDOMIZE!. While
I agree that this could be useful, it effectively mandates a
particular PRNG, and I don't want to be forced to use it.
Chris Hanson [Mon, 5 Jan 2004 21:04:38 +0000 (21:04 +0000)]
Rewrite the code that converts the output of the RNG to usable
numbers. The old methods didn't work; instead we now use the
rejection method, which is the only known good method.
uid67408 [Mon, 29 Dec 2003 05:07:54 +0000 (05:07 +0000)]
Fix bug: when parsing bracketed content, signal an appropriate error
when the content contains an illegal character, rather than just
failing to match.
Chris Hanson [Wed, 26 Nov 2003 07:00:40 +0000 (07:00 +0000)]
Fix broken behavior of RANDOM when given modulus that exceeds B. The
old implementation just scaled a random element (uniformly distributed
integer between 0 and B-1 inclusive) into the given range; this
strategy works fine for a modulus <= B but breaks pretty badly for
larger B.
In addition, RANDOM now generates an error if the modulus is a real
number but neither an exact positive integer nor an inexact real. The
old behavior in this case was arbitrary, not terribly useful, and
likely to be at odds with the user's expectations.
Here are some tests using the "ent" program that show the problem with
the old RANDOM implementation.
The first example is a 128MB file generated by repeatedly calling
(RANDOM (EXPT 2 64)), slicing each random number into bytes, and
writing the bytes to the file. The result is appalling:
Entropy = 7.500650 bits per byte.
Optimum compression would reduce the size
of this 134217728 byte file by 6 percent.
Chi square distribution for 134217728 samples is 515675588.87, and
randomly would exceed this value 0.01 percent of the times.
Arithmetic mean value of data bytes is 111.9516 (127.5 = random).
Monte Carlo value for Pi is 3.365650585 (error 7.13 percent).
Serial correlation coefficient is -0.031868 (totally uncorrelated = 0.0).
In contrast, here is the result from a file of the same length
generated using (RANDOM 256). This throws away 75% of each random
element, but shows the quality of the underlying generator:
Entropy = 7.999999 bits per byte.
Optimum compression would reduce the size
of this 134217728 byte file by 0 percent.
Chi square distribution for 134217728 samples is 235.11, and
randomly would exceed this value 75.00 percent of the times.
Arithmetic mean value of data bytes is 127.5060 (127.5 = random).
Monte Carlo value for Pi is 3.141120183 (error 0.02 percent).
Serial correlation coefficient is -0.000131 (totally uncorrelated = 0.0).
The new design uses enough random elements to guarantee a uniform
distribution, no matter what the size of the modulus, by iteratively
adding and scaling the elements. This preserves the quality of the
underlying generator, as shown by this result:
Entropy = 7.999999 bits per byte.
Optimum compression would reduce the size
of this 134217728 byte file by 0 percent.
Chi square distribution for 134217728 samples is 263.59, and
randomly would exceed this value 50.00 percent of the times.
Arithmetic mean value of data bytes is 127.5114 (127.5 = random).
Monte Carlo value for Pi is 3.141132700 (error 0.01 percent).
Serial correlation coefficient is -0.000044 (totally uncorrelated = 0.0).
Chris Hanson [Tue, 25 Nov 2003 23:55:33 +0000 (23:55 +0000)]
Several changes to ISO-8601 time:
1. Allow space to separate date and time on input.
2. Generate space as separator rather than T.
3. Allow seconds to be omitted on input.
Chris Hanson [Tue, 11 Nov 2003 01:31:28 +0000 (01:31 +0000)]
Signal an error if ADD-TO-GC-FINALIZER! or REMOVE-FROM-GC-FINALIZER!
is passed a finalized object. In REMOVE-ALL-FROM-GC-FINALIZER!,
finalize each object even if the object is already gone.
Chris Hanson [Tue, 30 Sep 2003 04:33:46 +0000 (04:33 +0000)]
Second draft: this one uses a fully lazy copy of the XML structure so
that the algorithms are concise _and_ efficient. This design also
allows EQ? to be used when comparing nodes.
Chris Hanson [Fri, 26 Sep 2003 05:35:43 +0000 (05:35 +0000)]
Restrict attribute values to be strings rather than lists of strings
and entity references. In cases where we used to insert an entity
reference into an attribute value or into content, signal an error.
Create named accessors for the name and value of an attribute. Soon I
will change the representation.
Chris Hanson [Fri, 26 Sep 2003 03:56:58 +0000 (03:56 +0000)]
Major update to rationalize naming structure. The implementation of
names has been moved to its own file. There are now fully fleshed-out
XML-QNAME and XML-NMTOKEN abstractions, so that it's possible to talk
about all those names that aren't affected by namespaces (e.g.
everything in the DTD).
Chris Hanson [Wed, 24 Sep 2003 22:39:12 +0000 (22:39 +0000)]
Implement abstraction for null namespace prefix and default namespace
URI, then change their representations to be something other than #F.
Change references to namespace "URI" to be "IRI" instead. Make some
changes to enhance support for namespace declaration parsing.
Chris Hanson [Tue, 23 Sep 2003 03:37:16 +0000 (03:37 +0000)]
Use quoting so that subprocess arguments can include spaces. This
won't work with cygwin programs, but it should work fine for
alternative shells such as 4NT.