birchwood-abbey.net Git - mit-scheme.git/commit

author	Chris Hanson <org/chris-hanson/cph>
	Mon, 12 Apr 2021 05:03:47 +0000 (22:03 -0700)
committer	Chris Hanson <org/chris-hanson/cph>
	Mon, 12 Apr 2021 05:03:47 +0000 (22:03 -0700)
commit	41967043bf1019564f0dbcfa61a92773d1e37037
tree	465f446ad86ca3a103ba11ddc12a5e9c94f0e730	tree \| snapshot
parent	ca769bc6e85027156c60c4d699fb581859dd53bc	commit \| diff

Fix wastefulness in grapheme/word breaks.

The solution has two parts: the first part is to generate the transitions at a
finer grain and using hash consing for new states.  The second part is to
optimize the resulting diagram by collapsing identical states, which are an
unfortunate side effect of the first part.

The end result is a much smaller diagram, in which there is never more than one
speculative branch introduced for any input code.  I haven't measured the
performance, but this can't help but be faster just on the basis of the amount
of data being manipulated.  Now that we have a limit on the speculative
branches, it should be possible to optimize the NFA further by supporting at
most two branches rather than a list of them.