Implement character replacement on ill-formed octet sequences.
- (utf8->string bv start end #t) now replaces by U+FFFD.
Existing behaviour of (utf8->string bv [start end]) is unchanged so
that utf8->string will fail noisily rather than quietly fail to be
invertible by string->utf8 on certain inputs.
- Generic I/O input now replaces ill-formed octet sequences by U+FFFD.
TODO: Add (port/set-coding-error port <action>) for <action> =
replace or <action> = error, perhaps.
TODO: This does not exactly implement the replacement algorithm
recommended as a best practice by Unicode 9, ยง3.9, pp. 127-129. That
algorithm is inconveneint because our decoder is factored into (a)
claiming a length based on the first code unit, and then (b)
consuming exactly that many bytes; the algorithm requires us to
refactor it so that part (b) can say `never mind' and consume fewer
bytes than (a) requeste.