summaryrefslogtreecommitdiff
path: root/app/tools/halibut/charset/README
diff options
context:
space:
mode:
Diffstat (limited to 'app/tools/halibut/charset/README')
-rw-r--r--app/tools/halibut/charset/README37
1 files changed, 37 insertions, 0 deletions
diff --git a/app/tools/halibut/charset/README b/app/tools/halibut/charset/README
new file mode 100644
index 0000000..8eb7c25
--- /dev/null
+++ b/app/tools/halibut/charset/README
@@ -0,0 +1,37 @@
+This subdirectory contains a general character-set conversion
+library, used in Timber, and available for use in other software if
+it should happen to be useful.
+
+I intend to use this same library in other programs at some future
+date. (A cut-down version of it is already in use in some ports of
+PuTTY.) It is therefore a _strong_ design goal that this library
+should remain perfectly general, and not tied to particulars of
+Timber. It must not reference any code outside its own subdirectory;
+it should not have Timber-specific helper routines added to it
+unless they can be documented in a general manner which might make
+them useful in other circumstances as well.
+
+There are some multibyte character encodings which this library does
+not currently support. Those that I know of are:
+
+ - Johab. There is no reason why we _shouldn't_ support this, but it
+ wasn't immediately necessary at the time I did the initial
+ coding. If anyone needs it, it shouldn't be too hard. The Unicode
+ mapping table for the encoding is available at
+ http://www.unicode.org/Public/MAPPINGS/OBSOLETE/EASTASIA/KSC/JOHAB.TXT
+
+ - ISO-2022-JP-1 (RFC 2237), and ISO-2022-JP-2 (RFC 1554). These
+ should be even easier if required - we already have the ISO 2022
+ machinery in place, and support all the underlying character
+ sets.
+
+ - ISO-2022-CN and ISO-2022-CN-EXT (RFC 1922). These are a little tricky
+ as they allow use of both GB2312 (simplified Chinese) and CNS 11643
+ (traditional Chinese), so we may need some way to specify which to
+ prefer.
+
+ - The Hong Kong (HKSCS) extension to Big5. Again, mapping tables
+ are available in the Unihan database.
+
+ - Other Big Five extensions, which I don't have mapping tables for
+ at all.