From 255436113adb73659428e18fd3f8e54858025ffe Mon Sep 17 00:00:00 2001 From: "B. Watson" Date: Sat, 21 Dec 2024 04:34:10 -0500 Subject: update man page --- uxd.1 | 41 +++++++++++++++++++++++++---------------- uxd.rst | 33 ++++++++++++++++++--------------- 2 files changed, 43 insertions(+), 31 deletions(-) diff --git a/uxd.1 b/uxd.1 index cb36f29..735845a 100644 --- a/uxd.1 +++ b/uxd.1 @@ -315,13 +315,13 @@ changed with the \fB\-c\fP option (see above). .B \fBgreen\fP, \fByellow\fP Printable characters (except the space, U+0020) alternate between green and yellow. .TP -.B \fBpurple\fP +.B \fBpurple\fP, \fBcyan\fP Spaces and unprintable characters ("control" characters, newlines, -tabs, etc). These are printed as "visible" characters, e.g. ␣ for -the space, ↵ for a newline. Hopefully this is an improvement over -the usual practice of printing these as periods, like standard hex -dumpers do. The Unicode BOM (byte order marker, U+FEFF) is printed -as a purple letter B. +tabs, etc) alternate between purple and cyan. These are printed as +"visible" characters, e.g. ␣ for the space, ↵ for a newline. +Hopefully this is an improvement over the usual practice of printing +these as periods, like standard hex dumpers do. The Unicode BOM +(byte order marker, U+FEFF) is printed as a purple letter B. .TP .B \fBred\fP Invalid UTF\-8 sequences. These are rendered as � (U+0FFD) with @@ -332,11 +332,20 @@ sequences are: .INDENT 0.0 .IP \(bu 2 Prefix bytes (>= 0x80) which are not followed by the correct number of continuation -bytes (with their high 2 bits set to \fB10\fP). Rendered as \fB�\fP\&. +bytes (with their high 2 bits set to \fB10\fP). .IP \(bu 2 -Continuation bytes that aren\(aqt preceded by a valid prefix byte. Rendered as \fB�\fP\&. +Continuation bytes that aren\(aqt preceded by a valid prefix byte. .IP \(bu 2 -Truncated UTF\-8 sequence at EOF. Rendered as \fB�\fP\&. +Truncated UTF\-8 sequence at EOF. +.UNINDENT +.UNINDENT +.UNINDENT +.sp +Also, there are sequences that are valid UTF\-8 encodings, but not valid Unicode. +These are normally rendered with a red background. +.INDENT 7.0 +.INDENT 3.5 +.INDENT 0.0 .IP \(bu 2 UTF\-16 surrogates (codepoints U+D800 to U+DFFF) [\fB*\fP]. Rendered as \fBS\fP\&. .IP \(bu 2 @@ -346,15 +355,15 @@ Rendered as \fB>\fP\&. Overlong encodings (e.g. codepoints U+0000 to U+007F encoded as 2 or more bytes) [\fB*\fP]. Rendered as \fBO\fP\&. .UNINDENT -.sp -For items marked with [\fB*\fP], the \fB\-j\fP, \fB\-p\fP, and/or \fB\-w\fP -options can disable error highlighting for this type of error. They -will be displayed in purple rather than red. +.UNINDENT +.UNINDENT .sp Each error\-highlighted sequence will increment the "Bad sequences" count, if the \fB\-i\fP option is used. -.UNINDENT -.UNINDENT +.sp +For items marked with [\fB*\fP], the \fB\-j\fP, \fB\-p\fP, and/or \fB\-w\fP +options can disable error highlighting for this type of error. They +will be displayed in purple or cyan rather than red. .UNINDENT .SH TERMINAL SUPPORT .sp @@ -373,7 +382,7 @@ Known \fBnot\fP to work: rxvt (doesn\(aqt support Unicode at all), and its derivatives such as aterm. .sp \fBuxd\fP also builds and runs correctly on a Mac running a recent -version of OSX (though I\(aqm not sure what terminal was used). +version of OSX with Terminal.app. .SH FONTS .sp For the human\-readable column to display correctly, you\(aqll need a font diff --git a/uxd.rst b/uxd.rst index 6069e4e..7e35440 100644 --- a/uxd.rst +++ b/uxd.rst @@ -260,13 +260,13 @@ changed with the **-c** option (see above). **green**, **yellow** Printable characters (except the space, U+0020) alternate between green and yellow. -**purple** +**purple**, **cyan** Spaces and unprintable characters ("control" characters, newlines, - tabs, etc). These are printed as "visible" characters, e.g. ␣ for - the space, ↵ for a newline. Hopefully this is an improvement over - the usual practice of printing these as periods, like standard hex - dumpers do. The Unicode BOM (byte order marker, U+FEFF) is printed - as a purple letter B. + tabs, etc) alternate between purple and cyan. These are printed as + "visible" characters, e.g. ␣ for the space, ↵ for a newline. + Hopefully this is an improvement over the usual practice of printing + these as periods, like standard hex dumpers do. The Unicode BOM + (byte order marker, U+FEFF) is printed as a purple letter B. **red** Invalid UTF-8 sequences. These are rendered as � (U+0FFD) with @@ -274,11 +274,14 @@ changed with the **-c** option (see above). sequences are: - Prefix bytes (>= 0x80) which are not followed by the correct number of continuation - bytes (with their high 2 bits set to **10**). Rendered as **�**. + bytes (with their high 2 bits set to **10**). - - Continuation bytes that aren't preceded by a valid prefix byte. Rendered as **�**. + - Continuation bytes that aren't preceded by a valid prefix byte. - - Truncated UTF-8 sequence at EOF. Rendered as **�**. + - Truncated UTF-8 sequence at EOF. + + Also, there are sequences that are valid UTF-8 encodings, but not valid Unicode. + These are normally rendered with a red background. - UTF-16 surrogates (codepoints U+D800 to U+DFFF) [**\***]. Rendered as **S**. @@ -288,12 +291,12 @@ changed with the **-c** option (see above). - Overlong encodings (e.g. codepoints U+0000 to U+007F encoded as 2 or more bytes) [**\***]. Rendered as **O**. - For items marked with [**\***], the **-j**, **-p**, and/or **-w** - options can disable error highlighting for this type of error. They - will be displayed in purple rather than red. + Each error-highlighted sequence will increment the "Bad + sequences" count, if the **-i** option is used. - Each error-highlighted sequence will increment the "Bad - sequences" count, if the **-i** option is used. + For items marked with [**\***], the **-j**, **-p**, and/or **-w** + options can disable error highlighting for this type of error. They + will be displayed in purple or cyan rather than red. TERMINAL SUPPORT ================ @@ -313,7 +316,7 @@ Known **not** to work: rxvt (doesn't support Unicode at all), and its derivatives such as aterm. **uxd** also builds and runs correctly on a Mac running a recent -version of OSX (though I'm not sure what terminal was used). +version of OSX with Terminal.app. FONTS ===== -- cgit v1.2.3