trim README, minor man update.

author: B. Watson <urchlay@slackware.uk> 2024-12-12 06:26:19 -0500
committer: B. Watson <urchlay@slackware.uk> 2024-12-12 06:26:19 -0500
commit: c41ec69dbc29465ee338125f4f08168e6fcdee86 (patch)
tree: 3e9d17113ff300fa1ed596ea45cb52de0d47b9aa
parent: 4df7bb4d762ff945fb7a823cb4c153cab7e3c273 (diff)
download: uxd-c41ec69dbc29465ee338125f4f08168e6fcdee86.tar.gz
3 files changed, 11 insertions, 53 deletions
diff --git a/README b/README
index dead264..755501e 100644
--- a/README
+++ b/README
@@ -1,51 +1,7 @@
-uxd (Unicode-aware Hex Dumper)
+uxd (UTF-8-aware Hex Dumper)
 
-Hex dump utility that uses color to indicate multi-byte UTF-8
-sequences.
+uxd is a hex dump utility that's aware of UTF-8 multibyte sequence
+semantics, and uses colorized output to indicate which byte
+sequences go with which human-readable characters.
 
-As usual for hex dumps, output is columnar. The rightmost column
-(which would be ASCII in a regular hex dump) shows one Unicode
-character for each UTF-8 sequence in the dump.
-
-Unicode sequences in the hex column are color-coded to match their
-character in the right column. Colors alternate between a set of 4,
-to help keep track of which character goes with with byte sequence.
-
-Sample output:
-
-00000000: 41 e2 98 af e2 98 ae c2 bf c3 a1 e2 88 9e 42 0a  A☯☮¿á∞B↵
-[colors]  1  2        3        4     1        2     3  5   12341235
-
-;   0 black (don't use)
-5 = 1 red
-1 = 2 green
-4 = 3 yellow
-;   4 blue (don't use)
-2 = 5 purple
-3 = 6 cyan
-;   7 white (don't use)
-
-Colors 1 to 4 are used for successive Unicode characters. For
-instance, color 3 is used for the ☮ character, and also for its hex
-representation "e2 98 ae" in the dump. Note that the "A" and "B" are
-in the ASCII subset of Unicode, and are treated as one-byte sequences.
-If there's a BOM, it'll be in reverse video color 1 (green), and the
-printable form of it will likely be "BOM".
-
-Color 5 is for unprintable characters, with Unicode codepoints below
-0x20 (aka "control characters"), plus a few others like 0x7f (delete).
-↵ is used for newlines... note that an actual ↵ character will
-also be displayed as ↵, but in one of the 4 alternating colors.
-
-Not shown in the dump: byte sequences that have the high bit(s) set,
-but are not valid UTF-8, will be shown in color 5 (red), but in
-reverse video.
-
-Usage: uxd [options] [<filename> ...]
-
-Options should be based on xxd(1) options, though not all of them will
-be supported. If uxd-specific options exist, they should ideally use
-letters that xxd doesn't, to avoid confusion.
-
-Ideas:
-support other encodings for Unicode, like UTF-16?
+See uxd.rst for full documentation, or (after installation), "man uxd".
diff --git a/uxd.1 b/uxd.1
index 87886b3..68c4554 100644
--- a/uxd.1
+++ b/uxd.1
@@ -36,7 +36,8 @@ uxd [\fIfile\fP | \fI\-\fP]
 .SH DESCRIPTION
 .sp
 \fBuxd\fP is a hex dump utility that\(aqs aware of UTF\-8 multibyte sequence
-semantics.
+semantics, and uses colorized output to indicate which byte
+sequences go with which human\-readable characters.
 .sp
 Input is read from \fIfile\fP, or standard input if \fIfile\fP is missing or
 given as \fB\-\fP\&. The input is treated as UTF\-8 encoded Unicode. Since
@@ -66,10 +67,10 @@ There are no options yet.
 It\(aqs hard to give a proper example, since man pages don\(aqt support
 color. You\(aqll have to use your imagination. Also, this section of
 the man page requires your man command to support UTF\-8 embedded in
-the man page. If the example looks mangled, try viewing the source
+the man page. If the examples looks mangled, try viewing the source
 (uxd.rst) in a text editor.
 .sp
-Japanese characters:
+Japanese text example:
 .INDENT 0.0
 .INDENT 3.5
 .sp
diff --git a/uxd.rst b/uxd.rst
index e5a8fff..f6f3bd3 100644
--- a/uxd.rst
+++ b/uxd.rst
@@ -23,7 +23,8 @@ DESCRIPTION
 ===========
 
 **uxd** is a hex dump utility that's aware of UTF-8 multibyte sequence
-semantics.
+semantics, and uses colorized output to indicate which byte
+sequences go with which human-readable characters.
 
 Input is read from *file*, or standard input if *file* is missing or
 given as **-**. The input is treated as UTF-8 encoded Unicode. Since
author	B. Watson <urchlay@slackware.uk>	2024-12-12 06:26:19 -0500
committer	B. Watson <urchlay@slackware.uk>	2024-12-12 06:26:19 -0500
commit	c41ec69dbc29465ee338125f4f08168e6fcdee86 (patch)
tree	3e9d17113ff300fa1ed596ea45cb52de0d47b9aa
parent	4df7bb4d762ff945fb7a823cb4c153cab7e3c273 (diff)
download	uxd-c41ec69dbc29465ee338125f4f08168e6fcdee86.tar.gz