NAME
    eoconv - Convert text files between various Esperanto encodings

USAGE
    eoconv [-q] --from=*encoding* --to=*encoding* [file ...]

     Options:
       --from       specify input encoding (see below)
       --to         specify output encoding (see below)
       -q, --quiet  suppress warnings

       --help       detailed help message
       --man        full documentation
       --version    display version information

     Valid encodings:
       post-h post-H post-x post-X post-caret pre-caret latex
       html-hex html-dec iso-8859-3 utf-7 utf-8 utf-16 utf-32

DESCRIPTION
    eoconv will read the given input files (or stdin if no files are
    specified) containing Esperanto text in the encoding specified by
    --from, and then output it in the encoding specified by --to.

OPTIONS
    --from=*encoding*
                     Specify character encoding for input

    --to=*encoding*  Specify character encoding for output

    -q --quiet       Suppress non-essential warning messages

    -? --help        Print a brief help message and exit.

    --man            Print the manual page and exit.

    --version        Print version information and exit.

  CHARACTER ENCODINGS
    *post-h*         ASCII postfix h notation

    *post-H*         ASCII postfix H notation

    *post-x*         ASCII postfix x notation

    *post-X*         ASCII postfix X notation

    *post-caret*     ASCII postfix caret (^) notation

    *pre-caret*      ASCII prefix caret (^) notation

    *latex*, *LaTeX* ASCII LaTeX sequences

    *html-hex*, *HTML-hex*
                     ASCII HTML hexadecimal entities

    *html-dec*, *HTML-dec*
                     ASCII HTML decimal entities

    *iso-8859-3*, *ISO-8859-3*, *latin3*, *latin-3*, *Latin3*, *Latin-3*
                     ISO-8859-3

    *utf-7*, *UTF-7*, *utf7*, *UTF7*
                     Unicode UTF-7

    *utf-8*, *UTF-8*, *utf8*, *UTF8*
                     Unicode UTF-8

    *utf-16*, *UTF-16*, *utf16*, *UTF16*
                     Unicode UTF-16

    *utf-32*, *UTF-32*, *utf32*, *UTF32*
                     Unicode UTF-32

ESPERANTO ORTHOGRAPHY
    Esperanto is written in an alphabet of 28 letters. However, only 22 of
    these letters can be found in the standard ASCII character set. The
    remaining six -- `c', `g', `h', `j', and `s' with circumflex, and `u'
    with breve -- are not available in ASCII; neither are they among the
    characters available in the common 8-bit ISO-8859-1 character encoding.
    Therefore, while the six special Esperanto characters pose no problem
    for handwritten texts, they were impossible to represent on standard
    typewriters, and are somewhat problematic even on modern-day computers.
    Various encoding systems have been developed to represent Esperanto text
    in printed and typed text.

  POSTFIX-h NOTATION
    This was the solution proposed by the creator of Esperanto, L. L.
    Zamenhof. He recommended using `u' for `u-breve' and appending an `h' to
    a letter to indicate that it should have a circumflex. However, the
    letters `u' and `h' are already part of the Esperanto alphabet, so using
    them for another purpose invites ambiguity and mispronunciation. It also
    makes conversion of Esperanto text to postfix-h notation `lossy' or
    one-way; it is generally not possible to convert from postfix-h notation
    via automated means. This notation suffers from the additional drawback
    that the text cannot be sorted with standard rules for ASCII text.

  POSTFIX-H NOTATION
    This is the same as postfix-h notation, except that `H' is used instead
    of `h' following a capital letter.

  POSTFIX-x NOTATION
    This is the most common ASCII notation encountered today. It involves
    appending an `x' to a letter to indicate that it should have an accent
    (be it circumflex or breve). Since `x' is not a letter in the Esperanto
    alphabet, no ambiguity results. However, ASCII sorting algorithms still
    fail with postfix-x text.

  POSTFIX-X NOTATION
    This is the same as postfix-x notation, except that `X' is used instead
    of `x' following a capital letter.

  PREFIX- AND POSTFIX-CARET NOTATION
    Two slightly less popular ASCII encodings are to prepend or append a
    caret (`^') to a letter to indicate that it should have an accent.

  ISO-8859-3 (LATIN-3)
    ISO 8859-3, also known as Latin-3 or South European, is an 8-bit
    character encoding for Esperanto. High-bit characters are used to encode
    the accented Esperanto letters. ISO-8859-3 can also be used for encoding
    English, Finnish, German, Italian, Latin, Maltese, Turkish, and
    Portuguese, making it useful for texts which mix Esperanto with one or
    more of these languages.

  UNICODE (ISO/IEC 10646)
    Unicode is a standard for matching every character of every human
    language to a specific code. The mapping methods are known as Unicode
    Transformation Formats (UTF). Among them are UTF-32, UTF-16, UTF-8 and
    UTF-7, where the numbers indicate the number of bits in one unit.

  LaTeX SEQUENCES
    The popular LaTeX typesetting package is capable of representing
    virtually any accented character. Note that conversion from LaTeX
    sequences assumes that characters to be accented are enclosed in braces
    -- for example, `\^{C}' will be recognized as `C' with circumflex, but
    `\^C' will not be.

  HTML ENTITIES
    Unicode codes for Esperanto characters can be escaped in HTML documents
    by using HTML entities. The codes can be represented in either decimal
    (base-10) or hexadecimal (base-16) notation; the two are functionally
    equivalent.

BUGS AND LIMITATIONS
    Because the postfix-h and postfix-H notations are inherently ambiguous,
    conversion from postfix-h or -H text is unlikely to result in coherent
    text. Use at your own risk, and carefully proofread the results.

    Report bugs to <psychonaut@nothingisreal.com>.

AUTHOR
    Tristan Miller <psychonaut@nothingisreal.com>

SEE ALSO
    charsets(7), ascii(7), iso_8859-3(7), unicode(7), utf-8(7), latex(1)

LICENSE AND COPYRIGHT
    Copyright (C) 2004-2013 Tristan Miller.

    Permission is granted to make and distribute verbatim or modified copies
    of this manual provided the copyright notice and this permission notice
    are preserved on all copies.

