$translator = new Convert::Translit($result_chset); $translator = new Convert::Translit($orig_chset, $result_chset); $translator = new Convert::Translit($orig_chset, $result_chset, $verbose);
$result_st = $translator->transliterate($orig_st); $result_st = Convert::Translit::transliterate($orig_st);
Convert::Translit::build_substitutes();
Export_OK Functions:
$result_chset for an argument string in
$orig_chset, transliterating by a map composed by new().
$orig_chset
isn't defined in $result_chset. For example, ``Latin capital A'' may be
substituted for ``Latin capital A with ogonek''. It takes a long time to
rebuild this file, but you should never need to. Its only source of
information is file ``rfc1345''.
$orig_chset to
$result_chset, these being names (or aliases) of 8-bit character sets
defined in RFC 1345. If only one argument, then $orig_chset is
assumed ``ascii''. If three arguments, the third is verbosity flag. Verbose
output lists approximate substitutions and other compromises.
Convert/Translit/rfc1345 (IETF RFC 1345, June 1992) Convert/Translit/substitutes
$orig_chset characters are
translated to a chosen indicator character. Transliteration is not
guaranteed commutative when substitutions were required. An
$orig_chset defined as 7-bit is assumed to be repeated to make
an 8-bit set (in the style of ``extended ascii''); no such adjustment is
made for $result_chset. The few mistakes in the RFC document are corrected
in the module.
Convert Russian language text from IBM to ASCII encoding:
$xxx = new Convert::Translit("EBCDIC-Cyrillic", "Cyrillic");
$ascii_cyr_st = $xxx->transliterate($ibm_cyr_st);
Convert from plain ASCII (default $orig_chset) to Latin2 (Central European):
$yyy = new Convert::Translit("Latin2");
$cnt_eur_st = $yyy->transliterate($ascii_st);
Since plain ASCII is subset of Latin2, nothing is lost in transliteration.
But going the other direction requires numerous simplifications:
$zzz = new Convert::Translit("Latin2", "ascii");
$ascii_st = $zzz->transliterate($cnt_eur_st);
Back to ASCII again, although substitutions probably mean ($again ne $cnt_eur_st): $again = $yyy->transliterate($ascii_st);
The example.pl script converts a Polish language phrase from Latin2 to EBCDIC-US.
Enjoy in good health. Cieszcie sie dobrym zdrowiem. Que gozen con salud. Benutze es heilsam gern! Genki dewa, yorokobi nasai.
Chris Leach, author of EBCDIC.pm Keld Simonsen, author of RFC 1345