==== NAME ==== html2dbk - convert XHTML to DocBook. ==== VERSION ==== This describes version ``0.03'' of html2dbk. ==== DESCRIPTION ==== This script (and module) converts an XHTML file into DocBook, using both XSLT and heuristics (as XSLT alone can't do everything). This script will convert "*filename*.html" into "*filename*.xml" By default, the input file is expected to be correct XML (there are other programs such as html tidy (http://tidy.sourceforge.net/) which can correct files for you; this does not do that). If you give the --html option then this will attempt to parse the file as HTML. Note also this is very simple; it doesn't deal with things like
or which it has no way of guessing the meaning of. This does not merge multiple XHTML files into a single document, so this converts each XHTML file into a , with each header being a section (sect1 to sect5). The tag is used for the chapter title. There will likely to be validity errors, depending on how good the original HTML was. There may be broken links, <xref> elements that should be <link>s, and overuse of <emphasis> and <emphasis role="bold">. ==== REQUIRES ==== Getopt::Long Pod::Usage Getopt::ArgvFile HTML::ToDocBook Cwd File::Basename File::Spec XML::LibXML XML::LibXSLT HTML::SimpleParse ==== AUTHOR ==== Kathryn Andersen (RUBYKAT) perlkat AT katspace dot com http://www.katspace.org/tools ==== COPYRIGHT AND LICENCE ==== Copyright (c) 2006 by Kathryn Andersen This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.