Filewatcher File Search
FTP Search
  
Directory 
  
Content Search 
   
pkg://html2sgml-0.3-1.i386.rpm:11900/usr/doc/html2sgml-0.3/html2sgml.sgml  info  downloads

<!doctype linuxdoc system>
<article>
<title>html2sgml documentation
<author>Peter Antman
<date>Tue Aug 26 12:26:46 MET DST 1997
<sect>README
<p>
html2sgml is a program wich converts html to sgml accroding to linuxdoc.sgml.
With a file in linuxdoc.dtd format you can create nice typset books, well
structured html-documents and so forth. linuxdoc.dtd is the format used in Linux
HOWTOS, for example.

html2sgml i tuned to work well with Applix HTML, and will convert any
footnotes apearent in the applix word-file that was used to produce the
html.

To use html2sgml you need Perl. To use the image converting routines you also
need: giftopnm, ppmtopgm and pnmtops

To do something usefull with the resulting file you also need linuxdoc-sgml or
the follow up sgml-tools <htmlurl url="http://www.xs4all.nl/~cg/sgmltools/"
name="http://www.xs4all.nl/~cg/sgmltools/">.

<sect1> Getting html2sgml
<p>
The homepage of html2sgml is <htmlurl
url="http://www.abc.se/~m9339/prog/html2sgml.html"
name="http://www.abc.se/~m9339/prog/html2sgml.html">
It is possible to ftp it from <htmlurl
url="ftp://ftp.mc.hik.se/pub/users/mia95anp/html2sgml/"
name="ftp://ftp.mc.hik.se/pub/users/mia95anp/html2sgml/">.
It has also been upploded to <htmlurl url="ftp://ftp.redhat.com"
name="ftp://ftp.redhat.com"> and <htmlurl url="ftp://sunsite.unc.edu/pub/linux/"
name="ftp://sunsite.unc.edu/pub/linux/">.
<sect1> Installation
<p>
To install html2sgml unpack the tarfile and cd into the disrtibution. Type
<verb>
make install
</verb>
It will install the programs html2sgml and mkbook, som files in the specified
documentdirectory, including <em>extras</em> a couple of script that shows some
examples of how you can merg severall html-files into one to use with html2sgml. A
manual page will be installed too.

Edit the makefile to change where to install and where Perl is on your system.
The default is /usr/bin/perl and prefix = /usr/local

<sect1> Usage
<p>
See the manual page

<sect> Manual page
<p>
<sect1> NAME
<p>
html2sgml &mdash convert html to sgml according to linuxdoc.dtd

<sect1> SYNOPSIS
<p>
html2sgml <em>file.html</em>

<sect1> DESCRIPTION
<p>
<em>html2sgml</em> is a fileconverter that converts html-files to sgml-files 
according to linuxdoc.dtd. It will ouput a file with the same name
as the specified file but with the ending html changed to sgml.
<p>
It will not work on every html-file because of the
free format of html. It is tuned to work well with html produced from 
<em>Applix HTML-editor</em>. If it finds a applix word file in the same directory
and with the same name as the specified file, it will include any
<em>footnotes</em>
from the aw-file in the produced sgml-file.
<p>
<em>html2sgml</em> will also try to convert all included images of type gif to
postscript.
<p>
By default html2sgml produces a ducument of type <em>article</em>. To change
to <em>book</em> you can use the script <em>mkbook</em>. It also fills
in a dummy name. If there is a title tag in the html-file it will use that
as a title for the sgml-file. To change this you have to hand edit the
sgml-file. 
<p>
If there are more than one <em>H1</em> tag these are used as the toplevel section.
Everything marked H1 will become a <em>sect</em> in sgml, and <em>H2</em>
will become sect1 and so forth. If there is only one ore no H1, H2 will 
be used instead. If there is no H* tags, then the document i broken by 
design :-)
<p>
The resulting sgml-file can then be used by <em>sgml-tools (was linuxdoc-sgml)</em> to produce
various new fileformats, eg latex, info, rtf.

<sect1> TIPS
<p>
<em>html2sgml</em> should work fine with straight html, that is, when no special
layout formating has been done. For example: it can handle html table tags, but it can
not handle them well if they are used to produce layout.
<p>
The best working thing is to use it with <em>Applix html</em>. You can both
write directly in Applix Word or import document to applix word. Try to use 
predefined styles for your document. You can create heading1, heading2, pre,
quote and so forth. Open Applix HTML and use <em>File->Import words document
</em>. You will then get the chance to tell Applix wich html-tags your defined
styles should match, eg heading1 -> html_h1. Then use <em>Format -> HTML document
setting</em> where you can fill in the title; here you can also fill in the
alternative to export Applix images as gif files. This is good to do because
html2sgml can convert the gif files to ps-files wich can be used when/if
converting to latex.

<sect1> BUGS AND FEATURES
<p>
<em>html2sgml</em> is still under development and will most probably contain
bugs. It also contain som features. All possible HTML and sgml tags are not
implemented. Unimplemented HTML tags will show up in the sgml file where you
have to hand edit them away. Some tags in sgml are also unsupported. More specific:
no math tags is implemented. You can check the resulting sgml file with the command 
<em>sgmlcheck</em> to discover any leftover tags.
<p>
I have concentrated on making it work in english and in swedish. This means that
there are a lot of characters that probably not will work OK, specialy when
converting Applix footnotes. Look in the source and try to put in the missing
characters if you have any problems. And pleas send the new improved
version to mee.

<sect1> AUTHOR
<p>
Peter Antman (peter.antman@abc.se)

<sect1> SEE ALSO
<p>
sgml2latex(1), sgml2html(1), sgml2txt(1), sgml2info(1), sgml2rtf, sgml2lyx(1)
</article> 
Results 1 - 1
Help - FTP Sites List - Software Dir.
Searching half a billion files worldwide
© 1997-2009 MARUHN Internet Solutions