Internet file formats can be divided into a few groups.
First, we have the file transfer (or communication) file formats, for which a long time ago the uuencode/decode schema was invented, followed by xxencode/decode.
This later evolved into the base64 encoding and MIME messaging scheme that a lot of mailers use today.
A second type of internet file formats is the Hyper Text Markup Language (HTML), with all its versions and (often browser specific) enhancements a true group in itself.
The third group of internet file formats is more an interface or protocol of communication again; the Common Gateway Interface (CGI), of which we can identify standard (or console) CGI and Windows CGI or WinCGI.
HTML
HTML stands for HyperText Mark-up Language, which is the basic language of any static page on the world-wide web.
A HTML page is plain ASCII with HTML-tags between "<" and ">" inside (often in pairs), that can be used to add special attributes to the contents.
Internet Browsers such as Netscape Navigator and Internet Explorer are just interpreters of the HTML codes in these pages they show to show things like headers, bold and italic text attributes, images and even frames and tables.
Using special HTML codes, you can do just about anything with your homepage, including things like headers, bold and italic text attributes, images and even frames and tables.
The following table shows a few of the basic HTML tags we'll be seeing in more detail in this chapter.
HTML tag | text effect |
<HTML>...</HTML> | entire HTML page |
<HEADER>...</HEADER> | header section |
<TITLE>...</TITLE> | document title |
<BODY>...</BODY> | actual contents (the text section) |
<H1>...</H1> | header (possible levels 1..6) |
<B>...</B> | bold text |
<I>...</I> | italic text |
<BR> | line break |
<HR> | horizontal ruler |
<P> | paragraph |
<A HREF="URL">....</A> | link to other page or URL |
A HTML-page always starts with <HTML> and ends with </HTML>.
The actual contents are put between <BODY> and </BODY> tags.
Multiple line feeds and white spaces are ignored (and replaced by a single whitespace), and this is why we need special characters like <BR> and <P> in the first place.
A simple HTML page with a one line text header and link is as follows:
<HTML>
<BODY>
<H1>Hello, world!</H1>
<P>
<HR>
<ADDRESS>
<A HREF="http://www.drbob42.com">Dr.Bob's Delphi Clinic</A>
</ADDRESS>
</BODY>
</HTML>
Note the <ADDRESS> tag, which we can use to put address information and a link to a homepage or e-mail address, for example.
This information will be displayed in italic.
The <A> tags are part of the foundation of HTML; these form the syntax for the hyper-links, in this case to another webpage (my homepage) at http://www.drbob42.com.
For this simple HTML-page, any webbrowser (such as Netscape Navigator) will show one page with a title and a link.
For some more good sources on HTML I can recommend a good book (such as "Netscape & HTML Explorer", from The Coriolis Group) and a generally peek at how others write their HTML-pages will be of help as well.
More HTML coverage can be found in the Databases on the Web chapter, which will - among others - describe a Table-to-HTML converter.