How summ.org works

summ.org
links
travels
reflections
For those curious about the nuts and bolts of putting a web site like this together, I put together the following explanation.

Site Organization

Basically, the site consists of collections of articles and some index pages. I keep a (hopefully exact) copy of the website on the laptop's hard drive. As I finish an article, I copy it from my working directory to the correct place in the on-disk website. I then modify the various index pages to include the new article, so that I end up with a complete version of the updated website on disk.

Articles

   articles
      index.htm
      article-1
         index.htm
         images
            thumbs
         slides
      ...
      article-N
Each article is a self-contained directory tree which can be copied around as a unit. This is theoretically inefficient in that articles may not share images (without arranging some independent location for them) but for my purposes the extra space is a reasonable price to pay for ease of management. As a practical matter, the current website weighs in at about 60 Mb and my basic web hosting agreement provides for 250 Mb so disk space is not a concern. And, articles don't share that many images anyway.

The index pages do reference images (thumbnails) from the articles but as I won't need to delete articles for another couple years (at the current rate of site growth), I don't have concerns about keeping the links up to date.

HTML

I would love to use some nifty wysiwyg-ish sort of editor to write HTML but every one I try (most recently the one with Star Office) produces HTML which is so baroque, so spectacularly disfunctional, that I always fall back on hand editing HTML with vi (a text editor). I'll usually have a browser up on the page I'm working on so that I can check formatting as I go. Also, wysiwyg editors usually manage to make it difficult to incorporate generated HTML code with the result that things like incorporating images become an incredibly time consuming series of identical dialog interactions.

The html necessary for the formatting I use is usually pretty small and simple relative to the content, the notable exception being images, for which I have a perl script. The (desirable) net effect is that I spend very little time on the web site doing anything other than writing and editing pictures.

I aim for HTML 3.2 compliance, but I don't make a religion out of it. If Explorer and Netscape both render it correctly, it is good enough for me. Things like <b> and <center> are too handy and style sheets, a particularly fuck-witted attempt to turn HTML into PDF, are too broken.

Tables

Aside from <p>, and the various headers, <h1> </h1;>...; tables are the most common HTML feature that I use for layout. I think you could come pretty close to Mondrian with tables, background colors and a lot of patience.

To get colors, I either fiddle directly with RGB values or use the Photoshop color picker if I can't home in on what I want with RGB.

Side Bars

A side bar is just a right-aligned table with a single cell:

<table cellpadding="15%" width="40%" align="right"><tr><td bgcolor="#80FCFF">
<center>
<h4>side bar title</h4>
</center>

...

</td></tr></table>
You need to specify a percentage width in order to get predictable layout results.

Floating Images Right

As a general rule, I float images out to the right margin with an align="right" in the img element. Browsers tend to screw this up if the images are too close together which, of course, depends on the size of the browser window.

Anytime I have a bunch of images close together, I use a table to hold the images and float the table to the right margin:

<table align="right">
   <tr>
      <td>
         <a href="..." ><img src="..." ... ></a>
      </td>
   </tr>
   <tr>
      <td>
         <a href="..." ><img src="..." ... ></a>
      </td>
   </tr>
   <tr>
      <td>
         <a href="..." ><img src="..." ... ></a>
      </td>
   </tr>
</table>
This keeps the browser from jumbling the images on top of each other.

Page Layout

The page layout that gives the site most of its "look and feel" is just a table with one row and two cells, one for the "margin" which contains the navigation index and one for the page contents.

<table cellpadding="5%">
   <tr>
      <td width="15%" bgcolor="#FFC010" valign="top">
         <!-- navigation links here -->
      </td>
      <td>
         <div style="margin-left: 2em">

         <!-- page contents here -->

         </div>

      </td>
   </tr>
</table>
The margin-left glop is about the only style sheet stuff I can bring myself to use. Browsers seem to deal with it a little more gracefully than baling wire alternatives such as lists with no items.

The page layout stuff gets handled by a script as detailed later.

Images

Incorporating images into a document in a way that is friendly to low bandwidth users takes a little doing.

  • Images should by linked via thumbnail, a smaller version of the image. A maximum dimension of 128 seems to be a reasonable size for thumbnails.
  • Thumbnail images (all images really) should specify the image size so that the browser can finish the page layout without waiting for the image to download
  • A textual title helps the user decide if they really want to see the larger version
  • The size of referenced image should appear in the title so that user will know how much of a wait to expect when viewing the full-sized image

Here's the HTML to do all that as well as the end product:

<a
   name="amedee.jpg"
   href="slides/amedee.htm"
   title="The Amedee light, lower leading beacon shows us to be in the middle of the Passe de Boulari (52K)">
   <img src="images/thumbs/amedee.jpg"
      width=128
      height=106>
</a>
You'd go nuts typing all that for every image I hacked together a little perl script, slides.pl, to do all the image grunt work. It uses ImageMagic to do the thumbnailing and extract image information. A Windows version of perl is available from ActiveState.

In article/images I put all the images associated with an article. Running slides.pl in article does the following:

  • generates thumbnails in articles/images/thumbs
  • creates or updates the file articles/images/idb.txt which allows one to associate titles with images and to order images arbitrarily.
  • Creates slide pages and a slide index in article/slides
  • Writes a template.htm, a template HTML file with html similar to that above, for each image.

To include an image, all one needs to do is open the generated template.htm, copy over the relevent line of HTML, and add any desired embellishments such as align="right".

The images are nearly all taken with a Nikon Coolpix 5700, cropped and adjusted with Photoshop. The Coolpix (I hope somebody at Nikon regrets that name.) came with some "cool" slide-tray type software which I stopped using in favor of just copying the images directly off the USB drive which appears when one plugs the camera (can't bear to say "coolpix" again) into the laptop's USB port. For the most part, I size them about 800x600 which should work well on most displays and usually results in images less than 100k in size, 5 - 10 seconds of download time on a 56K dialup line.

Site Style

I gradually evolved a simple style for pages, orange navigation bar to the left, title centered at the top. After maintaining it by hand became onerous, I put together a perl/sed script to take care of changes and updates. The basic idea is to use sed scripts of this form:

/^<!-- begin-nav -->$/,/^<!-- end-nav -->$/c\
<!-- begin-nav -->\
      <a href="http://www.summ.org/">summ.org</a><br>\
      <div style="margin-left: 2em">\
         <a href="http://www.summ.org/links.htm">links</a><br>\
         <a href="http://www.summ.org/travels/index.htm">travels</a><br>\
         <a href="http://www.summ.org/reflections/index.htm">reflections</a><br>\
         <a href="http://www.summ.org/articles/index.htm">articles</a><br>\
      </div>\
<!-- end-nav -->
to manage common chunks of HTML. The script replaces everything between the two HTML comments, begin-nav and end-nav, with whatever you want, in this case, the site outline. Any multi-line hunk of HTML can be encapsulated this way. One could also extend this mechanism to intra-line replacements, but I haven't found the need.

In actuality I use this sed script for the navigation links, and two other chunks of code, a page prefix and a page postfix, which expand into the HTML table code which used to implement the orange side bar.

A simple perl script recursively applies all the sed scripts to a directory hierarchy.

A somewhat similar idea, though a bit more complex to implement would be to allow something like:

<!-- thumb-right: foo -->
meaning: "insert the template code to include a link to the picture images/foo.jpg, right aligned following this comment." This would be better than manually copying out of the template file, because changed picture titles would come into the page automatically instead of requiring the user to recopy them out of the template file. This rarely seems to be a problem so I haven't bothered to implement it. FWIW, the obvious implementation is to have slides.pl generate the relevant sed scripts which could then be applied by the recursive-sed driver.

Web Hosting

There are about a zillion web hosting companies. I chose Kanganet because it was used by a couple of other cruising yacht web sites that I had been following. And it was cheap. For about $50 a quarter (it depends on the exchange rate because Kanganet is Australian) you get 250 Mb of server space hooked up to the domain name of your choice. Also, as many email addresses as you care to configure.

Of course, at that price you don't get phone support.

There are lots of different gradations of web hosting support ranging from "you get a dedicated machine in our facility" to "you get a subdirectory in our server farm." Since summ.org has no interactive features, the available server-side programming facilities aren't of much interest.

I bought the domain name, summ.org, from Network Solutions for 9 years at something like $10/year. When you buy a domain name, you get an account and passoword at Network Solutions which allows you to long on and supply them with the authoratative domain server IP address which will be provided by your web hosting company.

Updates

To update the website on the server, I get an FTP connection (open ftp://summ.org/ in Internet Explorer) and then copy new or modified files across in a bottom up fashion so that there are never broken links on the server. For example first the new article-N directory and contents, then the new articles index, then the site index and new pages. I do this by hand. If things ever get hopelessly out of sync. I just delete everything on the server and copy the whole site back up. There are fancier FTP clients which can automatically update a hierarchy, but I haven't found the update process onerous enough to incent me use one.

Since I don't have a fixed address, I get an internet connection by taking my laptop to an internet cafe. Laptops have become common enough that most cafes will now let one do this. Usually the configuration is handled by DHCP but in a pinch one just copies down the setup from one of their machines before stealing its network cable.