Articles on Web Development

Note: To reduce the overlap between sections, I've removed all the articles that are focused on Ruby on Rails from the Web Development section. So be sure to check the Ruby on Rails section as well.

Ruby on Rails Resource Site Launched

After several months of development, we have finally taken the wraps off the BuildingWebApps site!

Note: I won’t be publishing any more Ruby on Rails or web development articles on this blog. This will be more of a personal blog, with everything related to web technology going on the new site. To continue receiving the Rails articles that I and my colleagues will be writing (and we have plans for lots more articles than I’ve been able to publish here), please subscribe to the BuildingWebApps articles feed.

You can also subscribe to the BuildingWebApps blog, where I’ll be writing about the process of building the site and the business.

State of the site

BuildingWebApps.com is still very much a work in progress, but we’ve gone ahead and opened it up so we can get feedback. Please take a look and let me know what you think.

We worked with Josh Woodlander and Ethan Allen at Raspberry Media on the visual and interaction design, and I think they did a fantastic job. It was a joy to work with people who have such a strong sense of graphic design and who think deeply about the challenges of effectively presenting lots of information. And Ethan is a wizard at making all the browsers behave reasonably. (Any oddities you see are probably the result of my modifications.) If you’re looking for a team to take on a significant web design project, I highly recommend them.

My goal is to build this site into a valuable resource for the Ruby on Rails community, and for people who want to learn Ruby on Rails. Over time, we’ll increase our coverage of other web-related technology as well.

The site includes both original articles and an annotated, organized set of links to hundreds of other resources around the web. We’re in the early phases of building up the content, so I realize that it will seem a little thin in places (and if you’ve read the Rails-related articles on this blog, some of it will seem very familiar). But there’s lots of great stuff to come.

I’m really interested in feedback on the usability of the site. You can also submit suggestions on the site to help us build up the content.

How is this a business?

In case you’re curious about the business model: we plan to make all the core content free indefinitely. At some point, we’ll add some premium content that will require membership or a one-time payment. We’ll probably have advertising. And we’re offering seminars.

Longer term, we believe that the application we’re building to power this site will be applicable to many other knowledge domains and communities. Blogs and wikis are all the rage, but they both have huge limitations that we believe our platform will overcome.

Don’t forget to change channels

Once again, please note that I won’t be publishing any more Ruby on Rails or web development articles on this blog. I will continue this blog to write about a variety of topics beyond web development, so please keep your subscription here if you’re interested in my random thoughts. But if you subscribed to this blog primarily for web development information, it’s time to move on:

Thanks!

BoatingSF's 15 minutes of fame

I wrote my previous post, Tracking the Cosco Busan, just after I published the animation of the ship’s track as it hit the Bay Bridge. At that point, the site was already doing more than twice its usual volume just with the increased general interest in the Bay, and in current ship positions.

Over the next few days, as half a dozen major news articles mentioned my Track of the Cosco Busan animation, the site’s traffic spiked to an all-time high.

The peak day was more than 120,000 page views, which is about what the site does in a typical month.

I was pleased to find that the VPS this site runs on seemed to have enough headroom to handle the load reasonably well. I stored an alternate version of the flash movie and its data files on S3, which provided an alternative for visitors who experienced load problems.

It was also nice to see my AdSense earnings skyrocket, however briefly. Click-through rates fell to about 50% of normal, but since traffic was 30 times normal, the short-term increase was sizable. If there was this kind of interest in ship tracks ever day my boating site could be a full-time job…

Here’s some links to news articles that reference my animation of the accident:

Tracking the Cosco Busan

As you probably know, on Wednesday morning a 908-foot cargo ship, the Cosco Busan, ran into one of the San Francisco Bay Bridge towers, creating the worst oil spill in more than a decade in San Francisco Bay.

There’s all sorts of questions being raised about the speed of response. In time, I suspect we’ll learn a lot. For now, speculation on just how quickly the response was mustered, whether it could have been done more quickly, and what caused any delays is just that—speculation.

Even more puzzling is how this could happen to begin with. There’s a huge opening between the bridge towers—this is not a tight fit, even for a huge cargo ship.

It just so happens that one of my sites, BoatingSF.com, tracks ship movements on the bay. Anyone who looked at my real-time ship tracking page within an hour of the accident could have seen an instant replay of the ship’s track. And it should have been reasonably simple for me to access the historical data.

My perfect storm of server data ugliness

This data comes from two AIS receivers I operate, which receive VHF signals that all commercial ships are required to send that encode their position, speed, destination, name, and dimensions. These reports, which arrive at the rate of a few per minute, are streamed up to my server, where I decode them with some custom PHP code (this predates my involvement with Ruby and Rails) and stuff them into a database. Every five minutes, a cron job extracts a summary from the database and generates an XML file. The web page has a Flash movie that reads this XML file to control the animation.

Unfortunately, for performance reasons, I don’t keep decoded position information that’s more than one hour old. I’m going to rearchitect the solution to make this possible, but when I designed this almost two years ago, it was all I could do to get it working, and other projects got in the way of further optimization.

I do archive the raw AIS data stream, so I can go back and process it later to get at historical data. Several private and government agencies have used this data for various kinds of analysis projects. Until recently, I stored this raw stream in the database.

A couple weeks ago, something went wrong, and my simplistic scheme began to torment me. My simple database configuration on an old VPS account didn’t deal well with tens of millions of records.

I had the system set up to send me an email when a database failure occurred—and I started getting 100,000 such emails a day!

There’s another article to be written here, but suffice it to say that you shouldn’t do this (and I don’t any more)—write the errors to a log file, use logrotate or somesuch to keep the files from getting too big, and then use something like logwatch to warn you when the logs have errors in them.

The fix creates a new problem

In my hurry to stop the mail deluge, I changed the code to put the raw AIS stream into a log file instead of into the database. And in the rush, I forgot that the raw AIS data lacks a built-in time stamp. So when I went to dig out the data that would show the Cosco Busan accident, I found that I had no timestamps for any of the position reports! This meant I couldn’t just look for 8:30 am Wednesday but had to analyze ship movements to find the accident.

I also had to write new code to pull the reports from the log file, decode them, and stuff them into the database for further analysis. And I wanted a different zoom region for the display, which took additional work.

About 12 hours later, I had an animation of the accident completed. It doesn’t show actual time, since I didn’t have any timestamps to work with, but it animates on the assumption that the pace of ship reports is roughly constant (which it should be).

I had expected that the ship would have been heading for the space between the towers, and veered a bit off course. The reality is that the ship was going nearly parallel to the bridge, until it turned sharply and headed straight for the center tower! And there was a tug following closely behind. Unless there was a catastrophic steering failure, which seems unlikely given that the ship continued on under apparently good control, there’s some people with a lot of explaining to do.

Next up: a new architecture

This debacle (my server’s, not the ship’s) has spurred me to begin thinking about a new architecture for the system. I want to be able to pull any past window of time, and zoom in on any region, without custom programming.

Perhaps I’ll move it over to Ruby while I’m at it. One of the reasons it too me so long to create the Cosco Busan animation is that it had been almost a year since I had touched the PHP code, and it is an ugly thing! It is hard to remember to put semicolons at the end of every line, and empty parentheses after function calls that take no arguments, and so forth, now that I’m accustomed to a language that doesn’t have such requirements.

Making it even more complex is the convoluted Flash code that creates the animation, which requires dealing in yet another language and the vagaries of the Flash timeline interface. I’m not sure I see a way out of having the Flash code, but at least I can get rid of the PHP.

Cleaning up a Subversion Mess in Windows

Subversion is a wonderful tool. Even on projects where I’m the only person creating code, it’s great to be able to go back to any version at any time, have an easy way to synchronize both my desktop and notebook machines to the same code base, know that I always have an off-site backup, and, of course, be able to deploy with Capistrano, which pulls code from the repository.

But Subversion can also be confusing and frustrating. On Windows, the TortoiseSVN client, which integrates into Windows as a shell extension, makes it pretty painless, as long as everything is working. And both Aptana and NetBeans have built-in Subversion clients.

Subversion Frustrations

But every now and then, I find my projects in a state in which I just can’t get files to check into the repository. A couple days ago I had a project that hung the Subversion client whenever I tried to commit the new files. It happened again the next day with a fresh set of files and a new project in the repository. I’m not sure what went wrong here, but I seem to be having a little more trouble than usual after moving projects from Aptana to NetBeans.

There’s one way in which I know I have messed up on more than one occasion. When I’m creating a new project and adding in files from other projects, I sometimes get sloppy and move a folder using Windows, or the IDE’s file manager, which brings along the .svn files from the “alien” project, which makes Subversion quite unhappy. I had thought, at first, that the IDE’s with built-in Subversion support would take care of the doing the right thing if you used their file browser to move a folder, but they don’t—they happily move the .svn files and leave your working copy in an unacceptable state.

Emergency Treatment for Windows SVN Files

When all else fails, I’ve found this little bit of Windows shell code to be a savior:

for /r YOURPATH %f in (.svn) do rd /s /q "%f" 

Enter this in a Windows shell, substituting the top-most folder you want to affect for YOURPATH, and it will strip every vestige of Subversion from your files (or replace YOURPATH with a period (.) and it will operate on the current folder and all those below it). If you do this with an entire project, you can start fresh and import it as a new project if you want, though you lose all the change history up to that point. You can also apply this to a single folder if you inadvertently added .svn files that relate to another repository, as when moving a folder from another project. Once you strip all the .svn files, you can add the new files to the project.

Credits: I got this bit of shell code from Wyatt Preul’s Blog, where he discusses using PowerShell as well as the regular shell. Note that he does not put the final %f parameter in quotes, which will cause the command to fail if there are any directories that have spaces in their names.

Keeping Subversion Happy

I’d appreciate any tips anyone can add about how to keep Subversion happy, and how to troubleshoot it when commit attempts just hang.

Implementing SSL

When you’re dealing with personal or otherwise confidential information, you want to know that snoops can’t listen in on your communications, and that you’re communicating with who you think. On the web, SSL (Secure Sockets Layer) is the standard way to do this.

Converting all or part of a site to run securely isn’t too difficult, but it does involve a number of steps. This article focuses on SSL certificates; follow-on articles will describe how to install the certificate, and how to set up Apache and Rails to use it.

What an SSL Certificate Does for You

An SSL certificate provides two separate functions:

  • It certifies, to varying degrees, the identity of the person or business that controls the web site.
  • It provides the public and private keys for encrypting communications with the site.

You can create your own certificate to achieve the second of these functions, using what is called a “self-signed” certificate. This gives you the private and public key pair that your web site and your visitors’ browsers use to encrypt the communications.

If you create your own certificate, though, it doesn’t certify anything about who you are. Each browser has a list of trusted authorities, and if your certificate wasn’t issues by a company linked to that list, a visitor’s browser will present a warning dialog asking the user if they want to trust this certificate. If the visitor accepts the certificate, it will work fine, but you probably don’t want to take the risk that they won’t, or raise any confusion in your users’ minds about how secure the communication may be. And if your users will accept a self-signed certificate, then anyone can theoretically spoof your site.

SSL Certificate Types and Suppliers

That’s why most everyone buys SSL certificates from a recognized Certificate Authority (CA), which is known to the visitor’s browser as a trusted source. The CA not only issues the certificate, which after all you can do yourself, but it verifies, to some degree, that you are who you say you are.

Prices vary greatly among certificate issuers, but there are no differences in the encryption, and differences in the identity verification probably aren’t significant (since your visitors almost surely won’t know the difference). VeriSign is perhaps the best known; they charge very high prices ($400/year to $1500/year) for certificates whose added value is the VeriSign seal, which consumers may recognize, and a bunch of other things they throw in to increase the value—insurance, security analysis, etc.

At the other extreme, a “Turbo SSL” certificate from GoDaddy costs $19.99/year, or $14.99/year if purchased along with a domain name. The encryption it supports is just as secure as the expensive VeriSign certificate; it just doesn’t do nearly as much to prove who you are, and it doesn’t come with the VeriSign brand. All you have to do to prove who you are is be able to receive mail at the domain owner’s email address (as listed in the domain registration records).

These low-cost certificates only identify the domain name that owns the certificate. If you step up to a “high assurance” certificate ($89.99/year at GoDaddy), then the certificate also identifies the business name and location. This requires that humans at the certificate issuing authority actually check to see that your business exists, independently of its domain registration; you may have to fax documents to them that “prove” you are who you say.

If you step up to an Extended Validation certificate ($500/year at GoDaddy, $1500/year to VeriSign), then you get to go through even more paperwork to prove who you are, and you must be a corporation. The browser can identify an Extended Validation certificate and indicate in some way that this is a “safe” transaction. Currently, IE 7 provides a green background for the address bar in this case. Other browsers are likely to follow. Is this worth $500 to $1500 a year? If you’re a bank, yes; otherwise, perhaps not.

Once you’ve purchased the certificate, you need to generate it, install it on your server, and configure your software to use it. Stay tuned for details on how to do this.

SXSW Podcasts

There’s so many web-related conferences these days, you have to travel a lot to make it to just a fraction of them. One that I regret missing is SXSW Interactive (South by Southwest). It is a huge event, in Austin, Texas, with many thousands of attendees and dozens of panel discussions, presentations, and parties. It draws an impressive crowd of thought leaders in web design and other interactive media. There’s overlapping film and music events that draw even bigger crowds.

Fortunately for those of us who missed it, many of the sessions were recorded and are available as podcasts. It’s definitely not the same as being there—not only do you miss all the personal interaction, of course, but you also don’t get to see the slides. But I’ve found many of the SXSW podcasts well worth listening to.

Here’s a few of the ones I found interesting:

  • Will Wright Keynote—A fascinating talk by the creator of the Sims, about storytelling and game playing and how they are similar and different, and how games can improve their storytelling. (The second half of the talk is a demo of the new game he’s developing, which is not so interesting as audio…)
  • Tag. You’re It—Interesting anecdotes about the different ways tags are used and how they support and influence user behavior.
  • Online Publishers & Ad Networks—Targeted advertising for web publishers. Networks that connect advertising to publishers have become remarkably sophisticated.
  • A Decade of Style—A panel of web standards advocates discusses the evolution of CSS design.

Using External Web Site Monitoring Services

When your site goes down, you want to know about it as soon as possible. Ideally, you’d detect that something was going badly before a failure occurred, using monit or similar software running on your server. (See my article on Installing monit for Server Peace of Mind for details.) But some failures just aren’t detectable in advance. Furthermore, if the failure brings down your server entirely, or is in the connectivity to the server, no software running on your server will be able to notify you.

Fortunately, there’s a simple and inexpensive solution: third-party monitoring services that periodically retrieve a page from your server and send you an alert if it fails. There are many such services, most of which have a limited free option and a variety of paid plans.

My current favorite is site24×7. It has a rich set of features and flexible, reasonable pricing. The free plan will test two URLs every 60 minutes. With the paid plans, you can test as frequently as every 5 minutes (for $4/month per URL). The faster you poll, the more it costs, so you can decide for each site how critical it is to know right away, and how much you’re willing to pay to find out quickly. You can also get status reports via RSS.

Assuming you’re running a dynamic site, it’s not enough to know that the server responded to the page request; you want to ensure that the database and the application server are running. The simple way to do this is to have the service look for certain keywords on the retrieved page. As long as these keywords come from the database, this test gives you pretty good assurance that your application is running and is able to talk to the database. site24×7 offers this ability even on the free plan. On the paid plan, you can also test a web application using a series of URLs to step through a sequence of operations. It will even provide login credentials to the application if needed.

You can receive notifications in a variety of ways. There’s no charge for email alerts, and you can always view the status via the web. You can also view a variety of reports on the web, including uptime statistics and response time data. If you want an SMS alert sent to your phone, it costs $0.20 per message.

Hyperspin is another very capable service. Hyperspin offers a rich set of features and will test connections via a variety of protocols, not just http and https. It will also poll as frequently as once a minute, for $12/month per URL (most services have a minimum interval of 5 minutes). Monitoring every 15 minutes costs only $2/month per URL. It monitors from ten locations spread around the world, so you know that your site is available everywhere.

There’s many other such services. Here’s a partial list:

Google for “site monitoring” or “site uptime” for even more services.

Amost all of these services go beyond just telling you if your site is down. They also report response time and provide averages and plots over time, so you can see how your site is performing and how consistent it is.

Most services provide some sort of public stats page as an option, so you can use it to demonstrate to customers the uptime that you’re providing for their site. You can also use it to document any problems your hosting company may be having, which can be especially useful if you believe they’re not meeting their service level agreement (SLA). Some services even provide uptime badges that you can put on a site to show users what the uptime percentage is.

If you’re not using any site monitoring service, set up one of the free plan to become familiar with it. Then shop the various alternatives for the specific features you’re looking for. Think about how to verify that the critical parts of your application are functioning properly, decide how quickly you want to be notified, and whether you want email, SMS, or an RSS feed. Take a look at how many locations the service checks from, what their reports look like, and whether they can handle any special protocols or authentication you need. Then compare pricing among those that offer the features you want.

For a real belt-and-suspenders approach, use one company to perform a full, frequent test on a paid plan, and use one of the free services for backup in case the paid service slips up.

Extraordinary New Web Design Books

I have a shelf full of CSS and Web design books. Many were helpful in learning the technology, and several are helpful references. I thought I had enough books on this topic, but being the book junkie that I am, I couldn’t resist two new books, and they’ve delivered well beyond my expectations. I highly recommend both of them.

Transcending CSS

Andy Clark’s Transcending CSS: The Fine Art of Web Design is an extraordinary book. It explores territory that virtually none of the other books cover, and at the same time is visually stunning itself.

Transcending CSS is written for web designers with some experience, who understand the rudiments of CSS and HTML and Web layout and want to take their skills to the next level. If you’re new to CSS, there are better books to start with. But if you’ve mastered the basics, you’ll find this book invaluable. It will help you both write better CSS code, with less pain, while producing better-looking sites.

CSS, as Andy says with great understatement, is not designer-friendly. Even basic layout tasks, such as multiple columns with a footer below, are full of surprising complexity and require choosing from a myriad of approaches. Then there’s the challenge of supporting multiple browser versions—which ones do you design for? And what hacks do you use to support ill-behaved browsers?

Andy encourages readers to use modern CSS, semantic HTML markup, while also elevating visual designs to a high standard. In his own words, from the book’s first paragraph:

“Transcendent CSS is more than a plea to use the latest, coolest CSS. It’s a quest to use the lessons you’re learning in CSS as a means to becoming the finest artist and designer you can be. Transcendent CSS asks you to embrace the new rather than the old and to stimulate new ways to find inspiration, create more agile and appropriate workflows for Web design, and encourage yourself to constantly learn more about both the design and the technical issues with which you work.”

The book covers a range of topics that include not only a modern approach to HTML markup and CSS styling, but also the designer/developer workflow, how to prototype a design, the use of grids, and finding design inspirations. It also includes a section on CSS3, which may be more forward-looking that most designers need right now but is nevertheless interesting.

You can think of this either as a design book with an unusual amount of coding, or a coding book mixed with great design advice. Either way, if you’ve moved beyond the basics of Web design, you won’t regret buying this book.

The Principles of Beautiful Web Design

The Principles of Beautiful Web Design, by Jason Beaird, is a very different, but equally valuable, new book. It is a pure design book and ignores coding almost entirely. This gives the author more room to show, as the title says, what makes Web sites beautiful.

As a unifying theme, the author works through the design of an example site that is used throughout the book. He also shows examples from a number of other sites to illustrate various design approaches and techniques.

The sections of the book are:

  • Layout and Composition
  • Color
  • Texture
  • Typography
  • Imagery

Experienced designers may find this book too basic, but it’s a perfect match for new designers and for developers seeking to move beyond sites that look like the developer designed them.

If you spend a few hours with each of these books, you’re almost sure to end up creating better designs in the future.

Installing monit for Server Peace of Mind

As discussed in my previous article on server monitoring, if you care about the uptime of your server, you need to have automated monitoring in place.

monit is a great open-source server monitoring tool, which you install on your Linux server. It can:

  • Constantly monitor any service or system status
  • Restart services when needed
  • Send alerts
  • Provide a monitoring console

Note: These instructions reflect the Rails Machine system configuration, and you’ll need to make adjustments for other Linux variants and directory arrangements. Thanks to Bradley Taylor for providing the instructions from which I started.

Install monit

First, install monit:

wget http://dag.wieers.com/rpm/packages/monit/monit-4.9-1.el4.rf.i386.rpm
sudo rpm -i monit-4.9-1.el4.rf.i386.rpm

(These file names reflect the current version as of this writing; you’ll need to update them as new versions are released.)

Configure monit

monit is highly configurable, and it takes a little work to get it started. First, make a backup of the original config file:

sudo cp /etc/monit.conf /etc/monit.conf.orig

Now edit /etc/monit.conf and make the following changes:

  • uncomment the line “set daemon 120”, so monit will run every two minutes
  • uncomment the line that starts “set alert” and put in the email address to which you want alerts sent
  • uncomment the line “include /etc/monit.d/*.conf”

This last change causes monit to include all .conf files from /etc/monit.d as part of the configuration. This allows you to use a separate config file for each service, which makes it easier to maintain the configuration as things change.

Configure Services to be Monitored

Now create files in /etc/monit.d to configure each service to be monitored. The file names don’t matter, since all files in this directory that end in ”.conf” will be included as part of the confguration.

Monitoring Memory, Disk, and CPU Usage

The following configuration, stored in /etc/monit.d/system.conf, monitors the basic health of the server:

check system redtail.mzslater.railsmachina.com

  if loadavg (1min) > 4 then alert

  if loadavg (5min) > 2 then alert

  if memory usage > 85% then alert

check device rootfs with path /dev/sda1

  if space usage > 85% then alert

Monitoring Apache

This is my Apache monitoring configuration, which is stored in /etc/monit.d/httpd.conf:

check process httpd with pidfile /var/log/httpd/httpd.pid
   group apache
   start program = "/etc/init.d/httpd start" 
   stop  program = "/etc/init.d/httpd stop" 
   if failed host 127.0.0.1 port 80
        protocol HTTP request /monit/token then restart
   if 5 restarts within 5 cycles then timeout

Monitoring MySQL

The MySQL configuration goes in the file /etc/monit.d/mysqld.conf:

check process mysql with pidfile /var/run/mysqld/mysqld.pid
  group database
  start program = "/etc/init.d/mysqld start" 
  stop program = "/etc/init.d/mysqld stop" 
  if failed host 127.0.0.1 port 3306 protocol mysql then restart
  if 5 restarts within 5 cycles then timeout

Monitoring Mongrel Cluster

If you’re running a Rails application, you’ll also want to monitor whatever you’re using as the Rails server. In my case, it is two Mongrel instances. Here’s my basic configuration, stored in /etc/monit.d/mongrel.conf:

check process mongrel-8000 with pidfile /var/www/apps/[appname]/shared/log/mongrel.8000.pid
     start program = "/usr/bin/mongrel_rails cluster::start -C /etc/mongrel_cluster/[appname].conf" 
     stop program  = "/usr/bin/mongrel_rails cluster::stop -C /etc/mongrel_cluster/[appname].conf" 
     # restart if memory usage too high
     if totalmem is greater than 60.0 MB for 5 cycles then restart   
     # send an email to admin if cpu load too high
     if cpu is greater than 50% for 2 cycles then alert                  
     # restart if hung process
     if cpu is greater than 80% for 3 cycles then restart
     if loadavg(5min) greater than 10 for 8 cycles then restart
     # if problems repeat, call the sys-admin    
     if 3 restarts within 5 cycles then timeout                                         
     # check for response
     if failed port 8000 protocol http                                                         
        with timeout 10 seconds
        then restart
     group mongrel

Replace [appname] with the name of your application, and 8000 with the port number that your first mongrel instance is using. The paths may also be different for your installation.

Repeat this configuration code for each mongrel instance, updating the port number for each.

Configure Apache to Not Log monit Requests

Since monit is going to be hitting your web site regularly, you don’t want it cluttering up your logs and distorting your statistics. To stop Apache from logging monit’s requests, add the following lines to the end of your httpd.conf file (in my case, this file is in /etc/httpd/conf).

SetEnvIf Request_URI ”^\/monit\/token$” dontlog
CustomLog logs/access.log common env=!dontlog

You may need to modify the second line to reflect where you’re putting your logs. And, of course, if you’re using a server other than Apache, you’ll need to use that server’s configuration syntax to disable logging of requests to /monit/token.

Since we’ve changed its configuration, restart Apache:

sudo /sbin/service httpd reload

Create a test page

Create an empty file for Monit to request from Apache to test that the web server is alive:

mkdir /var/www/html/monit
touch /var/www/html/monit/token

Start Monit

Finally, enable and start Monit:

sudo /sbin/chkconfig monit on
sudo /sbin/service monit start

We're Back: Colocation Provider Bungles for Two Days

We are (we hope) finally beyond the worst downtime in this site’s short history. Alas, my customers’ sites were down as well, for two periods on Friday and for most of the day Saturday.

What went wrong? There is a chain of helplessness that appeared to end at AtlantaNAP. The company that provides my hosting, Rails Machine, was as helpless as I was to fix the problem. Rails Machine operates about 20 servers in a hosting facility called SiteSouth. SiteSouth, as it turns out, operates a cage full of server racks in a large facility called AtlantaNAP.

Update: In the original version of this article, I laid the blame squarely upon AtlantaNAP. I got a call from one of their representatives on Monday morning stating that the problem was, in fact, SiteSouth’s, and that AtlantaNAP had done what they could to help them out. From my vantage point, I can’t tell who was really at fault here. I’ve updated the article to reflect this ambiguity.

Either AtlantaNAP or SiteSouth had some sort of router problem that apparently caused some or all of the servers in SiteSouth’s cage, including all those operated by Rails Machine, to lose connectivity. AtlantaNAP has many connections to the Internet to provide redundancy, since reliable connectivity is one of the core attributes for a hosting provider. But if a router between those multiple connections and your site fails or is misconfigured, then it doesn’t matter how many connections to the Internet there are on the other side of the router.

That this kind of failure can happen is understandable. For a cohosting facility to take an hour and 45 minutes to correct it is bad, but tolerable if it’s an extremely rare event. But for the failure to repeat twice again that same day, and then for many hours the next day, is really unforgivable.

I suspect some finger-pointing may continue between SiteSouth and AtlantaNAP. Whoever was to blame, both companies are likely to lose some business over this. Within a hour of the first outage, Bradley Taylor at Rails Machine was looking for a new hosting facility, at least as a supplement, and surely the intensity of that effort accelerated as the outages repeated almost unbelievably.

Here’s the log from the site24×7 monitoring service, showing the start time and total downtime for each outage:

  • April 6, 7:51 AM: 1 Hrs 46 Mins
  • April 6, 9:55 AM: 1 Hrs 17 Mins
  • April 6, 2:14 PM: 13 Mins 30 Secs
  • April 7, 9:52 AM : 4 Hrs 21 Mins
  • April 7, 3:47 PM: 13 Mins 51 Secs
  • April 7, 5:33 PM: 1 Hrs 29 Mins

What Happened?

I don’t know yet, and may never know, what really happened. Here’s what I could see from the outside.

Here’s the tracert output for the path from my provider (Comcast) to AtlantaNAP, where the trace bounced back and forth repeatedly between two IP addresses (it repeated more times than I’ve shown here) but never reached my server:

  6    11 ms     9 ms    21 ms  te-8-1-ar01.sfsutro.ca.sfba.comcast.net [68.87.192.137]
  7    10 ms     9 ms     9 ms  68.86.143.9
  8     *        *       14 ms  68.86.90.165
  9    11 ms    11 ms    11 ms  64.215.30.201
 10    71 ms    68 ms    70 ms  NLAYER-COMMUNICATIONS-INC.ge-4-1-0.410.ar4.ATL1.gblx.net [206.41.25.
230]
 11    69 ms    71 ms    69 ms  atl-core-a-tgi2-1.gnax.net [209.51.149.105]
 12    70 ms    69 ms    69 ms  63.247.69.182
 13    69 ms    69 ms    70 ms  209.51.156.5
 14    69 ms    69 ms    71 ms  209.51.156.6
 15    69 ms    69 ms    69 ms  209.51.156.5
 16    71 ms    69 ms    69 ms  209.51.156.6

When the problem was finally (I hope!) fixed Saturday evening, the route changed, getting quickly to my server. All the changes occur in AtlantaNAP’s (or SiteSouth’s) routing. Here’s the current route:

  6    15 ms     9 ms     *     te-8-1-ar01.sfsutro.ca.sfba.comcast.net [68.87.192.137]
  7    11 ms    11 ms     9 ms  68.86.143.9
  8     *       12 ms     *     68.86.90.165
  9    11 ms    12 ms    11 ms  64.215.30.201
 10    69 ms    69 ms    69 ms  NLAYER-COMMUNICATIONS-INC.ge-4-1-0.410.ar4.ATL1.gblx.net [206.41.25.
230]
 11    84 ms    72 ms    71 ms  atl-core-a-tgi2-1.gnax.net [209.51.149.105]
 12    70 ms    69 ms    69 ms  63.247.69.182
 13    70 ms    71 ms    68 ms  207.210.123.118

I hope to have some news in the next few days, in part about what went wrong, but more important at this point, what the plan is for the future for Rails Machine to move beyond SiteSouth and AtlantaNAP.

Automated Website Monitoring

When you’re using a shared host, you’re generally depending on the hosting provider to monitor your system and keep it healthy. But if you’re running your own server, whether a VPS (virtual private server) or a dedicated server, you’re the one responsible for setting up monitoring and paying attention to the results (unless you’re paying for a fully managed server).

Automated monitoring is a crucial technique for increasing reliability of your systems. If you can catch a problem while it is developing, you can proactively resolve it instead of waiting for it to take your server down and create a panic.

Manually monitoring the server is far too much work to do on an ongoing basis. If this is how you’re checking up on your site, it’s only a matter of time before something goes wrong and you don’t notice it until it’s been a problem for some time. Monitoring is a task that scripts can perform much better than humans.

Fortunately, there’s a wide assortment of free and low-cost tools that automate the vast majority of the work and, once set up, make it almost painless to closely monitor the operation of your server. You can get emails or SMS alerts when any sort of threshold is crossed, so you’ll know as soon as possible if there’s something that should be looked into. And failed services can be restarted automatically.

No one tool does everything. You need to look at the system in different ways, from different angles, and with different timescales. So it takes a handful of tools providing complementary functions.

Note: There’s many different programs and services available for each of these tasks. The ones I list here are the ones I ended up using, and they worked for me. If you’re using something else that you like, please add a comment. I’m introducing the products briefly in this post, and will go into more detail on each in a dedicated post.

Monit

Monit is a program that you install on your server and configure for the services that you want it to monitor. For each service, you can specify a series of conditions (such as memory or CPU usage rising above a threshold) that will cause an alert to be generated, and other conditions that will cause an automatic attempt to restart the service. You can also configure it to provide a password-protected web page for see the statistics for all the monitored services.

Logwatch

Logwatch is a program you install on your server and typically set to run once a day. It scours the multitude of system and application logs and creates a daily summary, which you can receive by email. The program is highly configurable but comes with a set of defaults that will get you started quickly. (I’m not entirely sure that I feel better, however, knowing that there were 612 attempts to log in as root yesterday, and 485 different common names were attempted as logins.)

Site24×7

Site24×7 is a hosted service that pings your web sites, generates alerts for slow or failing loads, and logs performance statistics. It can look for specific words on the page, so you can ensure that the database is working and the application is functioning at some level. You can set it up to send an SMS message to your cell phone if a site goes down or crosses a performance threshold.

Google Analytics

Google Analytics you no doubt already know about. It is a great, free, hosted analytics service. It’s more for looking at the traffic patterns than ensuring that everything is running, but a regular check on the analytics can be another source of hints about issues that need investigation.

Each of these tools needs a post of its own to do it justice. Those posts will be forthcoming over the next few days.

Getting Ready to Deploy

I’ve not had time to do many blog posts lately, largely because I’ve been in the thick of getting my first business-critical Rails app ready to deploy. This application will serve a six-doctor medical practice by allowing patients to submit requests for appointments, referrals, prescription refills, and records transfers via the web. In addition to the public-facing site, there’s an admin site for the office staff to track and process the requests.

This site won’t be high-traffic on the scale of a general consumer site, but it needs to be robust and have very little downtime. The app itself is in pretty good shape, and I’ve been working on enhancing my hosting infrastructure. In particular, I’ve been working on:

  • Moving from a single VPS to a staging VPS and a production VPS.
  • Implementing SSL for secure pages.
  • Automating frequent database backups.
  • Installing monitoring software.

Since I haven’t come from a Unix background, along the way I’ve had to engage in a crash course in Linux system administration.

In the next few days, this application will be launched, and then I’ll have a series of posts describing what I learned in each of these areas. Stay tuned.

In Praise of Akismet

The spam nightmare continues, but thanks to Akismet, it’s been reduced to a minor nuisance on this blog.

Akismet was created by Matt Mullenweg of WordPress. Thankfully, the folks who run WordPress didn’t keep this to themselves, but opened it up to all types of blogs—and even other applications, such as forums. Anything that accepts user comments should be using this.

Here’s how it works. When a comment is received on your blog, or a post on your forum, or whatever, your software first submits the comment to Akismet via it’s open API. Akismet does its magic and tells you whether the comment is spam or not. If Akismet blesses it, then your software goes ahead and posts the comment. If not, it puts it in a holding pen, where you can double-check that it is really spam before deleting it. If you’re using WordPress, you can just download the plug-in. If you’re using Mephisto (the Ruby on Rails application that runs this blog), then it is built in. There’s a wide assortment of libraries and plug-ins for other platforms as well.

Understandably, the Akismet folks don’t disclose just how they decide what’s spam, but in my experience, it has been 100% accurate. They do have a vast volume of messages to learn from: since the service started, they’ve detected a staggering 643,803,210 spam posts, and they see millions a day. A revolting 94% of all posts submitted are spam.

The spammers are getting a little more clever, but Akismet is one step ahead of them. The 20 or so posts a day I’ve been getting for male dysfunction remedies are linking back not to the site of any spam company, but instead are linking to posts on other blogs and forums where the spam has been posted. So the link is to a legitimate place, which is unknowingly hosting the spam message. These posts come in bursts of three to five, each with a different email name attached and with a slight variation of the text, but clearly they all come from the same place. Akismet has gotten them all, so you never see them, and all I have to do is do a quick daily scan of the quarantined posts and click “delete all.”

Once you’ve installed a plug-in or integrated one of the libraries, you need to get an API key. This identifies each user of the service and helps the WordPress folks monitor use of the system and control abuse. An API key is free for non-commercial bloggers (which they define as anyone making less than $500/month from their blog). If you’re a “pro blogger,” you can get a key for just $5/month, which is well worth it. Enterprise subscriptions start at $50/month for 5 blogs. Non-profits can use the service for free if they provide some back-links to help promote the service, or for half-off the enterprise prices if not.

With this service available, there’s really no excuse, other than the need to implement the API interface, for any software to be posting spam. If we can eliminate the ability to post spam, we can take the upside out of this dirty business and send the scum who post spam comments off to some other misguided pursuit.

Great Free Web Developer Tools

Having recently left a five-year stint at Adobe, I became accustomed to the world of expensive software. Now that I’m doing web development on my own, I’ve been amazed at the diversity and quality of free tools. I think this has some serious implications for the future of the software business, but that’s a topic for another post… right now, I just want to rave about a few of these great tools.

Firebug—A Must-Have Web Developer Tool

The Firebug extension for Firefox is a life-changing product for web developers. If you’re doing web development and you’re not already using it, go get it right now! Among many other capabilities, you can:

  • Edit the HTML and CSS for a page you’re viewing and see the results instantly. This is very nice when you’re tweaking stuff to get it just right. It doesn’t save the files, so when you’re done you still have to go fix your source files, but it makes quick iterative experiments much easier.
  • View the source for the page in a pane, either the HTML, or the CSS, or the JavaScript.
  • Hover over an item on the page and see the corresponding HTML code highlighted, or vice versa, and see the cascade of styles applied to any element.
  • See the timing for how the page was loaded, file by file.
  • Debug the JavaScript code.

And lots more. Don’t work on web development without this.

IE Tab—Internet Explorer within Firefox

While we’re on the subject of Firefox extensions, I find IE Tab to be a great help. Browser rendering issues are a major nightmare in web development. Install the IE Tab Firefox extension, and with one click you can switch the rendering engine for any tab back and forth between IE and Firefox. My only wish is that it also would let me switch between IE 6 and IE 7.

RadRails—Ruby on Rails IDE

RadRails is a cross-platform Ruby on Rails development environment based on Eclipse. You get not only a nice Ruby editor, but also an integrated Subversion client and database browser. There’s integrated support for creating Mongrel servers for each app, and an integrated browser for viewing them. There’s also GUI access to the various Rails scripts, RI, and RDoc, though I’ve found this part less compelling.

Cygwin—Linux utilities for Windows

Cygwin is a tremendous collection of GNU tools ported to Windows. I know, if you have a “real” computer you don’t need this… but for those of us using Windows machines and also doing Linux server work, it’s a great convenience.

SQLyog—GUI for MySQL Database Management

SQLyog is a very nice GUI for managing MySQL databases. The free community edition unfortunately lacks some key features, like SSH tunneling, but it is still quite useful. And the Enterprise version is less than $50 for a single user (and even less for non-commercial use).

PuTTY—SSH Client for Windows

PuTTY is a Telnet and SSH client for Windows. An essential tool for working with remote Linux servers from Windows. Also part of the fully PuTTY package is Pageant, which enables you to use SSH keys on a Windows box for authentication with remote systems. (See my article on Using SSH Keys to Speed Login.)

WinSCP—SFTP Client for Windows

WinSCP is an open-source secure FTP client for Windows. This make it easy to securely browse remote filesystems.

JungleDisk—Cross-Platform Client for Amazon S3

JungleDisk turns Amazon’s S3 storage service into an easy-to-use virtual drive on a PC, Mac, or Linux box. This is a very inexpensive way to get easy remote storage. Great for backups or for moving files between systems.

Painless CSS Navigation Buttons

Creating web site navigation using HTML lists and styling them with CSS has a certain elegance to it, and I’ve used it for all the sites I’ve built in the past two or three years. But I still find the CSS styling painful and often struggle to get just the effect I want. Like so much of CSS, it has just enough complexity to take something that seems reasonably simple, make it not quite intuitive, and provide a multitude of subtle ways to go wrong.

I recently came across the Listamatic page at the excellent CSS tutorial site from maxdesign, a site that provides several dozen quality examples of horizontal and vertical navigation buttons styled with CSS. For each version, the site provides a simple and clean presentation, with a working illustration, the corresponding HTML and CSS code, and a link to the originator’s site. There’s also tutorial sections on the box model, browser issues, and the other arcana that make this simple task just a little complicated. I’ve found it to be an excellent resource.

SSH Keys for Subversion

It turns out that my article on using SSH keys to speed login isn’t quite complete, assuming you’re using Subversion. You need to take one more step to enable Subversion (SVN) to use the private key generated by PuTTYgen: adding a line to Subversion’s configuration file.

(Note: I’ve now updated that article to include this item.)

Subversion’s configuration file is located in the Application Data directory under your user account. The full path is:

C:\Documents and Settings\{your windows user name}\Application Data\Subversion\config

Note that Application Data is a hidden folder, so to locate this file you must have Windows set to show hidden files and folders.

Open the config file in any plain text editor (such as Notepad) and add the following line:

ssh = $SVN_SSH plink.exe

plink.exe is the command-line link setup program that is included with PuTTY.

You’ll also need to make sure that the PuTTY directory is listed in your system’s Path.

Using SSH Keys to Speed Login

(Updated: added tip on stopping pageant DOS window from popping up, and integrated formerly separate post on using Subversion)

In a previous post, I described how to set up SSH access from a Windows system to a remote Linux server. With this basic setup, you have to enter your password every time you log in to the server, which is not unreasonable from a security perspective. But if you want to automate tasks and use deployment tools such as Capistrano, you’ll end up typing that password over and over again, even for a single deployment process. Fortunately, there is a mechanism to avoid this while still preserving good security. But, as with most such things in Windows, it takes a little effort to set it up.

SSH authentication uses public key cryptography, in which you have a private key available only to you on your local system, and a matching public key that can be published on your server. Authentication software can confirm that the public and private keys match, but hackers cannot derive your private key from your public key. Once you set up a public-private key pair, these keys can be used to authenticate your SSH sessions, and you won’t ever have to type your password again.

There’s a couple different programs you can use to accomplish this; I’m going to explain how to do it with PuTTY and its associated programs, PuTTYgen and Pageant. If you installed the full PuTTY package as recommended in my previous article, you’ll have all three programs already installed. If not, download the installer and run it now. (Be sure to get the full package, under the heading “A Windows installer for everything except PuTTYtel,” and not just putty.exe.)

Creating Your Keys with PuTTYgen

To create your public-private key pair, run PuTTYgen. There’s several types of keys, but SSH-2 RSA is the most common and is selected by default. (If this doesn’t work, you’ll need to check with your host to see what type of key their SSH server is expecting.) The number of bits defaults to 1024, which is fine. So all you have to do in the PuTTYgen window is click the Generate button, and then wiggle the mouse around a bit. The mouse movements generate random data that ensures that your key is unique.

When PuTTYgen is done creating the key, it will show a long string of characters that make up the public key. Select this text and paste it into a file, named something like id.pub (using notepad or any simple text editor). I made a folder at the root level of my C drive called SSH to store these keys and other related info, but you can put it anywhere you can find it later. (Note: you can also click the Save Public Key button and enter a file name, but this file won’t work as an alternative to the id.pub file we generated with cut-and-paste. It includes line break characters that confuse the server-side SSH code.)

Now you need to save your private key. If you just click the Save Private Key button, PuTTYgen will ask if you really want to save it without a passphrase, because we didn’t enter one. Here you have a choice to make between convenience and security.

The passphrase is essentially a password for accessing the key. Once you have your public key uploaded to your server (which we’ll do shortly), anyone who has access to your private key will have access to your server. If you use password protection on your PC, and you’re the only one with access to it, you might be comfortable going without a passphrase. But it is safest to use a passphrase, and we’ll soon see how you can make it so you only need to enter it once each time you boot your system. So to set a passphrase and save the private key:

  • Enter it twice, once in the Key Passphrase field and once in the Confirm Passphrase field. Keep in mind that this passphrase is essentially the key to accessing your server, so make it a robust password.
  • Click the Save Private Key button, and enter a file name (no extension) for your private key. The .ppk extension is automatically appended.

You now have your key pair and are done with PuTTYgen. Next you need to upload your public key to your server and set up your PC to access your private key.

Uploading Your Public Key

The details of uploading your public key may vary depending on the server configuration. The instructions below are for Rails Machine and are derived from the Mac and Linux oriented instructions they provide.

Open an SSH session to your server (using PuTTy, or another client if you prefer, as described in my previous post.) You probably have more than one user account; in my case, following the recommended practices from the Rails Machine folks, I have a root account that I never log into directly, and regular user accounts of Michael and Deploy. The Deploy account is the one I use for almost all communication with the server. So log into that account, or its equivalent for your setup. You’ll have to manually enter the password one last time.

Now, in the shell window that is connected to your server, create a directory for the private key file:

mkdir ~/.ssh

This creates a directory named .ssh within your home directory, which is where the SSH server will look for the public key.

Now set the permissions for this directory so you, but only you, have all privileges:

chmod 700 ~/.ssh

Now you have a directory on your server to hold your public key, and you need to move the key up there. There’s various tools you can use to do this. One tool you should become comfortable with is scp, or secure copy. It is not built in to Windows, but there is a version of it that comes with PuTTY, called pscp. If you add the path to the PuTTY program directory to your system path, you’ll be able to use pscp in any command window. (You may also want to install a set of Unix-style utilities; you can install the entire Cygwin environment, or if you want something lighter weight just for SSH-related tasks, get just the OpenSSH utilities. In either case, make sure to add to your Windows system path the folder in which these programs are stored, so you can use them from any command window without having to type their full path.)

To copy the public key, follow these steps:

  • Open a Windows shell in the folder in which you’ve stored your public key. (If you installed the Command Here utility as I recommended in the previous article, you can just right-click the folder and choose Open Command Window Here.)
  • In the command window, type

pscp id.pub username@hostname.com:~/.ssh/authorized_keys

(Of course, you’ll need to replace “username” with your actual user name, and “hostname.com” with the name of your server. If you’ve named your public key something other than id.pub, replace that name as well. Finally, if you’re using scp from OpenSSH instead of PuTTY’s pscp, drop the p in the command name.) This will copy your public key to a file called authorized_keys in the .ssh directory in your home directory.

Finally, to make the key file a little more secure, go back to your SSH window (remember, we started there but then switched to the Windows console), and type:

chmod 600 ~/.ssh/authorized_keys

This ensures that only the owner of this file (that’s the user name you began your SSH session with) can read or write it.

Making Your Private Key Available in Windows

OK, we’re almost there. Now we need to enable Windows programs making SSH connections to access your private key file. You could set PuTTY to use the key file, but that doesn’t buy you much, since it will ask for the passphrase every time you open a connection, and it won’t be available to other programs (such as Capistrano). So, you need to use another program called Pageant, which is installed along with PuTTY, to load the key into memory and make it available to other programs.

You can run Pageant directly via Start > All Programs > PuTTY > Pageant, and then you can tell Pageant to load your private key. But assuming you want the private key to always be available, you want it to load automatically upon startup. To do so, create a text file called load_private_key.bat (or whatever), with the following contents:

start “Pageant” “c:/Program Files/PuTTY/Pageant.exe” c:/ssh/id.ppk

Note that you’ll need to change the path to Pageant.exe if you didn’t install PuTTY in its default location. The id.ppk file is the private key file that you generated from PuTTYgen. (Using the “start” command, rather than simply providing the path to Pageant directly, prevents a DOS window from being left on the screen. Thanks to Tim Jervis for this tip.)

Finally, add this batch file to your startup tasks (Click Startup > All Programs > right click on Startup and choose Open, then right-click the load_private_key.bat file, drag it into the startup folder, and choose Create Shortcut from the menu that appears when you release the mouse).

Now, when you reboot your system, the batch file will run, Pageant will load your private key, and you’ll be prompted for the passphrase that you specified when you created the key. Enter this passphrase just this once, and your private key is now available to all SSH functions. When you shut your computer down, everything is secure again.

Setting up Subversion

If you’re using Subversion, you need to take one more step to enable it to use the private key generated by PuTTYgen: adding a line to Subversion’s configuration file.

Subversion’s configuration file is located in the Application Data directory under your user account. The full path is:

C:\Documents and Settings\{your windows user name}\Application Data\Subversion\config

Note that Application Data is a hidden folder, so to locate this file you must have Windows set to show hidden files and folders.

Open the config file in any plain text editor (such as Notepad) and add the following line:

ssh = $SVN_SSH plink.exe

plink.exe is the command-line link setup program that is included with PuTTY.

You’ll also need to make sure that the PuTTY directory is listed in your system’s Path.

Unfortunately, plink insists on popping up a DOS window, which is annoying. If anyone knows how to stop it from doing this, please let me know!

You’re Done!

That was simple, wasn’t it? :-) This may seem like a lot of trouble to go to just to avoid having to type your password, but once you’ve set this up once, you’re done. And if you’re using an automated deployment tool such as Capistrano, you’d have to type your password multiple times for a single deployment (since one deployment involved multiple SSH commands and other actions); with this setup, it can be fully automated.

Remote Linux Admin for Windows Users

All the cool kids in the web world these days seem to be using Macs, which have hearts of Unix so are natural complements to Linux-based servers. Others are running Linux desktops. So a lot of the remote server administration information on the web assumes that you’re on either a Mac or a Linux box.

For historical reasons, however, I have a collection of Windows systems, and they’re what I’m comfortable with. I also have some things, like my collection of 60,000 photos managed in the Photoshop Elements Organizer, that aren’t easily moved to a Mac, and I have lots of Windows applications that I own and am familiar with. So while I don’t have any religious feelings about it (please, spare me the Mac evangelism), I’m using Windows systems to remotely administer my web servers.

This really isn’t a problem, as there are ample tools available to make Windows do most everything a Linux system does, or a least everything you need to do to administer one remotely. But it does take a little more effort, at times, to track down the right tools and figure out how to apply them. If you’re early in this process, this article may help. (If, on the other hand, you’re a grey-beard Linux hacker or a Mac die-hard, you can stop reading now.)

Although there are GUI interfaces for Linux, remote administration is done predominantly from the command line. And if you want to follow the well-greased paths for deploying Rails applications, you’re going to be living in a command-line world. This is, of course, rather alien in the Windows environment.

There’s really two command-line environments you need to use: the Windows command shell, for taking actions on your local machine, and a Linux shell, for interacting directly with your server. The Windows shell is essentially a grown-up version of the old DOS prompt. Linux shells come in a variety of versions, with BASH being the most common.

Enhancing the Windows Command Window

You need to use the Windows shell to control your local development environment, and with some extensions (to be described in a later post), you can use it for some tasks that involve your remote server as well.

To open a Windows command shell, you can select Run from Start menu and then enter cmd and click OK. But there’s a better way: Microsoft offers a free add-on that lets you open a command window by right-clicking on any folder and choosing a new option that the add-on installs, “Open Command Window Here.” Aside from being quicker than the Start > Run > cmd approach, it opens the command window with the current directory set to the folder upon which you right clicked. Download the command window PowerToy. It is entirely painless and will make your life just a little bit simpler.

Now you should customize your command window settings, as the defaults are pathetic. The window has no menu, so it may not be immediately clear how one customizes it. The secret is to open any command window, right-click on the title bar, and choose Properties. Once in the properties dialog, here’s some things you might want to change:

  • In the Options tab, check the boxes to enable Quick Edit Mode and Insert Mode. This enables you to cut and paste text (you can’t use ctrl-X and ctrl-V like you can in a GUI environment). To copy, select the text and then click the right mouse button. To paste, just click the right mouse button.
  • Also in the Options tab, change the Buffer Size to 999, and the Number of Buffers to 5. This gives you more memory for past commands. At any command prompt, press the up arrow repeatedly to move back through previous commands. This can save a lot of typing.
  • In the Layout tab, increase the Screen Buffer Height to 2500, so you’ll have more text you can scroll back through after it scrolls off the top of the window. Increase the Screen Height to provide a window as tall as you’d like; I prefer 75 for my 1200-pixel-high monitors.
  • In the Colors tab, change the text and background colors if you’d like. White text on a black background is traditional and has a retro appeal, but I prefer black text on a white background.

When you’re done making changes, click OK, and then choose Save Properties for Future Windows in the dialog that appears. Now you’ll have a much nicer command window to work with from now on.

Get Set up for SSH

Although you can use the Windows command prompt to act upon your remote server, the primary method used to access Linux systems remotely is SSH (Secure Shell). There’s not an SSH client built in to Windows, but good free clients are available. The most popular is PuTTY. Download the PuTTY installer package. Choose the download labeled “A Windows installer for everything except PuTTYtel”, which will get you the complete set of PuTTY utilities, some of which you’ll want later.

Run PuTTY, and you’ll see a deceptively simple window. There’s actually lots of options here, which you can explore by clicking the categories on the left. But you can get started by using all the defaults and simply entering the name of your host (or its IP address) in the Host Name field and clicking Open. (To save yourself a little typing in the future, you can enter a name under Saved Sessions and click Save, and then the next time you can just double-click this name in the Saved Sessions list.)

Assuming PuTTY is able to connect to your host, you’ll then see another of those lovely white text on a black background windows (you can change these settings in the initial PuTTY dialog), with a Login: prompt. At this prompt, enter the user name your host assigned you, and then you’ll get a password prompt. Enter the correct password, and you’ll be online talking to your server, with essentially all the control that a user sitting at the machine has. All data sent back and forth is securely encrypted, so no one will be able to sniff your network traffic and figure out how to get into your server (unlike FTP, in which not only your files but also your user name and password are sent in clear text).

If you aren’t able to connect to your server (even to the point of getting a Login prompt), then check the following:

  • Make sure your host has enabled SSH access. If you have a shared hosting account, it might not be offered, or you might have to ask for it.
  • Make sure you have the host name right. This should be simply the domain of your web site. If it is a new account and you haven’t set the DNS yet, you can use the IP address.
  • If all else fails, check with your host to see if they’ve moved SSH to a port other than the standard 22. Some companies are doing this to reduce brute-force attacks. You can enter any port number in the PuTTY dialog.

If, on the other hand, you get the login prompt but it doesn’t accept your user name or password, double-check that you have these exactly correct. For some hosts, you may need to use “name@domain.com” and not just “name” for your login name. Check the signup material you received when you opened the hosting account.

Once you have these two command-line environments in place, you have the essential tools to both control your local development environment and to administer your server. Now you just need to know what to type into these windows :-). More on that in future posts…

For more information:

Moving to More Serious Web Hosting

I’ve been building web sites for almost 10 years, but I’ve only recently made the transition into more robust hosting accounts. For years, I got by with shared hosting, in which the host runs hundreds or thousands of web sites on a single server. These accounts are by far the most pervasive because they’re inexpensive (some as low as a few dollars a month, and most less than $20 per month) and they don’t burden the user with any of the complexities of server administration.

I began to outgrow shared hosting as my San Francisco Boating site, BoatingSF.com, started to become popular, and I started seeing failures because I was hitting database connection limits. I also added a feature that displays real-time boat positions, and to make this work I needed to be able to run background tasks and make socket connections. So last spring I made the leap to a virtual private server (VPS).

Virtual Private Hosts

A VPS is like having your own dedicated server, but instead of having the entire server to yourself, at a cost of at least $100/month and typically more like $200/month or even higher, you get a fraction of a physical server. Virtualization software, typically either Virtuozzo or Xen, makes the single machine appear as multiple, independent Linux boxes, one for each VPS account. Each VPS account has its own Linux installation, with its own Apache server, MySQL server, and so forth. You get root access to the (virtual) machine, so you have nearly complete control.

With this control, of course, comes more responsibility. The fact that you have root access means that not only can you change anything, you can also break anything. Depending upon the host, they may provide more or less assistance, but in general they expect you to be able to administer the server (though they will typically handle installation of the OS, MySQL, and other major applications).

Many VPS installations offer some kind of control panel as an option; cPanel and Plesk are the two most common options. These control panels give you a GUI interface, accessed via your web browser, that allows you to do most administration without knowing much about Linux or having to use a command line. The control panel makes it easy to add accounts and domains without having to know how to edit the Apache configuration file and administer user accounts. But I found that the control panel sometimes made it hard to understand what was really going on, and as I strayed from the straight and narrow there were things I couldn’t figure out how to do without getting down and dirty with Linux.

Hosting Rails

Moving to Rails added another complication. Rails hosting is a bit trickier than PHP, especially if you want the best performance and reliability. So while I left my PHP sites on my initial VPS, I got a Rails-oriented shared hosting account from Rails Playground. That’s where this blog is hosted as I write this. But I’m now preparing to work on several Rails sites, some of which I hope will have modest usage levels, and some of which will be deployed on behalf of businesses that expect high reliability. Even with this blog’s light loads, I’ve found page load times to be highly variable, sometimes extending to several seconds, and feedburner is sometimes timing out fetching the feed.

In pursuit of these goals, I decided to get another VPS account from a Rails-oriented host that I felt could provide good performance and reliability, and that would also have some help documents to guide me through the process of doing things right. My VPS for my PHP sites is at ServInt, and while they’ve been reliable and did eventually install Rails for me, they don’t really support it and certainly didn’t have much to offer in terms of advice.

I read some good reviews of Media Temple’s” Grid Server, which is sort of like shared hosting but across a large grid of machines instead of just one. They offer a Rails “container” at a reasonable price ($45/month with 256M RAM), so I gave it a try. But I quickly became disenchanted because of a few issues. First, as I read more blog posts, I discovered that the Grid was experiencing significant downtime. While it should, in theory, be more reliable than a single server, in practice it is a complicated setup that clearly is not quite mature. Second, because of the grid arrangement there’s a number of things you have to do specially, and you don’t have the level of control that a VPS solution provides. And finally, I found that they only offered a single Mongrel process for each Rails application, which is not recommended for sites that have significant load. So I decided to bail on Media Temple, even though the price was attractive. Once this solution matures a bit, it may be worth a second look.

Next, I decided to go for a solution that I was confident would be an excellent, robust Rails implementation. One of the best-known Rails hosts is TextDrive. To get beyond their shared hosting accounts, however, you’re looking at a minimum of $100/month, and they are not VPS accounts, so some of the things I want to do may be problematic. Engine Yard is another high-end host that seems to really know what it is doing when it comes to serious Rails performance, but their packages start at $250/month.

My Answer: Rails Machine

In the end, I decided to go with Rails Machine. At $75/month for their basic VPS account, it was about double the price of Media Temple’s Rails container, with similar specs, but I was a lot more confident about how it would perform. Rails Machine is a relatively new company that is focused on simple, robust Rails hosting. It is run by Bradley Taylor, who is the author of mongrel_cluster and created the railsmachine gem to streamline deployment. I like the idea of having this kind of talent at the hosting company—especially at a company small enough that Bradley answers support requests personally.

In the next couple weeks, I hope to have this blog switched over to Rails Machine. This is not a control-panel environment, so after 30 years in computing without significant Linux experience, I’m climbing up that learning curve. Much of the available hosting administration reference material assumes that you already know your way around Linux pretty well. And the Linux administration books typically aren’t oriented toward remote administration of an already configured machine and have lots of detail that you don’t need to know. The knowledge you need is spread across a dozen books and sites, with each containing lots of information you don’t need and none covering the whole spectrum of information you do need. But since I have time now to dive into this learning process, it’s a fun challenge.

I’ll be following up with a series of posts on what I’ve learned along the way, in the hope that I might be able to simplify the process for others.