aegir

$100 Drupal Site Series: Part 4 - Platforms

So far in this series we have covered a potential target market and business plan, resources and infrastructure and the tools required to deliver Drupal sites with a sale price of $100 per site. In this post I'll be covering some of the considerations when building Drupal platforms or distributions.

The sites which customers deploy will need to be based on a custom Drupal distribution or "distro". The distro should be modular and primarily driven by Features.

Customers shouldn't have to know anything about administering Drupal when they first buy their site. A customer should be able to turn functionality on and off as they want, through a simple user interface.

Features

The platform should contain a good collection of Features. The following list is an example of what you might offer customers:

  • Contact Form
  • Image Gallery
  • Products
  • Services
  • "Static" Pages
  • Blog
  • News
  • Mailing Lists
  • Social Network Integration
  • Office / Store Locations
  • Staff Profiles

When developing your list of things to include in the site, think in terms of functionality a small business would want, not what modules you should be using. The list of modules should be derived from the functionality, not the other way around.

As the features included in the platform will be modular and generically useful, you should consider releasing them publicly, via your own features server or drupal.org as full modules.

On top of the features listed above you will probably need to include some custom glue code to enhance the user experience. In my first post in this series I discussed the target audience not having high level computer skills, so the user interface should take this into account. Some of the language might need to be changed or form options modified to use sane defaults and some might even be hidden from the user.

Security

As each server may have hundreds or even thousands of sites running on it, security will be an important consideration. Like with all servers you should ensure it is properly locked down and only running the services you need. Apache should be configured to block access to most things except index.php and relevant client side files (images, css, js) and the files directory. At the Drupal level you should make sure that things like the PHP module aren't enabled and secure coding practices are adhered to. The user account given to your customer shouldn't be user 1, they should be user 2, with restricted permissions that only gives them access to what they need.

I strongly recommend that you read Cracking Drupal by Greg Knaddison.

Sales and Support

In order to attract customers you will need a site to promote the service and allow customers to sign up and hand over their credit card details. Drupal now offers 3 ecommerce projects, Drupal e-Commerce, Drupal Commerce and ubercart, you should investigate which of these best suits your needs. The sales system will need some custom code to hook into Aegir, which will be managing the actual site deployments. The sales and support platform/s should be managed in a similar manner to the customer sites.

Once you have paying customers, you will also need to provide them with some resources such as detailed documentation, video walk throughs, forums and possibly a ticketing system. This site can either be part of the sales site or a separate site. In the next instalment I'll cover support in more detail.

Deploying Platforms

We need to keep the whole process very automated, CPU cycles are a lot cheaper than workers. Building and deploying platforms should involve a few clicks or tweaking a configuration file. For example platforms could be built as Debian (or Ubuntu) packages (aka debs) using an automated build process that runs a test suite against the code base before building a deb. The debs could then be deployed using puppet and a post installation script can notify Aegir that it has been installed successfully. The whole process could involve very little human interaction. Migrating client sites to upgraded platforms could also be automated using a simple script which adds a migrate task for each site.

What's Next?

Now that we have the service almost ready to go, we should look into how we are going to get customers to part with their cash and how we will support them once they have paid.

$100 Drupal Site Series: Part 3 - Tools

In the previous instalment of my $100 Drupal site series I covered resources and infrastructure. In this post I will be covering the development tools I think you need in order to build and sell Drupal sites at the $100 price point. Given that Drupal 7 is close to release, it is assumed that the sites will be built using D7. I don't believe that it is smart to invest heavily in Drupal 6 for new long term projects, given D8 could be out in 18 months and D6 would then be unsupported.

git

You are going to need a version control system for storing all of the code. There are many different version control systems available, but I think git is the most flexible and powerful option. Git allows for distributed development, offline commits and best of all, Drupal is switching to git. Gitorious is an open source clone of github which allows you to browse your git repository and manage integration of code from all of your developers.

If you are new to git I strongly recommend you get a dead tree copy of the book Pro Git.

drush

Drush is the DRUpal SHell, a command line interface for Drupal. It is an invaluable tool for developing and maintaining Drupal sites. Simple things like running "drush cc all" to clear all caches when developing sites, through to being able to sync databases from one server to another, or one continent to another, save many hours. Even if you're only running a handful of sites, you should have drush installed on all of your servers.

Modules

There are a few modules which you should be using if you want to be able to create polished Drupal based platforms. My list includes:

  • Features module allows developers to package up configuration as a Drupal module, including access to all the normal hooks that a module can implement. Feature is version control friendly and includes the ability to reset to a known good state, either via the Drupal web GUI or by using drush on the command line.

  • Strongarm exports values from Drupal's variables table, allowing sane defaults to be set. This means your site will always be configured the way you want it to be.

  • Context allows you to manage contextual conditions and reactions for different portions of your site. It gives you control over displaying blocks,hierarchy of breadcrumbs, themes and other context sensitive configuration options.

  • Admin is a menu module which makes it quick and easy to access items. (Disclaimer: I co-maintain the module)

  • Coder can perform static analysis of your code to ensure it is secure and complies with the Drupal coding standards. The module can also be used to upgrade code from one major version of Drupal to the next.

  • Devel is a collection of tools to help developers create Drupal sites and modules. Devel integrates with Admin to provide easy access to the devel actions.

  • WYSIWYG provides a pluggable framework for integrating rich text editors into a Drupal site. Given the target market for the $100 sites, we can't expect our users to hand craft HTML, they'll be expecting "something like Word".

  • Skinr provides a way for end users to apply predefined CSS to designated parts of a site, without the user having to understand CSS.

Installation Profiles

When most people install Drupal for the first time they are blissfully unaware that they are using an installation profile. An installation profile sets out a base configuration for a site that is installed by Drupal. Until recently installation profiles contained a list of modules to install and a large chunk of hand crafted PHP code which setup all the configuration for a site. These days most of the configuration can live in Features and so an installation profile can be very lightweight. It is worth reviewing some of the more popular installation profiles, such as Open Atrium, Managing News or Drupal Commons, for inspiration. Installation profiles have changed a lot in Drupal 7, so you will need to port the ideas to the new way of doing things.

Drush Make

Drush Make is a tool for building Drupal platforms. Drush Make allows developers to specify all the components of their site in a text file, then use the command line tool to "make" the site. It is possible to specify a version of Drupal core (including Pressflow), contrib modules and themes, third party components, patches and external libraries.

Aegir

Aegir is a Drupal site deployment and management tool built using a collection of Drupal modules. With Aeigr you can manage your DNS, http (web) and database servers from a common UI. It is possible to move a bunch of sites from one server to another in a matter of minutes, not hours and the downtime will be measured in seconds. When security fixes are released, testing can involve a few minutes of work then a few more clicks to deploy it to all of your sites.

Workflow

Instead of me explaining development workflows, I'll defer to Miguel Jacq (aka mig5). Miguel's blog post entitled "Drupal deployments & workflows with version control, drush_make, and Aegir" is considered by many to be the key work on modern Drupal development workflows.

What's Next

Many of the tools I've covered today should be in your Drupal developer's tool bag. My next post will cover what I think are the important considerations in building the platforms for the service.

Travelling, Speaking, Scaling and Aegiring

The next couple of months are going to be a crazy ride. I will be visiting at least 7 countries, speaking on 8 or more days in a 5 week period. The talks will be focused on Drupal and Aegir. My schedule is below.

Horizontally Scaling Drupal - Melbourne

On 7 August I'll be running a 1 day workshop around the theme of horizontally scaling Drupal. The content is built on the knowledge I developed building, deploying and managing around 2100 sites for a client. This event has very limited capacity and has almost sold out.

DrupalCon - Denmark

Denmark is hosting the European leg of DrupalCon this year. I will be attending the full conference. I won't be presenting, but I will be getting involved with some of the BoFs. I had a ball at DrupalCon San Francisco earlier in the year.

Efficiently Managing Many Drupal Sites - Slovakia

After spending a couple of days recovering from DrupalCon, I'll be teaming up with the crew at Sven Creative in Bratslavia, to run a 2 day intensive workshop on horizontally scaling Drupal and development workflows. For more information check out the workshop website.

Free Software Balkans - Albania

On the weekend of 11-12 September, the inaugural Free Software Balkans Conference will be held at the University of Vlore, Albania. I'll be there speaking about Drupal and Aegir. In addition to this I will be running half day build your first Drupal site workshops around the country. The dates and locations for the workshops are still being finalised.

OSI Days - India

On my way back to Australia I will be taking a side trip to Chennai, via Delhi, for OSI Days 2010, Asia's largest open source conference. I will be presenting sessions on Aegir and Drupal. This looks like it will be a huge event.

Other Events

I've launched a new site workshops.davehall.com.au to list my training and speaking engagements. As dates are locked in I'll be adding them to the site.

If you would like to meet with me while I'm on the road, add me to your tripit network, follow me on identi.ca or twitter or add me to your network on LinkedIn.

Multi Core Apache Solr on Ubuntu 10.04 for Drupal with Auto Provisioning

Apache Solr is an excellent full text index search engine based on Lucene. Solr is increasingly being used in the Drupal community for search. I use it for search for a lot of my projects. Recently Steve Edwards at Drupal Connect blogged about setting up a mutli core Solr server on Ubuntu 9.10 (aka Karmic). Ubuntu 10.04LTS was released a couple of months ago and it makes the process a bit easier, as Apache Solr 1.4 has been packaged. An additional advantage of using 10.04LTS is that it is supported until April 2015, whereas suppport for 9.10 ends in 10 months - April 2011.

As an added bonus in this howto you will be able to auto provision solr cores just by calling the right URL.

In this tutorial I will be using Jetty rather than tomcat which some tutorials recommend, as Jetty performs well and generally uses less resources.

Install Solr and Jetty

Installing jetty and Solr just requires a simple command

$ sudo apt-get install solr-jetty openjdk-6-jdk

This will pull down Solr and all of the dependencies, which can be alot if you have a very stripped down base server.

Configuring Jetty

Configuring Jetty is very straight forward. First we backup the existing /etc/default/jetty file like so:

sudo cp -a /etc/default/jetty /etc/default/jetty.bak

Then simply change your /etc/default/jetty to be like this (the changes are highlighted):

# Defaults for jetty see /etc/init.d/jetty for more

# change to 0 to allow Jetty to start
NO_START=0
#NO_START=1

# change to 'no' or uncomment to use the default setting in /etc/default/rcS 
VERBOSE=yes

# Run Jetty as this user ID (default: jetty)
# Set this to an empty string to prevent Jetty from starting automatically
#JETTY_USER=jetty

# Listen to connections from this network host (leave empty to accept all connections)
#Uncomment to restrict access to localhost
#JETTY_HOST=$(uname -n)
JETTY_HOST=solr.example.com

# The network port used by Jetty
#JETTY_PORT=8080

# Timeout in seconds for the shutdown of all webapps
#JETTY_SHUTDOWN=30

# Additional arguments to pass to Jetty    
#JETTY_ARGS=

# Extra options to pass to the JVM         
#JAVA_OPTIONS="-Xmx256m -Djava.awt.headless=true"

# Home of Java installation.
#JAVA_HOME=

# The first existing directory is used for JAVA_HOME (if JAVA_HOME is not
# defined in /etc/default/jetty). Should contain a list of space separated directories.
#JDK_DIRS="/usr/lib/jvm/default-java /usr/lib/jvm/java-6-sun"

# Java compiler to use for translating JavaServer Pages (JSPs). You can use all
# compilers that are accepted by Ant's build.compiler property.
#JSP_COMPILER=jikes

# Jetty uses a directory to store temporary files like unpacked webapps
#JETTY_TMP=/var/cache/jetty

# Jetty uses a config file to setup its boot classpath
#JETTY_START_CONFIG=/etc/jetty/start.config

# Default for number of days to keep old log files in /var/log/jetty/
#LOGFILE_DAYS=14

If you don't include the JETTY_HOST entry Jetty will only bind to the local loopback interface, which is all you need if your drupal webserver is running on the same machine. If you set the JETTY_HOST make sure you configure your firewall to restrict access to the Solr server.

Configuring Solr

I am assuming you have already installed the Apache Solr module for Drupal somewhere. If you haven't, do that now, as you will need some config files which ship with it.

First we enable the multicore support in Solr by creating a file called /usr/share/solr/solr.xml with the following contents:

<solr persistent="true" sharedLib="lib">
 <cores adminPath="/admin/cores" shareSchema="true" adminHandler="au.com.davehall.solr.plugins.SolrCoreAdminHandler">
 </cores>
</solr>

You need to make sure the file is owned by the jetty user if you want it to be dymanically updated, otherwise change persistent="true" to persistent="false", don't include the adminHandler attribute and don't run the commands below. Also if you want to auto provision cores you will need to download the jar file attached to this post and drop it into the /usr/share/solr/lib directory (which you'll need to create).

sudo chown jetty:jetty /usr/share/solr
sudo chown jetty:jetty /usr/share/solr/solr.xml
sudo chmod 640 /usr/share/solr/solr.xml
sudo mkdir /usr/share/solr/cores
sudo chown jetty:jetty /usr/share/solr/cores

To keep your configuration centralised, symlink the file from /usr/share/solr to /etc/solr. Don't do it the other way, Solr will ignore the symlink.

sudo ln -s /usr/share/solr/solr.xml /etc/solr/

Solr needs to be configured for Drupal. First we backup the existing config file, just in case, like so:

sudo mv /etc/solr/conf/schema.xml /etc/solr/conf/schema.orig.xml
sudo mv /etc/solr/conf/solrconfig.xml /etc/solr/conf/solrconfig.orig.xml

Now we copy the Drupal Solr config files from where you installed the module

sudo cp /path/to/drupal-install/sites/all/modules/contrib/apachesolr/{schema,solrconfig}.xml /etc/solr/conf/

Solr needs the path to exist for each core's data files, so we create them with the following commands:

sudo mkdir -p /var/lib/solr/cores/{,subdomain_}example_com/{data,conf}
sudo chown -R jetty:jetty /var/lib/solr/cores/{,subdomain_}example_com

Each of the cores need their own configuration files. We could implement some hacks to use a common set of configuration files, but that will make life more difficult if we ever have to migrate some of cores. Just copy the common configuration for all the cores:

sudo bash -c 'for core in /var/lib/solr/cores/*; do cp -a /etc/solr/conf/ $core/; done'

If everything is configured correctly, we should just be able to start Jetty like so:

sudo /etc/init.d/jetty start

If you visit http://solr.example.com:8080/solr/admin/cores?action=STATUS you should get some xml that looks something like this:

<?xml version="1.0" encoding="UTF-8"?>
<response>
	<lst name="responseHeader">
		<int name="status">0</int>
		<int name="QTime">0</int>
	</lst>
	<lst name="status"/>
</response>

If you get the above output everything is working properly

If you enabled auto provisioning of Solr cores, you should now be able to create your first core. Point your browser at http://solr.example.com:8080/solr/admin/cores?action=CREATE&name=test1&i... If it works you should get output similar to the following:

<?xml version="1.0" encoding="UTF-8"?>
<response>
	<lst name="responseHeader">
		<int name="status">0</int>
		<int name="QTime">1561</int>
	</lst>
	<str name="core">test1</str>
	<str name="saved">/usr/share/solr/solr.xml</str>
</response>

I would recommend using identifiable names for your cores, so for davehall.com.au I would call the core, "davehall_com_au" so I can easily find it later on.

Security Note: As anyone who can access your server can now provision solr cores, make sure you restrict access to port 8080 to only allow access from trusted IP addresses.

For more information on the commands available, refer to the Solr Core Admin API documenation on the Solr wik.

Next in this series will be how to use this auto provisioning setup to allow aegir to provision solr cores as sites are created.