Blogs

$100 Drupal Site Series: Part 2 - Resources and Infrastructure

In my previous post in this series on the $100 Drupal site I outlined a possible target market and set out why I thought very low cost sites could be a viable business model. Today I will cover the resources and infrastructure you'd need to consider to build such a service.

I am not proposing that the business is built on the premise of working for $5 per hour to build new sites for each client. The whole business should be automated. There should be no need for human interaction to close the sale and deploy the site. As soon as a worker is added to the process costs start to rise. At the same time we need some skilled professionals to build the platform and the cookie cutters.

Developers

Building a very basic Drupal site doesn't involve that much skill, it is possible for someone to signup with a web host that uses an automated installer script, such as Fantastico, and then start building a site using stock Drupal core. Building great Drupal sites takes skills and knowledge. In order to build modular Drupal sites that can be used as part of installation profiles and deployed using automated processes, you will need someone who knows Drupal pretty well. The person driving development of the project should be passionate about Drupal as a platform. They are likely to have an opinion on the Drupal specific religious wars - Panels vs Context, CKEditor vs tinyMCE, which Slideshow / Carousel module to use etc.

Most competent PHP developers can be taught how to develop for Drupal. A second, junior, developer can be used to do a lot of the build work. They can start being a "click monkey" developer and as their skills grow they can get their hands dirty writing glue modules and patches. This person should be good at driving the Drupal GUI and working with contrib.

Designer / Themer

Although customers will only be paying $100 for their site, they won't be very impressed if they can only use Garland as their theme. They will expect a choice of themes. The themes should be designed from the ground up to be flexible Drupal themes. The designer should have experience creating designs that will be come Drupal themes. They should understand flexible regions, module themeing and how the Color module can be used to provide control to the user over their theme. Ideally the designer should be able to craft HTML and CSS, and even perform basic Drupal themeing.

System Administrator

Even though the plan is to keep costs low, it doesn't make sense to run the $100 sites on cheap shared hosting. You need to spend money on your own solid hosting infrastructure, and if you have your own hosting infrastructure you'll need someone to manage it. The sys admin will be responsible for keeping everything running, adding new servers, testing backups etc.

You will need a system administrator who has some experience with administering LAMP servers, previous Drupal experience is a big plus. You want someone in this role who hates downtime.

Infrastructure

One of the assumptions for this project is that 95% of the sites will have less than 50 page views per day. At first there will be very little traffic for the sites, so you can start off pretty small, but as the business grows you need to have a plan in place to handle that growth.

I wouldn't recommend spending a lot of money on buying servers outright and putting them in a data centre. Start off with some virtual machines from a cloud hosting provider, such as slicehost, linode, rackspace cloud or amazon AWS (if you don't care about liberty). By using cloud hosting providers you can scale your infrastructure up as you need.

To get started you will probably need 3 production virtual machines, one for serving up content, one for the db and one for search using Apache Solr. Each of the VMs should have 1G of RAM and be located in the same data centre with the same hosting company. When things start to grow you can scale up the VMs as you need and even add more servers. Each time you scale things you will also have to retune your configuration. An opcode cache such as APC or xcache will allow you to serve more requests on the same hardware. You should include some kind of monitoring service such as nagios or zabbix to make sure things are running as you want.

Once you have more than a handful of machines to manage, you will want a centralised configuration management tool. This will allow you to add new servers and have them configured as you want them with no (or very little) human intervention. I would recommend puppet for this job. Given the effort involved in retro fitting puppet in an existing environment, it will be easier to add it from the start. You may want to consider using libcloud to even handle the deployment of the VMs too.

To ensure everything can be tested properly before deployment, you should seriously consider having a replicated testing environment. Instead of having 3 more VMs, you can just run up a Linux box with 4G RAM and a relatively modern CPU and run 3 VMs on it using KVM or Xen.

What's Next?

Now we have the business plan, the market, the team and servers in place, it is time to start looking at some of the tools we'll be using to make all of this possible. The next post in this $100 Drupal site series will cover the tools needed to build a sustainable platform for the business.

$100 Drupal Site Series: Part 1 - Is it Possible?

In late October Gdzine posed the question "$100 CMS web site feasible? What do you think?" on LinkedIn, and the question was also posted on groups.drupal.org. These posts lead to lengthy discussion threads. Some people accused Gdzine of trolling and others claimed that it wasn't possible, but a few of us argued it was possible to build a Drupal site for $100.

Over the next week or so I'll be blogging how I would go about delivering $100 Drupal sites. $100 is in United States Dollars. I won't be providing a complete blueprint, but there should be enough information to help get you started.

I have some experience building large numbers of production sites using Drupal for a small price per site. In 2009 I built, deployed and managed 2086 sites for a European client. For most of this year I have been offering training and consulting around some of the tools and techniques this blog series will cover.

Is it Feasible?

Several companies already offer low cost Drupal site solutions. Acquia's Drupal Gardens is probably the most well known cheap Drupal site building service with its freemium model. Wedful offers a Drupal based site for couples getting married for only 95USD for the first year, then 25USD every following year including hosting. Spoon Media's Pagebuild service offers a customised Drupal platform for 30USD per month. I am sure there are others operating in the same space. wordpress.com offers a similar service using WordPress. I have no idea how financially viable these businesses are, but I think it is safe to assume that they've done some research and planning to get to this point.

These services all rely on making their money on the turnover rather than the margin. Most consultants, myself included, make our money by charging a good hourly rate, but we only get paid for the time we work. These services rely on investing up front then waiting for the long tail revenue. For example if you invest 50,000USD up front into building the service, then you have to sell 500 sites just to break even.

Target Market

In order to make these services viable, you have to target a particular market segment. Customers for a $100 site are likely to be "mum and dad" businesses, they don't have a lot of money to spend and are also unlikely to have a lot knowledge about the web. A lot of the customers are likely to think "the internet" is that blue e on their desktop or facebook. I know of several small businesses who think that Yellow Pages advertising is not giving the return on investment they want, but can't afford 1500-2000USD for a decent quality brochure site. These are the people this service should be seeking to attract.

Why Bother?

Most Drupal developers aspire to work for switched on clients who want high quality sites. Working with well known brands is always a bonus. No one is really going to be interested in hearing about how you built the site for "Joe and Jo's Diner". There is still a lot of problem solving involved in building a service like this, but these problems are very different to those found in large scale site builds. Also many of the people seeking a $100 site are likely to be high needs clients who undervalue the skills involved in building a site.

What's Next?

All posts in this series will be tagged with "100 drupal site". In my next post I will cover what I think you need in terms of infrastructure and resources to make something like this work.

I have proposed a session for DrupalCon Chicago on this topic, please consider voting for it.

Kicking Javascript to the Footer in Drupal 8?

As a platform, Drupal has excellent javascript support. Drupal 7 will ship with jQuery 1.4.2 and jQuery UI 1.8, which will make it even easier to build rich user interactions with Drupal.

Drupal supports aggregating javascript files to reduce the number of network connections a browser must open to load a page. It is common practice for Drupal themes to put the <script> tag in the <head> section of the page. Unfortunately this has a performance impact, as all browsers will stop processing the page and start loading and processing the referenced javascript file. For this reason, both Yahoo! and Microsoft recommend placing all javascript just before the closing </body> tag in a page so it is loaded and processed after the content.

Making this change in Drupal is a pretty straight forward process. It is already possible to do this in Drupal 6 or 7. My site places the $scripts variable at the end of the page. Unfortunately some modules rely on javascript being in the <head>er, and some even place <script>s in the body to allow inline function calls.

It is too late to implement this change in Drupal 7, but the transition can occur now. Documentation can be updated to inform theme developers that they can place the $script variable at the end of the page, just above where the $closure variable is placed. The module development guide can be updated to strongly recommend against relying on the value 'header' for the 'scope' element of the $options array for drupal_add_js() meaning that the javascript will end up in the header and to not place any inline javascript code in themes or modules. In Drupal 8 the scope element for the $options array can be dropped.

If theme and module developers adopt this best practice approach for their Drupal 7 releases there should be minimal transition work for this change in the version 8 release cycle.

I am hoping to discuss this at the Core Developers Summit at DrupalCon Copenhagen later this month.

Travelling, Speaking, Scaling and Aegiring

The next couple of months are going to be a crazy ride. I will be visiting at least 7 countries, speaking on 8 or more days in a 5 week period. The talks will be focused on Drupal and Aegir. My schedule is below.

Horizontally Scaling Drupal - Melbourne

On 7 August I'll be running a 1 day workshop around the theme of horizontally scaling Drupal. The content is built on the knowledge I developed building, deploying and managing around 2100 sites for a client. This event has very limited capacity and has almost sold out.

DrupalCon - Denmark

Denmark is hosting the European leg of DrupalCon this year. I will be attending the full conference. I won't be presenting, but I will be getting involved with some of the BoFs. I had a ball at DrupalCon San Francisco earlier in the year.

Efficiently Managing Many Drupal Sites - Slovakia

After spending a couple of days recovering from DrupalCon, I'll be teaming up with the crew at Sven Creative in Bratslavia, to run a 2 day intensive workshop on horizontally scaling Drupal and development workflows. For more information check out the workshop website.

Free Software Balkans - Albania

On the weekend of 11-12 September, the inaugural Free Software Balkans Conference will be held at the University of Vlore, Albania. I'll be there speaking about Drupal and Aegir. In addition to this I will be running half day build your first Drupal site workshops around the country. The dates and locations for the workshops are still being finalised.

OSI Days - India

On my way back to Australia I will be taking a side trip to Chennai, via Delhi, for OSI Days 2010, Asia's largest open source conference. I will be presenting sessions on Aegir and Drupal. This looks like it will be a huge event.

Other Events

I've launched a new site workshops.davehall.com.au to list my training and speaking engagements. As dates are locked in I'll be adding them to the site.

If you would like to meet with me while I'm on the road, add me to your tripit network, follow me on identi.ca or twitter or add me to your network on LinkedIn.

Western Digital to fix Licensing?

Over the last few months months I've been corresponding with Dennis Ulrich of Western Digital (WDC) about my concerns with the EULA for the My Book World Edition (MBWE) and their obligations under the GPL. To say it has been a drawn out process is an understatement.

It has taken some time to get WDC to understand the situation. There has been confusing messages about what the situation is with the EULA, the GPL and what license covers what pieces of code. The bottom line is that currently users must check the header of each file to ascertain which license applies to it, even though the downloads are marked as GPL.

Although WDC is moving slowly, they do seem to be commited to making the situation clearer in the next iteration of their MBWE product line. Based on a recent phone call with Dennis, the legal and engineering teams are working together to ensure various licenses complied with and their software engineers are aware of their obligations.

The next version of the MBWE is likely to ship with a revised EULA and properly inform the users of their rights under the GPL. This text is still being developed. At the same time it is still unclear if WDC will backport their changes to their existing NAS products. As it is a simple string change it shouldn't be too hard for them to dedicate a few resources to audit the code and update the strings.

It is unfortunate that WDC appears to have little interest in developing their MBWE product range as a hacker friendly FOSS product. It appears the licensing fix that will be implemented by WDC will more clearly delinate the FOSS and non free components of the MBWE firmware. Clarity is always an important legal consideration, but doesn't help foster a community. WDC seem to have little interest in fostering a hacker community around their products. This is an unfortunate decision by the company.

Many manufacturers of embedded devices only start releasing source for their firmware after being caught out for violating the GPL. WDC is to be commended for complying with the requirements of the GPL from the start. Altough there is no legal requirement for them to make the web gui code and other non free components available, WDC already does. It would be disappointing if they chose to take a backward step and stopped distributing parts of the firmware.

It is possible, and perfectly legal, for WDC to stop distributing the source for the proprietary components. At the same time it would not take much effort for them to release the whole platform as GPL or another FOSS friendly license. WDC is already required to do a code drop every time they release a new version of their hardware or firmware. I suspect it would be faster and easier to push all code to a public git repository than pick through and dump selected components as tarballs on a website.

WDC already have their support team dealing with customer bug reports. Maintaining a mailing list, a bug tracker, a wiki and maybe a public source code repository on somewhere like gitorious is likely to take less than 1 full time employee. The benefit for WDC would be great.

Not only is the hacker community likely to contribute bugs fixes and propose or even develop new features, they can help increase sales. I'm sure the good will generated by the switch to truly open approach to the MBWE product line would outweigh the cost of the additional resources required.

Let's hope Western Digital's fix is a FOSS friendly one. I will post more news as things progress.

Hello Planet Ubuntu Users and Ubuntu Universe

One of my new (financial) year resolutions was to increase the readership of my blog. As part of this plan I have been trying to get my blog syndicated on relevant planets. The latest on my list has been Planet Ubuntu Users and the associated Ubuntu Universe. About 24 hours ago my blog was added to both aggregators. Thanks Tiago!

I run Ubuntu on my Dell D830 laptop (my primary machine). I run various flavours and versions of Ubuntu on desktops, servers and VMs both for my business and clients.

In 2009 I converted the local community run internet cafe to Ubuntu. When they were running XP there were problems almost weekly, since the switch, the only real issue they have had was a failed automatic update for firefox last week.

As for off the clock Linux time, my kids associate tux with computers, while my partner is more than happy running LTS releases on her laptop. The only desktop OS my mother in law has used is Ubuntu Linux.

I have been active in the Australian Ubuntu LoCo for quite a few years.

My Ubuntu related posts are generally technical howtos. Most of the stuff is system administration related with a few Ubuntu home brew packaging and webapp development tips.

The non tagged feed of this blog contains a good mix of geeky stuff about whatever we are working on at DHC, rants about annoying things and the occasional thing from real life. I hope there is something there that you find useful. Happy reading.

Hello Slicehost Planet

Hi, I'm Dave and it has been 10 days since I ran up a new slice.

I have been using slicehost for over 2 years to host various VMs. I run a Free / Open Source focused IT Consulting and programming business based in Victoria, Australia.

This blog contains a good mix of geeky stuff about whatever we are working on at DHC, rants about annoying things / companies along with the occasional thing from "real life". A lot of the stuff that gets blogged about is running on slicehost VMs or at least has been tested on them.

This blog is now being syndicated on the Slicehost Planet. The future of the Slicehost planet is unclear, support have suggested that it suffer the same fate as USA-193, while others insist it is staying. For now at least they are adding new blogs, so long as they hosted on their VMs. If you have a blog running on a slice, email support at slicehost and ask them to add your feed to the planet.

Multi Core Apache Solr on Ubuntu 10.04 for Drupal with Auto Provisioning

Apache Solr is an excellent full text index search engine based on Lucene. Solr is increasingly being used in the Drupal community for search. I use it for search for a lot of my projects. Recently Steve Edwards at Drupal Connect blogged about setting up a mutli core Solr server on Ubuntu 9.10 (aka Karmic). Ubuntu 10.04LTS was released a couple of months ago and it makes the process a bit easier, as Apache Solr 1.4 has been packaged. An additional advantage of using 10.04LTS is that it is supported until April 2015, whereas suppport for 9.10 ends in 10 months - April 2011.

As an added bonus in this howto you will be able to auto provision solr cores just by calling the right URL.

In this tutorial I will be using Jetty rather than tomcat which some tutorials recommend, as Jetty performs well and generally uses less resources.

Install Solr and Jetty

Installing jetty and Solr just requires a simple command

$ sudo apt-get install solr-jetty openjdk-6-jdk

This will pull down Solr and all of the dependencies, which can be alot if you have a very stripped down base server.

Configuring Jetty

Configuring Jetty is very straight forward. First we backup the existing /etc/default/jetty file like so:

sudo cp -a /etc/default/jetty /etc/default/jetty.bak

Then simply change your /etc/default/jetty to be like this (the changes are highlighted):

# Defaults for jetty see /etc/init.d/jetty for more

# change to 0 to allow Jetty to start
NO_START=0
#NO_START=1

# change to 'no' or uncomment to use the default setting in /etc/default/rcS 
VERBOSE=yes

# Run Jetty as this user ID (default: jetty)
# Set this to an empty string to prevent Jetty from starting automatically
#JETTY_USER=jetty

# Listen to connections from this network host (leave empty to accept all connections)
#Uncomment to restrict access to localhost
#JETTY_HOST=$(uname -n)
JETTY_HOST=solr.example.com

# The network port used by Jetty
#JETTY_PORT=8080

# Timeout in seconds for the shutdown of all webapps
#JETTY_SHUTDOWN=30

# Additional arguments to pass to Jetty    
#JETTY_ARGS=

# Extra options to pass to the JVM         
#JAVA_OPTIONS="-Xmx256m -Djava.awt.headless=true"

# Home of Java installation.
#JAVA_HOME=

# The first existing directory is used for JAVA_HOME (if JAVA_HOME is not
# defined in /etc/default/jetty). Should contain a list of space separated directories.
#JDK_DIRS="/usr/lib/jvm/default-java /usr/lib/jvm/java-6-sun"

# Java compiler to use for translating JavaServer Pages (JSPs). You can use all
# compilers that are accepted by Ant's build.compiler property.
#JSP_COMPILER=jikes

# Jetty uses a directory to store temporary files like unpacked webapps
#JETTY_TMP=/var/cache/jetty

# Jetty uses a config file to setup its boot classpath
#JETTY_START_CONFIG=/etc/jetty/start.config

# Default for number of days to keep old log files in /var/log/jetty/
#LOGFILE_DAYS=14

If you don't include the JETTY_HOST entry Jetty will only bind to the local loopback interface, which is all you need if your drupal webserver is running on the same machine. If you set the JETTY_HOST make sure you configure your firewall to restrict access to the Solr server.

Configuring Solr

I am assuming you have already installed the Apache Solr module for Drupal somewhere. If you haven't, do that now, as you will need some config files which ship with it.

First we enable the multicore support in Solr by creating a file called /usr/share/solr/solr.xml with the following contents:

<solr persistent="true" sharedLib="lib">
 <cores adminPath="/admin/cores" shareSchema="true" adminHandler="au.com.davehall.solr.plugins.SolrCoreAdminHandler">
 </cores>
</solr>

You need to make sure the file is owned by the jetty user if you want it to be dymanically updated, otherwise change persistent="true" to persistent="false", don't include the adminHandler attribute and don't run the commands below. Also if you want to auto provision cores you will need to download the jar file attached to this post and drop it into the /usr/share/solr/lib directory (which you'll need to create).

sudo chown jetty:jetty /usr/share/solr
sudo chown jetty:jetty /usr/share/solr/solr.xml
sudo chmod 640 /usr/share/solr/solr.xml
sudo mkdir /usr/share/solr/cores
sudo chown jetty:jetty /usr/share/solr/cores

To keep your configuration centralised, symlink the file from /usr/share/solr to /etc/solr. Don't do it the other way, Solr will ignore the symlink.

sudo ln -s /usr/share/solr/solr.xml /etc/solr/

Solr needs to be configured for Drupal. First we backup the existing config file, just in case, like so:

sudo mv /etc/solr/conf/schema.xml /etc/solr/conf/schema.orig.xml
sudo mv /etc/solr/conf/solrconfig.xml /etc/solr/conf/solrconfig.orig.xml

Now we copy the Drupal Solr config files from where you installed the module

sudo cp /path/to/drupal-install/sites/all/modules/contrib/apachesolr/{schema,solrconfig}.xml /etc/solr/conf/

Solr needs the path to exist for each core's data files, so we create them with the following commands:

sudo mkdir -p /var/lib/solr/cores/{,subdomain_}example_com/{data,conf}
sudo chown -R jetty:jetty /var/lib/solr/cores/{,subdomain_}example_com

Each of the cores need their own configuration files. We could implement some hacks to use a common set of configuration files, but that will make life more difficult if we ever have to migrate some of cores. Just copy the common configuration for all the cores:

sudo bash -c 'for core in /var/lib/solr/cores/*; do cp -a /etc/solr/conf/ $core/; done'

If everything is configured correctly, we should just be able to start Jetty like so:

sudo /etc/init.d/jetty start

If you visit http://solr.example.com:8080/solr/admin/cores?action=STATUS you should get some xml that looks something like this:

<?xml version="1.0" encoding="UTF-8"?>
<response>
	<lst name="responseHeader">
		<int name="status">0</int>
		<int name="QTime">0</int>
	</lst>
	<lst name="status"/>
</response>

If you get the above output everything is working properly

If you enabled auto provisioning of Solr cores, you should now be able to create your first core. Point your browser at http://solr.example.com:8080/solr/admin/cores?action=CREATE&name=test1&i... If it works you should get output similar to the following:

<?xml version="1.0" encoding="UTF-8"?>
<response>
	<lst name="responseHeader">
		<int name="status">0</int>
		<int name="QTime">1561</int>
	</lst>
	<str name="core">test1</str>
	<str name="saved">/usr/share/solr/solr.xml</str>
</response>

I would recommend using identifiable names for your cores, so for davehall.com.au I would call the core, "davehall_com_au" so I can easily find it later on.

Security Note: As anyone who can access your server can now provision solr cores, make sure you restrict access to port 8080 to only allow access from trusted IP addresses.

For more information on the commands available, refer to the Solr Core Admin API documenation on the Solr wik.

Next in this series will be how to use this auto provisioning setup to allow aegir to provision solr cores as sites are created.

Site Refresh

Our site hasn't changed very much over the last 4 years, but the business has changed a lot. The biggest change was the (uneventful and long overdue) upgrade to Drupal 6 a few months ago.

During the last week or so the site has been updated and refocused. The major changes include:

This also signals our return to regular blogging. There are a few posts in the pipeline. There should be a good mix of drupal and sys admin posts in the coming weeks.

As always, feedback is welcome.

eBook Review: Theming Drupal: A First Timer’s Guide

My experience themeing Drupal, like most of my coding skills, have been developed by digging up useful resources on line and some trail and error. I have an interest in graphic design, but never really studied it. I can turn out sites which look good, but my "designs" don't have the polish of a professionally designed site. I own quite a few (dead tree) books on development and project management. Generally I like to read when I am sick of sitting in front of a screen. The only ebooks I consider reading are short ones.

Emma Jane Hogbin offered her Drupal theming ebook Theming Drupal: A First Timer’s Guide to her mailing list subscribers for free. I am not a big fan of vendor mailing lists, most of the time I scan the messages and hit delete before the bottom. In the case of Emma, rumour has it that it is really worthwhile to subscribe to her list - especially if you are a designer interested in themeing Drupal. Emma also offered free copies of her ebook to those who begged, so I subscribed and I begged.

The first thing I noticed about the book was the ducks on the front cover, I'm a sucker for cute animal pics. The ebook is derived from Emma's training courses and the book she coauthored with Konstantin Kaefer, Front End Drupal. Readers are assumed to have some experience with HTML, CSS and PHP. The book is pitched at designers and programmers who want to get into building themes for Drupal.

The reader is walked through building a complete Drupal theme. The writing is detailed and includes loads of references for obtaining additional information. It covers building a page theme, content type specific themeing and the various base themes available for Druapl. The book is a very useful resource for anyone working on a Drupal theme.

Although I have themed quite a few Drupal sites, Emma's guide taught me a few things. The book is a good read for anyone who wants to improve their knowledge of Drupal themeing. Now to finish reading Front End Drupal ...