Supporting Drupal Websites: A Timeline

Luke

Due to the increasing number of clients we support we recently undertook a series of updates to our support systems. Before discussing the latest changes I thought I'd share some of the history of our journey that has brought us to this point.

I feel that there are three main aspects of supporting a Drupal install which I will talk about as we go along:

  1.  managing updates to modules and core;
  2. managing issues and tickets;
  3. reporting to the client.

There was a time before Agile Collective, which I won't talk too much about here, back when I first had experience of supporting sites in Drupal 5. Back in those days myself and others who I worked with were new to Drupal and had all kinds ideas about how to go about maintaining Drupal installs, but no direct experience. The bad ideas (keeping a single module repo in CVS and checking it out on multiple sites) and the good ideas (scheduling and assigning maintenance deployments to sync up with security releases) gradually coalesced into a system that formed the basis of how we supported sites when Agile Collective was formed.

In that first main phase of supporting sites, we handled each aspect in the following ways:

  • Updates: Developers who had developed a project inherited it as a maintenance project, and they were responsible for performing security updates when they came out. This meant that everyone who worked at Agile Collective had maintenance projects and the whole team would generally sit down at the same time each month to perform those tasks. The process was also in no way automated - individuals would check the site for updates, then carry them out on their local machines, test them, push them up to stage, test again, then deploy.
  • Issues: We were using Atrium 1.0 (A Drupal distribution) to manage development projects, so when they were over, clients who had been using them could now raise support tickets inside them. Some support issues would also come in via email, and would get inserted into tickets.
  • Reporting: Each assigned maintainer for a project would write a report to the client with details of the hours spent at the start of each month.

After some time we realised that the system was not performing as effectively as we felt it should and could, so I volunteered to oversee and manage the support service in an attempt to standardise and improve the processes. Specifically I saw the following issues:

  • Updates: Because developers inherited projects, the work was shared out quite flatly, and some people liked it more than others. On top of that, people at different stages of development would be doing the same tasks regardless of their skill or experience. While this was in one sense egalitarian, in others it was not, and it also wasn't terribly efficient.
  • Issues: Individuals to whom the tickets were assigned were potentially the only ones aware of the issues. This meant that some people could end up with lots of issues assigned compared to others, and there was no centralised management of issues. Also, clients needed to create the tickets carefully, and to notify people properly on tickets, otherwise none of us would know the tickets had been raised.
  • Reporting: Individuals would write reports in slightly different ways, which could be confusing and led to a level of inconsistency.

After a review with the full team, we decided to make a number of changes:

  • Updates: We would reduce the number of people doing them down to 3 or 4. This way we had a dedicated team who could focus on the task and others would be able to get on with other things.
  • Issues: We created a spreadsheet into which all incoming issues were logged. Someone would then go through them all and bump them on a weekly basis. We also made ticket creation auto-notify our support email box. Someone was designated to monitor the box during working hourse to guarantee a first level response time to either tickets or emails.
  • Reporting: The report writing was standardised and the people writing them was reduced down to just 1 or 2.

That system worked well for a long time; efficiency increased as people had the same regular responsibilities, there was less stress as people spent less time context switching beween a project and support. However, as the number of sites we supported grew and grew, it became clear that we needed to improve and perhaps re-invent our processes to handle the increased volume. We identified that we could develop an automated updates systme that would reduce the amount of time spent doing security updates by 50 - 70%, and enable us to react to critical releases far more swiftly.

Our system now comprises of the following:

  • Updates: While individuals still retain responsibility for clients' updates, the updates are performed via Jenkins. Jenkins will poll for security updates, and then apply them to a git hotfix and push that out to a version of the site on a tertiary staging server. The individual now just need to check and test that staged site, then if things work they can finalise the hotfix, push it to the primary stage site and then production.
  • Issues: We now have a dedicated support system for tickets, based on Atrium 2.0, which offers a vastly improved UX while still retaining our ability to tailor it to our evolving needs. Instead of a spreadsheet, we use a custom dashboard for the support team that shows how many hours each client has available, which tickets are assigned to you, which tickets have not been active for 7 days and need bumping, and some global statsistics such as the number of open tickets and total hours spent in the month. Earlier this year, we hired our first dedicated support developer, Dario, who taken to the role like a duck to water and allowed me to focus more on the management of our support service without having to worry about resolving tickets on top of that.
  • Reporting: We log our time in Harvest, and our team has set up a system to export the data and create a time report in the Atrium notebook with the time usage details for that month. We also use a suite of custom drush commands to update a little widget that shows how much time has been spent / is rmeaining in the month. We still review the reports manually and also create a ticket to notify the client just to make sure the information is correct and allow us to make adjustments as necessary.

We have by no means finished, there is always more to do and ways to make things better, but it is very satisfying to look back and reflect on the improvements we have made as our service has grown. As it continues to do so, we know we will need to respond to the changing needs of our clients to ensure a first rate service - a challenge we very much look forward to!