State of the World

After several months of behind-the-scenes work, we’ve moved the last of our services to our new infrastructure. You will hopefully have seen no impact at all, although a couple of people reported problems on Thursday when we moved our front-end servers.

The new infrastructure gives us much more resilience than before, and also allows us to scale up and down according to demand. When we’ve done this in the past, it’s been a manual process.

Now that everything has been successfully migrated, we’re going to start on rolling out the new version of OpenTrainTimes. This will happen slowly and gradually, and there won’t be any ‘big bang’ change. More about that in weeks to come.

We’ve almost caught up with the backlog of map updates – here are some things we’ve done recently:

There are a lot of support tickets with fixes to implement, which we’ll be doing over the next few weeks and making more frequent releases.

That’s your lot for now – we’ll be back soon with more details of the new version and how we’re going to roll it out. Stay tuned!

Service degradation over the past weeks

The last week hasn’t been great in terms of availability of our data feeds. We’ve had numerous outages that are outside our control:

  • On 3rd June, we had an outage of all our data feeds from 0320 to around 0333. This was followed by a period of high latency (the delay between a message being generated by a train describer and it being received by our systems – sometimes known as lag) between 0446 and 0512, then from 0518 to 0524. From 0631 to around 0735, data was being received by our systems between 15 seconds a 2 minutes late.
  • On 4th June, we had an outage of all our data feeds from 2342 (on 3rd June) to around 0912. This resulted in our live track diagrams freezing for the duration of the outage, and for some hours afterwards, maps showing out-of-date data and the location screens not reporting any train movements.
  • On 5th June, we had an outage of all our data feeds from 0941 to 1103 – again resulting in our maps showing out-of-date data for some time after the feeds returned, due to the nature of the feeds.
  • On 6th June, we had a partial outage on our train movement data feed from 1740 until around 2115 on 7th June. Our monitoring systems should have picked up this issue but didn’t – we are investigating why. On the same day, from 1747 to around 2130, we had an outage of all our our data feeds – both feeds returning to normal afterwards, but again resulting in our site showing incomplete and out-of-date data.
  • On 7th June, we had an outage of all our data feeds from 0846 to 0851, although the maps will have updated quickly afterwards due to the volume of trains running in the end of the morning peak period
  • On 8th June, we had an outage of our train describer feed between 1155 and 1201, from 1215 to 1311, from 1314 to 1328, from 1343 to 1350, from 1358 to 1407, from 1415 to 1430, and from 1438 to 1440. During this time, the train movement feed was operating normally, but this issue caused our maps to freeze repeatedly and intermittently for nearly two and a half hours.
  • On 9th June, we had an outage of all our feeds between 1320 and 1350, a partial outage between 1428 and 1440, then a full outage from 1440 to 1453, followed by a further outage between 1601 and 1620.

Quite rightly, many of you have voiced your concern over our inability to provide you with a useful service over the past week. We have no control over the Open Data feeds, which are available to anyone at https://datafeeds.networkrail.co.uk/, and we are not happy at the huge number of outages and the combined total downtime over the past week. These issues have been happening for months, if not years, with seemingly no resolution in sight.

We are facing a difficult decision: do we continue to use the public feeds to operate the OpenTrainTimes public website, or do we move to more reliable feeds which aren’t available to, or suitable for, many other developers and websites to use – since they’re not quick to set up, and not free to use when you consider the additional hardware and technical work required. The answer to this isn’t simple, and we continue to hope that the Open Data feeds will be fixed so that everyone can benefit from them.

What’s new – 10th January 2019

Happy New Year from all of us at OpenTrainTimes!

We’ve just released a slightly delayed set of updates to the site. These were due to go live about a week ago, but due to illness we didn’t have time to test everything to our satisfaction.

In this release:

  • A new map of Mossley Hill to Runcorn, covering the new workstations at Manchester ROC, as well as Ditton signal box. Garston Intermodal Freight Terminal is also drawn
  • Improved functionality from the new control system at York ROC means we can show, and we’ve added, routes and signal aspects for the whole of the York area, covering York station and parts of the surrounding routes
  • The Hull area map now contains route indications

Minor fixes include adding a missing signal at Cogload Junction, L394 signal at Seven Kings, F456 signal at Feltham, W367 signal at Southfields, 1999 signal at Gilberdyke, and new platforms berths at Skipton, Ilkley and Bradford Forster Square.

Unfortunately, at least three maps – the Exeter, Liverpool Street and Airedale maps have become corrupted during deployment and rather than rolling back the whole release, we’ll fix them as soon as we can.

What’s new – 3rd January 2018

Happy New Year to you all – we’ve been hard at work over the Christmas break updating several maps, which are now live:

As well as those major pieces of work, there are a handful of minor things that have been fixed:

  • Between Winsford and Crewe, some signals were placed out-of-order
  • At Lydney, some berths were missing from the loops
  • On the Exeter map at Norton Fitzwarren, signal E627 was mis-drawn as a ground position light signal, rather than a main aspect
  • Signal 7125 at Metheringham failed to show a train description
  • Route from signal S136 at Sheffield were drawn incorrectly

There is still competition back at OpenTrainTimes HQ between updating the public site and working on projects which bring in the money we need to keep the public site running. We have a rather large backlog of support tickets waiting to be answered – if you’ve logged something and not had a reply, we’re sorry – and we’ll get around to responding as soon as we can.

Until next time (which will be sooner than three months since last time!), enjoy the maps!

What’s new – 10th September 2017

I’m happy to say that we’re back up to full speed! Summer is behind us, and the wet weather has given us the perfect excuse to start attacking the support tickets and emails which continue to flood in.

Fifteen or so support tickets have been reviewed and sorted out this weekend, fixing the following bugs:

We’re going to start working on some new maps in the coming weeks, as well as preparing for further engineering works happening later in the year – as well as other projects to bring you even more detail on maps.

Until next time, keep watching the trains!

Post-Incident Review

We had some problems with OpenTrainTimes earlier today. Although the public site is not operated for profit, we take uptime seriously and we’ve produced this review of what happened.

If you use OpenTrainTimes as part of your job and you’re interested in a commercially supported version of the site, including freight data and integrations with your stock and crew systems, please drop us a mail at hello@opentraintimes.com.

What happened?

Earlier this evening, we had multiple users reporting that maps on OpenTrainTimes were lagging.

Upon investigation, we found an unusually large number of users on the site for the time of day combined with a 45 minute backlog on our train describer feed.

We temporarily disabled the train movement feed and turned off logging for the real-time maps in order to process this backlog. Once the backlog had cleared, we turned the train movement feed back on and monitored the service whilst the backlog of TRUST messages cleared.

The site returned to normal operation by about 2045.

More detail

OpenTrainTimes is a very popular site, and several hundred users are usually viewing multiple maps at the same time. This figure grows steadily and gradually over time, and we review our capacity every few months to make sure we’re not caught out. Each time we release a new map, the base load on our servers increases as we have anything up to 500 new pieces of signalling data to process – and then there are the extra users that the maps attract.

But that wasn’t the issue – but not by the number of users, but by the type of users!

Briefly, when a user’s web browser connects to our map server, it either uses a long-lived connection over which map data is sent, or it requests map data every few seconds. Several things influence which is chosen, but it’s usually down to whether the device is behind a proxy server – not all proxy servers allow, or support, long-lived connections over websockets.

This evening, we noticed a larger than normal number of polling users. Since a large percentage of OpenTrainTimes users are coming from a mobile device, we think this may be because a change was made at one of the mobile network providers which meant our websocket implementation couldn’t be used by clients.

Normally, this is OK – but the gradual and continual increase in users each week, coupled with a gradual surge in the number of connections that our server was logging, meant there was insufficient CPU time available to process all of the data coming in to us from Network Rail.

The first thing we did was to turn off logging – we don’t really need it day-to-day, and it bought us some time. We then switched off processing TRUST messages, allowing them to queue whilst we allocated the rest of the server’s capacity to processing the backlog of train describer (TD) messages. It took about 20 minutes to process the TD messages, after which we turned TRUST messages back on. Processing the backlog of those messages, plus the remaining TD messages took about another hour.

What we’re going to do about it

First of all, we’re sorry that we missed a trick and took too long to respond to the initial reports of a problem.

We’re going to add some new health checks to our monitoring system, one of which will enable us to monitor the size of any message backlog.

We’re also going to look at scaling out our servers to cope with the extra demand and leave more breathing room – but this means our costs will double, so we’ll need to make sure this is sustainable.

And finally, we’re going to press forward with the new version of OpenTrainTimes which builds on the six years experience we’ve had working with railway data, and will be quicker and better than the current version.

So, sorry for the problems this evening.

Peter Hicks
Director, OpenTrainTimes Ltd.

What's new – Sunday 25th October 2015

Summer’s officially over – the mornings aren’t as light, and the evenings are darker. To make up for this, there are four new or updated maps!

  • The York map has been extended to Thirsk
  • The Darlington map map covers Thirsk, Northallerton, Darlington and Durham, plus the Northallerton to Eaglescliffe and Darlington to Eaglescliffe routes
  • The Newcastle map covers Durham to Newcastle, plus Dunston, Manors and Heworth
  • The Sileby to Langley Mill map has been redrawn, covering Sileby to Sheet Stores Junction, some of the route to Stenson Junction and Derby, plus Toton and Beeston

As always, I’ve been hard at work fixing the smaller bugs that you’ve been reporting – the highlights are:

  • The Crewe map includes ‘train ready to start’ indications for the platforms at Crewe
  • The Exeter map includes has some fixes which caused trains to disappear around Dawlish (no, not in to the sea!)
  • Sometimes, map elements would appear with large yellow sections due to a code problem – that’s been fixed

I have a well-earned holiday coming up shortly, so expect some non-map related features in the next release, probably in a fortnight’s time!

Solstice Release

Hello again, after a two week break. It feels like I’ve been flat out over the last couple of months, churning out maps like there’s no tomorrow.

I’ve managed to fix a handful of small problems with the maps:

There are also some other bugs that I know about and am trying to fix:

  • On the Charing Cross, Cannon Street and London Bridge to Forest Hill map, trains never step out of the berth for signal L156, and are (seemingly) manually interposed in to the berth for signal L148. This could be a fault with the train describer itself, which I’ll have to work around
  • There are occasional problems with some trains being linked to schedules which are not the right ones
  • Some signals and TRTS indications always show on when they’re not – I believe this is linked to the problem above

I’m working hard to try and get them fixed, so please accept my apologies for the fact it’s taking a while.

That’s it for this week – the next maps are a surprise (read that as "I haven’t decided which ones to tackle next").

// Peter

Happy Birthday, OpenTrainTimes

On Monday 10th January 2012, I launched OpenTrainTimes.  I believe it marked a turning point in opening up Great Britain’s railway data – leading the way and showing that it can be done, and that the outcome would be positive.

We’ve come a long way in those three years.  Network Rail opened up detailed real-time data through their Data Feeds platform and have opened up their timetable, fares and associated data through their Rail Industry Data portal.

Most recently, National Rail Enquiries opened up their Live Departure Boards web service and loosened their terms and conditions so that nearly anyone can work with the data.

In the coming months, the pièce de résistance will be unveiled – open and scaleable access to Darwin – one of the most important systems that produces a ‘single source of truth’, whose information is distributed to websites, mobile phones, station departure boards and numerous other technology platforms.

If you’re familiar with OpenTrainTimes, you’ll realise that it only uses two or three of the available data sources.  There are two very good reasons for that.  First, it’s a more compelling argument when you can appraise somebody else’s work as a talking point and reason to open up data than when you’re presenting your own.  Secondly, OpenTrainTimes started off as an experiment and I never expected it to be as popular as it is.  The architecture has run in to a number of scalability issues which can only be fixed with a lot of behind-the-scenes work.

I’ve been putting in many hours of work each week, outside the time I spent working for Rockshore to improve the railway’s real-time systems, to re-build the entire site and make it even more successful than it is.

I am not quite at the end yet – the majority of the heavy lifting’s already been done and the scaffolding’s starting to be taken down.  There are a few more weeks of testing I need to do to iron out bugs and make sure the new site performs much better than the other did.  Please get in touch if you’re interested in helping test the new site.

When I launch the new site, probably in a month or so’s time, it’ll include real-time data from more feeds at Network Rail and, once the Darwin feed from National Rail Enquiries launches, it’ll include forward-looking predictions that mirror what you see on other systems powered by Darwin.  No “Your site says X, but the National Rail site says Y, how do I don’t know who’s right?” – consistency trumps accuracy in predicting when a train will turn up.

Thank you to everyone who’s helped me out – especially to the numerous industry people who have kept my enthusiasm up and made helpful suggestions on where to go next.

Watch this space – a new OpenTrainTimes is around the corner!

Scheduled downtime – Saturday 21st June 2014

OpenTrainTimes is a popular site – so much so that it’s necessary to upgrade one of the servers that runs the site.

To do this, there will be an outage from approximately 0900 to 1300 on the morning of Saturday 21st June, and the site will be offline during this time.

After this upgrade is complete, there will be plenty of spare capacity on the site to allow the next set of exciting features to be developed… watch this space!