Machine intelligence has, in recent years, emerged as one of the most exciting new frontiers in technology. It is powering innovative new approaches to tackling issues across almost every industry, from autonomous vehicles to agriculture and medicine. At Atomico we’re big believers that technology is not only capable of solving some of humanity’s biggest problems, but is indeed necessary to do so. Software, however, relies on hardware to deliver its potential. And when it comes to AI, current hardware is proving a poor fit for the task.

The design underlying the processors in use today emerged from the realms of general computing and graphics acceleration. They are able to perform large amounts of processing on dense blocks of data, and in doing so appear very impressive at basic benchmarks like numbers of Teraflops. Critically, however, these performance numbers can only be realized when tackling dense data problems like, for instance, rendering polygons for graphics. When it comes to rapid iteration over large sparse datasets, such as the diverse set of features used in training a neural network, using a Graphics Processing Unit (or GPU – the current industry vanguard) is rather like fitting a square peg into a round hole. To make it work involves workarounds and hacks which bloat the memory required, leading to solutions that are sub-optimal, power hungry, costly and highly complex despite relatively simple underlying algorithms.

 

Three key problems are manifested because of this:

 

  • The prohibitive cost of the hardware needed for new research effectively locks out all but the best funded corporations or institutions from AI research and scaled application of these methods. This limits the capability to innovate by concentrating that into the hands of a few, effectively slowing down its pace.
  • In an era where machine learning researchers are in short supply and small, highly motivated, teams can make large steps forward, how quickly you can run a model really matters. It can take days or weeks to re-run a model after simple tweaks like adding new dataset features. This long lag destroys momentum, concentration and enthusiasm. If one could cut weeks/days down to hours/minutes it would fundamentally change the pace and quality of experimentation for AI researchers and data scientists.
  • Computational power continues to grow impressively (exponentially, in keeping with Moore’s law), but, nevertheless, at the cutting edge of AI research it remains a huge bottleneck. Delivering the equivalent of 5-10 years worth of advances in one fell swoop may just get us to a highly promising future that much quicker.

 

These aren’t abstract issues.

 

Imagine the effect of accelerating the pace of innovation: Autonomous driving a decade earlier could equate to millions of lives saved; new AI-powered forms of cancer detection or drug discovery 5-10 years earlier could reduce mortality, pain and suffering for tens of millions; machine intelligence based models for global climate could allow us to finally settle the debate surrounding the impact of man, and get on with doing something proportionate to the challenge.

 

At Atomico we set out to find a solution to what was an all but obvious problem, and encouragingly found a great deal of work going on in the area. However, one company’s approach stood apart both in ambition, elegance and generality of their solution and in early performance. And perhaps, even more crucially, had a world class team of founders that could deliver.

 

Founded in Bristol in 2016 by serial entrepreneurs Nigel Toon and Simon Knowles, Graphcore has developed a new machine learning co-processor they call an Intelligent Processing Unit (IPU).

 

Unlike traditional von Neumann architectures, where memory is a major bottleneck, this is a completely different design of processor built for graph algorithms (of which neural networks are perhaps the most significant current example). Full models are loaded into a combined memory-compute architecture, and then trained very rapidly on large datasets. This results in strikingly lower power consumption and hugely improved performance. Better still, given that most modern machine learning frameworks internally represent their models as graphs, it’s a relatively straightforward task to take advantage of this new hardware. A developer using Tensorflow, or other standard framework supported at launch, should just need to recompile their code to get the basic speedup, with advanced changes only needed for power users.

 

We’ve been extremely impressed by Nigel and Simon, both seasoned founders with an impressive track record – at Icera, Element14 and Picochip – developing complex chips and successfully bringing them to market. What they’re doing requires a skill set few teams in the world possess, and it was critical to our investment decision to get to know and appreciate the confidence-inspiring group of people they have brought together.

 

We’re delighted today to announce Atomico is leading Graphcore’s $30m Series B funding round, and consider it highly relevant that joining the round as angel investors are several industry luminaries: Demis Hassabis of Deepmind, Greg Brockman & Ilya Sutskever of OpenAI, and notable academic and now Uber chief scientist, Zoubin Ghahramani.

 

Atomico Partner Siraj Khaliq, himself a computer scientist and ex-entrepreneur, will join the board, while Atomico’s Growth Acceleration team, including Partner Niall Wass, will be on hand to support the Graphcore team as they bring the world’s first IPU to market.

 

We believe Graphcore has the potential to be a game-changer for AI and machine learning, and are thrilled to have the opportunity to support Nigel and Simon through this upcoming, very exciting, phase of the company’s journey.