Graphcore has created a completely new processor, the Intelligence Processing Unit (IPU), specifically designed for artificial intelligence. The IPU’s unique architecture means developers can run current machine learning models orders of magnitude faster. More importantly, it lets AI researchers undertake entirely new types of work, not possible using current technologies, to drive the next great breakthroughs in general machine intelligence.
We believe our IPU technology will become the worldwide standard for artificial intelligence compute. The performance of Graphcore’s IPU is going to be transformative across all industries and sectors whether you are a medical researcher, roboticist or building autonomous cars.
Our team is at the forefront of the artificial intelligence revolution, enabling innovators from all industries and sectors to expand human potential with technology. What we do, really makes a difference.
Graphcore’s chip team has created a diverse set of in-house tooling to manage both their front end and back end tool flows. As the DevOps Engineer, you will be instrumental in leading and debugging problems within this infrastructure alongside adding new functionality and improving performance within our environment.
As a DevOps Engineer embedded within the chip team at Graphcore you will be responsible for the team’s dedicated compute resource. It must be kept constantly ready to accept large HPC like workloads, dispatching and processing them in the most efficient manner. You will be able to create monitoring software both bespoke and part of Ansible or Puppet to keep yourself and the team informed of status and spot any bottle necks or misconfigurations.
Graphcore operates multiple data centre sites. You will be skilled in remote management of hardware and software running locally in not only our onsite data centre but also data centres located in other geographical regions.
You will also manage the installed software tooling, adding new packages and service packs as they become available or upon request.
You will work closely with the IT team but embedded within the chip team and servicing their requests using support request tickets.
This is a challenging, yet rewarding role that requires in depth knowledge across a diverse set of domains.
Developing and maintaining software infrastructure for the chip team.
Planning maintenance of compute farm hardware and software infrastructure both on and off site.
Evaluation, specification and planning for hardware upgrade cycles.
Working closely with IT department to ensure best possible availability of compute resources.
Installing tools, e.g. Python and libraries, LLVM, EDA tools from Cadence/Mentor/Synopsys.
Be highly motivated, a self-starter, and a team player
Good communication and negotiation skills
Ability to work across teams and programming languages
Experience in a software infrastructure environment
Excellent programming skills in Python, C++, Bash
Remote hardware administration with IPMI
Configuration and management of
SGE/Univa, Slurm, LSF or other DRMS
Puppet, Ansible, Nagios
DVCS e.g. Git
AWS, Azure, Google Cloud
XML and XPath/XSLT
We welcome people of different backgrounds and experiences and are committed to building an inclusive work environment that makes Graphcore a great home for everyone. We are an equal opportunity employer and want to build a work environment where everyone is happy, productive and respectful so they can do their best work. If you have a disability or additional need that requires accommodation, just let us know.
Please note, we are only considering candidates who have an established right to work in the UK for roles based in Cambridge, UK.
Founded just over three years ago, Graphcore's growth and impact has been little short of staggering. We believe we're in a unique position as a new wave of machine learning technology begins to emerge. We see a world where technology enhances human potential and takes us into a new era of intelligence and progress that everyone can benefit from