02 Jul, 2019

Data Engineer

  • DataCareer GmbH
  • London, UK
Big Data

Töö kirjeldus - job description.

Job purpose
The M&S Food Data Science team was set up in April 2018. Since our inception it has grown rapidly both in the number of team members and in the areas across the business where we are involved and making an impact. Our long-term goal is to build a framework of sophisticated machine learning models which can optimise the entire food supply chain.
We have built a number of models for which we have trialled the outputs and demonstrated their business value. We would now like to scale up these models across the food estate. In order to do this, we need to productionise our analytical pipelines. This has two aspects: (1) working with our IT data team to set up automated data extract, transformation, and loading processes and (2) model deployment within applications and integration within other business systems. As a data engineer within the Food Data Science team, you will lead on developing our capabilities in this area.

Key accountabilities and measures

  • Leading on the design and implementation of automated data ingestion, cleaning, and transformation pipelines for machine learning models
  • Designing and implementing the scheduling and deployment of models, and their integration with other applications
  • Coordinating and championing DevOps practices related to developing and deploying analytical solutions
  • Engaging with key technical stakeholders and teams within the business to deliver the above points
  • Supporting upskilling other members of the team and wider analytical community through development activities
  • Remaining up-to-date on new developments within the data engineering space and embedding in the team where appropriate

Key skills

  • Demonstrable experience as a Data Engineer (experience working within supply chain is beneficial but not essential)
  • Experience working within a multidisciplinary analytical team
  • Can collaborate with other technical/IT functions to deliver projects


  • Advanced knowledge of SQL, including stored procedures, views and query optimisation
  • Use of Parquet files or HDFS for data storage, including optimising partitions
  • Use of scheduling tools, containers, and APIs to deploy analytical services such as machine learning models
  • Use of DevOps techniques such as unit testing and continuous integration


  • Use of Spark/Databricks for data ingestion and manipulation and delivery of analysis, including cluster architecture design
  • Advanced knowledge of at least one of Python or Scala
  • Use of existing statistical or machine learning techniques to automate data cleaning
  • Experience using graph databases for data storage and querying

Key relationships and stakeholders

  • Food Data Science Team
  • Food Analytics Team
  • IT Product Teams
  • Big Data Team
  • Analytical Community

Olen huvitatud