2021-08-10

The workflow show

A carpenter has a workshop, a cook has a kitchen, and a data scientist has a computer. Regardless of profession and environment, however, you still need the right tools for your practice. In this meetup we will explore some of the tools that make data science possible.

After all, we need the right packages and pipelines to get the most out of our data. We'll dive into our toolbox and examine some of the equipment at our disposal.

During the meetup we will host two practitioners active in this domain. Both speakers will give a 30 minute presentation, with some time for questions afterwards. Between the talks there will be a brief break, and afterwards we will have room for further discussion and socialising.

Details:

Please let us know if we can do anything to make the event (more) accessible to you. We will try our best to accommodate any needs you may have. You can reach us through organisation@onehot.nl.

Continuous Machine Learning

In the software engineering world, CI/CD practices have proven to be a reliable and effective approach to automating recurring tasks, like running tests, code analysis checks and even delivering final products to production.

In this talk, we will present how to automate ML processes using GitHub Actions or GitLab CI/CD and Continuous Machine Learning (CML) library that will take care of:

  • transferring large datasets to CI runners
  • managing GPU/CPU resources for computations and
  • generating ML model report with metrics and plots right in GitHub Pull Request so that ML specialists can focus on research.

Keywords: CML, continuous integration, continuous delivery, machine learning, GitHub actions

About Paweł Redzyński

Photograph of Paweł Redzyński
Software engineer at Iterative.ai, and one of the core maintainers of DVC. I am interested in machine learning from the perspective of project maintenance. In my free time I like to train with kettlebells, hike and learn new things.

Iterative.ai DVC GitHub LinkedIn

A talk by Jessica Forde

The title and details will be announced soon

About Jessica Forde

Photograph of Jessica Forde
I’m a machine learning researcher focusing on the empirical study of deep learning models to improve their reliability in high stakes domains such as healthcare. At Brown University, I work with my PhD advisor, Michael Littman, studying the inductive bias of overparameterized models. For the past two summers, I collaborated with Michela Paganini at Facebook AI Research on model pruning. Prior to starting my PhD, I was an active core maintainer at Project Jupyter, which maintains open source projects such as the Jupyter Notebook. I also worked as a Data Scientist, collaborating with colleagues at organizations such as McKinsey and DARPA.

I believe that the open science movement is important for improving transparency and accountability in machine learning. At Project Jupyter I co-maintained reproducibility tools such as binder and repo2docker. I also am a co-organizer of the Machine Learning Reproducibility Challenge and Machine Learning Retrospectives.

GitHub Google Scholar LinkedIn Twitter