Manifold Blog

Manifold Blog

Jason Carpenter

Recent Posts

Using Dask in Machine Learning: Preprocessing

Posted by Jason Carpenter on Apr 25, 2019 6:00:00 AM

Introduction

This is the second post in a five part series about using Dask in machine learning workflows:

  • Using Dask in Machine Learning: Best Practices
  • Using Dask in Machine Learning: Preprocessing
  • Using Dask in Machine Learning: Feature Engineering
  • Using Dask in Machine Learning: Model Training
  • Using Dask in Machine Learning: Model Evaluation

Starting with this post, each installment will have data snapshots and code snippets to give you an example of the problem we are working on. We have this public self-contained GitHub repo. You can pull that repo and run the code yourself and follow along more closely.

Read More

Topics: Data engineering, Machine learning

Using Dask in Machine Learning: Best Practices

Posted by Jason Carpenter on Jan 31, 2019 6:00:00 AM

Introduction

The Python ecosystem offers a number of incredibly useful open source tools for data scientists and machine learning (ML) practitioners. One such tool is Dask, available from Anaconda. At Manifold, we have used Dask extensively to build scalable ML pipelines.

Read More

Topics: Data science, Data engineering, Machine learning

Never Miss a Post

Get the Manifold Blog in Your Inbox

We publish occasional blog posts about our client work, open source projects, and conference experiences. We focus on industry insights and practical takeaways to help you accelerate your data roadmap and create business value.


Subscribe Here


Popular Posts