Manifold Blog

Manifold Blog

Using Dask in Machine Learning: Preprocessing

Posted by Jason Carpenter on Apr 25, 2019 6:00:00 AM


This is the second post in a five part series about using Dask in machine learning workflows:

  • Using Dask in Machine Learning: Best Practices
  • Using Dask in Machine Learning: Preprocessing
  • Using Dask in Machine Learning: Feature Engineering
  • Using Dask in Machine Learning: Model Training
  • Using Dask in Machine Learning: Model Evaluation

Starting with this post, each installment will have data snapshots and code snippets to give you an example of the problem we are working on. We have this public self-contained GitHub repo. You can pull that repo and run the code yourself and follow along more closely.

Read More

Topics: Data engineering, Machine learning

How to Apply and Optimize Your Algorithm When You're Ready to Run With AI

Posted by Sourav Dey on Mar 29, 2019 5:50:00 AM

Amazon’s recently launched SageMaker artificial intelligenceservice is an exciting new development, but the program doesn’t do it all. There’s a distinct gap between innovative AI technology that exists and AI solutions that will help drive business results in your specific case. Using products such as SageMaker is like having a brand-new Tesla Model S: It’s an awesome car, but it’s a giant electric paperweight if you don’t know how to drive.

We discussed "walking" with AI in a prior Entrepreneur article; now it’s time to hit the ground running.

Read More

Topics: Data science

Walking With AI: How to Spot, Store and Clean the Data You Need

Posted by Sourav Dey on Mar 28, 2019 5:46:10 AM

Last August, data science leader Monica Rogati unveiled a new way for entrepreneurs to think about artificial intelligence. Modeled after psychologist Abraham Maslow's five-tier hierarchy of psychological needs, her AI hierarchy of needs has become a conference favorite for illustrating how to incorporate AI into a business.

Despite entrepreneurs' excitement around AI, Rogati's hierarchy makes an uncomfortable point. Few companies are ready to adopt AI. Most are struggling to fulfill fundamental needs, such as reliable data flow and storage. The truth is that data literacy is lacking at most companies hoping to reap the rewards of AI.

Read More

Topics: Data science

This Is How to Get Started With AI When the Only Thing You Know Is the Acronym

Posted by Sourav Dey on Mar 27, 2019 5:42:46 AM

Unless you’ve been living under a rock, you’ve heard the buzz around artificial intelligence. So it might surprise you to learn that, according to a 2017 survey published by McKinsey Global Institute, out of 3,000 AI-aware executives, only one in five are using any AI-related technology in core areas of their businesses.

Why aren't entrepreneurs and executives jumping on what they know to be a market-changing technology? In a word, uncertainty. With AI still young, leaders aren't sure where to apply it, how to ensure a return on their investment or, most of all, how to implement it.

Read More

Topics: Computer vision

Your Project Needs a Data Readiness Audit

Posted by Vinay Seth Mohta on Mar 21, 2019 6:00:00 AM

In the early phase of a new project, we dive into the “Understand” step of our Lean AI framework. There are two main forms of understanding we aim for — business understanding and data understanding.

Read More

Topics: Data engineering

We Need to Build Interactive Computer Vision Systems

Posted by Ajay Mishra on Feb 28, 2019 6:00:00 AM

You hear strong proclamations about how AI is taking over the world. And then, you read about how sophisticated AI models are easily fooled by small perturbations in input (see the figure below).

Strike (with) a Pose: Neural Networks Are Easily Fooled by Strange Poses of Familiar Objects

Read More

Topics: Computer vision

How to Quickly Build a Gesture Recognition System

Posted by Rajendra Koppula on Feb 14, 2019 7:00:00 AM

Gesture recognition is a key part of the future of design, and is poised to become the next inflection point in how we interact with devices.

Gesture-based interactions are already prevalent in AR and VR devices; for example, here are some available interactions from Microsoft HoloLens. But gestures have the potential to make a far-reaching impact beyond these specialized uses cases: imagine interacting with everyday objects and machines with just a motion of your hand, instead of pushing buttons or turning knobs. This future may not be as far-off as it seems. The principal driver behind the progress in this space is state-of-the-art computer vision technology that enables machines to recognize human gestures.

Read More

Topics: Data science, Computer vision

Efficient Data Engineering

Posted by Jakov Kucan on Feb 7, 2019 7:32:34 AM

A typical data engineering problem, often referred to as extract, transform and load (ETL), consists of the following:

  1. take data in one place (extract)
  2. change its form (transform)
  3. move it to a new place, in this new form (load)

This process gets interesting when data volumes are large, and you have to consider performance. Long turnaround time (e.g., a run taking several hours or days) makes the typical serially iterative software engineering approach inefficient. In this article, we offer some tips on re-structuring the software engineering process and leveraging the cloud to make iteration more efficient.

Read More

Topics: Data engineering

Using Dask in Machine Learning: Best Practices

Posted by Jason Carpenter on Jan 31, 2019 6:00:00 AM


The Python ecosystem offers a number of incredibly useful open source tools for data scientists and machine learning (ML) practitioners. One such tool is Dask, available from Anaconda. At Manifold, we have used Dask extensively to build scalable ML pipelines.

Read More

Topics: Data science, Data engineering, Machine learning

Before Optimizing Industrial Equipment with AI, Optimize Your Data

Posted by Sourav Dey on Jan 21, 2019 7:00:00 AM
These days, just about everything is "smart," from IoT toasters to internet-connected toilet paper dispensers. The existence of such devices points to the increasing availability of resources that enable more important pursuits. The related costs are decreasing, meaning it's possible to collect vast amounts of data from sensors attached to expensive equipment like oil and gas rigs, earth-moving tools, and factory machinery.
Read More

Topics: AI at the edge

Never Miss a Post

Get the Manifold Blog in Your Inbox

We publish occasional blog posts about our client work, open source projects, and conference experiences. We focus on industry insights and practical takeaways to help you accelerate your data roadmap and create business value.

Subscribe Here

Popular Posts