We take machine learning models from experiments in a Jupyter notebook to production in the cloud. We follow a structured MLOps process for the entire modeling lifecycle, including Dockerized ML development, parallelized backtesting, ML API patterns, model explainability, model performance monitoring, and infrastructure as code modules for rapid deployment.
We develop models to solve hard problems—from unsupervised anomaly detection in multi-variate time series to dynamic system identification using deep learning. We take a heterodox approach to data science: we start from first principles about the mathematical formulation of the problem and then experiment with the relevant methods—from modern hierarchical Bayesian methods, to gaussian processes, to deep learning, and sometimes to more classical techniques like Kalman filters. As with all good science, we start simple, experiment a lot, and iterate our way to the best possible solution.
We engineer modern data infrastructure, including purpose-built enterprise data platforms, complex data pipelines, batch and streaming data, sensitive data handling, serverless technologies, and more. In data science and machine learning, more and better data always beats better algorithms.