Skip to content

This repositorty will contain the code and slides for PyBay2020 talk: Scalable Hyper-parameter Optimization using RAPIDS and AWS

Notifications You must be signed in to change notification settings

copperwiring/scalable-hpo-pybay

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 

Repository files navigation

Scalable Hyper-parameter Optimization using RAPIDS and AWS

Presented at PyBay 2020. Slides here. Recording (yet to be uploaded).

Scaling HPO on CPUs

You can find the instructions here. The demo covers:

  • Single-node multi-CPU workflow.
  • Setting up the docker container.
  • Mounting the code from host system to docker container.
  • Installing Papermill and related dependencies.
  • Running the parameterized XGBoost demo notebook both sequentially and in parallel using MultiProcessing and Papermill.

Scaling HPO on GPUs

You can find the instructions here: rapidsai/cloud-ml-examples. The notebook shown in the talk is here. The demo covers:

  • Single CPU, Multi CPU, Single GPU, Multi GPU workflows
  • Building the ML workflow using RAPIDS and Dask
  • Building Estimator
  • Running HPO on AWS SageMaker

Suggested Reading

Talk Abstract

“Not sure if my novel ML model is giving the best accuracy it can!”

“Can I find the best hyperparameters for my talk demo 4 days before the deadline?”

If this sounds like you, then you might want to attend this talk.

Distributed computing in machine learning is becoming the norm, and this trend is driven largely by the computational requirements of machine learning applications. However, building distributed applications today requires tons of expertise and finding optimal sets of hyper-parameters. It is often a time and resource-consuming process.

We want something that can search for hyper-parameters (hyper-parameter search optimization, aka HPO) in a distributed manner on-prem and on the cloud. We would expect the approach to intelligently optimize which of the possible combinations from the search space will give us the best results (example: best accuracy ). In this talk, we will be covering a technique that you can use with Jupyter Notebooks: an interactive python environment, parametrize using Papermill: a python library and distribute hyper-parameter search. Further in the talk we will explore how to scale up this search to cloud, specifically using AWS SageMaker orchestrator for HPO.

About Speakers

Srishti Yadav

Srishti is currently a graduate research assistant at Networked Robotics and Sensing Laboratory at Simon Fraser University, Canada. Her work revolves around the intersection of computer vision and machine learning where she actively use PyTorch, TensorFlow, Python, MATLAB, Numpy, Scipy, OpenCV, Matplotlib, GDAL, etc. scientific stack as well as cloud services like AWS. She is a founder of several developer community groups of Vancouver. In the past, she has given talks at Microsoft Open Source events, PyLadies and other machine learning groups. As a strong proponent of tech and diversity, her involvement goes beyond local community work. Recently she was one of the the chairs of Women in Computer Vision workshop co-hosted with CVPR, 2020 and was on the committee of the Women in Machine Learning workshop, 2019.

Website: srishti.dev / Email me: [email protected] / GitHub: @copperwiring

Akshit Arora

Akshit is a deep learning solutions architect at NVIDIA focused on deploying machine learning and deep learning platforms at scale. As an architect, he helps accelerate deep learning pipelines using NVIDIA GPUs at various tech companies. Previously at CU Boulder, he developed deep learning models to understand how students learn on an online learning platform. His work also includes predicting weather using LSTMs and automatically completing a painting in virtual reality using sketch-RNN. He is interested in creative applications of machine learning/deep learning and the wide set of possibilities it presents.

Website: aroraakshit.github.io / Email me: [email protected] / Follow me on Twitter: @_AkshitArora / GitHub: @aroraakshit


If you find any bug in the code, please raise an issue here or send a Pull Request here.

About

This repositorty will contain the code and slides for PyBay2020 talk: Scalable Hyper-parameter Optimization using RAPIDS and AWS

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published