Career Profile

I’m currently looking for a software engineer job in backend, infrastructure, and machine learning systems (ML infrastructure). I received (09/2019) my PhD degree from New York University, advised by Prof. Jinyang Li. My interest is distributed systems. More specifically, my PhD research focuses on building machine learning systems that help AI developers to distribute algorithms on a multi-GPU machine or a large-scale cluster. Before my PhD study, I worked for 4 years in MediaTek, Taiwan, as a software engineer. My duty was to develop system software (e.g., hardware drivers and management software) for multi-core mobile phone processors.

Experiences

Research Assistant

2013 - 2019
New York University, New York

Following list the three projects I participated during my PhD study.

  1. SwapAdvisor: Support Large Deep Learning Models via Smart Swapping

    SwapAdvisor can automatically swap temporarily unused tensors from GPU memory to CPU memory to support running larger DNN models. To minimize the communication overhead, SwapAdvisor analyzes the dataflow graph of the given DNN model and uses a custom-designed genetic algorithm to optimize the operator scheduling and memory allocation. Based on the optimized operator schedule and memory allocation, SwapAdvisor can determine what and when to swap to achieve good performance.

  2. Tofu: Distributing Tensor Computation for Large-scale Deep Learning

    Tofu partitions very large DNN models across multiple GPUs device to reduce per-GPU memory footprint while achieving good performance. In order to understand the feasible partition methods for all the operators, Tofu provides a simple domain-specific language to describe the semantics of an operator. Tofu analyzes the semantics of each operator in the target DNN model and applies a recursive search algorithm to minimize the total communication costs.

  3. Spartan: Distributed Array Programming Framework with Smart Tiling

    Spartan is a distributed array framework, built on top of a set of higher-order dataflow operators. Based on the operators, Spartan provides a collection of Numpy-like array APIs. To achieve good performance for the distributed application, Spartan analyzes the communication pattern of the dataflow graph captured through the operators and applies a greedy strategy to find a good partition scheme to minimize the communication cost.

Software Engineer Intern

2016
Google Inc., Mountain View

Research Intern

2015
IBM Inc., Yorktown Heights

Senior Software Enginner

2011 - 2012
MediaTek Inc., Taiwan

Software Enginner

2008 - 2011
MediaTek Inc., Taiwan

Publications

SwapAdvisor: Push Deep Learning Beyond the GPU Memory Limit via Smart Swapping
Chien-Chin Huang, Gu Jin, Jinyang Li
Under submission
Support Very Large Models using Dataflow Graph Partitioning [pdf]
Minjie Wang, Chien-Chin Huang, Jinyang Li
Usenix EuroSys, 2019
Unifying Data, Model and Hybrid Parallelism in Deep Learning via Tensor Tiling
Minjie Wang, Chien-Chin Huang, Jinyang Li
arXiv, 2018
Spartan: A Distributed Array Framework with Smart Tiling [pdf]
Chien-Chin Huang, Qi Chen, Zhaoguo Wang, Russell Power, Jorge Ortiz, Jinyang Li, Zhen Xiao
Usenix ATC, 2015
Get More With Less: Near Real-Time Image Clustering on Mobile Phones
Jorge Ortiz, Chien-Chin Huang, Supriyo Chakraborty
arXiv, 2015
Garbage Collection for Multiversion Index in Flash-based Embedded DataBases
Po-Chun Huang, Yuan-Hao Chang, Kam-Yiu Lam, Jian-Tao Wang, Chien-Chin Huang
ACM Transactions on Design Automation of Electronic Systems, 2014
Enhancing Microkernel Performance on VLIW DSP Processors via Multiset Context Switch
Brian K. Hsieh, Yung-Chia Lin, Chien-Chin Huang, Jenq Kuen Lee.
Journal of Signal Processing Systems, Vol. 51
Integrating compiler and system toolkit flow for embedded VLIW DSP processors
Chi Wu, Kun-Yuan Hsieh, Yung-Chia Lin, Chung-Ju Wu, Wen-Li Shih, Shih-Chang Chen, Chung-Kai Chen, Chien-Ching Huang, Yi-Ping You, Jenq Kuen Lee
IEEE RTCSA, 2006

Skills & Proficiency

Python

C/C++

Linux

Bash/Zsh Script

Java