About me

Welcome! I am a Deep Learning Library Performance Software Engineer at NVIDIA Corporation. I work on CUTLASS, a collection of CUDA C++ template abstractions for implementing high-performance matrix-matrix multiplication (GEMM) and related computations at all levels and scales within CUDA.

I earned my PhD in Electrical and Computer Engineering from University of California, Santa Barbara supervised by Prof. Yuan Xie. I received my B.Eng and B.Mgt from Tsinghua University. My academic work focuses on Deep Learning, Systems for Machine Learning, and GPU parallel programming,