Ion Stoica

Distributed Probabilistic Inference on Ray

To enable the use of probabilistic models at Amazon at scale, we propose to exploit synergies between the Clay probabilistic language and the distributed computation effort called Ray that is currently being carried out at UC Berkeley RISELab.

Update

...

Read more about Distributed Probabilistic Inference on Ray

NumS: NumPy API-Compatible Framework backed by Ray

Runtime improvements to nd-array and tensor operations increasingly rely on parallelism, and the scientific computing community has a growing need to train on more data. Systems and machine learning research has focused mostly on scaling machine learning workloads by designing scalable solutions for specific machine learning problems, such as data...

Read more about NumS: NumPy API-Compatible Framework backed by Ray

Graph Data Augmentation for Computer Systems

Graphs are the most common state representation for structured input problems including molecule property prediction, code representation learning and computer systems. Learning algorithms embed graph structures using graph neural networks (GNNs). However, many domains lack large training datasets due to the expense of acquiring samples; work by Mirhoseini et al. trained chip placement policies from a dataset of only 20 examples due to the complexity of designing new chips. In data-scarce settings, augmentation is widely used to improve generalization. Simple transformations like...

Read more about Graph Data Augmentation for Computer Systems