Skip to content

Latest commit

 

History

History
215 lines (158 loc) · 12.7 KB

File metadata and controls

215 lines (158 loc) · 12.7 KB

SIGMOD

2023

ST4ML: Machine Learning Oriented Spatio-Temporal Data Processing at Scale

Kaiqi Liu (Nanyang Technological University)*; Panrong Tong (Alibaba Group); Mo Li (Nanyang Technological University); Yue Wu (Damo Academy, Alibaba Group); Jianqiang Huang (Alibaba Group)

FlexMoE: Scaling Large-scale Sparse Pre-trained Model Training via Dynamic Device Placement

Xiaonan Nie (Peking University)*; Xupeng Miao (Carnegie Mellon University); Zilong Wang (Microsoft); Zichao Yang (Carnegie Mellon University); Jilong Xue (Microsoft Research); Lingxiao Ma (Microsoft Research); Gang Cao (BAAI); Bin Cui (Peking University)

Automating and Optimizing Data-Centric What-If Analyses on Native Machine Learning Pipelines

Stefan Grafberger (University of Amsterdam); Paul Groth (University of Amsterdam); Sebastian Schelter (University of Amsterdam);

GoodCore: Coreset Selection over Incomplete Data for Data-effective and Data-efficient Machine Learning

Chengliang Chai (Beijing Institute of Technology); Jiabin Liu (Tsinghua University); Nan Tang (Qatar Computing Research Institute, HBKU); Ju Fan (Renmin University of China); dongjing miao (哈尔滨工业大学); Jiayi Wang (Tsinghua University); Yuyu Luo (Tsinghua University); Guoliang Li (Tsinghua University);

Scalable and Efficient Full-Graph GNN Training for Large Graphs

Xinchen Wan (HKUST); Kaiqiang Xu (HKUST); Xudong Liao (HKUST); Yilun Jin (The Hong Kong University of Science and Technology); Kai Chen (HKUST); Xin Jin (Peking University);

DeltaBoost: Gradient Boosting Decision Trees with Efficient Machine Unlearning

Zhaomin Wu (National University of Singapore); Junhui Zhu (National University of Singapore); Qinbin Li (National University of Singapore); Bingsheng He (National University of Singapore);

DUCATI: A Dual-Cache Training System for Graph Neural Networks on Giant Graphs with GPU

Xin Zhang (Hong Kong University of Science and Technology); Yanyan Shen (Shanghai Jiao Tong University); Yingxia Shao (BUPT); Lei Chen (Hong Kong University of Science and Technology);

Caerus: A Caching-based Framework for Scalable Temporal Graph Neural Networks

Yiming Li (Hong Kong University of Science and Technology)*; Yanyan Shen (Shanghai Jiao Tong University); Lei Chen (Hong Kong University of Science and Technology); Mingxuan Yuan (Huawei)

EARLY: Efficient and Reliable Graph Neural Network for Dynamic Graphs

Haoyang Li (The Hong Kong University of Science and Technology); Lei Chen (Hong Kong University of Science and Technology);

FEC: Efficient Deep Recommendation Model Training with Flexible Embedding Communication

Kaihao Ma ( The Chinese University of Hong Kong); Xiao Yan (Southern University of Science and Technology)*; Zhenkun Cai (The Chinese University of Hong Kong); Yuzhen Huang (Meta); Yidi Wu (Meta Platforms, Inc); James Cheng (CUHK)

2022

Nautilus: An Optimized System for Deep Transfer Learning over Evolving Training Datasets

Supun C Nakandala (University of California, San Diego)*; Arun Kumar (University of California, San Diego)

Where Is My Training Bottleneck? Hidden Trade-Offs in Deep Learning Preprocessing Pipelines

Alexander Isenko (Technical University of Munich)*; Ruben Mayer (Technical University of Munich); Jeffery Jedele (Technical University of Munich); Hans-Arno Jacobsen (University of Toronto)

Complaint-Driven Training Data Debugging at Interactive Speeds

Lampros Flokas (Columbia University)*; Weiyuan Wu (Simon Fraser University); Yejia Liu (Simon Fraser University); Jiannan Wang (Simon Fraser University); Nakul Verma (Columbia University); Eugene Wu (Columbia University)

HET-GMP: a Graph-based System Approach to Scaling Large Embedding Model Training

Xupeng Miao (Peking University)*; Yining Shi (Peking University); Hailin Zhang (Peking University); Xin Zhang (Peking University); Xiaonan Nie (Peking University); Zhi Yang (Peking University); Bin Cui (Peking University)

Camel: Managing Data for Efficient Stream Learning

Yiming Li (Hong Kong University of Science and Technology)*; Yanyan Shen (Shanghai Jiao Tong University); Lei Chen (Hong Kong University of Science and Technology)

In-Database Machine Learning with CorgiPile: Stochastic Gradient Descent without Full Data Shuffle

Lijie Xu (ETH Zurich)*; Shuang Qiu (University of Chicago); Binhang Yuan (ETH Zurich); Jiawei Jiang (ETH Zurich); Cedric Renggli (ETH Zurich); Shaoduo Gan (ETH Zurich); Kaan Kara (ETHZ); Guoliang Li (Tsinghua University); Ji Liu (Kwai Inc.); Wentao Wu (Microsoft Research); Jieping Ye (Didi Chuxing & University of Michigan); Ce Zhang (ETH)

NuPS: A Parameter Server for Machine Learning with Non-Uniform Parameter Access

Alexander Renz-Wieland (Technische Universität Berlin)*; Rainer Gemulla (Universität Mannheim); Zoi Kaoudi (TU Berlin); Volker Markl (Technische Universität Berlin)

Sommelier: Curating DNN Models for the Masses

Peizhen Guo (Yale University)*; Bo Hu (Yale University); Wenjun Hu (Yale University)

Lightweight and Accurate Cardinality Estimation by Neural Network Gaussian Process

Kangfei Zhao (The Chinese University of Hong Kong)*; Jeffrey Xu Yu (Chinese University of Hong Kong); Zongyan He (The Chinese University of Hong Kong); Rui Li (The Chinese University of Hong Kong); Hao Zhang (Chinese University of Hong Kong)

2021

Efficient Deep Learning Pipelines for Accurate Cost Estimations Over Large Scale Query Workload

Johan Zhi Kang Kok (Grab)*; Gaurav Gaurav (Grab); Sienyi Tan (Grab); Feng Cheng (Grab); Shixuan Sun (National University of Singapore); Bingsheng He (National University of Singapore)

VF^2Boost: Very Fast Vertical Federated Gradient Boosting for Cross-Enterprise Learning

Fangcheng Fu (Peking University)*; Yingxia Shao (BUPT); Lele Yu (Peking University); Jiawei Jiang (ETH Zurich); Huanran Xue (Tencent Inc.); Yangyu Tao (Tencent); Bin Cui (Peking University)

ALG: Fast and Accurate Active Learning Framework for Graph Convolutional Networks

Wentao Zhang (Peking University)*; Yu Shen (Peking University); Yang Li (Peking University); Lei Chen (Hong Kong University of Science and Technology); Zhi Yang (Peking University); Bin Cui (Peking University)

VLDB

2023

MiCS: Near-linear Scaling for Training Gigantic Model on Public Cloud

Zhen Zhang, Shuai Zheng, Yida Wang, Justin Chiu, George Karypis, Trishul A Chilimbi, Mu Li, Xin Jin

Scalable Graph Convolutional Network Training on Distributed-Memory Systems

Gunduz Vehbi Demirci, Aparajita Haldar, Hakan Ferhatosmanoglu

Coresets over Multiple Tables for Feature-rich and Data-efficient Machine Learning

Jiayi Wang, Chengliang Chai, Nan Tang, Jiabin Liu, Guoliang Li

SHiFT: An Efficient, Flexible Search Engine for Transfer Learning

Cedric Renggli, Xiaozhe Yao, Luka Kolar, Luka Rimanic, Ana Klimovic, Ce Zhang

FastFlow: Accelerating Deep Learning Model Training with Smart Offloading of Input Data Pipeline

Taegeon Um, Byungsoo Oh, Byeongchan Seo, Minhyeok Kweun, Goeun Kim, Woo-Yeon Lee

Galvatron: Efficient Transformer Training over Multiple GPUs Using Automatic Parallelism

Xupeng Miao, Yujie Wang, Youhe Jiang, Chunan Shi, Xiaonan Nie, Hailin Zhang, Bin Cui

2022

ANN Softmax: Acceleration of Extreme Classification Training

Kang Zhao (Alibaba)* Liuyihan Song (Alibaba Group) Yingya Zhang (Alibaba Group) Pan Pan (Alibaba Group) Xu Yinghui (Alibaba Group) rong jin (alibaba group)

Accelerating Recommendation System Training by Leveraging Popular Choices

Muhammad Adnan (University of British Columbia) Yassaman Ebrahimzadeh Maboud (University of British Columbia) Divya Mahajan (Microsoft)* Prashant Nair (University of British Columbia)

HET: Scaling out Huge Embedding Model Training via Cache-enabled Distributed Framework

Xupeng Miao (Peking University)* Hailin Zhang (Peking University) Yining Shi (Peking University) Xiaonan Nie (Peking University) Zhi Yang (Peking University) Yangyu Tao (Tencent) Bin Cui (Peking University)

COMET: A Novel Memory-Efficient Deep Learning Training Framework by Using Error-Bounded Lossy Compression

Sian Jin (Washington State University) Chengming Zhang (Washington State University) Xintong Jiang (McGill University) Yunhe Feng (University of Washington) Hui Guan (University of Massachusetts, Amherst) Guanpeng Li (University of Iowa) Shuaiwen Song (University of Sydney) Dingwen Tao (Washington State University)*

TGL: A General Framework for Temporal GNN Training onBillion-Scale Graphs

hongkuan zhou (University of Southern California)* Da Zheng (Amazon) Israt Nisa (Amazon) Vassilis N. Ioannidis (Amazon Web Services) Xiang Song (Amazon) George Karypis (Amazon)

Ginex: SSD-enabled Billion-scale Graph Neural Network Training on a Single Machine via Provably Optimal In-memory Caching

Yeonhong Park (Seoul National University)* Sunhong Min (Seoul National University) Jae W. Lee (Seoul National University)

Distributed Learning of Fully Connected Neural Networks using Independent Subnet Training

Binhang Yuan (Rice University) Cameron Wolfe (Rice University)* Chen Dun (Rice University) Yuxin Tang (Rice University ) Anastasios Kyrillidis (Rice University ) Chris Jermaine (Rice University)

Optimizing Machine Learning Inference Queries with Correlative Proxy Models

Zhihui Yang (Zhejiang Lab)*

Demonstration of Accelerating Machine Learning Inference Queries with Correlative Proxy Models

Zhihui Yang (Zhejiang Lab)* Yicong Huang (UC Irvine) Zuozhi Wang (U C IRVINE) Feng Gao (Zhejiang Lab) Yao Lu (Microsoft Research) Chen Li (UC Irvine) X. Sean Wang (Fudan University)

ByteGNN: Efficient Graph Neural Network Training at Large Scale

Chenguang Zheng (CUHK)* Hongzhi CHEN (ByteDance) Yuxuan Cheng (ByteDance Inc) Zhezheng Song (CUHK) Yifan Wu (Peking University) Changji Li (CUHK) James Cheng (CUHK) Hao Yang (ByteDance) Shuai Zhang (ByteDance)

Hyper-Tune: Towards Efficient Hyper-parameter Tuning at Scale

Yang Li (Peking University)* Yu Shen (Peking University) Huaijun Jiang (Peking University) Wentao Zhang (Peking University) Jixiang Li (Kuaishou Inc.) Ji Liu (Kwai Inc.) Ce Zhang (ETH) Bin Cui (Peking University

2021

Hindsight Logging for Model Training

Rolando Garcia (UC Berkeley), Erick Liu (UC Berkeley), Vikram Sreekanti (UC Berkeley), Bobby Yan (UC Berkeley), Anusha Dandamudi (UC Berkeley), Joseph Gonzalez (UC Berkeley), Joseph M Hellerstein (UC Berkeley), Koushik Sen (University of California, Berkeley)

ParaX: Boosting Deep Learning for Big Data Analytics on Many-Core CPUs

Lujia Yin (NUDT), Yiming Zhang (NUDT), Zhaoning Zhang (NUDT), Yuxing Peng (NUDT), Peng Zhao (Intel)

Analyzing and Mitigating Data Stalls in DNN Training

Jayashree Mohan (UT Austin), Amar Phanishayee (Microsoft Research), Ashish Raniwala (Microsoft), Vijay Chidambaram (UT Austin and VMWare)

Towards an Optimized GROUP BY Abstraction for Large-Scale Machine Learning

Side Li (University of California, San Diego), Arun Kumar (University of California, San Diego)

Large Graph Convolutional Network Training with GPU-Oriented Data Communication Architecture

Seung Won Min (University of Illinois at Urbana-Champaign), Kun Wu (University of Illinois at Urbana-Champaign), Sitao Huang (University of Illinois at Urbana-Champaign), Mert Hidayetoglu (University of Illinois at Urbana-Champaign), Jinjun Xiong (IBM Thomas J. Watson Research Center), Eiman Ebrahimi (NVIDIA), Deming Chen (University of Illinois at Urbana-Champaign), Wen-mei Hwu (NVIDIA Corporation)

ICDE

2022

PSP: Progressive Space Pruning for Efficient Graph Neural Architecture Search

Guanghui Zhu (Nanjing University)*; Wenjie Wang (Nanjing University); Zhuoer Xu (Nanjing University); Feng Cheng (Nanjing University); Mengchuan Qiu (Nanjing University); Chunfeng Yuan (Nanjing University); Yihua Huang (Nanjing University)

TSplit: Fine-grained GPU Memory Management for Efficient DNN Training via Tensor Splitting

Xiaonan Nie (Peking University)*; Xupeng Miao (Peking University); Zhi Yang (Peking University); Bin Cui (Peking University)

HybridGNN: Learning Hybrid Representation in Multiplex Heterogeneous Networks

Tiankai Gu (Tsinghua University); Chaokun Wang (Tsinghua University)*; Cheng Wu (Tsinghua University); Yunkai Lou (Tsinghua University); Jingcao Xu (Tsinghua University); Changping Wang (Kuaishou Inc); Kai Xu (Kuaishou); Can Ye (Kuaishou Inc); Yang Song (Kuaishou Inc)

Dynamic Model Tree for Interpretable Data Stream Learning

Johannes Haug (University of Tuebingen)*; Klaus Broelemann (Schufa Holding AG); Gjergji Kasneci ( University of Tuebingen)

2021

HuGE: An Entropy-driven Approach to Efficient and Scalable Graph Embeddings

Peng Fang; Fang Wang; Zhan Shi; Hong Jiang; Dan Feng; Lei Yang