Skip to content

Latest commit

 

History

History
253 lines (241 loc) · 42 KB

Papers-2019.md

File metadata and controls

253 lines (241 loc) · 42 KB

December 2019

  • PPDM: Parallel Point Detection and Matching for Real-time Human-Object Interaction Detection - [Arxiv] [QA]
  • RC-DARTS: Resource Constrained Differentiable Architecture Search - [Arxiv] [QA]
  • NAS evaluation is frustratingly hard - [Arxiv] [QA]
  • Something-Else: Compositional Action Recognition with Spatial-Temporal Interaction Networks - [Arxiv] [QA]
  • Improving Knowledge-aware Dialogue Generation via Knowledge Base Question Answering - [Arxiv] [QA]
  • Image Processing Using Multi-Code GAN Prior - [Arxiv] [QA]
  • ClusterFit: Improving Generalization of Visual Representations - [Arxiv] [QA]
  • Self-Supervised Visual Terrain Classification from Unsupervised Acoustic Feature Learning - [Arxiv] [QA]
  • Infinite products and zero-one laws in categorical probability - [Arxiv] [QA]
  • Generating Videos of Zero-Shot Compositions of Actions and Objects - [Arxiv] [QA]
  • 15 Keypoints Is All You Need - [Arxiv] [QA]
  • 12-in-1: Multi-Task Vision and Language Representation Learning - [Arxiv] [QA]
  • Prioritized Unit Propagation with Periodic Resetting is (Almost) All You Need for Random SAT Solving - [Arxiv] [QA]
  • Self-Supervised Learning of Pretext-Invariant Representations - [Arxiv] [QA]
  • Lost-customers approximation of semi-open queueing networks with backordering -- An application to minimise the number of robots in robotic mobile fulfilment systems - [Arxiv] [QA]
  • Just Go with the Flow: Self-Supervised Scene Flow Estimation - [Arxiv] [QA]

November 2019

  • ASR is all you need: cross-modal distillation for lip reading - [Arxiv] [QA]
  • Single Headed Attention RNN: Stop Thinking With Your Head - [Arxiv] [QA]
  • Binarized Neural Architecture Search - [Arxiv] [QA]
  • Binarized Neural Architecture Search - [Arxiv] [QA]
  • Breaking the cycle -- Colleagues are all you need - [Arxiv] [QA]
  • Region Normalization for Image Inpainting - [Arxiv] [QA]
  • All You Need Is Boundary: Toward Arbitrary-Shaped Text Spotting - [Arxiv] [QA]
  • Automatic Text-based Personality Recognition on Monologues and Multiparty Dialogues Using Attentive Networks and Contextual Embeddings - [Arxiv] [QA]
  • Generating Persona Consistent Dialogues by Exploiting Natural Language Inference - [Arxiv] [QA]
  • Momentum Contrast for Unsupervised Visual Representation Learning - [Arxiv] [QA]
  • A Pre-training Based Personalized Dialogue Generation Model with Persona-sparse Data - [Arxiv] [QA]
  • Effectiveness of self-supervised pre-training for speech recognition - [Arxiv] [QA]
  • Contextualized Sparse Representations for Real-Time Open-Domain Question Answering - [Arxiv] [QA]
  • Fast Transformer Decoding: One Write-Head is All You Need - [Arxiv] [QA]

October 2019

  • Attention Is All You Need for Chinese Word Segmentation - [Arxiv] [QA]
  • Multi-Stage Document Ranking with BERT - [Arxiv] [QA]
  • Towards Unsupervised Speech Recognition and Synthesis with Quantized Speech Representation Learning - [Arxiv] [QA]
  • Stabilizing DARTS with Amended Gradient Estimation on Architectural Parameters - [Arxiv] [QA]
  • Mockingjay: Unsupervised Speech Representation Learning with Deep Bidirectional Transformer Encoders - [Arxiv] [QA]
  • Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram - [Arxiv] [QA]
  • Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer - [Arxiv] [QA]
  • Generative Pre-Training for Speech with Autoregressive Predictive Coding - [Arxiv] [QA]
  • KnowIT VQA: Answering Knowledge-Based Questions about Videos - [Arxiv] [QA]
  • Location-Relative Attention Mechanisms For Robust Long-Form Speech Synthesis - [Arxiv] [QA]
  • Adversarial Skill Networks: Unsupervised Robot Skill Learning from Video - [Arxiv] [QA]
  • Understanding Deep Networks via Extremal Perturbations and Smooth Masks - [Arxiv] [QA]
  • ALOHA: Artificial Learning of Human Attributes for Dialogue Agents - [Arxiv] [QA]
  • Reverse derivative categories - [Arxiv] [QA]
  • Understanding the Limitations of Variational Mutual Information Estimators - [Arxiv] [QA]
  • Self-supervised Label Augmentation via Input Transformations - [Arxiv] [QA]
  • vq-wav2vec: Self-Supervised Learning of Discrete Speech Representations - [Arxiv] [QA]
  • A cost-effective method for improving and re-purposing large, pre-trained GANs by fine-tuning their class-embeddings - [Arxiv] [QA]
  • Explaining image classifiers by removing input features using generative models - [Arxiv] [QA]
  • Probability, valuations, hyperspace: Three monads on Top and the support as a morphism - [Arxiv] [QA]
  • Bayesian open games - [Arxiv] [QA]
  • MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis - [Arxiv] [QA]
  • Continual Learning in Neural Networks - [Arxiv] [QA]
  • Continual Learning in Neural Networks - [Arxiv] [QA]
  • ZeRO: Memory Optimizations Toward Training Trillion Parameter Models - [Arxiv] [QA]
  • Is Fast Adaptation All You Need? - [Arxiv] [QA]

September 2019

  • Interpretations are useful: penalizing explanations to align neural networks with prior knowledge - [Arxiv] [QA]
  • Visual Explanation for Deep Metric Learning - [Arxiv] [QA]
  • Joint-task Self-supervised Learning for Temporal Correspondence - [Arxiv] [QA]
  • UNITER: UNiversal Image-TExt Representation Learning - [Arxiv] [QA]
  • High Fidelity Speech Synthesis with Adversarial Networks - [Arxiv] [QA]
  • Improving Generative Visual Dialog by Answering Diverse Questions - [Arxiv] [QA]
  • On Model Stability as a Function of Random Seed - [Arxiv] [QA]
  • Understanding and Robustifying Differentiable Architecture Search - [Arxiv] [QA]
  • Self-Training for End-to-End Speech Recognition - [Arxiv] [QA]
  • Pose-aware Multi-level Feature Network for Human Object Interaction Detection - [Arxiv] [QA]
  • Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism - [Arxiv] [QA]
  • An Internal Learning Approach to Video Inpainting - [Arxiv] [QA]
  • Learning to Deceive with Attention-Based Explanations - [Arxiv] [QA]
  • Towards Scalable Multi-domain Conversational Agents: The Schema-Guided Dialogue Dataset - [Arxiv] [QA]
  • Specifying Object Attributes and Relations in Interactive Scene Generation - [Arxiv] [QA]
  • CTRL: A Conditional Transformer Language Model for Controllable Generation - [Arxiv] [QA]
  • ACUTE-EVAL: Improved Dialogue Evaluation with Optimized Questions and Multi-turn Comparisons - [Arxiv] [QA]
  • Image Inpainting with Learnable Bidirectional Attention Maps - [Arxiv] [QA]
  • Identifying Personality Traits Using Overlap Dynamics in Multiparty Dialogue - [Arxiv] [QA]
  • All You Need is Ratings: A Clustering Approach to Synthetic Rating Datasets Generation - [Arxiv] [QA]

August 2019

  • Copy-and-Paste Networks for Deep Video Inpainting - [Arxiv] [QA]
  • Accelerating Large-Scale Inference with Anisotropic Vector Quantization - [Arxiv] [QA]
  • Onion-Peel Networks for Deep Video Completion - [Arxiv] [QA]
  • VL-BERT: Pre-training of Generic Visual-Linguistic Representations - [Arxiv] [QA]
  • Efficient Deep Neural Networks - [Arxiv] [QA]
  • Efficient Deep Neural Networks - [Arxiv] [QA]
  • A synthetic approach to Markov kernels, conditional independence and theorems on sufficient statistics - [Arxiv] [QA]
  • Unsupervised Learning of Landmarks by Descriptor Vector Exchange - [Arxiv] [QA]
  • Unicoder-VL: A Universal Encoder for Vision and Language by Cross-modal Pre-training - [Arxiv] [QA]
  • StructureFlow: Image Inpainting via Structure-aware Appearance Flow - [Arxiv] [QA]
  • Approximating the Convex Hull via Metric Space Magnitude - [Arxiv] [QA]
  • ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks - [Arxiv] [QA]
  • On the Existence of Simpler Machine Learning Models - [Arxiv] [QA]
  • Smooth Grad-CAM++: An Enhanced Inference Level Visualization Technique for Deep Convolutional Neural Network Models - [Arxiv] [QA]
  • Generative Image Inpainting with Submanifold Alignment - [Arxiv] [QA]

July 2019

  • On Mutual Information Maximization for Representation Learning - [Arxiv] [QA]
  • Benchmarking Attribution Methods with Relative Feature Importance - [Arxiv] [QA]
  • Forward-Backward Decoding for Regularizing End-to-End TTS - [Arxiv] [QA]
  • Compositional Deep Learning - [Arxiv] [QA]
  • PC-DARTS: Partial Channel Connections for Memory-Efficient Architecture Search - [Arxiv] [QA]
  • Dual Adversarial Semantics-Consistent Network for Generalized Zero-Shot Learning - [Arxiv] [QA]
  • Generative Counterfactual Introspection for Explainable Deep Learning - [Arxiv] [QA]
  • Large Scale Adversarial Representation Learning - [Arxiv] [QA]
  • Generalizing from a few environments in safety-critical reinforcement learning - [Arxiv] [QA]
  • Learnable Gated Temporal Shift Module for Deep Video Inpainting - [Arxiv] [QA]

June 2019

  • Self-Supervised Dialogue Learning - [Arxiv] [QA]
  • Using Self-Supervised Learning Can Improve Model Robustness and Uncertainty - [Arxiv] [QA]
  • Improving performance of deep learning models with axiomatic attribution priors and expected gradients - [Arxiv] [QA]
  • Unsupervised State Representation Learning in Atari - [Arxiv] [QA]
  • Sample-Efficient Neural Architecture Search by Learning Action Space - [Arxiv] [QA]
  • One Epoch Is All You Need - [Arxiv] [QA]
  • Stand-Alone Self-Attention in Vision Models - [Arxiv] [QA]
  • Contrastive Multiview Coding - [Arxiv] [QA]
  • Real-Time Open-Domain Question Answering with Dense-Sparse Phrase Index - [Arxiv] [QA]
  • Factorized Mutual Information Maximization - [Arxiv] [QA]
  • Factorized Mutual Information Maximization - [Arxiv] [QA]
  • Topology-Preserving Deep Image Segmentation - [Arxiv] [QA]
  • Self-Supervised Learning for Contextualized Extractive Summarization - [Arxiv] [QA]
  • Effective Use of Variational Embedding Capacity in Expressive End-to-End Speech Synthesis - [Arxiv] [QA]
  • HowTo100M: Learning a Text-Video Embedding by Watching Hundred Million Narrated Video Clips - [Arxiv] [QA]
  • Selfie: Self-supervised Pretraining for Image Embedding - [Arxiv] [QA]
  • XRAI: Better Attributions Through Regions - [Arxiv] [QA]
  • Attention is all you need for Videos: Self-attention based Video Summarization using Universal Transformers - [Arxiv] [QA]
  • Image Synthesis with a Single (Robust) Classifier - [Arxiv] [QA]
  • Automated Machine Learning: State-of-The-Art and Open Challenges - [Arxiv] [QA]
  • Learning Representations by Maximizing Mutual Information Across Views - [Arxiv] [QA]
  • Zero-Shot Semantic Segmentation - [Arxiv] [QA]
  • Rethinking Loss Design for Large-scale 3D Shape Retrieval - [Arxiv] [QA]
  • Latent Retrieval for Weakly Supervised Open Domain Question Answering - [Arxiv] [QA]
  • Learning to Generate Grounded Visual Captions without Localization Supervision - [Arxiv] [QA]

May 2019

  • Attention Is (not) All You Need for Commonsense Reasoning - [Arxiv] [QA]
  • MathQA: Towards Interpretable Math Word Problem Solving with Operation-Based Formalisms - [Arxiv] [QA]
  • Align-and-Attend Network for Globally and Locally Coherent Video Inpainting - [Arxiv] [QA]
  • Let's Agree to Agree: Neural Networks Share Classification Order on Real Datasets - [Arxiv] [QA]
  • Why do These Match? Explaining the Behavior of Image Similarity Models - [Arxiv] [QA]
  • Countering Noisy Labels By Learning From Auxiliary Clean Labels - [Arxiv] [QA]
  • Data-Efficient Image Recognition with Contrastive Predictive Coding - [Arxiv] [QA]
  • FastSpeech: Fast, Robust and Controllable Text to Speech - [Arxiv] [QA]
  • Deeper Text Understanding for IR with Contextual Neural Language Modeling - [Arxiv] [QA]
  • PEPSI++: Fast and Lightweight Network for Image Inpainting - [Arxiv] [QA]
  • Evolving Rewards to Automate Reinforcement Learning - [Arxiv] [QA]
  • Tabular Benchmarks for Joint Architecture and Hyperparameter Optimization - [Arxiv] [QA]
  • Deep Flow-Guided Video Inpainting - [Arxiv] [QA]
  • Frame-Recurrent Video Inpainting by Robust Optical Flow Inference - [Arxiv] [QA]
  • Characterizing the invariances of learning algorithms using category theory - [Arxiv] [QA]
  • Deep Video Inpainting - [Arxiv] [QA]
  • Unsupervised Pre-Training of Image Features on Non-Curated Data - [Arxiv] [QA]
  • Scaling and Benchmarking Self-Supervised Visual Representation Learning - [Arxiv] [QA]
  • Visualizing Deep Networks by Optimizing with Integrated Gradients - [Arxiv] [QA]
  • Full-Gradient Representation for Neural Network Visualization - [Arxiv] [QA]

April 2019

  • Segmentation is All You Need - [Arxiv] [QA]
  • A critical analysis of self-supervision, or what we can learn from a single image - [Arxiv] [QA]
  • TVQA+: Spatio-Temporal Grounding for Video Question Answering - [Arxiv] [QA]
  • DynamoNet: Dynamic Action and Motion Network - [Arxiv] [QA]
  • Free-form Video Inpainting with 3D Gated Convolution and Temporal PatchGAN - [Arxiv] [QA]
  • GraphNAS: Graph Neural Architecture Search with Reinforcement Learning - [Arxiv] [QA]
  • Poly-encoders: Transformer Architectures and Pre-training Strategies for Fast and Accurate Multi-sentence Scoring - [Arxiv] [QA]
  • SelFlow: Self-Supervised Learning of Optical Flow - [Arxiv] [QA]
  • Self-Supervised Audio-Visual Co-Segmentation - [Arxiv] [QA]
  • Understanding Neural Networks via Feature Visualization: A survey - [Arxiv] [QA]
  • Document Expansion by Query Prediction - [Arxiv] [QA]
  • Deep Fusion Network for Image Completion - [Arxiv] [QA]
  • Semantically Aligned Bias Reducing Zero Shot Learning - [Arxiv] [QA]
  • HARK Side of Deep Learning -- From Grad Student Descent to Automated Machine Learning - [Arxiv] [QA]
  • Understanding the Behaviors of BERT in Ranking - [Arxiv] [QA]
  • Learning Pyramid-Context Encoder Network for High-Quality Image Inpainting - [Arxiv] [QA]
  • Counterfactual Visual Explanations - [Arxiv] [QA]
  • The Geometry of Bayesian Programming - [Arxiv] [QA]
  • Focus Is All You Need: Loss Functions For Event-based Vision - [Arxiv] [QA]
  • CEDR: Contextualized Embeddings for Document Ranking - [Arxiv] [QA]
  • VORNet: Spatio-temporally Consistent Video Inpainting for Object Removal - [Arxiv] [QA]
  • wav2vec: Unsupervised Pre-training for Speech Recognition - [Arxiv] [QA]
  • ThumbNet: One Thumbnail Image Contains All You Need for Recognition - [Arxiv] [QA]
  • On zero-shot recognition of generic objects - [Arxiv] [QA]
  • Leveraging the Invariant Side of Generative Zero-Shot Learning - [Arxiv] [QA]
  • Self-supervised Spatio-temporal Representation Learning for Videos by Predicting Motion and Appearance Statistics - [Arxiv] [QA]
  • Detecting Human-Object Interactions via Functional Generalization - [Arxiv] [QA]
  • Data Shapley: Equitable Valuation of Data for Machine Learning - [Arxiv] [QA]
  • VideoBERT: A Joint Model for Video and Language Representation Learning - [Arxiv] [QA]
  • Creativity Inspired Zero-Shot Learning - [Arxiv] [QA]

March 2019

  • Interpreting Black Box Models via Hypothesis Testing - [Arxiv] [QA]
  • Wasserstein Dependency Measure for Representation Learning - [Arxiv] [QA]
  • Self-Supervised Learning via Conditional Motion Propagation - [Arxiv] [QA]
  • Simple Applications of BERT for Ad Hoc Document Retrieval - [Arxiv] [QA]
  • Generalized Convolution and Efficient Language Recognition - [Arxiv] [QA]
  • sharpDARTS: Faster and More Accurate Differentiable Architecture Search - [Arxiv] [QA]
  • Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set - [Arxiv] [QA]
  • Learning Correspondence from the Cycle-Consistency of Time - [Arxiv] [QA]
  • A Deep Look into Neural Ranking Models for Information Retrieval - [Arxiv] [QA]
  • Turbo Learning Framework for Human-Object Interactions Recognition and Human Pose Estimation - [Arxiv] [QA]
  • All You Need is a Few Shifts: Designing Efficient Convolutional Neural Networks for Image Classification - [Arxiv] [QA]
  • Pluralistic Image Completion - [Arxiv] [QA]
  • Deep Reinforcement Learning of Volume-guided Progressive View Inpainting for 3D Point Scene Completion from a Single Depth Image - [Arxiv] [QA]
  • CLEVR-Dialog: A Diagnostic Dataset for Multi-Round Reasoning in Visual Dialog - [Arxiv] [QA]
  • Self-Supervised Learning of 3D Human Pose using Multi-view Geometry - [Arxiv] [QA]
  • High-Fidelity Image Generation With Fewer Labels - [Arxiv] [QA]
  • Learning Latent Plans from Play - [Arxiv] [QA]
  • Lenses and Learners - [Arxiv] [QA]
  • Change Detection with the Kernel Cumulative Sum Algorithm - [Arxiv] [QA]
  • Stabilizing the Lottery Ticket Hypothesis - [Arxiv] [QA]
  • Stabilizing the Lottery Ticket Hypothesis - [Arxiv] [QA]
  • Differentiable Causal Computations via Delayed Trace - [Arxiv] [QA]
  • Semantic-Guided Multi-Attention Localization for Zero-Shot Learning - [Arxiv] [QA]

February 2019

  • Multi-Stage Self-Supervised Learning for Graph Convolutional Networks on Graphs with Few Labels - [Arxiv] [QA]
  • A Theoretical Analysis of Contrastive Unsupervised Representation Learning - [Arxiv] [QA]
  • From open learners to open games - [Arxiv] [QA]
  • Evaluating the Search Phase of Neural Architecture Search - [Arxiv] [QA]
  • Predicting city safety perception based on visual image content - [Arxiv] [QA]
  • SC-FEGAN: Face Editing Generative Adversarial Network with User's Sketch and Color - [Arxiv] [QA]
  • CBOW Is Not All You Need: Combining CBOW with the Compositional Matrix Space Model - [Arxiv] [QA]
  • Self-supervised Visual Feature Learning with Deep Neural Networks: A Survey - [Arxiv] [QA]
  • LS-Tree: Model Interpretation When the Data Are Linguistic - [Arxiv] [QA]
  • Towards Automatic Concept-based Explanations - [Arxiv] [QA]
  • Depthwise Convolution is All You Need for Learning Multiple Visual Domains - [Arxiv] [QA]
  • Collaborative Sampling in Generative Adversarial Networks - [Arxiv] [QA]
  • Parameter-Efficient Transfer Learning for NLP - [Arxiv] [QA]

January 2019

  • Compositionality for Recursive Neural Networks - [Arxiv] [QA]
  • Personalized Dialogue Generation with Diversified Traits - [Arxiv] [QA]
  • On the (In)fidelity and Sensitivity for Explanations - [Arxiv] [QA]
  • Revisiting Self-Supervised Visual Representation Learning - [Arxiv] [QA]
  • Diffusion Variational Autoencoders - [Arxiv] [QA]
  • Diffusion Variational Autoencoders - [Arxiv] [QA]
  • Self-Supervised Generalisation with Meta Auxiliary Learning - [Arxiv] [QA]
  • Improving Sequence-to-Sequence Learning via Optimal Transport - [Arxiv] [QA]
  • Foreground-aware Image Inpainting - [Arxiv] [QA]
  • Passage Re-ranking with BERT - [Arxiv] [QA]
  • Automated Rationale Generation: A Technique for Explainable AI and its Effects on Human Perceptions - [Arxiv] [QA]
  • Detecting Overfitting of Deep Generative Networks via Latent Recovery - [Arxiv] [QA]
  • A Comprehensive Survey on Graph Neural Networks - [Arxiv] [QA]
  • Visualizing Deep Similarity Networks - [Arxiv] [QA]
  • EdgeConnect: Generative Image Inpainting with Adversarial Edge Learning - [Arxiv] [QA]
  • A Theoretical Analysis of Deep Q-Learning - [Arxiv] [QA]
  • A Theoretical Analysis of Deep Q-Learning - [Arxiv] [QA]