Sumit Chopra

sumit [at] imagentechnologies [dot] com


Background

I am the head of A.I. Research at Imagen Technologies: a well funded stealth startup working towards transforming healthcare using artificial intelligence. I am interested in advancing AI research with a particular focus towards deep learning and healthcare.

Before Imagen, I was a research scientist at Facebook AI Research (FAIR), where I worked on understanding natural language. I graduated with a Ph.D., in computer science from New York University under the supervision of Prof. Yann LeCun. My thesis proposed a first of its kind neural network model for relational regression, and was a conceptual foundation for a startup for modeling residential real estate prices. Following my Ph.D., I joined AT&T Labs – Research as a scientist in the machine learning department. There I focused on building novel deep learning models for speech recognition, natural language processing, computer vision, and other areas of machine learning, such as, recommender systems, computational advertisement, and ranking.

Google Scholar
LinkedIn



Publications

Selected Publications

Abstractive sentence summarization with attentive recurrent neural networks
Sequence level training with recurrent neural networks
Towards ai-complete question answering: A set of prerequisite toy tasks
Memory Networks
Discovering the Hidden Structure of House Prices with a Non-Parametric Latent Manifold Model
A Tutorial on Energy-Based Learning
Efficient Learning of Sparse Overcomplete Representations with Energy-Based Model
Learning a Similarity Measure Discriminatively with Applications to Face Verification

All Publications

Sam Wiseman, Sumit Chopra, Marc'Aurelio Ranzato, Arthur Szlam, Ruoyu Sun, Soumith Chintala, and Nicolas Vasilache. Training Language Models Using Target-Propagation. arXiv:1702.04770

Jiwei Li, Alexander H. Miller, Sumit Chopra, Marc'Aurelio Ranzato, and Jason Weston. Learning Through Dialogue Interactions. To appear in International Conference on Learning Regresentations (ICLR) 2017. [Code+Data]

Jiwei Li, Alexander H. Miller, Sumit Chopra, Marc'Aurelio Ranzato, and Jason Weston. Dialogue Learning With Human-In-The-Loop. To appear in International Conference on Learning Regresentations (ICLR) 2017. [Code+Data]

Sumit Chopra, Michael Auli, and Alexander M. Rush. Abstractive sentence summarization with attentive recurrent neural networks. Proceedings of NAACL-HLT16, 93-98.

Jesse Dodge, Andreea Gane, Xiang Zhang, Antoine Bordes, Sumit Chopra, Alexander Miller, Arthur Szlam, and Jason Weston. Evaluating prerequisite qualities for learning end-to-end dialog systems. International Conference on Learning Representations (ICLR) 2016. [Data].

Marc'Aurelio Ranzato, Sumit Chopra, Michael Auli, and Wojciech Zaremba. Sequence level training with recurrent neural networks. International Conference on Learning Representations (ICLR) 2016.

Felix Hill, Antoine Bordes, Sumit Chopra, and Jason Weston. The Goldilocks Principle: Reading Children's Books with Explicit Memory Representations. International Conference on Learning Representations (ICLR) 2016. [Data].

Alexander M. Rush, Sumit Chopra, and Jason Weston. A neural attention model for abstractive sentence summarization. EMNLP 2015. [Code].

Antoine Bordes, Nicolas Usunier, Sumit Chopra, and Jason Weston. Large-scale simple question answering with memory networks. arXiv:1506.02075. [Data].

Jason Weston, Antoine Bordes, Sumit Chopra, Alexander M. Rush, Bart van Merriënboer, Armand Joulin, and Tomas Mikolov. Towards ai-complete question answering: A set of prerequisite toy tasks. International Conference on Learning Representations (ICLR) 2015. [Code+Data].

Tomas Mikolov, Armand Joulin, Sumit Chopra, Michael Mathieu, and Marc'Aurelio Ranzato. Learning longer memory in recurrent neural networks. International Conference on Learning Representations (ICLR) 2015.

Marc'Aurelio Ranzato, Arthur Szlam, Joan Bruna, Michael Mathieu, Ronan Collobert, and Sumit Chopra. Video (language) modeling: a baseline for generative models of natural videos. International Conference on Learning Representations (ICLR) 2015.

Jason Weston, Sumit Chopra, and Antoine Bordes. Memory Networks. International Conference on Learning Representations (ICLR) 2015.

Antoine Bordes, Sumit Chopra, and Jason Weston. Question answering with subgraph embeddings. EMNLP 2014.

Jason Weston, Sumit Chopra, and Keith Adams. # TagSpace: Semantic embeddings from hashtags. EMNLP 2014.

Sumit Chopra, Suhrid Balakrishnan, and Raghuraman Gopalan. DLID: Deep Learning for Domain Adaptation by Interpolating between Domains. Proceedings of the ICML 2013, Workshop on Representation Learning, Atlanta, Georgia, USA, 2013.

Suhrid Balakrhishnan and Sumit Chopra. Collaborative Ranking. Proceedings of the fifth ACM international conference on Web search and data mining (WSDM) 2012.

Suhrid Balakrishnan, Sumit Chopra, David Applegate, and Simon Urbanek. Computational television advertising. Data Mining (ICDM), 2012 IEEE 12th International Conference on, 71-80.

Suhrid Balakrhishnan and Sumit Chopra. Two of a kind or the ratings game? Adaptive pairwise preferences and latent factor models. Frontiers of Computer Science 6 (2), 197-208.

Piotr Mirowski, Sumit Chopra, Suhrid Balakrishnan, and Srinivas Bangalore. Feature-rich continuous language models for speech recognition. Spoken Language Technology Workshop (SLT), 2010 IEEE, 241-246.

Diego Aragon, A Caplin, S Chopra, JV Leahy, Y LeCun, M Scoffier, and J Tracy. Reassessing FHA risk. National Bureau of Economic Research, March 2010.

Suhrid Balakrishnan, Sumit Chopra, and Ian D. Melamed. The business next door: Click-through rate modeling for local search. NIPS Workshop on Machine Learning in Online Advertising.

Sumit Chopra, Trivikraman Thampy, John Leahy, Andrew Caplin, and Yann LeCun. Factor Graphs for Relational Regression. Technical Report: TR2007-906, January 2007.

Sumit Chopra, Trivikraman Thampy, John Leahy, Andrew Caplin, and Yann LeCun. Discovering the Hidden Structure of House Prices with a Non-Parametric Latent Manifold Model. 13th International Conference on Knowledge Discovery and Data Mining (KDD), San Jose CA, August 2007.

Yann LeCun, Sumit Chopra, Marc'Aurelio Ranzato, and Jie Huangfu. Energy-Based Models in Document Recognition and Computer Vision. Proceedings of the International Conference on Document Analysis and Recognition (ICDAR) 2007.

Marc'Aurelio Ranzato, Y-Lan Boureau, Sumit Chopra, and Yann LeCun. A Unified Energy Based Framework for Unsupervised Learning. Proceedings of the 2007 Conference on Artificial Intelligence and Statistics (AISTATS) 2007.

Neelima Gupta and Sumit Chopra. Output-Sensitive Algorithms for Optimally Constructing Upper Envelope of Straight Line Segments in Parallel. Journal of Parallel and Distributed Computing 2007.

Yann LeCun, Sumit Chopra, Raia Hadsell, Jie Huangfu, and Marc'Aurelio Ranzato. A Tutorial on Energy-Based Learning. Predicting Structured Outputs, Bakir et al. (eds), MIT Press 2006.

Marc'Aurelio Ranzato, Christopher Poultney, Sumit Chopra, and Yann LeCun. Efficient Learning of Sparse Overcomplete Representations with Energy-Based Model. Advances in Neural Information Processing Systems 19, in Scholkopf et al. (eds), MIT Press, Cambridge, MA, 2006.

Raia Hadsell, Sumit Chopra, and Yann LeCun. Dimensionality Reduction by Learning an Invariant Mapping. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), New York City, NY, June 2006.

Sumit Chopra, Raia Hadsell, and Yann LeCun. Learning a Similarity Measure Discriminatively with Applications to Face Verification. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Diego CA, June 2005.

T. Agarwal, A. Agarwal, S. Chopra, A. Feldman, N. Kammenhuber, P. Krysta, and B. Voeking. An Experiment Study of Different Strategies of DNS-Based Load Balancing. Proceesings of Euro-Par, Klagenfurt Austria, August 2003.

Neelima Gupta, Sumit Chopra, and Sandeep Sen. Optimal Output-Sensitive Algorithms for Constructing Upper Envelope of Line Segments in Parallel. Proceedings of Foundations of Software Technology and Theoretical Computer Science (FSTTCS), Bangalore India, December 2001.

Code

Training Language Models Using Target-Propagation
A neural attention model for abstractive sentence summarization
Learning Through Dialogue Interactions
Dialogue Learning With Human-In-The-Loop
Babi Tasks and Other Datasets