The IU Computer Vision Lab investigates and develops advanced statistical and machine learning techniques for automatically analyzing, understanding, and organizing visual information. Our applications include recognizing objects in consumer images, analyzing human activity in video, discovering patterns in large scientific datasets, reconstructing 3-d models of world landmarks, and even studying visual attention in toddlers.
Selected Recent Papers
News and Updates
David J. Crandall
Director of Graduate Studies, Computer Science
Grant Thornton Scholar
Luddy School of Informatics, Computing, and Engineering
Mailing address: 901 E Tenth St
Physical address: 611 N Park Ave
(one block west of Informatics West)
Bloomington, IN 47408
Fall 2019 office hours:
Mondays 5-6pm, 611 N Park
Sundays 10pm-11pm, online
|Sept 12, 2011:||Awarded IU Faculty Research Support Program grant to use first-person cameras for studying autistic and typically-developing children, with Chen Yu|
|June 22, 2011:||Our paper on 3d reconstruction won runner-up best paper at CVPR 2011!|
|Dec 21, 2010:||Our PNAS paper on inferring social ties from geo-tags featured in New Scientist and BBC World Service|
|Aug 24, 2010:||Awarded grant from the IU D2I Center for mining social media for ecology|
|Aug 1, 2010:||Joining the faculty of Indiana University School of Informatics and Computing! -->|
I work in computer vision, the area of computer science concerned with automatically inferring semantic meaning from images -- teaching computers to "see." More generally, I am interested in problems that involve analyzing and modeling large amounts of uncertain data, like mining data from social networking websites.
I'm an Associate Professor in the School of Informatics, Computing, and Engineering at Indiana University, where I direct the IU Computer Vision Lab. I'm on the faculty of the Computer Science, Informatics, Cognitive Science, and Data Science programs, and I'm adjunct in the Department of Statistics. My work has been funded by grants and contracts from the National Science Foundation (including through a CAREER award), Google, Kodak, the Department of Defense, and Indiana University.
Before joining IU, I was a graduate student and postdoc in Computer Science at Cornell University. I worked with Professor Dan Huttenlocher on statistical part-based object recognition algorithms, and with Professor Jon Kleinberg on modeling and mining data from online social networks. I completed the Ph.D. degree in August 2008.
Before Cornell, I spent two years in the research labs of Eastman Kodak Company. There I worked mostly on image understanding and enhancement algorithms for both consumer and medical images.
As an undergraduate and Masters student at Penn State University, I worked with Professor Rangachar Kasturi on content-based video indexing, and specifically on detecting, tracking and recognizing text in video (both captions and text appearing naturally in a scene).
- "SfM with MRFs: Discrete-Continuous Optimization for Large-Scale Structure from Motion," in PAMI 2013 (with A. Owens, N. Snavely, D. Huttenlocher) [pdf] [project website]
- "Discrete-Continuous Optimization for Large-Scale Structure from Motion," in CVPR 2011 (with A. Owens, N. Snavely, D. Huttenlocher) [pdf] [project website] Runner-up best paper!
First-person and opportunistic imagery
- "Joint Person Segmentation and Identification in Synchronized First- and Third-person Videos," in ECCV 2018 (with M. Xu, C. Fan, Y. Wang, M. Ryoo) [pdf]
- "From Coarse Attention to Fine-Grained Gaze: A Two-stage 3D Fully Convolutional Network for Predicting Eye Gaze in First Person Video," in BMVC 2018 (with Z. Zhang, S. Bambach, C. Yu) [pdf]
- "Estimating Head Motion from Egocentric Vision," in ICMI 2018 (with S. Tsutsui, S. Bambach, C. Yu)
- "Identifying first-person camera wearers in third-person videos," in CVPR 2017 (with C. Fan, J. Lee, M. Xu, K.K. Singh, Y.J. Lee, M. Ryoo) [pdf]
- "DeepDiary: Automatically Captioning Lifelogging Image Streams," in ECCV EPIC 2016 (with C. Fan) [pdf] [www with code and data]
- "Enhancing Lifelogging Privacy by Detecting Screens," in CHI 2016 (with M. Korayem, R. Templeman, D. Chen, A. Kapadia) [pdf] [www] Best paper honorable mention!
- "Lending A Hand: Detecting Hands and Recognizing Activities in Complex Egocentric Interactions," in ICCV 2015 (with S. Bambach, S. Lee, C. Yu) [pdf] [www] [dataset]
- "Viewpoint Integration for Hand-Based Recognition of Social Interactions from a First-Person View," in ICMI 2015 (with S. Bambach, C. Yu) [pdf] [dataset]
- "PlaceAvoider: Steering First-Person Cameras away from Sensitive Spaces," in NDSS 2014 (with R. Templeman, M. Korayem, A. Kapadia) [pdf]
- "This Hand is My Hand: A Probabilistic Approach to Hand Disambiguation in Egocentric Video," in CVPR Workshop on Egocentric Vision 2014 (with S. Lee, S. Bambach, J. Franchak, C. Yu) [pdf] Best paper!
- "PlaceRaider: Virtual Theft in Physical Spaces with Smartphones," in NDSS 2013 (with R. Templeman, Z. Rahman, A. Kapadia) [pdf] [project website]
- "Fully-Coupled Two-Stream Spatiotemporal Networks for Extremely Low Resolution Action Recognition," in WACV 2018 (with M. Xu, A. Sharghi, X. Chen)
- "Vehicle Recognition with Constrained Multiple Instance SVMs," in WACV 2014 (with K. Duan, L. Marchesotti) [pdf]
- "Discovering localized attributes for fine-grained recognition," in CVPR 2012 (with K. Duan, D. Parikh, K. Grauman) [pdf] [project website]
- "A Multi-layer Composite Model for Human Pose Estimation," in BMVC 2012 (with K. Duan, D. Batra) [pdf] [project website]
- "Landmark classification in large-scale image collections," in ICCV 2009 (with Y. Li and D. Huttenlocher) [pdf] [expanded book chapter]
- "Composite models of objects and scenes for category recognition," in CVPR 2007 (with D. Huttenlocher) [pdf]
- "Weakly-supervised learning of part-based spatial models for visual object recognition," in ECCV 2006 (with D. Huttenlocher) [pdf] [source code]
- "Spatial Priors for Part-Based Recognition using Statistical Models," in CVPR 2005 (with P. Felzenszwalb and D. Huttenlocher) [pdf] [expanded LNCS version] [source code]
- "Robust Color Object Detection using Spatial-Color Joint Probability Functions," in IEEE Transactions on Image Processing, 2006 (with J. Luo) [CVPR 2004 version]
- "Part-based statistical models for visual object class recognition," 2008 (Ph.D. thesis) [pdf]
Mining and modeling large-scale photo collections
- A Unified Model for Near and Remote Sensing," in ICCV 2017 (with S. Workman, M. Zhai, N. Jacobs) [pdf]
- "Tracking Natural Events through Social Media and Computer Vision," in ACM Multimedia 2016 (with J. Wang, M. Korayem, S. Blanco) [pdf]
- "Linking Past to Present: Discovering Style in Two Centuries of Architecture," in ICCP 2015 (with S. Lee, N. Maisonneuve, A. Efros, J. Sivic)
See also an online demo of some results.
- "Predicting Geo-informative Attributes in Large-scale Image Collections using Convolutional Neural Networks," in WACV 2015 (with S. Lee, H. Zhang) [pdf]
- "Multimodal Learning in Loosely-organized Web Images," in CVPR 2014 (with K. Duan, D. Batra) [pdf]
- "Observing the natural world with Flickr," in ICCV Workshop on Computer Vision for Converging Perspectives, 2013 (with J. Wang, M. Korayem) [pdf] Best paper!
- "Modeling people and places with internet photo collections," in Communications of the ACM (with N. Snavely) [pdf]
- "Mining photo-sharing websites to study ecological phenomena," in WWW 2012 (with H. Zhang, M. Korayem, G. LeBuhn) [pdf] [project website]
- "Beyond co-occurrence: Discovering and visualizing tag relationships from geo-spatial and temporal similarities," in WSDM 2012 (with H. Zhang, M. Korayem, E. You) [pdf] [project website]
- "Mapping the World's Photos," in WWW 2009 (with L. Backstrom, D. Huttenlocher, J. Kleinberg)
[pdf] Runner-up best paper!
See also a gallery of automatically-generated maps.
Mining and modeling online social networks
- "Utilizing remote sensing and big data to quantify conflict intensity: The Arab Spring as a case study," in Applied Geography 2018 (with N. Levin, S. Ali) [pdf]
- "Where have all the people gone? Enhancing global conservation using night lights and social media," in Ecological Applications 2015 (with N. Levin, S. Kark) [pdf]
- "De-anonymizing users across heterogeneous social computing platforms," in ICWSM 2013 (with M. Korayem) [pdf] [project website]
- "Inferring Social Ties from Geographic Coincidences," in Proc. National Academy of Sciences (PNAS), 8 December 2010 (with L. Backstrom, D. Cosley, S. Suri, D. Huttenlocher, J. Kleinberg) [pdf]
- "Feedback Effects between Similarity and Social Influence in Online Communities," in KDD 2008 (with D. Cosley, J. Kleinberg, D. Huttenlocher, S. Suri) [pdf]
Image privacy attitudes and applications
- "Viewer Experience of Obscuring Scene Elements in Photos to Enhance Privacy," in CHI 2018 (with R. Hasan, E. Hassan, Y. Li, K. Caine, R. Hoyle, A. Kapadia) [pdf]
- "Cartooning for enhanced privacy in lifelogging and streaming video," in CVPR CV-COPS 2017 (with E. Hassan, R. Hasan, P. Shaffer, A. Kapadia) [pdf]
- "Addressing physical safety, security, and privacy for people with visual impairments," in SOUPS 2016 (with T. Ahmed, P. Shaffer, K. Connelly, A. Kapadia) [pdf]
- "Sensitive Lifelogs: A Privacy Analysis of Photos from Wearable Cameras," in CHI 2015 (with R. Hoyle, R. Templeman, D. Anthony, A. Kapadia) [pdf]
- "Privacy Concerns and Behaviors of People with Visual Impairments" in CHI 2015 (with T. Ahmed, R. Hoyle, K. Connelly, A. Kapadia) [pdf]
- "Privacy Behaviors of Lifeloggers using Wearable Cameras," in Ubicomp 2014 (with R. Hoyle, R. Templeman, S. Armes, D. Anthony, A. Kapadia) [pdf]
Human vision and learning
- "Exploring inter-observer differences in first-person object views using deep learning models," in ICCV MBCCV workshop, 2017 (with S. Bambach, Z. Zhang, C.Yu) [pdf]
- "An Egocentric Perspective on Active Vision and Visual Object Learning in Toddlers," in ICDL 2017 (with S. Bambach, L. Smith, C. Yu) [pdf]
- "Active Viewing in Toddlers Facilitates Visual Object Learning: An Egocentric Vision Approach," in CogSci 2016 (with S. Bambach, L. Smith, C. Yu) [pdf]
- "Objects in the Center: How the Infant's Body Constrains Infant Science," in ICDL 2016 (with S. Bambach, L. Smith, C. Yu) [pdf] Best paper!
- "Detecting Hands in Children's Egocentric Views to Understand Embodied Attention during Social Interaction," in CogSci 2014 (with S. Bambach, J. Franchak, C. Yu) [pdf]
- "Understanding embodied visual attention in child-parent interaction," in ICDL 2013 (with S. Bambach, C. Yu) [pdf]
- "Psychophysical study of image orientation perception," in Spatial Vision, 2003, pp. 429-457. (with J. Luo, A. Singhal, M. Boutell and R. Gray.) [pdf]
- "Diverse Beam Search for Improved Description of Complex Scenes," AAAI Conference on Artificial Intelligence," AAAI 2018 (with A. Vijayakumar, M. Cogswell, R. Selvaraju, Q. Sun, S. Lee, D. Batra)
- "Stochastic Multiple Choice Learning for Training Diverse Deep Ensembles," NIPS 2016 (with S. Lee, S. Purushwalkam, M. Cogswell, V. Ranjan, D. Batra) [pdf]
- "Diverse Beam Search: Decoding Diverse Solutions from Neural Sequence Models," arXiv 2016 (with A. Vijayakumar, M Cogswell, R. Selvaraju, Q. Sun, S. Lee, D. Batra) [arXiv link]
- "Why M Heads are Better than One: Training a Diverse Ensemble of Deep Networks," arXiv 2015 (with S. Lee, S. Purushwalkam, M. Cogswell, D. Batra) [arXiv link]
Computer vision for polar science
- "Multi-Task Spatiotemporal Neural Networks for Structured Surface Reconstruction," in WACV 2018 (with M. Xu, C. Fan, J. Paden, G. Fox, D. Crandall) [pdf]
- "Automatic estimation of ice bottom surfaces from radar imagery", in ICIP 2017 (with M. Xu, G. Fox, J. Paden) [pdf]
- "Estimating bedrock and surface layer boundaries and confidence intervals in ice sheet radar imagery using MCMC", in ICIP 2014 (with S. Lee, J. Mitchell, G. Fox) [pdf] [project website]
- "Layer-finding in radar echograms using probabilistic graphical models", in ICPR 2012 (with G. Fox, J. Paden) [pdf] [project website]
Text and document analysis
- A Data Driven Approach for Compound Figure Separation Using Convolutional Neural Networks," in ICDAR 2017 (with S. Tsutsui) [pdf] [project website]
- "Extraction of special effects caption text events from digital video," in International Journal on Document Analysis and Recognition, 2002 (with S. Antani and R. Kasturi) [pdf] [M.S. thesis version] [ICDAR01 version]
Other interests :)
- "A deep study into the history of web design," in WebSci 2017 (with B. Doosti, N. Su) [pdf] [project website]
- "Understanding the Aesthetic Evolution of Websites: Towards a Notion of Design Periods," in CHI 2017 (with W. Chen, N. Su) [pdf]
- "From funding agencies to scientific agency: Collective allocation of science funding as an alternative to peer review," in EMBO Reports, 2014 (with J. Bollen, D. Junk, Y. Ding, K. Borner) [pdf]
- "Learning visual features for the Avatar Captcha Recognition Challenge," in ICMLA 2012 (with M. Korayem, A. Mohammed, D. Crandall, R. Yampolskiy) [pdf]
- "Subjectivity and Sentiment Analysis of Arabic: A Survey", in AMLTA, 2012 (with M. Korayem, M. Abdul-Mageed) [pdf]
- CS B657, Computer Vision (online and hybrid sections)
- CS B551, Elements of Artificial Intelligence (Fall 2015, Fall 2016, Spring 2017, Fall 2017, Fall 2018)
- CS B657, Computer Vision, (Fall 2010, Spring 2012, Spring 2014, Spring 2016, Spring 2017, Spring 2018, Spring 2019)
- CS B490/B659, Image processing and recognition (Spring 2015)
- Info I210, Information Infrastructure I (Fall 2014)
- Info I399, Research Methods for Undergraduates (Fall 2013)
- CS B553, Probabilistic Approaches to Artificial Intelligence (Spring 2013)
- Info I427, Search Informatics (Fall 2011, Spring 2011, Fall 2012, Fall 2013, Fall 2014, Fall 2015)
- Info 2950, Mathematical Methods for Information Science, Fall 2008 (at Cornell)
- CS 113, Introduction to C, Fall 2007 and Spring 2008 (at Cornell)
- CS 211, Algorithms and Data Structures in Java, Summer 2007 (at Cornell)
- T.A. for CS 664, Computer Vision, Fall 2003 (at Cornell)
- CSE 275, Digital Design Lab, Fall 2000 and Spring 2001 (at Penn State)
- Automatic accent restoration in Spanish text, Spring 2005 course project for CS 674
- An Inverse Discrete Cosine Transform (IDCT) unit in VLSI, Fall 1999 course project for CSE 477
- A rudimentary hardware MPEG video decoder, Spring 1999 course project for CSE 471
My current preferred winter activity is swimming in IU's fantastic pools. (My unofficial hobby is to swim in a public lap pool in as many cities of the world as possible (so far: Amsterdam, Barcelona, Beijing, Blacksburg, Bloomington, Boulder, Chicago, Columbus, Copenhagen, Dallas, D.C., Denver, Detroit, Erie, Fort Collins, Glasgow, Hong Kong, Honolulu, Indianapolis, Ithaca, Jamestown, Jerusalem, Kailua-Kona, Las Vegas, Los Angeles, Madrid, Minneapolis, Montreal, Munich, Nashville, New Orleans, Newcastle upon Tyne, Paris, Phoenix, Quebec City, Salt Lake City, San Diego, San Jose, Santiago, Seoul, Shanghai, State College, Sydney, Tokyo, Vancouver, Venice).
I took piano for 12 years as a child and recently started picking it back up again. I've also been trying to learn organ. Since I can't produce great music myself, I'm happy to live just a few blocks from the amazing IU Jacobs School of Music.
I'm on a perpetual quest to become fluent in Spanish. In addition to taking classes in high school and throughout college and graduate school, I've studied abroad in Spain and Mexico, and I've also traveled to Puerto Rico, Chile, and El Salvador. Despite all that effort, my Spanish is still remarkably lousy. :)
If you like pretty pictures, you might check out my dad's amazing bird photos. My own household flock includes Pichu, Ron, Sally, Betsy, and Sebastian; alumni are Toto, Zuma, Rameses, Rameses II, and, of course, Cocoa.