Egocentric computer vision

Ego4d: Around the world in 3,000 hours of egocentric video
PAMI 2025
See also: [CVPR 2022 version]





Situated Cameras, Situated Knowledges: Towards an Egocentric Epistemology for Computer Vision
CVPR Ego4d/EPIC Workshop 2023
[paper]





Can Privacy Be Satisfying? On Improving Viewer Satisfaction for Privacy-Enhanced Photos Using Aesthetic Transforms
CHI 2019


Deepdiary: Lifelogging image captioning and summarization
Journal of Visual Communication and Image Representation 2018
See also: [ECCVW 2016 version]




From Coarse Attention to Fine-Grained Gaze: A Two-stage 3D Fully Convolutional Network for Predicting Eye Gaze in First Person Video
BMVC 2018
[website]

An Egocentric Perspective on Active Vision and Visual Object Learning in Toddlers
ICDL 2017




Viewpoint Integration for Hand-Based Recognition of Social Interactions from a First-Person View
ICMI 2015
[poster]

Challenges in Running Wearable Camera-Related User Studies
ACM CSCW Workshop on The Future of Networked Privacy 2015

Privacy Behaviors of Lifeloggers using Wearable Cameras
UbiComp 2014
[video]


Reactive Security: Responding to Visual Stimuli from Wearable Cameras
Ubicomp UPSIDE Workshop 2014

This Hand is My Hand: A Probabilistic Approach to Hand Disambiguation in Egocentric Video
CVPR Egovision Workshop 2014

Understanding Embodied Visual Attention in Child-Parent Interaction
ICDL 2013
3-d scene modeling


HVPUNet: Hybrid-Voxel Point-cloud Upsampling Network
ICCV 2025

Deep learning-based 3D reconstruction from multiple images: A survey
Neurocomputing 2024





SfM with MRFs: Discrete-Continuous Optimization for Large-Scale Structure from Motion
PAMI 2013
[website]

Discrete-Continuous Optimization for Large-scale Structure from Motion
CVPR 2011
[website]
Privacy and security applications and implications


Modular Anti-counterfeit Tags Formed by Template-Assisted Self-Assembly of Plasmonic Nanocrystals and Authenticated by Machine Learning
Advanced Functional Materials 2024
[paper]


Plasmonic Anti-counterfeit Tags with High Encoding Capacity Rapidly Authenticated with Deep Machine Learning
ACS Nano 2021

Deep Neural Network-based Detection and Verification of Microelectronic Images
Journal of Hardware and Systems Security 2020

Privacy Norms and Preferences for Photos Posted Online
ACM Transactions on Computer-Human Interaction 2020


Can Privacy Be Satisfying? On Improving Viewer Satisfaction for Privacy-Enhanced Photos Using Aesthetic Transforms
CHI 2019

Viewer Experience of Obscuring Scene Elements in Photos to Enhance Privacy
CHI 2018


Understanding the Physical Safety, Security, and Privacy Concerns of People with Visual Impairments
IEEE Internet Computing 2017

Addressing supply chain risks of microelectronic devices through computer vision
AIPR 2017



Considering Privacy Implications of Assistive Devices for People with Visual Impairments
WISH 2016

Privacy Behaviors of Lifeloggers using Wearable Cameras
UbiComp 2014
[video]

mFingerprint: Privacy-preserving user modeling with multimodal mobile device footprints
SBP 2014

Reactive Security: Responding to Visual Stimuli from Wearable Cameras
Ubicomp UPSIDE Workshop 2014

De-anonymizing users across heterogeneous social computing platforms
ICWSM 2013
[poster]

AI for Health

Preliminary findings regarding the association between patient demographics and ED experience scores across a regional health system: A cross sectional study using natural language processing of patient comments
International Journal of Medical Informatics 2025

Bridging Human Intuition and AI in Colorful Food Assessment
CHI 2025

Computer Vision for Dietary Assessment
CHI Workshop on Realizing AI in Healthcare: Challenges Appearing in the Wild 2021
[video]


Autonomous perception and navigation


Few-shot segmentation and Semantic Segmentation for Underwater Imagery
IROS 2023




Real-Time, Cloud-based Object Detection for Unmanned Aerial Vehicles
IEEE Robotic Computing 2017

Cloud-based parallel implementation of SLAM for mobile robots
ISC 2015

Image and video segmentation


Few-shot segmentation and Semantic Segmentation for Underwater Imagery
IROS 2023

Zero-Shot Video Object Segmentation with Co-Attention Siamese Networks
PAMI 2022

Segmenting Objects from Relational Visual Data
PAMI 2022

Automatic Annotation for Semantic Segmentation in Indoor Scenes
IROS 2019


Action and activity recognition in video





Zero-Shot Video Object Segmentation with Co-Attention Siamese Networks
PAMI 2022






Viewpoint Integration for Hand-Based Recognition of Social Interactions from a First-Person View
ICMI 2015
[poster]

A Framework for Reliable Text-Based Indexing of Video
Symposium on Document Image Understanding Technology 2001


Evaluation of Methods for Detection and Localization of Text from Video
IAPR Workshop on Document Analysis Systems 2000
Computational models of human development




Learning the generative principles of a symbol system from limited examples
Cognition 2020


A View of Their Own: Capturing the Egocentric View of Infants and Toddlers with Head-Mounted Cameras
Journal of Visualized Experiments 2019

How do infants start learning object names in a sea of clutter?
CogSci 2019

Towards Detecting Dyslexia in Children’s Handwriting Using Neural Networks
ICML Workshop on AI for Social Good 2019
[website]



An Egocentric Perspective on Active Vision and Visual Object Learning in Toddlers
ICDL 2017




Understanding Embodied Visual Attention in Child-Parent Interaction
ICDL 2013
Vision-language models


How do infants start learning object names in a sea of clutter?
CogSci 2019

Deepdiary: Lifelogging image captioning and summarization
Journal of Visual Communication and Image Representation 2018
See also: [ECCVW 2016 version]

Diverse Beam Search for Improved Description of Complex Scenes
AAAI Conference on Artificial Intelligence 2018
[paper]
Observing and modeling human behavior


Observing Pianist Accuracy and Form with Computer Vision
WACV 2019

Utilizing remote sensing and big data to quantify conflict intensity: The Arab Spring as a case study
Applied Geography 2018

From Coarse Attention to Fine-Grained Gaze: A Two-stage 3D Fully Convolutional Network for Predicting Eye Gaze in First Person Video
BMVC 2018
[website]

mFingerprint: Privacy-preserving user modeling with multimodal mobile device footprints
SBP 2014

This Hand is My Hand: A Probabilistic Approach to Hand Disambiguation in Egocentric Video
CVPR Egovision Workshop 2014

Networks of Landmarks, Photos, and People
Leonardo 2011



Psychophysical study of image orientation perception
Spatial Vision 2003
Neurosymbolic approaches to AI

Run Like a Neural Network, Explain Like k-Nearest Neighbor
IJCAI 2025

Extracting Features with Deep Learning for Ensemble-Driven Case-Based Classification
ICCBR 2025

Learning Case Features with Proxy-Guided Deep Neural Networks
ICCBR 2025

Network Implementation of CBR: Case Study of a Neural Network K-NN
ICCBR 2024

Extracting Indexing Features for CBR from Deep Neural Networks: A Transfer Learning Approach
ICCBR 2024

Examining the Impact of Network Architecture on Extracted Feature Quality for CBR
ICCBR 2023

Enhancing Case-Based Reasoning with Neural Networks
Compendium of Neurosymbolic Artificial Intelligence 2023

Case Adaptation with Neural Networks: Capabilities and Limitations
ICCBR 2022

Extracting Case Indices from Convolutional Neural Networks: A Comparative Study
ICCBR 2022

Generation and Evaluation of Creative Images from Limited Data: A Class-to-Class VAE Approach
ICCC 2022
[video]

Generating Counterfactual Images: Toward a C2C-VAE Approach
ICCBR Workshop on CBR for Explanation of Intelligent Systems 2022

Learning Adaptations for Case-Based Classification: A Neural Network Approach
ICCBR 2021

On Combining Knowledge-Engineered and Network-Extracted Features for Retrieval
ICCBR 2021
[video]

Applying the Case Difference Heuristic to Learn Adaptations from Deep Network Features
IJCAI Workshop on Deep Learning, Case-based Reasoning, and AutoML 2021

Supporting Case-Based Reasoning with Neural Networks: An Illustration for Case Adaptation
AAAI-MAKE 2021

Bringing Case Based Reasoning to Deep Learning
International Conference on Case-Based Reasoning Special Track on Challenges and Promises 2020
Image and object recognition









Recognizing landmarks in large-scale social image collections
Visual Analysis and Geolocalization of Large Scale Imagery 2016
See also: [ICCV09 paper]

Human pose estimation through composite multi-layer models
Signal Processing 2015
[website]
See also: [BMVC 2012 version]



This Hand is My Hand: A Probabilistic Approach to Hand Disambiguation in Egocentric Video
CVPR Egovision Workshop 2014


Color Object Detection using Spatial-Color Joint Probability Functions
TIP 2006
See also: [CVPR 2004 version]


Spatial Priors for Part-Based Recognition using Statistical Models
CVPR 2005
See also: [Expanded LNCS version]

Extraction of special effects caption text events from digital video
IJDAR 2002
[paper]
See also: [ICDAR 2001 version]
Survey papers


Deep learning-based 3D reconstruction from multiple images: A survey
Neurocomputing 2024

Contributions to High-Performance Big Data Computing
Future Trends of HPC in a Disruptive Scenario 2019

Sentiment/Subjectivity Analysis Survey for Languages other than English
Social Network Analysis and Mining 2016
[paper]

Networks of Landmarks, Photos, and People
Leonardo 2011
AI and Art, Humanities, and Music


“It Was Really All About Books:” Speech-like Techno-Masculinity in the Rhetoric of Dot-Com Era Web Design Books
ACM Transactions on Computer-Human Interaction 2023
[video]

Investigating the Homogenization of Web Design: A Mixed-Methods Approach
CHI 2021
[video]
See also: [The Conversation article]

Studying Empirical Color Harmony in Design
CVPR Workshop on Computer Vision for Fashion, Art, and Design 2020

Observing Pianist Accuracy and Form with Computer Vision
WACV 2019

A deep study into the history of web design
WebSci 2017
[website]

Understanding the Aesthetic Evolution of Websites: Towards a Notion of Design Periods
CHI 2017
AI for Ecology and Polar Science

Automated Ice-Bottom Tracking of 2D and 3D Ice Radar Imagery Using Viterbi and TRW-S
JSTARS 2019

World Heritage in danger: Big data and remote sensing can help protect sites in conflict zones
Global Environmental Change 2019
[paper]


Crossover Analysis and Automated Layer-Tracking Assessment of the Extracted DEM of the Basal Topography of the Canadian Arctic Archipelago Ice-Cap
IEEE Radar Conference 2018

Automatic estimation of ice bottom surfaces from radar imagery
ICIP 2017
[website]

Where have all the people gone? Enhancing global conservation using night lights and social media
Ecological Applications 2015


Estimating Bedrock and Surface Layer Boundaries and Confidence Intervals in Ice Sheet Radar Imagery using MCMC
ICIP 2014
[website]


Automatic Near Surface Estimation from Snow Radar Imagery
IGARSS 2013

Layer-finding in radar echograms using probabilistic graphical models
ICPR 2012
[website]

Mining Photo-sharing Websites to Study Ecological Phenomena
WWW 2012
[website]
Accessibility

Conveying Situational Information to People with Visual Impairments
CHI Workshop on Situationally-Induced Impairments and Disabilities in Mobile Interaction 2019
[paper]

Understanding the Physical Safety, Security, and Privacy Concerns of People with Visual Impairments
IEEE Internet Computing 2017


Considering Privacy Implications of Assistive Devices for People with Visual Impairments
WISH 2016
Science of Science



An efficient system to fund science: from proposal review to peer-to-peer distributions
Scientometrics 2017

Challenges in Running Wearable Camera-Related User Studies
ACM CSCW Workshop on The Future of Networked Privacy 2015
Document and text recognition



Extraction of special effects caption text events from digital video
IJDAR 2002
[paper]
See also: [ICDAR 2001 version]

A Framework for Reliable Text-Based Indexing of Video
Symposium on Document Image Understanding Technology 2001


Evaluation of Methods for Detection and Localization of Text from Video
IAPR Workshop on Document Analysis Systems 2000
Machine learning

Controlling the Quality of Distillation in Response-Based Network Compression
AAAI International Workshop on Practical Deep Learning in the Wild 2022
[paper]


Generalized Capsule Networks with Trainable Routing Procedure
International Conference on Machine Learning Workshop on Generalization 2019
[paper]
Education and AI

Hello Research! Developing an Intensive Research Experience for Undergraduate Women
SIGCSE 2019
[website]
Social robotics and human-robot interaction