About


I. Motivation

Large scale machine learning and data mining often require sifting through sizeable quantities of data to identify relevant records before any experimentation can be done. This preprocessing task can be exceedingly costly if the data is not well organized or easily accessible. Our case study for this project is the IU Computer Vision Lab's image data collection. The lab maintains a collection of millions of images and associated metadata downloaded from Flickr and uses these images in machine learning experiments. Often only a subset of the images have attributes appropriate for a given experiment (i.e. are in the geographic region of interest or have a certain text tag) and selecting these images from the whole collection is currently performed by utilizing specialized scripts written on a case by case basis to navigate the file system. In many experiments, images need to be annotated with additional labels depending on the study. Two examples of such labelings are "contains snow" or "is in a city". Streamlining these processes would allow for more agile experimentation.

II. Project Statement

Image Collaborative Labelling And Semantic Search System seeks to improves the organization and usability of the image data collection for the IU Computer Vision Lab. The lab maintains a collection of millions of images and associated metadata downloaded from Flickr and uses these images in machine learning experiments and data mining. This often require sifting through sizeable quantities of data to identify relevant records before any experimentation can be done. This preprocessing task can be exceedingly costly if the data is not well organized or easily accessible.

The application allows user to upload the image data in database having various attributes appropriate for a given experiment (i.e. are in the geographic region of interest or have a certain text tag). Thus it improves the organization and usability of this image data collection by providing capability to the users to add new image data to the dataset, modify image data in the dataset, search based on metadata and labelings, download selected images, create new labelings, and label existing images. This framework also allows for collaborative labeling of images.

III. Development Team Members

  • Stefan Lee - steflee@indiana.edu
  • Rosy Agarwal - rosyagar@indiana.edu
  • Nikita Pandey - npandey@indiana.edu
  • Tanghong Qiu - taqiu@indiana.edu