Defense Event

Reducing End-User Burden in Everyday Data Organization

Li Eric Qian

 
Tuesday, April 23, 2013
10:00am - 12:00pm
3725 BBB

 

About the Event

As digital data permeates every aspect of our daily life, end-users find it appealing to organize their everyday data electronically. In fact, people are already used to managing their personal data such as contact books and calendars in electronic devices. Meanwhile, the desire for organizing more information into the computer is expanding. Rather than record shopping lists and recipes on notes stick to the refrigerator, a household would prefer storing these information in their smartphones to bring to supermarkets. As online structured data sources such as Freebase and BigTable flourish, end-users would also like to leverage these sources to create their own data collections such as favorite movie libraries and travel wishlists. However, there is a major barrier to end-users’ electronically organizing their everyday data. The user has to first design a database according to his original data, and then to continuously digest new data sources into the database. This process involves various cognitive and operational burdens. First of all, when designing her data collection, the user has the burden to abstract her mental model of her real-life data into a reasonable database schema design. Second, when incorporating external data sources, there is a burden to understand the external data semantics as well as a burden to transform the data from those sources into the user’s own collection. Meanwhile, if the user wants to filter the data, she has the burden to understand and specify the selection condition. Finally, when existing sources are updated or additional sources are added, there is a burden to understand these updates and fuse them into her data collection. This dissertation introduces various approaches to help the end-user reduce these burdens in organizing their everyday data. To ease the birthing pain of creating new databases, the dissertation proposes a system with direct manipulation interface and user-friendly operators for the end-user to easily design and evolve her data schema. To facilitate incorporation of external data sources, a sample-driven schema mapping approach is introduced with a direct manipulation interface. Using this approach, the user can restfully provide sample instances in her collection and the system will automatically deduce the desired schema mapping from the external sources to her own collection. In a similar flavor, we propose an approach to facilitate the user derive selection conditions in an example-driven scenario. Finally, to help the user fuse source data updates into her own collection, the dissertation proposes a technique to automatically update the user data collection according to external source change, by conducting efficient incremental information integration.

Additional Information

Sponsor: H V Jagadish

Open to: Public