|
Project Information
|
UsherAccording to the abstract...Data quality is a critical problem in modern databases. Data entry forms present the first and arguably best opportunity for detecting and mitigating errors, but there has been little research into automatic methods for improving data quality at entry time. Usher is an end-to-end system for form design, entry, and data quality assurance. Using previous form submissions, Usher learns a probabilistic model over the questions of the form. Usher then applies this model at every step of the data entry process to improve data quality. Before entry, it induces a form layout that captures the most important data values of a form instance as quickly as possible, and simplifies complex error-prone questions. During entry, it dynamically adapts the form to the values being entered by providing real-time interface feedback, re-asking questions with dubious responses, and simplifying questions via reformulation. After entry, it revisits question responses that it deems likely to have been entered incorrectly by re-asking the question or a reformulation of the question. Usher can improve data quality considerably at a reduced cost when compared to current practice. Academic papers
Authors |