The next Information Science Breakfast meeting will be held on the coming Friday, October 13. Our speaker will be John Abowd, Professor of ILR and IS, and a member in the Getting Connected project of the Institute for Social Sciences (ISS).
- Who: John Abowd, Professor of ILR and IS
- When: Friday, October 13, 10-11am (coffee and bagels included)
- Where: 301 College Ave., large conference room
Title: Synthetic Data: We Made the Numbers Up, but It's OK to Use Them
Abstract: Synthetic data, the practice of releasing scientific data that are subject to privacy and confidentiality protections by sampling from the joint posterior predictive distribution of the confidential data, is gaining traction in the official statistical community. Visit http://lehdmap.dsd.census.gov/ to see the U.S. Census Bureau's first official synthetic data product, which I helped to design. There is a fundamental tension between the extent of confidentiality protection and the analytic usefulness of the released data. Economists model this tradeoff by using a production possibility frontier for the two "goods"--privacy protection and information content. (Statisticians call this the risk-utility tradeoff, but I'll explain why I prefer to reserve "utility" to describe the data user's preferences and not the data provider's technology.) I've got lots of examples, but you should feel free to suggest some of your own problems. If you want more background, visit INFO 747 at http://www.vrdc.cornell.edu/info747/.
For more information, http://www.cs.cornell.edu/~gl/is_breakfast_fall_2006.htm