Scale
06/11/2018 - 16:30 to 17:10
Maschinenhaus
long talk (40 min)
Intermediate
Session abstract:
Before releasing a public dataset, practitioners need to thread the balance between utility and protection of individuals. In this talk we'll move from theory to real-life while handling massive public datasets. We'll showcase newly available tools that help with PII detection, and bring concepts like k-anonymity and l-diversity to a practical realm.
Related research: "Considerations for Sensitive Data within Machine Learning Datasets" - https://cloud.google.com/solutions/sensitive-data-and-ml-datasets