List of implementations of differentially private analyses

(Redirected from Pinq)

Since the advent of differential privacy, a number of systems supporting differentially private data analyses have been implemented and deployed. This article tracks real-world deployments, production software packages, and research prototypes.

Real-world deployments

edit

Name Organization Year Introduced Notes Still in use?
OnTheMap: Interactive tool for exploration of US income and commute patterns.[1][2] US Census Bureau 2008 First deployment of differential privacy Yes
RAPPOR in Chrome Browser to collect security metrics[3][4] Google 2014 First widespread use of local differential privacy No
Emoji analytics; analytics. Improve: QuickType, emoji; Spotlight deep link suggestions; Lookup Hints in Notes. Emoji suggestions, health type usage estimates, Safari energy drain statistics, Autoplay intent detection (also in Safari)[5] Apple 2017 Yes
Application telemetry[6] Microsoft 2017 Application usage statistics Microsoft Windows. yes
Flex: A SQL-based system developed for internal Uber analytics[7][8] Uber 2017 Unknown
2020 Census[9] US Census Bureau 2018 Yes
Audience Engagement API[10] LinkedIn 2020 Yes
Labor Market Insights[11] LinkedIn 2020 Yes
COVID-19 Community Mobility Reports[12] Google 2020 Unknown
Advertiser Queries[13] LinkedIn 2020
U.S. Broadband Coverage Data Set[14] Microsoft 2021 Unknown
College Scorecard Website IRS and Dept. of Education 2021 Unknown
Ohm Connect[15] Recurve 2021
Live Birth Dataset[16][17] Israeli Ministry of Health 2024 Yes

Production software packages

edit

These software packages purport to be usable in production systems. They are split in two categories: those focused on answering statistical queries with differential privacy, and those focused on training machine learning models with differential privacy.

Statistical analyses

edit
Name Developer Year Introduced Notes Still maintained?
Google's differential privacy libraries[18] Google 2019 Building block libraries in Go, C++, and Java; end-to-end framework in Go,.[19] Yes
OpenDP[20] Harvard, Microsoft 2020 Core library in Rust,[21] SDK in Python with an SQL interface. Yes
Tumult Analytics[22] Tumult Labs[23] 2022 Python library, running on Apache Spark. Yes
PipelineDP[24] Google, OpenMined[25] 2022 Python library, running on Apache Spark, Apache Beam, or locally. Yes
PSI (Ψ): A Private data Sharing Interface Harvard University Privacy Tools Project.[26] 2016 No
TopDown Algorithm[27] United States Census Bureau 2020 Production code used in the 2020 US Census. No

Machine learning

edit
Name Developer Year Introduced Notes Still maintained?
Diffprivlib[28] IBM[29] 2019 Python library. Yes
TensorFlow Privacy[30][31] Google 2019 Differentially private training in TensorFlow. Yes
Opacus[32] Meta 2020 Differentially private training in PyTorch. Yes

Research projects and prototypes

edit
Name Citation Year Published Notes
PINQ: An API implemented in C#. [33] 2010
Airavat: A MapReduce-based system implemented in Java hardened with SELinux-like access control. [34] 2010
Fuzz: Time-constant implementation in Caml Light of a domain-specific language. [35] 2011
GUPT: Implementation of the sample-and-aggregate framework. [36] 2012
 KTELO: A framework and system for answering linear counting queries. [37] 2018

See also

edit

References

edit
  1. ^ "OnTheMap". onthemap.ces.census.gov. Retrieved 29 March 2023.
  2. ^ Machanavajjhala, Ashwin; Kifer, Daniel; Abowd, John; Gehrke, Johannes; Vilhuber, Lars (April 2008). "Privacy: Theory meets Practice on the Map". 2008 IEEE 24th International Conference on Data Engineering. pp. 277–286. doi:10.1109/ICDE.2008.4497436. ISBN 978-1-4244-1836-7. S2CID 5812674.
  3. ^ Erlingsson, Úlfar. "Learning statistics with privacy, aided by the flip of a coin".
  4. ^ Erlingsson, Úlfar; Pihur, Vasyl; Korolova, Aleksandra (November 2014). "RAPPOR: Randomized Aggregatable Privacy-Preserving Ordinal Response". Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security. pp. 1054–1067. arXiv:1407.6981. Bibcode:2014arXiv1407.6981E. doi:10.1145/2660267.2660348. ISBN 9781450329576. S2CID 6855746.
  5. ^ Differential Privacy Team (December 2017). "Learning with Privacy at Scale". Apple Machine Learning Journal. 1 (8). {{cite journal}}: |last1= has generic name (help)
  6. ^ Ding, Bolin; Kulkarni, Janardhan; Yekhanin, Sergey (December 2017). "Collecting Telemetry Data Privately". 31st Conference on Neural Information Processing Systems: 3574–3583. arXiv:1712.01524. Bibcode:2017arXiv171201524D.
  7. ^ Tezapsidis, Katie (Jul 13, 2017). "Uber Releases Open Source Project for Differential Privacy".
  8. ^ Johnson, Noah; Near, Joseph P.; Song, Dawn (January 2018). "Towards Practical Differential Privacy for SQL Queries". Proceedings of the VLDB Endowment. 11 (5): 526–539. arXiv:1706.09479. doi:10.1145/3187009.3177733.
  9. ^ Abowd, John M. (August 2018). "The U.S. Census Bureau Adopts Differential Privacy". Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. p. 2867. doi:10.1145/3219819.3226070. hdl:1813/60392. ISBN 9781450355520. S2CID 51711121.
  10. ^ Rogers, Ryan; Subramaniam, Subbu; Peng, Sean; Durfee, David; Lee, Seunghyun; Santosh Kumar Kancha; Sahay, Shraddha; Ahammad, Parvez (2020). "LinkedIn's Audience Engagements API: A Privacy Preserving Data Analytics System at Scale". arXiv:2002.05839 [cs.CR].
  11. ^ Rogers, Ryan; Adrian Rivera Cardoso; Mancuhan, Koray; Kaura, Akash; Gahlawat, Nikhil; Jain, Neha; Ko, Paul; Ahammad, Parvez (2020). "A Members First Approach to Enabling LinkedIn's Labor Market Insights at Scale". arXiv:2010.13981 [cs.CR].
  12. ^ Aktay, Ahmet; Bavadekar, Shailesh; Cossoul, Gwen; Davis, John; Desfontaines, Damien; Fabrikant, Alex; Gabrilovich, Evgeniy; Gadepalli, Krishna; Gipson, Bryant; Guevara, Miguel; Kamath, Chaitanya; Kansal, Mansi; Lange, Ali; Mandayam, Chinmoy; Oplinger, Andrew; Pluntke, Christopher; Roessler, Thomas; Schlosberg, Arran; Shekel, Tomer; Vispute, Swapnil; Vu, Mia; Wellenius, Gregory; Williams, Brian; Royce J Wilson (2020). "Google COVID-19 Community Mobility Reports: Anonymization Process Description (Version 1.1)". arXiv:2004.04145 [cs.CR].
  13. ^ Rogers, Ryan; Subbu, Subramaniam; Peng, Sean; Durfee, David; Lee, Seunghyun; Kancha, Santosh Kumar; Sahay, Shraddha; Ahammad, Parvez (2020). "LinkedIn's Audience Engagements API: A Privacy Preserving Data Analytics System at Scale". arXiv:2002.05839 [cs.CR].
  14. ^ Pereira, Mayana; Kim, Allen; Allen, Joshua; White, Kevin; Juan Lavista Ferres; Dodhia, Rahul (2021). "U.S. Broadband Coverage Data Set: A Differentially Private Data Release". arXiv:2103.14035 [cs.CR].
  15. ^ "EDP". EDP. Retrieved 29 March 2023.
  16. ^ Hod, Shlomi; Canetti, Ran (2024). "Differentially Private Release of Israel's National Registry of Live Births". arXiv:2405.00267 [cs.CR].
  17. ^ "Live Birth Dataset (Hebrew)". data.gov.il. Retrieved 2 May 2024.
  18. ^ "Google's differential privacy libraries". GitHub. 3 February 2023.
  19. ^ "Differential-privacy/Privacy-on-beam at main · google/Differential-privacy". GitHub.
  20. ^ "OpenDP". opendp.org. Retrieved 29 March 2023.
  21. ^ "OpenDP Library". GitHub.
  22. ^ "Tumult Analytics". www.tmlt.dev. Retrieved 29 March 2023.
  23. ^ "Tumult Labs | Privacy Protection Redefined". www.tmlt.io. Retrieved 29 March 2023.
  24. ^ "PipelineDP". pipelinedp.io. Retrieved 29 March 2023.
  25. ^ "OpenMined". www.openmined.org. Retrieved 29 March 2023.
  26. ^ Gaboardi, Marco; Honaker, James; King, Gary; Nissim, Kobbi; Ullman, Jonathan; Vadhan, Salil; Murtagh, Jack (June 2016). "PSI (Ψ): a Private data Sharing Interface".
  27. ^ "DAS 2020 Redistricting Production Code Release". GitHub. 22 June 2022.
  28. ^ "Diffprivlib v0.5". GitHub. 17 October 2022.
  29. ^ Holohan, Naoise; Braghin, Stefano; Pól Mac Aonghusa; Levacher, Killian (2019). "Diffprivlib: The IBM Differential Privacy Library". arXiv:1907.02444 [cs.CR].
  30. ^ Radebaugh, Carey; Erlingsson, Ulfar (March 6, 2019). "Introducing TensorFlow Privacy: Learning with Differential Privacy for Training Data".
  31. ^ "TensorFlow Privacy". GitHub. 2019-08-09.
  32. ^ "Opacus · Train PyTorch models with Differential Privacy". opacus.ai. Retrieved 29 March 2023.
  33. ^ McSherry, Frank (1 September 2010). "Privacy integrated queries" (PDF). Communications of the ACM. 53 (9): 89–97. doi:10.1145/1810891.1810916. S2CID 52898716.
  34. ^ Roy, Indrajit; Setty, Srinath T.V.; Kilzer, Ann; Shmatikov, Vitaly; Witchel, Emmett (April 2010). "Airavat: Security and Privacy for MapReduce" (PDF). Proceedings of the 7th Usenix Symposium on Networked Systems Design and Implementation (NSDI).
  35. ^ Haeberlen, Andreas; Pierce, Benjamin C.; Narayan, Arjun (2011). "Differential Privacy Under Fire". 20th USENIX Security Symposium.
  36. ^ Mohan, Prashanth; Thakurta, Abhradeep; Shi, Elaine; Song, Dawn; Culler, David E. "GUPT: Privacy Preserving Data Analysis Made Easy". Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data. pp. 349–360. doi:10.1145/2213836.2213876. S2CID 2135755.
  37. ^ Zhang, Dan; McKenna, Ryan; Kotsogiannis, Ios; Hay, Michael; Machanavajjhala, Ashwin; Miklau, Gerome (June 2018). "EKTELO: A Framework for Defining Differentially-Private Computations". Proceedings of the 2018 International Conference on Management of Data. pp. 115–130. arXiv:1808.03555. doi:10.1145/3183713.3196921. ISBN 9781450347037. S2CID 5033862.