Research, Projects & Publications
Our research is aimed at developing and disseminating educational content, including courses, case studies, and online resources, aimed at fostering ethical literacy and equipping students and data professionals with the knowledge and tools necessary to navigate ethical considerations in AI, machine learning, and data-driven systems.
Current Research Areas
AI and Data Ethics Case Studies
With the rapid rise of AI tools and technologies, it is crucial that current and future policymakers, business leaders, and data/technology professionals have a strong ethical foundation. Our research team has embarked on a project to aggregate and analyze real-world case studies related to AI ethics, algorithmic bias, privacy, and other emerging issues. By curating a database of case studies we aim to develop educational materials that will prepare students in technical and non-technical fields to grapple with complex sociotechnical trade-offs.
-
We've designed our Ethics in Data Science course to touch on several different themes in AI and data ethics: algorithmic bias and fairness; privacy and data security; transparency and explainable AI; automation; the future of work; academic integrity and truth; copyright and intellectual property; environmental impact; and other unique ethical dilemmas introduced by generative models. As part of our course we've designed a set of case studies to provoke thoughtful discussions about each of these issues. There are several flavors of case studies including standard short-form case studies about real-world events that took place, and role-play-based case studies that require each student to play a role and attempt to analyze the issues from a very specific perspective. Some case studies also contain optional exercises at the end for students to complete on their own.
Case Studies
The following case studies were developed by Robert Clements and Hadley Dixon as part of a project to create extensible case studies for both laymen and technical audiences.
Facial Recognition for Policing: Real-life Negative Consequences of Biased Algorithms
Themes: Algorithmic bias and fairness; privacyVoice Assistants and Biometric Data: Amazon's Violations of Children's Privacy Rights
Themes: Privacy and data securityRacial Bias in Healthcare Data Solutions: Perpetuating Disparities in Medical Treatment Nationwide
Themes: Algorithmic bias and fairness; social impactFailures of LLM-Generated Text Detection: False Positives Dilute the Efficacy of AI Detection
Themes: Academic integrity and truthWhen Noone is Driving: Navigating Accountability via Cruise's Driverless Vehicles
Themes: Transparency and explainable AI; automation; future of workAI Art: Assistants, Replacements and Grifters
Themes: Future of work; copyright and intellectual propertyOther Resources
Grants
Craig Newmark Fund for Data Ethics, 2019
Established by generous donation of Craig Newmark, an internet entrepreneur best known for being the founder of Craigslist. He was awarded an honorary degree from USF in 2009. The Center for AI & Data Ethics (formerly CADE) aligns with Craig’s priorities to strengthen the foundation of trustworthy press and access for women in technology.
Projects
Every year, our faculty and graduate students in data science collaborate with organizations worldwide to tackle real-world data science and data engineering challenges. Below is a select list of projects with direct ethical considerations in AI and data science.
Along with working on these projects, some of our students have written up formal ethical assessments as part of the Ethics in Data Science course, a few of which we've included below.
-
ACLU of Northern California
Student Team: Ian Duke, Ho Nam Tong
Faculty Mentor: Robert Clements
Company Liaison: Dylan Verner-Crist
Project Outcomes: Students employed an array of data science methods to automate body camera review for a class action case related to the racial profiling of drivers in Siskiyou, California. Using computer vision in Python, they created a program to automatically link body camera videos with written police reports. By relying on machine learning and natural language processing, they developed models to identify interactions containing police misconduct characteristic of pretextual stops. In partnership with USF, the ACLU’s Lead Investigator was able to review large amounts of body camera footage in real time—a task that would have been impossible with manual review alone.
Boston Children's Hospital
Mindful Machine Learning: Ethical Considerations for Data-Driven Epilepsy Research by Amadeo Cabanela
California Academy of Sciences
A reflection on bias, fairness, and environmental impact during my two projects at the California Academy of Sciences by Maricela Abarca
-
Candid
Student Team: Zemin Cai, Harrison Jinglun Yu
Faculty Mentor: Shan Wang
Company Liaison: Cathleen Clerkin
Project Outcomes: Candid's Insights department engaged students in impactful research projects in data ethics. These projects included an examination of diversity, equity, and inclusion within nonprofits, an exploration of nonprofits' societal impact, and an investigation into real-time grantmaking data, particularly in relation to issues like racial equity. Students were tasked with identifying factors influencing organizations' willingness to share demographic data and analyzing data to predict nonprofits' societal impact. Additionally, they explored methodologies to provide real-time insights into philanthropic trends while addressing potential biases and confounding factors. These projects harnessed various data science techniques and underscored the importance of ethical considerations in data analysis.
Kidas Inc.
Student Team: Raghavendra Kommavarapu
Faculty Mentor: Mustafa Hajij
Company Liaison: Amit Yungman
Project Outcomes: Students optimized point-of-interest detection algorithms, including hate speech and sexual content detection, using data and metadata. They attempted age detection in audio and text, emotion detection in audio and text, and voice changer detection in audio. Additionally, they worked on displaying data visualizations on personal pages based on user activity and algorithm results using Python.
YLabs (Youth Development Labs)
Student Team: Tejaswi Dasari
Faculty Mentor: Diane Woodbridge
Company Liaison: Robert On
Project Outcomes: In the CyberRwanda project, focused on enhancing the well-being and prospects of urban teenagers through digital education, students used various technologies and techniques to measure project progress and effectiveness. They employed Google Analytics to track engagement metrics and designed KPI dashboards for automatic data generation. However, challenges included manual data tracking, discrepancies between Google Analytics versions, and gaps in tracking product pick-ups. Integrating and utilizing data from different sources for decision-making was identified as a crucial goal.
-
ACLU
Our Team: Joleena Marshall
Faculty Mentor: Michael Ruddy
Company Liaisons: Linnea Nelson, Tedde Simon, Brandon Greene
Project Outcomes: The team developed a tool with Python to acquire and preprocess publicly-available data related to the Oakland Unified School District to investigate whether or not OUSD’s allocation of resources results in inequities between schools. The team also provided an updated data analysis on educational outcomes for indigenous students for a select number of Humboldt County unified school districts, including data visualizations.
California Forward
Our Team: Evie Klaassen
Faculty Mentor: Michael Ruddy
Company Liaison: Patrick Atwater
Project Outcomes: The team built a tool with Python to determine where high wage jobs are located in California. This tool serves as an extension to current data tools created and maintained by the organization. The team also developed a pipeline to clean and prepare new public data when it is released, and for the tool’s outputs to be regularly updated given any new data.
-
ACLU Criminal Justice
Our Team: Qianyun Li
Goal: At the ACLU, the student identified potential discrimination in school suspensions by performing feature importance analysis with machine learning models and statistical tests.
ACLU Micromobility
Our Team: Max Shinnerl
Goal: At the ACLU, the student analyzed COVID-19 vaccine equitable distribution data. They developed interactive maps with Leaflet to visualize shortcomings of the distribution algorithm and automated the cleaning of legislative record data. They also developed a pipeline for storing data to enable remote SQL queries using Amazon RDS and S3 from AWS.
-
Human Rights Data Analysis Group (HRDAG)
Our Team: Bing Wang
Goal: At the Human Rights Data Analysis Group (HRDAG), Bing gleaned critical location of death information from unstructured text fields in Arabic using Google Translate and Python Pandas, adding identifiable records to Syrian conflict data. She wrote R scripts and bash Makefiles to create blocks of similar records on killings in the Sri Lankan conflict to reduce the size of search space in the semi-supervised machine learning record linkage (database deduplication) process.
Publications
Research activities and publications by our faculty, accomplished fellows, and affiliates.
-
Toward Realignment: Big Tech, Organized Labor, and the Politics of the Future of Workby, Nantina Vgontzas, Sage Journals
-
- “Inside DeepMind's Secret Plot to Break Away From Google”, Business Insider
- “Social Media Content Moderation Is Not Neutral, USF Researcher Says”, SF Public Press
- “How to poison the data that Big Tech uses to surveil you”, Technology Review
- "To Live in Their Utopia: Why Algorithmic Systems Create Absurd Outcomes" by Ali Alkhatib at [CHI 2021] - also available as a video summary on YouTube
- "The politicization of face masks in the American public sphere during the COVID-19 pandemic" by Scoville, C., McCumber, A., Amironesei R., Jeon, J. at the American Sociological Association
- "On the Genealogy of Machine Learning Datasets: A Critical History of ImageNet" by Denton, E., Hanna, A., Amironesei, R., Smart, A., Nicole, H. at Big Data and Society
- "Notes on Problem Formulation" by Amironesei, R., Denton, E., Hanna, A. at IEEE Technology and Society Magazine Journal
- "Algorithmic Conservation in a Changing Climate" by Scoville, C., Chapman, M., Amironesei, R., and Boettiger, C. at Current Opinion in Sustainability Journal
- "'You Can’t Sit With Us': Exclusionary Pedagogy in AI Ethics Education" by Raji, I.D., Scheuerman, M.K., Amironesei, R. at FAccT
- "Genealogy, Archeology, Hermeneutics: Techniques of Interpretation in Machine Learning Datasets," by Amironesei, R., Denton, E., Hanna, A. at [IEEESSIT]
- "Bringing the People Back In: Contesting Benchmark Machine Learning Datasets" by Denton, E., Hanna, A., Amironesei, R., Smart, A., Nicole, H., Scheuerman, M.K. at arXiv
-
- “The Dark Side of Big Tech’s Funding for AI Research”
- “Google workers reject company's account of AI researcher's exit as anger grows”
- “Is your boss spying on you while you work remotely?”
- “Open Letters by Tech Industry, Google Employees Criticize Google’s Lack of Transparency in AI Research”
- "Bridging Data Science with Ignatian Spirituality"
Make A Gift
Your support plays a pivotal role in shaping the trajectory of ethical discussions and practices within the field, empowering us to lead the way in responsible AI and data ethics.