Verantwortung der Informatik – Accountability In AI (AIAI 2023)

Prof. Dr. Andreas Polze
Weronika Wrazen
E-Mail: {firstname.lastname}@hpi.de
Dates: Tuesday, 13:30-15:00 Uhr, K-1.03

In this seminar, we will talk about the accountability of computer science in the area of artificial intelligence. Each week a student will give a presentation, in which different perspectives of accountability, ethics, fairness, transparency, auditability, explainability, interpretability, and regulation are introduced. After each presentation, a group discussion about the presented topic will take place. The presentation should be based on literature and statements from recognized domain experts, however, it should also include an assessment of the arguments and the opinion of the presenter. Each 45-minute presentation should be single-handedly prepared by a participant using primary- and secondary literature. In preparation for the presentation, each participant will schedule a consultation with the supervisors and email a draft of the slides one week before the date of the presentation.

Prerequisites

Students are expected to have basic knowledge in the areas of statistics, machine learning, and deep learning.

Grading

To earn 3 ECTS points, students must hand in their slides (including notes) after the presentation. Grading will be based on the quality of the presentation (and notes). Active participation in the weekly discussions is highly encouraged.

Schedule

The following schedule is just preliminary and will be subject to change during the semester. All updates will also be announced on the course mailing list.

Topics

We suggest the following list of topics to choose from, however students are free to suggest and choose their own topic.

This presentation will delve into the ethical dimensions surrounding Chat GPT and other Large Language Models (LLMs), shedding light on their multifaceted impacts. Students are encouraged to critically analyze five key aspects:

  1. Energy Consumption: Investigate the environmental footprint of training and deploying LLMs. How does their computational demand intersect with sustainability goals?
  2. Dual Use Concerns: Explore potential misuse scenarios, including spam and fraud. What safeguards can be implemented to mitigate these risks, ensuring responsible use?
  3. GDPR Compliance: Examine the use of user-generated content within the framework of General Data Protection Regulation. How can LLMs be employed while respecting individuals' privacy and data rights?
  4. Plagiarism: Investigate how LLMs may influence content originality and attribution. What are the ethical implications for academic and creative integrity?
  5. Global Communication Shifts: Analyze the broader societal effects on communication worldwide. How do LLMs impact information dissemination, linguistic diversity, and accessibility?
  6. Hallucination: Examine instances where LLMs generate false or misleading information. What are the ethical implications of such hallucinations, and how can they be addressed in real-world applications?
Literature:

In this topic you are supposed to examine how the rise of generative AI impacts professional artists. As generative AI systems, like GANs and neural networks, become increasingly proficient at producing art that rivals human creations, a host of intricate challenges emerge for artists. You'll explore how generative AI blurs the traditional lines of authorship and ownership, posing fundamental questions about who holds the rights to AI-generated artworks. This shift leads to copyright concerns and economic challenges for professional artists as their work faces new competition from AI counterparts. You are encouraged to research the dilemmas facing professional artists and society from the rise of generative AI, considering the cultural, ethical, legal, and economic implications, diminishing uniqueness, and exploring potential solutions to ensure a fair and harmonious coexistence between human creativity and AI.

Literature:

In this topic, you will delve into the concept of "Panic as a Service" (PaaS), a term that has emerged in the age of technology-driven anxiety. PaaS refers to the increasing prevalence of platforms, services, and technologies that exploit, amplify, or profit from societal fears, uncertainties, and doubts. You are encouraged to explore the dynamics of this phenomenon, examining the ways in which technology, media, and other entities leverage panic for various purposes. AI plays a significant role in "Panic as a Service" (PaaS) by amplifying, enabling, and sometimes even perpetuating the mechanisms behind this phenomenon. You'll analyze the different aspects oh how AI affects PaaS. Furthermore you can put a focus on AI as the central subject of panic. As we witness the rapid advancements in artificial intelligence and its increasing integration into our daily lives, questions and concerns surrounding the impact of AI on society's collective anxiety come to the forefront.

Literature:

In understanding the urgency for AI regulation, it is essential to assess the risks and benefits that call for a carefully structured approach. This entails striking a balance between fostering innovation and upholding ethical considerations. To delve into the realm of AI regulation, one must recognize the diverse array of stakeholders shaping this landscape. Governments, industry leaders, academia, and civil society all play integral roles in influencing policy and ensuring compliance. Exploring the methodologies and strategies behind crafting effective AI regulation unveils a complex interplay of technical, legal, and ethical considerations that form the bedrock of regulatory frameworks. Defining the parameters and scope of AI regulation is a critical endeavor. This involves discerning the specific technologies, applications, and industries that warrant focused regulatory attention. An in-depth analysis of present efforts in AI regulation, particularly the European Union's proposed AI Act, provides invaluable insights. This scrutiny includes an evaluation of its strengths, discernible limitations, and potential ramifications on a global scale. Engaging in a thorough debate regarding the pros and cons of AI regulation is paramount. This discourse necessitates contemplation of how regulation may both spur innovation and safeguard societal interests, while also being mindful of its potential inadvertent hindrance to progress.

Literature:

Biases in data can lead to different behaviour resulting from various ethnic and/or socio-demographic subpopulations. Especially for applications in medicine and healthcare, this can have critical effects on underrepresented groups. Which metrics or methods can be used to detect biases in data collection, pre-processing and/or model performance? What can be done to remove biases? What consequences can biases have?

Literature:

To ensure quality for a product, so-called Audits are carried out to check if the product meets certain standards and regulations. For AI for health, the requirements for safety and efficacy are particularly strict and difficult to evaluate. These evaluations include the analysis of model performance, privacy, robustness, fairness and bias, and interpretability among other things. Which processes could the evaluation part of an audit include? Which regulations or guidelines are available? How does AI for health differ from other applications regarding the evaluations?

Literature:

Most popular (social media) websites and apps use recommender systems to individually filter content and provide users with suggestions of movies to watch, news articles to read, music to listen to, etc. Suggestions are based on previous user interactions and optimized to match the users' interests, maximizing user engagement. By providing a constant feed of interesting content recommender systems can lead to the excessive use (addiction) of internet applications. On the other hand, recommender systems also create echo chambers, reinforcing the opinions of users, by only showing content that agrees with a user's preexisting opinion. These virtual echo chambers or "bubbles" make critical discourse much harder, because a ground truth, that both parties can agree on no longer exists. How can recommender systems be built to be less addicting, while still providing relevant content? How to pop the "bubbles"?

Literature:

Big, state-of-the-art neural networks need big datasets for training. How does this fit with the principles of the European data protection laws (GDPR)? How can we ensure data protection and progress in AI at the same time? What restrictions are already imposed on AI research and products by the GDPR? How can requirements such as the principle of data economy (Grundsatz der Datensparsamkeit) and the right to be forgotten be implemented? How can we ensure the privacy of uninvolved third parties such as relatives, which share parts of their genome?

Literature:

Nowadays, machine learning (ML) techniques are gaining more and more popularity because of their performance. With them, we are able to model complex phenomena receiving high accuracy of results. One of the limitations of ML models is the fact the mean value of the output is estimated. It means that we do not know how our model is confident about the result. Consequently, we - end users- are not sure if the model is trustworthy and safe. To overcome that problem the uncertainty of the prediction can be estimated. Uncertainty tells us with the given probability how far from the mean value the real value is expected to be. Uncertainty estimation allows users to decide if they can trust the model and when there is a need for an additional human decision.

Literature:

Artificial Intelligence (AI) is attracting the attention of more and more specialists, not only computer scientists and statisticians but also medical personnel, engineers, and economists. AI enables the modeling of complex phenomena receiving high accuracy of results. One of the limitations is the fact that AI models are “black boxes”. It means that it is difficult to explain the relationship between the input features and the output. Consequently, we - end users- are not sure if the model is trustworthy and safe. To overcome this problem different algorithms and approaches have been developed to explain predictions. It also allows estimating which features have the highest impact on the result. As a result, the users can assess the outcome with state-of-the-art knowledge and discover new patterns in an analyzed phenomenon.

Literature:

"Dual use goods are products and technologies normally used for civilian purposes but which may have military applications." [1] Like many technologies, AI has the potential to be used in military applications. Be it object detection used in autonomous killer drones to localize targets or deep fakes used to spread misinformation online. How can military use be avoided? Which areas should not be researched? Do we need a ban on "weaponizable" AI such as facial recognition?

Literature:

Training deep neural networks is energy-intensive. It is estimated that training GPT-3 required about 190 000 kWh. [2] Are advances in AI worth this environmental cost? Which experiments are worth running? How can we reduce the energy consumption or the number of experiments that need to be run?

Literature:

Synthetic Media is a catch-all term for the artificial production, manipulation, and modification of data and media by automated means. Synthetic media as a field has grown rapidly since the creation of generative adversarial networks, primarily through the rise of Deepfakes as well as music synthesis, text generation, human image synthesis, speech synthesis, and more. While this technology may bear positive effects, like cutting cost in production, it's rather obvious that it's possibilities are rather concerning. From misrepresenting well-known politicians, and thus misinforming the public, to social hacking. So far potential users needed to have at least some proficiency in coding, but apps like reface show that this technology is getting more accessible to non-experts. How doomed are we?

Literature:

Presentation

Your presentation should contain the following parts:

  • What is the topic?
  • How is it defined?
  • Are there multiple, different definitions?
  • Why is it important?
  • Present a method/paper/tool which addresses the problem
  • Check topic description/literature section for some suggestions
  • Explain the main idea
  • Highlight benefits and potential shortcomings
  • Provide 2-3 points/questions to start the interactive discussion