AI Safety Discussion Group

Spring 2024 Application Form Open

Early 2023, a collection of the world’s greatest AI scientists, leaders, and thinkers came together to put out a statement on AI Safety: “Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war.”

Artificial intelligence (AI), particularly artificial general intelligence (AGI), may be one of the most impactful technologies developed this century. However, there are many open problems in present-day machine learning (ML) systems which could be exacerbated as systems’ capabilities become more advanced.

Finding acceptable solutions to these problems ahead of time will require a concerted effort from researchers, policymakers and others in the decades to come.

In this discussion group, you will learn what problems advanced Machine Learning systems pose, and how current research is addressing them.


Discussion Group Overview

New technologies present new opportunities for good, yet they are often also associated with novel risks. Advanced artificial intelligence (AI), particularly the prospect of artificial general intelligence (AGI), may be an important turning point for humanity. With open problems in the growing field of AI alignment, the development of AGI could pose extreme risks where humanity loses control of its future. We expect solving these problems ahead of time will require a concerted effort from researchers, policymakers and others in the decades to come.

During the programme, you will explore open research questions like:

  • How can we be sure to control intelligent algorithms that follow decision making processes too complex for humans to comprehend?

  • How do we govern the development of AI in such a way which maximises our chances of safely transitioning to a world with advanced AI?

We begin by introducing the idea of AGI, and examining the progress towards it so far (such as foundation models). We'll then investigate fundamental alignment problems such as Reward Misspecification and Goal Misgeneralization, some examples of them, and why they might lead to catastrophic outcomes. The latter half of the course covers four techniques which aim to prevent misalignment, and those techniques' limitations, followed by research which tries to understand ML systems at a deeper level, including interpretability and agent foundations.

By the end of this discussion group, you should be able to understand a range of agendas in AI Alignment, and make informed decisions about your next steps to engage with the field.

Discussion Format

We expect for each participant to spend around 3-4 hours weekly to engage with all of the materials. Each week, there will be a set of core and optional readings, as well as potential exercises to help you think through topics yourself. These insights are solidified by the weekly discussion groups, facilitated by experienced alumni.

Application Process

Interested participants need to fill out the application, due February 10th. Please spend less than 15 minutes on the required questions, this is not meant to be a long application!

Interviews will be conducted after the application period.

If you have questions about anything here, or about the program in general, join our slack and ask away!

Who are we?

We think that reducing risks from advanced artificial intelligence is potentially one of the most pressing problems of our time. Running events from introductory discussion groups to speaker events, we aim to find help talented UT Austin students interested in helping and connect them with opportunities, researchers, and other motivated UT Austin students.

Our members have gone on to work at places such as SERI MATS, CAIS, and collaborate with top AI researchers worldwide to conduct self-directed research and take AI Safety seriously.

What is AI Safety?

In the field of artificial intelligence (AI), AI alignment research aims to steer AI systems towards humans’ intended goals, preferences, or ethical principles. An AI system is considered aligned if it advances the intended objectives. A misaligned AI system is competent at advancing some objectives, but not the intended ones
— Wikipedia

This goal - of developing Aligned rather than Misaligned systems - turns out to be a very difficult technical problem. We will discuss this in-depth within our discussion group, and we talk about AI Safety frequently within our club. Join our Slack and say Hi!