For best experience please turn on javascript and use a modern browser!
You are using a browser that is no longer supported by Microsoft. Please upgrade your browser. The site may not present itself correctly if you continue browsing.
From 1 September, the University of Amsterdam is offering UvA AI Chat for students and lecturers. This page provides information on how UvA AI Chat can contribute to the dialogue on responsible use of AI, privacy-related issues, the involvement of external parties, and the handling of (personal) data by both the University and users. Finally, this page explains the current content moderation filters and the considerations and the considerations which motivate their use.

Enabling the dialogue on Responsible use of AI (RAI)

By offering UvA AI Chat, the UvA ensures compliance with existing legislation and its own policy frameworks. This provides a basis for dialogue on how the academic community can shape the responsible use of AI. Within this dialogue, UvA AI Chat and its project team can contribute by exploring how theoretical frameworks and guidelines can be brought to fruition through practical applications in higher education.

Read more on related legal and policy frameworks: 

Privacy, third party involvement and data handling

Data processed through UvA AI Chat, including chat content, are never accessible to parties outside the UvA and are not used to train models. Regardless of the model selected, all data are anonymised before transmission so that they cannot be linked to any specific individual.

Frequently asked questions

  • How does the data flow within UvA AI Chat generally work?
    1. The user submits a prompt in UvA AI Chat.
    2. Identifiers are removed or replaced, anonymising the prompt before data are forwarded to a model.
    3. The request is forwarded to the model endpoint (the network address to which a prompt is sent for processing). Within UvA AI Chat, several models are available; which endpoint is used depends on the model selected by the user. Depending on the provider of the model, the following applies:
      1. If the user selects a model offered by SURF, the request is processed on infrastructure under its direct management. (What is SURF?)
        As of writing, all models are still processed via Azure. However, we expect to be able to offer models via SURF very soon. 
      2. If the user selects a model whose processing takes place on Microsoft Azure servers, the request is forwarded to an endpoint approved for use within the UvA. This means that all data processing meets the same security requirements as other Microsoft services used by the UvA. The next section provides more information on what this entails.
    4. The model generates a result, which is returned to the user through an encrypted connection.

    Throughout this entire process, chat messages or other uploaded data are never used to train models. Nonetheless, existing regulations and policies on privacy and personal data remain fully applicable, including those related to Research Data Management and, in particular, the various legal requirements in health sciences practice, including but not limited to the WGBO (External link, Dutch) and the WMO (External link, Dutch). For example, never upload documents containing (lists of) names of students, patients or research participants. In case of doubt, always consult the information page on policy and regulations or contact your faculty’s designated data steward.

     

  • How do different providers handle data processing, such as Microsoft?

    The UvA itself determines which models are available in UvA AI Chat and is not dependent on a single provider. Requests for certain open-source models, such as LLama, are for example processed by SURF servers in the Netherlands. These data are not accessible to third parties or commercial actors.

    In addition, UvA AI Chat makes use of Microsoft Azure, among others, to process chat requests, for example when using OpenAI models. This does not mean that Microsoft or OpenAI has access to or ownership of the data. It is important to distinguish between infrastructure hosting and model inference (the generation of content): Microsoft supplies computing capacity via Azure, while the UvA manages the software services, such as language models, data storage and web hosting. This means that the content of the data is not visible to Microsoft or OpenAI. By way of comparison: the University already uses Microsoft 365 and OneDrive. Microsoft provides the platform, but documents remain private to the user and the University. Processing requests in Azure for UvA AI Chat operates within the same privacy frameworks.

    When, for example, an OpenAI model is used within UvA AI Chat, the anonymised prompt is sent to the corresponding inference service. Unlike on chatgpt.com, this service does not run on OpenAI servers but on Azure servers that comply with the ‘EU Data Boundary’. The EU Data Boundary is a safeguard implemented by Microsoft to ensure that personal data and customer data of European users are stored and processed exclusively within the EU. It exists to comply with European privacy legislation (such as the GDPR) and to prevent data from being accessible outside the EU region. This guarantees that neither Microsoft nor OpenAI has access to the content of conversations.

    Finally, OpenAI models outside OpenAI’s own environment or API are currently only available through Microsoft Azure. Microsoft is a major shareholder in OpenAI’s commercial branch and largely manages the intellectual property rights of OpenAI.

  • Are ‘big tech’ firms involved?

    As noted above, ‘big tech’ companies currently provide various managed model services, for example the OpenAI models offered by Microsoft. For other functionalities, such as implementing search engines, it is difficult for the UvA to not become reliant on any third party, as building such a service in-house is a difficult challenge for the UvA alone. It remains important to emphasise that where services are procured from third parties or commercial providers, it is ultimately the UvA that decides. UvA AI Chat has been developed in-house and is not, in the longer term, tied to any single (commercial) provider.

Basic system prompt and content filters: providing a balanced starting point

Academic freedom is a fundamental value for the University; hence the risk of censorship is therefore a constant concern. Developing a comprehensive legal or theoretical framework, and translating this by way of implementation in UvA AI Chat is a matter that requires care. A finalised, perhaps even continually developing formulation of such a framework is a long-term process. At the same time, the availability of UvA AI Chat as a practical environment is necessary to test and shape such a theoretical framework. For this reason, the University must already apply certain filters at the launch of UvA AI Chat in order to enable a safe yet workable starting point for this service. The explanation below provides transparency on the current filters in UvA AI Chat and sets out the rationale for these choices considering safety, academic freedom and the risk of censorship.

System prompt and personas

UvA AI Chat uses a more minimal ‘system prompt’ than many commercial services. A system prompt is the initial, usually hidden instruction that defines the role, behaviour, conditions and context of a GenAI model, and thus largely directs its responses during a conversation. This can lead the same model to produce different information or terminology in UvA AI Chat compared with other environments. As a result, the same model within UvA AI Chat may produce different information or terminology than in other environments. For instance, an OpenAI model within UvA AI Chat may respond differently to certain topics than it does on chatgpt.com. The system prompt is kept less restrictive so that limitations introduced for commercial purposes do not place unnecessary constraints on academic use.

The current approach to system prompts serves as a pragmatic starting point and is not final. For those who wish to experiment and explore what can be done using prompting, a ‘Persona’ feature is available within UvA AI Chat. With Persona’s, users can create predefined conversational roles that guide the tone and behaviour of a model. This feature can, for example, be used to experiment with forming base prompts that more closely align with research or educational objectives. Users can also personalise UvA AI Chat in terms of conversation style, and set custom instructions that apply to all chats outside Personas via the Settings menu (see: ‘Conversation Style’ and ‘Use Custom Instructions’ under Settings > Personalization).

See the UvA AI Chat user manual on how to use Persona’s

Content filters

In parallel with the aim of placing as few restrictions as possible on academic freedom, UvA AI Chat still applies some limitations on what can be done with models using input and output filters: to adhere to existing policy frameworks and avoid other risks not everything is allowed out of the gate. In addition to standard safeguards (for example on copyright and cybersecurity), the University applies restrictions on four themes that are closely linked to the implementation of responsible use. The following sections explain these filters, indicate the strength level for each, and provide the rationale for these choices.

Filter overview

Content filters are configurable safety filters that screen content in four areas according to different levels of severity: hate, sexual content, self-harm and violence. The aim of these filters is to ensure that UvA AI Chat supports academic use while preventing the production of harmful or inappropriate material.

  • Hate: detects insults or demeaning content about protected groups.
  • Sexual: detects explicit or pornographic material; allows non-explicit romance and clinical terms.
  • Self-harm: detects requests for methods, encouragement, or instructions; allows supportive, non-instructional discussion.
  • Violence: detects requests or instructions for harm; allows neutral, non-graphic historical summary.

Each filter has a ‘severity threshold’. For both incoming and outgoing messages, the severity of the content is assessed per theme (classified as low, medium or high). Depending on this classification, the filter is either activated or not. The filters operate as follows:

  • Low blocks low, medium, and high severity content.
  • Medium blocks medium and high, and allows low severity content.
  • High is the most permissive, it blocks only high severity content and allows the rest.

In short: the high threshold allows more, the low threshold allows less; it is a threshold for when the filter is activated, not for the content itself.

Input vs. output filters

  • Input filters check what a user pastes or asks before the model processes it. If the input breaches the threshold, the request is blocked at the start. Example: an explicit sexual request is blocked at the input stage.
  • Output filters check what the model is about to produce. If the draft response would breach the threshold, it is stopped or replaced with a safer alternative. Example: even if a lecturer pastes a hateful passage for analysis, the output filter prevents the model from generating slurs.

Separating input and output lets staff analyse sensitive source material where appropriate, while still preventing the system from generating harmful content.

 

Motivation and explanation per filter

  • Hate

     Flow

     Enabled

    Blocks?

     Threshold

     Triggers (examples)

     Passes (examples)

     Input

     No

     n/a

     n/a

     n/a (disabled)

    You can analyse a hateful prompt; output is still governed by output rules.

     Output

     Yes

     Yes

     Low

    Hateful insults toward a protected class in the model’s output.

    Neutral discussion of why hate speech is harmful without slurs.

    Motivation

    Staff and students can still quote or paste primary sources such as those that include slurs for analysis in teaching and research. The model itself remains barred from producing hateful output, which helps ensure a safe study environment.

    What if altered by the UvA

    • If input were enabled at low, pasting source texts that contain slurs would be blocked, which could hinder critical analysis in linguistics, history, law and media studies.
    • If output were relaxed to medium or disabled, the model could generate or amplify hateful content, creating compliance, safeguarding and reputational risks.
  • Sexual

     Flow

     Enabled

     Blocks?

     Threshold

     Triggers (examples)

     Passes (examples)

     Input

     Yes

     Yes

     Medium

    Explicit or erotic user input.

    Romance without explicit detail; clinical terms.

     Output

     Yes

     Yes

     Medium

    Generating explicit sexual content.

    Non-explicit intimacy; high-level sexual health.

    Motivation

    This allows legitimate work with literature, art, and sexual health materials while blocking pornographic or explicit content. It supports teaching and research that need neutral summaries or clinical discussion without sexual(ised) detail.

    What if altered by the UvA

    • If thresholds in both flows were tightened to low, even mild romance or standard sexual health language could be blocked, possibly disrupting teaching in literature, classics, medicine and social sciences.
    • If thresholds in both flows were raised to high or filters disabled, explicit material and sexualised descriptions could be processed or produced. Although this is not strictly always irrelevant for academic purposes, allowing such uses carries risks of violating existing policy standards.
  • Self-harm

     Flow

     Enabled

     Blocks?

     Threshold

     Triggers (examples)

     Passes (examples)

     Input

     Yes

     Yes

     Medium

    Requests for methods, intent, or encouragement.

    Support or awareness questions without instructions. 

     Output

     Yes

     Yes

     Medium

    Any instructional or encouraging self-harm output.

    Supportive, non-instructional guidance and signposting.

    Motivation

    Safeguards prevent acquisition or generation of methods for self-harm while still allowing discussion of epidemiology, prevention, ethics and policy, and enabling supportive, non-clinical signposting language where appropriate.

    What if altered by the UvA

    • If thresholds in both flows were relaxed to high or filters disabled, requests for methods could slip through, which poses clear safety and duty-of-care issues.
    • If thresholds in both flows were tightened to low, benign academic queries about prevention programmes or the history of public health campaigns might be blocked unnecessarily.
  • Violence

     Flow

     Enabled

     Blocks?

     Threshold

     Triggers (examples)

     Passes (examples)

     Input

     Yes

     Yes

     Low

    Any violent or instructive content, even mild.

    Historical discussion of events without violent instructions.

     Output

     Yes

     Yes

     Low

    Any violent output, even mild.

    Non-graphic historical summary without instructions.

    Motivation

    The setting prevents the model from accepting or producing instructions for harm while still allowing neutral, high-level treatment of violent events in history, politics, anthropology or literature.

    What if altered

    • If thresholds were raised to medium or filters disabled, queries or outputs could include practical guidance for violence or weaponisation, which is inappropriate and unsafe.
    • If tightened further beyond low, even neutral historical requests might be blocked, possibly impeding legitimate teaching and research.