From 1d0ada0697a7134dfa79495e7d0eec0bd78c8fbe Mon Sep 17 00:00:00 2001 From: Korey Stegared-Pace Date: Thu, 26 Oct 2023 22:31:58 +0200 Subject: [PATCH] Added Chris's Edits --- .../README.md | 70 ++++++++++++++----- 04-prompt-engineering-fundamentals/README.md | 59 +++++++++------- 07-building-chat-applications/README.md | 46 ++++++++++-- 08-building-search-applications/README.md | 16 ++--- .../README.md | 61 ++++++++++------ 12-designing-ux-for-ai-applications/README.md | 36 +++++++++- 6 files changed, 207 insertions(+), 81 deletions(-) diff --git a/02-exploring-and-comparing-different-llms/README.md b/02-exploring-and-comparing-different-llms/README.md index 7da6b4a8f..3c4f87cdb 100644 --- a/02-exploring-and-comparing-different-llms/README.md +++ b/02-exploring-and-comparing-different-llms/README.md @@ -6,22 +6,22 @@ ## Introduction -With the previous lesson, we have seen how Generative AI is changing the technology landscape, how Large Language Models (LLMs) work and how a business - like the Edu4All startup - can apply them to their use cases and grow! +With the previous lesson, we have seen how Generative AI is changing the technology landscape, how Large Language Models (LLMs) work and how a business - like our startup - can apply them to their use cases and grow! In this chapter, we're looking to compare and contrast different types of large language models, LLMs to understand their pros and cons. The next step in our startup's journey is exploring the current landscape of Large Language Models (LLMs) and understanding which are suitable for our use case. This lesson will cover: -- Different types of LLMs in the current landscape -- Testing, iterating, and comparing different models for your use case in Azure -- How to deploy an LLM +- Different types of LLMs in the current landscape. +- Testing, iterating, and comparing different models for your use case in Azure. +- How to deploy an LLM. ## Learning Goals After completing this lesson, you will be able to: -- Select the right model for your use case -- Understand how to test, iterate, and improve performance of your model -- Know how Business deploy models +- Select the right model for your use case. +- Understand how to test, iterate, and improve performance of your model. +- Know how businesses deploy models. ## Understand different types of LLMs @@ -30,10 +30,9 @@ Large Language Models (LLMs) can have multiple categorizations based on their ar ### Foundation Models versus LLMs The term Foundation Model was [coined by Stanford researchers](https://arxiv.org/abs/2108.07258) and defined as an AI model that follows some criteria, such as: -- They are trained using unsupervised learning or self-supervised learning, meaning they are trained on unlabeled multimodal data, and they do not require human annotation or labeling of data for their training process. -- They are very large models, based on very deep neural networks trained on billions of parameters. -- They are normally intended to serve as a ‘foundation’ for other models, meaning they can be used as a starting point for other models to be built on top of, which can be done by fine-tuning. -Now, since foundation models have taken shape most strongly in the natural language processing domain, it’s common to use the terms foundation model and LLM interchangeably. However, to be precise, LLMs are a type of foundation model, usually trained on text data, that could be specialized for specific use cases, such as text summarization, translation, or question answering. In other words, not all foundation models are LLMs and LLMs can be seen as language-focused foundation models. +- **They are trained using unsupervised learning or self-supervised learning**, meaning they are trained on unlabeled multimodal data, and they do not require human annotation or labeling of data for their training process. +- **They are very large models**, based on very deep neural networks trained on billions of parameters. +- **They are normally intended to serve as a ‘foundation’ for other models**, meaning they can be used as a starting point for other models to be built on top of, which can be done by fine-tuning. ![Foundation Models versus LLMs](./images/FoundationModel.png) @@ -115,13 +114,36 @@ Most of the models we mentioned in previous paragraphs (OpenAI models, open sour ![Model deployment](./images/Llama4.png) -## Deploying LLMs +## Selecting appropriate LLM model within GPT model family +There are many different types of LLM models, your choice of model depends on what you aim to use them for, your data, how much you're ready to pay and more. + +Depending on if you aim to use the models for text, audio, video, image generation and so on, you might opt for a differen type of model. + +- **Audio and speech recognition**. For this purpose, Whisper-type models are a great choice as they're general-purpose and aimed at speech recognition. It's trained on diverse audio and can perform multilingual speech recognition. As an example, you can use everything from a cheaper, but capable model like curie to the more costly but performat davinci type model. Learn more about [Whisper type models here](https://platform.openai.com/docs/models/whisper). + +- **Image generation**. For image generation, DALL-E and Midjourney are two very known choices. DALL-E is offered by Azure OpenAI. [Read more about DALL-E here](https://platform.openai.com/docs/models/dall-e) and also in Chapter 9 of this curriculum + +- **Text generation**. Most models are trained on text generation and you have a large variety of choices from GPT-3, GPT-3.5 to GPT-4. They come at different costs with GPT-4 being the most expensive. It's worth looking into the [Azure Open AI playground](https://oai.azure.com/portal/playground) to evaluate which models best fit your needs in terms of capability and cost. + +Selecting a model means you get some basic capabilties, that might not be enough however. Often you have company specific data that you somehow need to tell the LLM about. There are a few different choices on how to approach that, more on that in the upcoming section. + +## Improving LLM results We’ve explored with the Edu4All team different kinds of LLMs and a Cloud Platform (Azure Machine Learning) enabling us to compare different models, evaluate them on test data, improve performance and deploy them on inference endpoints. But when shall they consider fine-tuning a model rather than using a pre-trained one? Are there other approaches to improve model performance on specific workloads? + + +There are several approaches a business can use to get the results they need from an LLM, you can select different types of models with different degrees of training + +deploy an LLM in production, with different levels of complexity, cost, and quality. Here's some different approaches: + +- **Prompt engineering with context**. The idea is to provide enough context when you prompt to ensure you get the responses you need. + +- **Retrieval Augmented Generation, RAG**. Your data might exist in a database or web endpoint for example, to ensure this data, or a subset of it, is included at the time of prompting, you can fetch the relevant data and make that part of the users prompt. + +- **Fine-tuned model**. Here, you trained the model further on your own data which leads to the model being more exact and responsive to your needs but might be costly. -There are several approaches a business can use to deploy an LLM in production, with different levels of complexity, cost, and quality. Let’s look at them. ![LLMs deployment](./images/Deploy.png) @@ -130,6 +152,7 @@ Img source: [Four Ways that Enterprises Deploy LLMs | Fiddler AI Blog](https://w ### Prompt Engineering with Context Pre-trained LLMs work very well on generalized natural language tasks, even by calling them with a short prompt, like a sentence to complete or a question – the so-called “zero-shot” learning. + However, the more the user can frame their query, with a detailed request and examples – the Context – the most accurate and closest to user’s expectations the answer will be. In this case, we talk about “one-shot” learning if the prompt includes only one example and “few shot learning” if it includes multiple examples. Prompt engineering with context is the most cost-effective approach to kick-off with. @@ -140,14 +163,29 @@ This can be overcome through RAG, a technique that augments prompt with external This technique is very helpful when a business doesn’t have enough data, enough time, or resources to fine-tune an LLM, but still wishes to improve performance on a specific workload and reduce risks of hallucinations, i.e., mystification of reality or harmful content. ### Fine-tuned model + Fine-tuning is a process that leverages transfer learning to ‘adapt’ the model to a downstream task or to solve a specific problem. Differently from few-shot learning and RAG, it results in a new model being generated, with updated weights and biases. It requires a set of training examples consisting of a single input (the prompt) and its associated output (the completion). This would be the preferred approach if: -- A business would like to use fine-tuned less capable models (like embedding models) rather than high performance models, resulting in a more cost effective and fast solution. -- Latency is important for a specific use-case, so it’s not possible to use very long prompts or the number of examples that should be learnt from the model doesn’t fit with the prompt length limit. -- A business has a lot of high-quality data and ground truth labels and the resources required to maintain this data up to date over time. + +- **Using fine-tuned models**. A business would like to use fine-tuned less capable models (like embedding models) rather than high performance models, resulting in a more cost effective and fast solution. + +- **Considering latency**. Latency is important for a specific use-case, so it’s not possible to use very long prompts or the number of examples that should be learnt from the model doesn’t fit with the prompt length limit. + +- **Staying up to date**. A business has a lot of high-quality data and ground truth labels and the resources required to maintain this data up to date over time. + ### Trained model Training an LLM from scratch is without a doubt the most difficult and the most complex approach to adopt, requiring massive amounts of data, skilled resources, and appropriate computational power. This option should be considered only in a scenario where a business has a domain-specific use case and a large amount of domain-centric data. +## Knowledge check + +Q1 For the following use case, what could be a good approach to improve LLM completion results? + + 1. Prompt engineering with context + 1. A2: RAG + 1. A3: Fine-tuned model + +A:3, if you have the time and resources and high quality data, fine-tuning is the better option to stay up to date. However, if you're looking at improving things and you're lacking time it's worth considering RAG first. + ## Great Work, Continue Your Learning! diff --git a/04-prompt-engineering-fundamentals/README.md b/04-prompt-engineering-fundamentals/README.md index 0dc4760b5..19806b801 100644 --- a/04-prompt-engineering-fundamentals/README.md +++ b/04-prompt-engineering-fundamentals/README.md @@ -3,29 +3,35 @@ [![Prompt Engineering Fundamentals ](./img/genai_course_4[10].png)](https://youtu.be/r2ItK3UMVTk) +How you write your prompt to the LLM matters, a carefully crafted prompt can achieve achieve a better result than one that isn't. But what even are these concepts, prompt, prompt engineering and how do I improve what I send to the LLM? Questions like these is what this chapter and the upcoming chapter is looking to answer. + + _Generative AI_ is capable of creating new content (e.g., text, images, audio, code etc.) in response to user requests. It achieves this using _Large Language Models_ (LLMs) like OpenAI's GPT ("Generative Pre-trained Transformer") series that are trained for using natural language and code. Users can now interact with these models using familiar pardigms like chat, without needing any technical expertise or training. The models are _prompt-based_ - users send a text input (prompt) and get back the AI response (completion). They can then "chat with the AI" iteratively, in multi-turn conversations, refining their prompt till the response matches their expectations. "Prompts" now become the primary _programming interface_ for generative AI apps, telling the models what to do and influencing the quality of returned responses. "Prompt Engineering" is a fast-growing field of study that focuses on the _design and optimization_ of prompts to deliver consistent and quality responses at scale. -## 1.1 Learning Goals +## Learning Goals In this lesson, we learn what Prompt Engineering is, why it matters, and how we can craft more effective prompts for a given model and application objective. We'll understand core concepts and best practices for prompt engineering - and learn about an interactive Jupyter Notebooks "sandbox" environment where we can see these concepts applied to real examples. By the end of this lesson we will be able to: + 1. Explain what prompt engineering is and why it matters. 2. Describe the components of a prompt and how they are used. 3. Learn best practices and techniques for prompt engineering. 4. Apply learned techniques to real examples, using an OpenAI endpoint. -## 1.2 Learning Sandbox +## Learning Sandbox Prompt engineering is currently more art than science. The best way to improve our intuition for it is to _practice more_ and adopt a trial-and-error approach that combines application domain expertise with recommended techniques and model-specific optimizations. The Jupyter Notebook accompanying this lesson provides a _sandbox_ environment where you can try out what you learn - as you go, or as part of the code challenge at the end. To execute the exercises you will need: - 1. An OpenAI API key - the service endpoint for a deployed LLM. - 2. A Python Runtime - in which the Notebook can be executed. + +1. An OpenAI API key - the service endpoint for a deployed LLM. + +2. A Python Runtime - in which the Notebook can be executed. We have instrumented this repository with a _dev container_ that comes with a Python 3 runtime. Simply open the repo in GitHub Codespaces or on your local Docker Desktop, to activate the runtime automatically. Then open the notebook and select the Python 3.x kernel to prepare the Notebook for execution. @@ -33,7 +39,7 @@ The default notebook is setup for use with an OpenAI API Key. Simply copy the `. The notebook comes with _starter_ exercises - but you are encouraged to add your own _Markdown_ (description) and _Code_ (prompt requests) sections to try out more examples or ideas - and build your intuition for prompt design. -## 1.3 Our Startup +## Our Startup Now, let's talk about how _this topic_ relates to our startup mission to [bring AI innovation to education](https://educationblog.microsoft.com/2023/06/collaborating-to-bring-ai-innovation-to-education). We want to build AI-powered applications of _personalized learning_ - so let's think about how different users of our application might "design" prompts: @@ -53,19 +59,20 @@ Prompt Engineering. Define it and explain why it is needed. --> -## 1.4 What is Prompt Engineering? +## What is Prompt Engineering? We started this lesson by defining **Prompt Engineering** as the process of _designing and optimizing_ text inputs (prompts) to deliver consistent and quality responses (completions) for a given application objective and model. We can think of this as a 2-step process: - _designing_ the initial prompt for a given model and objective - _refining_ the prompt iteratively to improve quality of response This is necessarily a trial-and-error process that requires user intuition and effort for getting optimal results. So why is it important? To answer that question, we first need to understand three concepts: + - _Tokenization_ = how the model "sees" the prompt - _Base LLMs_ = how the foundation model "processes" a prompt - _Instruction-Tuned LLMs_ = how the model can now see "tasks" -### 1.4.1 Tokenization +### Tokenization An LLM sees prompts as a _sequence of tokens_ where different models (or versions of a model) can tokenize the same prompt in different ways. Since LLMs are trained on tokens (and not on raw text), the way prompts get tokenized has a direct impact on the quality of the generated response. @@ -73,7 +80,7 @@ To get an intuition for how tokenization works, try tools like the [OpenAI Token ![Tokenization](./img/4.0-tokenizer-example.png) -### 1.4.2 Concept: Foundation Models +### Concept: Foundation Models Once a prompt is tokenized, the primary function of the ["Base LLM"](https://blog.gopenai.com/an-introduction-to-base-and-instruction-tuned-large-language-models-8de102c785a6) (or Foundation model) is to predict the token in that sequence. Since LLMs are trained on massive text datasets, they have a good sense of the statistical relationships between tokens and can make that prediction with some confidence. Not that they don't understand the _meaning_ of the words in the prompt or token; they just see a pattern they can "complete" with their next prediction. They can continue predicting the sequence till terminated by user intervention or some pre-established condition. @@ -83,7 +90,7 @@ But what if the user wanted to see something specific that met some criteria or ![Base LLM Chat Completion](./img/4.0-playground-chat-base.png) -### 1.4.3 Concept: Instruction Tuned LLMs +### Concept: Instruction Tuned LLMs An [Instruction Tuned LLM](https://blog.gopenai.com/an-introduction-to-base-and-instruction-tuned-large-language-models-8de102c785a6) starts with the foundation model and fine-tunes it with examples or input/output pairs (e.g., multi-turn "messages") that can contain clear instructions - and the response from the AI attempt to follow that instruction. @@ -98,7 +105,7 @@ See how the result is now tuned to reflect the desired goal and format? An educa ![Instruction Tuned LLM Chat Completion](./img/4.0-playground-chat-instructions.png) -## 1.5 Why do we need Prompt Engineering? +## Why do we need Prompt Engineering? Now that we know how prompts are processed by LLMs, let's talk about _why_ we need prompt engineering. The answer lies in the fact that current LLMs pose a number of challenges that make _reliable and consistent completions_ more challenging to achieve without putting effort into prompt construction and optimization. For instance: @@ -113,7 +120,7 @@ Let's see this in action in the OpenAI or Azure OpenAI Playground: - Use the same prompt with different LLM deployments (e.g, OpenAI , Azure OpenAI, Hugging Face) - did you see the variations?. - Use the same prompt repeatedly with the _same_ LLM deployment (e.g., Azure OpenAI playground) - how did these variations differ? -### 1.5.1 Hallucinations Example +### Hallucinations Example Want to get a sense of how hallucinations work? Think of a prompt that instructs the AI to generate content for a non-existent topic (to ensure it is not found in the training dataset). For example - I tried this prompt: > **Prompt:** generate a lesson plan on the Martian War of 2076. @@ -138,7 +145,7 @@ As expected, each model (or model version) produces slightly different responses Prompt engineering techniques like _metaprompting_ and _temperature configuration_ may reduce model hallucinations to some extent. New prompt engineering _architectures_ also incorporate new tools and techniques seamlessly into the prompt flow, to mitigate or reduce some of these effects. -## 1.6 Case Study: GitHub Copilot +## Case Study: GitHub Copilot Let's wrap this section by getting a sense for how prompt engineering is used in real-world solutions by looking at one Case Study: [GitHub Copilot](https://github.com/features/copilot). @@ -167,11 +174,11 @@ Prompt Design. Illustrated with examples. --> -## 1.7 Prompt Construction +## Prompt Construction We've seen why prompt engineering is important - now let's understand how prompts are _constructed_ so we can evaluate different techniques for more effective prompt design. -### 1.7.1 Basic Prompt +### Basic Prompt Let's start with the basic prompt: a text input sent to the model with no other context. Here's an example - when we send the first few words of the US national anthem to the OpenAI [Completion API](https://platform.openai.com/docs/api-reference/completions) it instantly _completes_ the response with the next few lines, illustrating the basic prediction behavior. @@ -181,7 +188,7 @@ Let's start with the basic prompt: a text input sent to the model with no other | Oh say can you see | It sounds like you're starting the lyrics to "The Star-Spangled Banner," the national anthem of the United States. The full lyrics are ... | -### 1.7.2 Complex Prompt +### Complex Prompt Now let's add context and instructions to that basic prompt. The [Chat Completion API](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/chatgpt) lets us construct a complex prompt as a collection of _messages_ with - input/output pairs reflecting _user_ input and _assistant_ response @@ -202,7 +209,7 @@ response = openai.ChatCompletion.create( ) ``` -### 1.7.3 Instruction Prompt +### Instruction Prompt In the above examples, the user prompt was a simple text query that can be interpreted as a request for information. With _instruction_ prompts, we can use that text to specify a task in more detail, providing better guidance to the AI. Here's an example: @@ -212,7 +219,7 @@ In the above examples, the user prompt was a simple text query that can be inter |Write a description of the Civil War. Provide key dates and events and describe their significance | _returned a paragraph followed by a list of key event dates with descriptions_| Complex | |Write a description of the Civil War in 1 paragraph. Provide 3 bullet points with key dates and their significance. Provide 3 more bullet points with key historical figures and their contributions. Return the output as a JSON file | _returns more extensive details in a text box, formatted as JSON that you can copy-paste to a file and validate as needed_| Complex. Formatted. | -## 1.8 Primary Content +## Primary Content In the above examples, the prompt was still fairly open-ended, allowing the LLM to decide what part of its pre-trained dataset was relevant. With the _primary content_ design pattern, the input text is divided into two parts: - an instruction (action) @@ -224,7 +231,7 @@ Here's an example where the instruction is to "summarize this in 2 sentences". |:---|:---| | Jupiter is the fifth planet from the Sun and the largest in the Solar System. It is a gas giant with a mass one-thousandth that of the Sun, but two-and-a-half times that of all the other planets in the Solar System combined. Jupiter is one of the brightest objects visible to the naked eye in the night sky, and has been known to ancient civilizations since before recorded history. It is named after the Roman god Jupiter.[19] When viewed from Earth, Jupiter can be bright enough for its reflected light to cast visible shadows,[20] and is on average the third-brightest natural object in the night sky after the Moon and Venus.
**Summarize this in 2 short sentences** | Jupiter, the fifth planet from the Sun, is the largest in the Solar System and is known for being one of the brightest objects in the night sky. Named after the Roman god Jupiter, it's a gas giant whose mass is two-and-a-half times that of all other planets in the Solar System combined. | -The primary content segment can be used in various ways to drive more effective instructions. +The primary content segment can be used in various ways to drive more effective instructions: - **Examples** - instead of telling the model what to do with an explicit instruction, give it examples of what to do and let it infer the pattern. - **Cues** - follow the instruction with a "cue" that primes the completion, guiding the model towards more relevant responses. @@ -232,7 +239,7 @@ The primary content segment can be used in various ways to drive more effective Let's explore these in action. -### 1.8.1 Using Examples +### Using Examples This is an approach where you use the primary content to "feed the model" some examples of the desired output for a given instruction, and let it infer the patter for the desired output. Based on the number of examples provided, we can have zero-shot prompting, one-shot prompting, few-shot prompting etc. @@ -251,7 +258,7 @@ The prompt now consists of three components: Note how we had to provide an explicit instruction ("Translate to Spanish") in zero-shot prompting, but it gets inferred in the one-shot prompting example. The few-shot example shows how adding more examples allows models to make more accurate inferences with no added instructions. -### 1.8.2 Prompt Cues +### Prompt Cues Another technique for using primary content is to provide _cues_ rather than examples. In this case, we are giving the model a nudge in the right direction by _starting it off_ with a snippet that reflects the desired response format. The model then "takes the cue" to continue in that vein. @@ -263,7 +270,7 @@ Another technique for using primary content is to provide _cues_ rather than exa | | | | -### 1.8.3 Prompt Templates +### Prompt Templates A prompt template is a _pre-defined recipe for a prompt_ that can be stored and reused as needed, to drive more consistent user experiences at scale. In its simplest form, it is simply a collection of prompt examples like [this one from OpenAI](https://platform.openai.com/examples) that provides both the interactive prompt components (user and system messages) and the API-driven request format - to support reuse. @@ -272,7 +279,7 @@ In it's more complex form like [this example from LangChain](https://python.lang Finally, the real value of templates lies in the ability to create and publish _prompt libraries_ for vertical application domains - where the prompt template is now _optimized_ to reflect application-specific context or examples that make the responses more relevant and accurate for the targted user audience. The [Prompts For Edu](https://github.com/microsoft/prompts-for-edu) repository is a great example of this approach, curating a library of prompts for the education domain with emphasis on key objectives like lesson planning, curriculum design, student tutoring etc. -## 1.9 Supporting Content +## Supporting Content If we think about prompt construction as having a instruction (task) and a target (primary content), then _secondary content_ is like additional context we provide to **influence the output in some way**. It could be tuning parameters, formatting instructions, topic taxonomies etc. that can help the model _tailor_ its response to be suit the desired user objectives or expectations. @@ -296,11 +303,11 @@ What are some basic techniques for prompt engineering? Illustrate it with some exercises. --> -## 1.10 Prompting Best Practices +## Prompting Best Practices Now that we know how prompts can be _constructed_, we can start thinking about how to _design_ them to reflect best practices. We can think about this in two parts - having the right _mindset_ and applying the right _techniques_. -### 1.10.1 Prompt Engineering Mindset +### Prompt Engineering Mindset Prompt Engineering is a trial-and-error process so keep three broad guiding factors in mind: @@ -311,7 +318,7 @@ Prompt Engineering is a trial-and-error process so keep three broad guiding fact 3. **Iteration & Validation Matters.** Models are evolving rapidly, and so are the techniques for prompt engineering. As a domain expert, you may have other context or criteria _your_ specific application, that may not apply to the broader community. Use prompt engineering tools & techniques to jumpstart prompt construction, then iterate and validate the results using your own intuition and domain expertise. Record your insights and create a **knowledge base** (e.g, prompt libraries) that can be used as a new baseline by others, for faster iterations in future. -## 1.10.2 Best Practices +## Best Practices Now let's look at common best practices that are recommended by [Open AI](https://help.openai.com/en/articles/6654000-best-practices-for-prompt-engineering-with-openai-api) and [Azure OpenAI](https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/prompt-engineering#best-practices) practitioners. @@ -342,7 +349,7 @@ Link to a copy of that Notebook with the prompts filled in and run, showing what --> -## 1.11 Code Challenge +## Code Challenge Congratulations! You made it to the end of the lesson! It's time to put some of those concepts and techniques to the test with real examples! diff --git a/07-building-chat-applications/README.md b/07-building-chat-applications/README.md index 4946427e2..750f4d235 100644 --- a/07-building-chat-applications/README.md +++ b/07-building-chat-applications/README.md @@ -5,16 +5,27 @@ *(Click the image above to view video of this lesson)* -Chat applications have become integrated into our daily lives, offering more than just a means of casual conversation. They're integral parts of customer service, technical support, and even sophisticated advisory systems. It's likely that you've gotten some help from a chat application not too long ago. As we integrate more advanced technologies like generative AI into these platforms, the complexity increases and so does the challenges. How do we efficiently build and seamlessly integrate these AI-powered applications for specific use cases? Once deployed, how can we monitor and ensure that the applications are operating at the highest level of quality, both in terms of functionality and adhering to the [six principles of responsible AI](https://www.microsoft.com/ai/responsible-ai)? +Now that we've seen how we can build text-generation apps, let's look into chat applications. + +Chat applications have become integrated into our daily lives, offering more than just a means of casual conversation. They're integral parts of customer service, technical support, and even sophisticated advisory systems. It's likely that you've gotten some help from a chat application not too long ago. As we integrate more advanced technologies like generative AI into these platforms, the complexity increases and so does the challenges. + +Some questions we need answered are: + +- **Building the app**. How do we efficiently build and seamlessly integrate these AI-powered applications for specific use cases? +- **Monitoring**. Once deployed, how can we monitor and ensure that the applications are operating at the highest level of quality, both in terms of functionality and adhering to the [six principles of responsible AI](https://www.microsoft.com/ai/responsible-ai)? As we move further into an age defined by automation and seamless human-machine interactions, understanding how generative AI transforms the scope, depth, and adaptability of chat applications becomes essential. This lesson will investigate the aspects of architecture that support these intricate systems, delve into the methodologies for fine-tuning them for domain-specific tasks, and evaluate the metrics and considerations pertinent to ensuring responsible AI deployment. +## Introduction + This lesson covers: + - Techniques for efficiently building and integrating chat applications. - How to apply customization and fine-tuning to applications. - Strategies and considerations to effectively monitor chat applications. ## Learning Goals + By the end of this lesson, you'll be able to: - Describe considerations for building and integrating chat applications into existing systems. @@ -68,7 +79,7 @@ The above example uses the GPT-3.5 Turbo model to complete the prompt, but notic AuthenticationError: No API key provided. You can set your API key in code using 'openai.api_key = ', or you can set the environment variable OPENAI_API_KEY=). If your API key is stored in a file, you can point the openai module at it with 'openai.api_key_path = '. You can generate API keys in the OpenAI web interface. See https://platform.openai.com/account/api-keys for details. ``` -### User Experience (UX) +## User Experience (UX) General UX principles apply to chat applications, but here's some additional considerations that become particularly important due to the machine learning components involved. @@ -85,7 +96,7 @@ This "profile" prompts ChatGPT to create a lesson plan on linked lists. Notice t ![A prompt in ChatGPT for a lesson plan about linked lists](img/lesson_plan_prompt.png) -#### Microsoft's System Message Framework for Large Language Models +### Microsoft's System Message Framework for Large Language Models [Microsoft has provided guidance](https://learn.microsoft.com/azure/ai-services/openai/concepts/system-message#define-the-models-output-format) for writing effective system messages when generating responses from LLMs broken down into 4 areas: @@ -106,9 +117,32 @@ This "profile" prompts ChatGPT to create a lesson plan on linked lists. Notice t ## Customization and Fine-tuning for Domain-Specific Language Models -Imagine a chat application that understands your company's jargon and anticipates the specific queries its user base commonly has. Domain-specific language models (DSL Models) can enhance user engagement and by providing specialized, contextually relevant interactions. It's a model that is trained or fine-tuned to understand and generate text related to a specific field, industry, or subject. Options for using a DSL model can vary from training one from scratch, to using pre-existing ones through SDKs and APIs. Another option is fine-tuning, which involves taking an existing pre-trained model and adapting it for a specific domain. +Imagine a chat application that understands your company's jargon and anticipates the specific queries its user base commonly has. There are a couple of approaches worth mentioning: + +- **Leveraging DSL models**. DSL stands for domain specific language. You can leverage a so called DSL model trained on a specific domain to understand it's concepts and scenarios. +- **Apply fine-tuning**. Fine-tuning is the process to further train your model with specific data. + +### Using a DSL + +Leveraging a domain-specific language models (DSL Models) can enhance user engagement and by providing specialized, contextually relevant interactions. It's a model that is trained or fine-tuned to understand and generate text related to a specific field, industry, or subject. Options for using a DSL model can vary from training one from scratch, to using pre-existing ones through SDKs and APIs. Another option is fine-tuning, which involves taking an existing pre-trained model and adapting it for a specific domain. + +### Apply fine-tuning + + +Fine-tuning is often considered when a pre-trained model falls short in a specialized domain or specific task. + +For instance, medical queries are complex and require a lot of context. When a medical professional diagnoses a patient it's based on a variety of factors such as lifestyle or pre-existing conditions, and may even rely on recent medical journals to validate their diagnosis. In such nuanced scenarios, a general-purpose AI chat application cannot be a reliable source. + +**Scenario: a medical application** + +Consider a chat application designed to assist medical practitioners by providing quick references to treatment guidelines, drug interactions, or recent research findings. + +A general-purpose model might be adequate for answering basic medical questions or providing general advice, it may struggle with the following: + +- **Highly specific or complex cases**. For example, a neurologist might ask the application, "What are the current best practices for managing drug-resistant epilepsy in pediatric patients?" +- **Lacking recent advancements**. A general-purpose model could struggle to provide a current answer that incorporates the most recent advancements in neurology and pharmacology. -Fine-tuning is often considered when a pre-trained model falls short in a specialized domain or specific task. For instance, medical queries are complex and require a lot of context. When a medical professional diagnoses a patient it's based on a variety of factors such as lifestyle or pre-existing conditions, and may even rely on recent medical journals to validate their diagnosis. In such nuanced scenarios, a general-purpose AI chat application cannot be a reliable source. Consider a chat application designed to assist medical practitioners by providing quick references to treatment guidelines, drug interactions, or recent research findings. While the original, general-purpose model might be adequate for answering basic medical questions or providing general advice, it may struggle with highly specific or complex cases. For example, a neurologist might ask the application, "What are the current best practices for managing drug-resistant epilepsy in pediatric patients?" A general-purpose model could struggle to provide a current answer that incorporates the most recent advancements in neurology and pharmacology. In instances such as these, fine-tuning the model with a specialized medical dataset can significantly improve its ability to handle these intricate medical inquiries more accurately and reliably. This requires access to a large and relevant dataset that represents the domain-specific challenges and questions that need to be addressed. +In instances such as these, fine-tuning the model with a specialized medical dataset can significantly improve its ability to handle these intricate medical inquiries more accurately and reliably. This requires access to a large and relevant dataset that represents the domain-specific challenges and questions that need to be addressed. ## Considerations for a High Quality AI-Driven Chat Experience @@ -147,6 +181,8 @@ Microsoft's approach to Responsible AI has identified six principles that should | Transparency | AI systems should be understandable. | Provide clear documentation and reasoning for AI responses. | Users are more likely to trust a system if they can understand how decisions are made. | | Accountability | People should be accountable for AI systems. | Establish a clear process for auditing and improving AI decisions. | Enables ongoing improvement and corrective measures in case of mistakes. | +## Coding Challenge +See [assignment](./notebook-azure-openai.ipynb) it will take you through a series of exercises from running your first chat prompts, to classifying and summaring text and more. diff --git a/08-building-search-applications/README.md b/08-building-search-applications/README.md index 629c934e0..f80c79b5c 100644 --- a/08-building-search-applications/README.md +++ b/08-building-search-applications/README.md @@ -6,7 +6,7 @@ There's more to LLMs than chatbots and text generation. It's also possible to build search applications using Embeddings. Embeddings are numerical representations of data also known as vectors, and can be used for semantic search for data. -In this lesson, you are going to build a search application for our startup. Our startup is a non-profit organization that provides free education to students in developing countries. Our startup has a large number of YouTube videos that students can use to learn about AI. Our startup wants to build a search application that allows students to search for a YouTube video by typing a question. +In this lesson, you are going to build a search application for our education startup Edu4All. Our startup is a non-profit organization that provides free education to students in developing countries. Our startup has a large number of YouTube videos that students can use to learn about AI. Our startup wants to build a search application that allows students to search for a YouTube video by typing a question. For example, a student might type in 'What are Jupyter Notebooks?' or 'What is Azure ML' and the search application will return a list of YouTube videos that are relevant to the question, and better still, the search application will return a link to the place in the video where the answer to the question is located. @@ -23,10 +23,10 @@ In this lesson, we will cover: After completing this lesson, you will be able to: -- The difference between semantic and keyword search. +- Tell the difference between semantic and keyword search. - Explain what Text Embeddings are. -- Explain how to use Embeddings to search for data. -- Create an application that uses Embeddings to search for data. +- Create an application using Embeddings to search for data. + ## Why build a search application? @@ -36,7 +36,7 @@ The lesson includes an Embedding Index of the YouTube transcripts for the Micros The following is an example of a semantic query for the question 'can you use rstudio with azure ml?'. Check out the YouTube url, you'll see the url contains a timestamp that takes you to the place in the video where the answer to the question is located. -![](media/query_results.png) +![Semantic query for the question "can you use rstudio with Azure ML"](media/query_results.png) ## What is semantic search? @@ -92,19 +92,17 @@ Next, we're going to learn how to build a search application using Embeddings. T This solution was built and tested on Windows 11, macOS, and Ubuntu 22.04 using Python 3.10 or later. You can download Python from [python.org](https://www.python.org/downloads/). -## Assignment - let's enable students +## Assignment - building a search application, to enable students We introduced our startup at the beginning of this lesson. Now it's time to enable the students to build a search application for their assessments. -## Creating the Azure OpenAI Services - In this assignment, you will create the Azure OpenAI Services that will be used to build the search application. You will create the following Azure OpenAI Services. You'll need an Azure subscription to complete this assignment. ### Start the Azure Cloud Shell 1. Sign in to the [Azure portal](https://portal.azure.com/). 2. Select the Cloud Shell icon in the upper-right corner of the Azure portal. -3. Select Bash for the environment type. +3. Select **Bash** for the environment type. #### Create a resource group diff --git a/10-building-low-code-ai-applications/README.md b/10-building-low-code-ai-applications/README.md index 763f82829..967094d3f 100644 --- a/10-building-low-code-ai-applications/README.md +++ b/10-building-low-code-ai-applications/README.md @@ -7,11 +7,14 @@ ## Introduction +Now that we've learned how to build image generating applications, let's talk about low code. Generative AI can be used for a variety of different areas including low code, but what is low code and how can we add AI to it? + Building apps and solutions has become more easier for traditional developers and non-developers through the use of Low Code Development Platforms. Low Code Development Platforms enable you to build apps and solutions with little to no code. This is achieved by providing a visual development environment that enables you to drag and drop components to build apps and solutions. This enables you to build apps and solutions faster and with less resources. In this lesson, we dive deep into how to use Low Code and how to enhance low code development with AI using Power Platform. The Power Platform provides organizations with the opportunity to empower their teams to build their own solutions through an intuitive low-code or no-code environment. This environment helps simplify the process of building solutions. With Power Platform, solutions can be built in days or weeks instead of months or years. Power Platform consists of five key products: Power Apps, Power Automate, Power BI, Power Pages and Power Virtual Agents. This lesson covers: + - Introduction to Generative AI in Power Platform - Introduction to Copilot and how to use it - Using Generative AI to build apps and flows in Power Platform @@ -21,29 +24,42 @@ This lesson covers: By the end of this lesson, you will be able to: -- Understand how Copilot works in Power Platform +- Understand how Copilot works in Power Platform. + - Build a Student Assignment Tracker App for our education startup. -- Build an Invoice Processing Flow that uses AI to extract information from invoices -- Apply best practices when using the Create Text with GPT AI Model + +- Build an Invoice Processing Flow that uses AI to extract information from invoices. + +- Apply best practices when using the Create Text with GPT AI Model. The tools and technologies that you will use in this lesson are: -- Power Apps for the Student Assignment Tracker app, which provides a low-code development environment for building apps to track, manage and interact with data. -- Dataverse for storing the data for the Student Assignment Tracker app where Dataverse will provide a low-code data platform for storing the apps data. -- Power Automate for the Invoice Processing flow where you will have low-code development environment for building workflows to automate the Invoice Processing process. -- AI Builder for the Invoice Processing AI Model where you will use prebuilt AI Models to process the invoices for our startup. +- **Power Apps**, for the Student Assignment Tracker app, which provides a low-code development environment for building apps to track, manage and interact with data. + +- **Dataverse**, for storing the data for the Student Assignment Tracker app where Dataverse will provide a low-code data platform for storing the apps data. + +- **Power Automate**, for the Invoice Processing flow where you will have low-code development environment for building workflows to automate the Invoice Processing process. + +- **AI Builder**, for the Invoice Processing AI Model where you will use prebuilt AI Models to process the invoices for our startup. ## Generative AI in Power Platform -Enhancing low-code development and application with generative AI is a key focus area for Power Platform. The goal is to enable everyone to build AI-powered apps, sites, dashboards and automate processes with AI, without requiring any data science expertise. This is achieved by integrating generative AI into the low-code development experience in Power Platform in the form of Copilot and AI Builder. +Enhancing low-code development and application with generative AI is a key focus area for Power Platform. The goal is to enable everyone to build AI-powered apps, sites, dashboards and automate processes with AI, *without requiring any data science expertise*. This goal is achieved by integrating generative AI into the low-code development experience in Power Platform in the form of Copilot and AI Builder. + +### How does this work? + +Copilot is an AI assistant that enables you to build Power Platform solutions by describing your requirements in a series of conversational steps using natural language. You can for example instruct your AI assistant to state what fields your app will use and it will create both the app and the underlying data model or you could specify how to set up a flow in Power Automate. -How does this work? Copilot is an AI assistant that enables you to build Power Platform solutions by describing your requirements in a series of conversational steps using natural language. You can use Copilot driven functionalities as feature in your app screens to enable users to uncover insights through conversational interactions. AI Builder is a low-code AI capability available in Power Platform that enables you to use AI Models to help you to automate processes and predict outcomes. With AI Builder you can bring AI to your apps and flows that connect to your data in Dataverse or in various cloud data sources, such as SharePoint, OneDrive or Azure. +You can use Copilot driven functionalities as a feature in your app screens to enable users to uncover insights through conversational interactions. + + +AI Builder is a low-code AI capability available in Power Platform that enables you to use AI Models to help you to automate processes and predict outcomes. With AI Builder you can bring AI to your apps and flows that connect to your data in Dataverse or in various cloud data sources, such as SharePoint, OneDrive or Azure. Copilot is available in all of the Power Platform products: Power Apps, Power Automate, Power BI, Power Pages and Power Virtual Agents. AI Builder is available in Power Apps and Power Automate. In this lesson, we will focus on how to use Copilot and AI Builder in Power Apps and Power Automate to build a solution for our education startup. ### Copilot in Power Apps -As part of the Power Platform, Power Apps provides a low-code development environment for building apps to track, manage and interact with data. It is a suite of app development services with a scalable data platform and the ability to connect to cloud services and on-premises data. Power Apps allows you to build apps that run on browsers, tablets, and phones, and can be shared with co-workers. Power Apps eases users into app development with a simple interface, so that every business user or pro developer can build custom apps. The app development experience is also enhanced with Generative AI through Copilot. +As part of the Power Platform, Power Apps provides a low-code development environment for building apps to track, manage and interact with data. It's a suite of app development services with a scalable data platform and the ability to connect to cloud services and on-premises data. Power Apps allows you to build apps that run on browsers, tablets, and phones, and can be shared with co-workers. Power Apps eases users into app development with a simple interface, so that every business user or pro developer can build custom apps. The app development experience is also enhanced with Generative AI through Copilot. The copilot AI assistant feature in Power Apps enables you to describe what kind of app you need and what information you want your app to track, collect, or show. Copilot then generates a responsive Canvas app based on your description. You can then customize the app to meet your needs. The AI Copilot also generates and suggests a Dataverse Table with the fields you need to store the data you want to track and some sample data. We will look at what Dataverse is and how you can use it in Power Apps on this lesson later. You can then customize the table to meet your needs using the AI Copilot assistant feature through conversational steps. This feature is readily available from the Power Apps home screen. @@ -53,7 +69,7 @@ As part of the Power Platform, Power Automate lets users create automated workfl The copilot AI assistant feature in Power Automate enables you to describe what kind of flow you need and what actions you want your flow to perform. Copilot then generates a flow based on your description. You can then customize the flow to meet your needs. The AI Copilot also generates and suggests the actions you need to perform the task you want to automate. We will look at what flows are and how you can use them in Power Automate on this lesson later. You can then customize the actions to meet your needs using the AI Copilot assistant feature through conversational steps. This feature is readily available from the Power Automate home screen. -## Use Copilot to Build a Solution for Our Startup +## Assignment: manage student assignments and invoices for our startup, using Copilot Our startup provides online courses to students. The startup has grown rapidly and is now struggling to keep up with the demand for its courses. The startup has hired you as a Power Platform developer to help them build a low code solution to help them manage their student assignments and invoices. Their solution should be able to help them track and manage student assignments through an app and automate the invoice processing process through a workflow. You have been asked to use Generative AI to develop the solution. @@ -69,17 +85,18 @@ You will build the app using Copilot in Power Apps following the steps below: 2. Use the text area on the home screen to describe the app you want to build. For example, ***I want to build an app to track and manage student assignments***. Click on the **Send** button to send the prompt to the AI Copilot. - ![](images/copilot-chat-prompt-powerapps.png) + ![Describe the app you want to build](images/copilot-chat-prompt-powerapps.png) + 3. The AI Copilot will suggest a Dataverse Table with the fields you need to store the data you want to track and some sample data. You can then customize the table to meet your needs using the AI Copilot assistant feature through conversational steps. > **Important**: Dataverse is the underlying data platform for Power Platform. It is a low-code data platform for storing the apps data. It is a fully managed service that securely stores data in the Microsoft Cloud and is provisioned within your Power Platform environment. It comes with built-in data governance capabilities, such as data classification, data lineage, fine-grained access control, and more. You can learn more about Dataverse [here](https://docs.microsoft.com/en-us/powerapps/maker/data-platform/data-platform-intro?WT.mc_id=academic-109639-somelezediko). - ![](images/copilot-dataverse-table-powerapps.png) + ![Suggested fields in your new table](images/copilot-dataverse-table-powerapps.png) 4. Educators want to send emails to the students who have submitted their assignments to keep them updated on the progress of their assignments. You can use Copilot to add a new field to the table to store the student email. For example, you can use the following prompt to add a new field to the table: ***I want to add a column to store student email***. Click on the **Send** button to send the prompt to the AI Copilot. - ![](images/copilot-new-column.png) +![Adding a new field](images/copilot-new-column.png) 5. The AI Copilot will generate a new field and you can then customize the field to meet your needs. @@ -89,7 +106,7 @@ You will build the app using Copilot in Power Apps following the steps below: 8. For educators to send emails to students, you can use Copilot to add a new screen to the app. For example, you can use the following prompt to add a new screen to the app: ***I want to add a screen to send emails to students***. Click on the **Send** button to send the prompt to the AI Copilot. - ![](images/copilot-new-screen.png) +![Adding a new screen via a prompt instruction](images/copilot-new-screen.png) 9. The AI Copilot will generate a new screen and you can then customize the screen to meet your needs. @@ -125,15 +142,15 @@ To create a table in Dataverse using Copilot, follow the steps below: 2. On the left navigation bar, select on **Tables** and then click on **Describe the new Table**. - ![](images/describe-new-table.png) +![Select new table](images/describe-new-table.png) 3. On the **Describe the new Table** screen, use the text area to describe the table you want to create. For example, ***I want to create a table to store invoice information***. Click on the **Send** button to send the prompt to the AI Copilot. - ![](images/copilot-chat-prompt-dataverse.png) +![Describe the table](images/copilot-chat-prompt-dataverse.png) 4. The AI Copilot will suggest a Dataverse Table with the fields you need to store the data you want to track and some sample data. You can then customize the table to meet your needs using the AI Copilot assistant feature through conversational steps. - ![](images/copilot-dataverse-table.png) +![Suggested Dataverse table](images/copilot-dataverse-table.png) 5. The finance team want to send an email to the supplier to update them with the current status of their invoice. You can use Copilot to add a new field to the table to store the supplier email. For example, you can use the following prompt to add a new field to the table: ***I want to add a column to store supplier email***. Click on the **Send** button to send the prompt to the AI Copilot. @@ -145,7 +162,7 @@ To create a table in Dataverse using Copilot, follow the steps below: AI Builder is a low-code AI capability available in Power Platform that enables you to use AI Models to help you to automate processes and predict outcomes. With AI Builder you can bring AI to your apps and flows that connect to your data in Dataverse or in various cloud data sources, such as SharePoint, OneDrive or Azure. -### Prebuilt AI Models vs Custom AI Models +## Prebuilt AI Models vs Custom AI Models AI Builder provides two types of AI Models: Prebuilt AI Models and Custom AI Models. Prebuilt AI Models are ready-to-use AI Models that are trained by Microsoft and available in Power Platform.These help you add intelligence to your apps and flows without having to gather data and then build, train and publish your own models. You can use these models to automate processes and predict outcomes. @@ -162,9 +179,9 @@ Some of the Prebuilt AI Models available in Power Platform include: With Custom AI Models you can bring your own model into AI Builder so that it can function like any AI Builder custom model, allowing you to train the model using your own data. You can use these models to automate processes and predict outcomes in both Power Apps and Power Automate. When using your own model there are limitations that apply. Read more on these [limitations](https://learn.microsoft.com/en-us/ai-builder/byo-model#limitations). -![](images/ai-builder-models.png) +![AI builder models](images/ai-builder-models.png) -### Build an Invoice Processing Flow for Our Startup +## Assignment #2 - Build an Invoice Processing Flow for Our Startup The finance team has been struggling to process invoices. They have been using a spreadsheet to track the invoices but this has become difficult to manage as the number of invoices has increased. They have asked you to build a workflow that will help them process invoices using AI. The workflow should enable them to extract information from invoices and store the information in a Dataverse table. The workflow should also enable them to send an email to the finance team with the extracted information. @@ -211,7 +228,7 @@ To build a workflow that will help the finance team process invoices using the I > **Your homework**: The flow you just built is a good start, now you need to think of how you can build an automation that will enable our finance team to send an email to the supplier to update them with the current status of their invoice. Your hint: the flow must run when the status of the invoice changes. -### Use a Text Generation AI Model in Power Automate +## Use a Text Generation AI Model in Power Automate The Create Text with GPT AI Model in AI Builder enables you to generate text based on a prompt and is powered by the Microsoft Azure OpenAI Service. With this capability, you can incorporate GPT (Generative Pre-Trained Transformer) technology into your apps and flows to build a variety of automations and insightful applications. diff --git a/12-designing-ux-for-ai-applications/README.md b/12-designing-ux-for-ai-applications/README.md index 3cbe750dd..8810a8a4e 100644 --- a/12-designing-ux-for-ai-applications/README.md +++ b/12-designing-ux-for-ai-applications/README.md @@ -4,34 +4,50 @@ *(Click the image abvoe to view video of this lesson)* +User experience is a very important aspect of building apps. Users need to be able to use your app in an efficient way to perform tasks. Being efficient is one thing but you also need to design apps so that it an be used by everyone, to make it *accessible*. This chapter will focus on these area so you hopefully end up designing an app that people can and want to use. ### Introduction -User experience is how a user interacts with and uses a specific product or service be it a system, tool, or design. When developing AI applications, developers not only focus on ensuring the user experience is effective but also ethical. In this lesson, we cover how to build Artificial Intelligence (AI) applications that adresses user needs. The lesson will cover the following areas: + +User experience is how a user interacts with and uses a specific product or service be it a system, tool, or design. When developing AI applications, developers not only focus on ensuring the user experience is effective but also ethical. In this lesson, we cover how to build Artificial Intelligence (AI) applications that adresses user needs. + +The lesson will cover the following areas: * Introduction to User Experience and Undestanding User Needs * Designing AI Applications for Trust and Transparency * Designing AI Applications for Collaboration and Feedback +## Learning goals + +After taking this lesson, you'll be able to: + +- Build apps that are easy to use. +- Design apps that are accessible. + ### Prerequisite Take some time and read more about [user experience and design thinking.](https://learn.microsoft.com/en-us/training/modules/ux-design/) ## Introduction to User Experience and Understanding User Needs + In our fictitious education startup, we have two primary users, teachers and students. Each of the two users has unique needs. A user-centered design prioritizes the user ensuring the products are relevant and beneficial for those it is intended for. The application should be **useful, reliable, accessible and pleasant** to provide a good user experience. ### Usability + Being useful means that the application has functionality that matches its intended purpose, such as automating the grading process or generating flash cards for revision. An application that automates the grading process should be able to accurately and efficiently assign scores to students' work based on a predefined criteria. Similarly, an application that generates revision flash cards should be able to create relevant and diverse questions based on its data. ### Reliability + Being reliable means that the application can perform its task consistently and without errors. However, AI just like humans is not perfect and may be prone to errors. The applications may encounter errors or unexpected situations that require human intervention or correction. How do you handle errors? In the last section of this lesson, we will cover how AI systems and applications are designed for collaboration and feedback. ### Accessibility + Being accessible means extending the user experience to users with various abilities, including those with disabilities, ensuring no one is left out. By following accessibility guidelines and principles, AI solutions become more inclusive, usable, and beneficial for all users. ### Pleasant + Being pleasant means that the application is enjoyable to use. An appealing user experience can have positive impact on the user encouraging them to return to the application and increasing business revenue. ![image illustrating UX considerations in AI](images/uxinai.png) @@ -39,11 +55,13 @@ Being pleasant means that the application is enjoyable to use. An appealing user Not every challenge can be solved with AI. AI comes in to augment your user experience, be it automating manual tasks, or personalizing user experiences. ## Designing AI Applications for Trust and Transparency + Building trust is crtitical when designing AI applications. Trust ensures a user is confident that the application will get the work done, deliver results consistently and the results are what the user needs. A risk in this area is mistrust and overtrust. Mistrust occures when a user has little or no trust in an AI system, this leads to the user rejecting your application. Overtrust occurs when a user overestimates the capability of an AI system, leading to users trusting the AI system too much. For example, an automated grading system in the case of overtrust might lead the teacher not to proof through some of the papers to ensure the grading system works well. This could result in unfair or inaccurate grades for the students, or missed opportunities for feedback and improvement. - Two ways to ensure trust is put right at the centre of design is explainability and control. +Two ways to ensure trust is put right at the centre of design is explainability and control. ### Explainability + When AI helps inform decisions such as imparting knowledge to the future generations, it is critical for teachers and parents to understand how AI decisions are made. This is explainability - understanding how AI applications make decisions. Designing for explainability includes adding details of examples of what an AI application can do. For example, instead of "Get started with AI teacher", the system can use: "Summarize your notes for easier revision using AI." ![an app landing page with clear illustration of explainability in AI applications](images/explanability-in-ai.png) @@ -57,6 +75,7 @@ One last key part in explainability is simplification of explanations. Students ![simplified explanations on AI capabilities](images/simplified-explanations.png) ### Control + Generative AI creates a collaboration between AI and the user, where for instance a user can modify prompts for different results. Additionally, once an output is generated, users should be able to modify the results making them have a sense of control. For example, when using Bing, you can tailor your prompt based on format, tone and length. Additionally, you can add changes to your output and modify the output as shown below: ![](images/bing1.png) @@ -68,6 +87,7 @@ Another feature in Bing that allows a user to have control over the application > When designing AI applications, intentionality is key in ensuring users do not overtrust setting unrealistic expectations of its capabilities. One way to do this is by creating friction between the prompts and the results. Reminding the user, that this is AI and not a fellow human being ## Designing AI Applications for Collaboration and Feedback + As earlier mentioned generative AI creates a collaboration between the user and AI. Most engagements are with a user inputing a prompt and the AI generating an output. What if the output is incorrect? How does the application handle errors if they occur? Does the AI blame the user or takes time to explain the error? AI applications should be built in to receive and give feedback. This not only helps the AI system improve, but it also builds trust with the users. A feedback loop should be included in the design, an example can be a simple thumbs up or down on the output. @@ -80,7 +100,17 @@ System errors are common with applications where the user might need assistance AI applications are not perfect, therefore, they are bound to make mistakes. When designing your applications, you should ensure you create room for feedback from users and error handling in a way that is simple and easily explainable. -🚀 Challenge: create a user experience of how users would opt-in and opt-out data collection in the AI application. +## Assignment + +Take any apps you've built so far, consider implementing the below steps in your app: + +- **Pleasent**. Consider how you can make your app more pleasent. Are you adding explanations everywhere, are you encouraging the user to explore. How are you wording your error messages? + +- **Usability**. Building a web app. Make sure your app is navigable by both mouse and keyboard. + +- **Trust and transparency**. Don't trust the AI completely and it's output, consider how you would add a human to the process ot verify the output. Also consider and implement other ways to achieve trust and transparency. + +## Challenge: create a user experience of how users would opt-in and opt-out data collection in the AI application.