The agent leverages the Azure AI Agent service and utilizes file search for knowledge retrieval from uploaded files, enabling it to generate responses with citations. The solution also includes built-in monitoring capabilities with tracing to ensure easier troubleshooting and optimized performance.
SOLUTION OVERVIEW | GETTING STARTED | OTHER FEATURES | RESOURCE CLEAN-UP | GUIDANCE | TROUBLESHOOTING
This solution deploys a web-based chat application with an AI agent running in Azure Container App.
The agent leverages the Azure AI Agent service and utilizes Azure AI Search for knowledge retrieval from uploaded files, enabling it to generate responses with citations. The solution also includes built-in monitoring capabilities with tracing to ensure easier troubleshooting and optimized performance.
This solution creates an Azure AI Foundry project and Azure AI services. More details about the resources can be found in the resources documentation. There are options to enable logging, tracing, and monitoring.
Instructions are provided for deployment through GitHub Codespaces, VS Code Dev Containers, and your local development environment.
The app code runs in Azure Container App to process the user input and generate a response to the user. It leverages Azure AI projects and Azure AI services, including the model and agent.
-
Knowledge Retrieval
The AI agent uses file search to retrieve knowledge from uploaded files. -
Customizable AI Model Deployment
The solution allows users to configure and deploy AI models, such as gpt-4o-mini, with options to adjust model capacity, and knowledge retrieval methods. -
Built-in Monitoring and Tracing
Integrated monitoring capabilities, including Azure Monitor and Application Insights, enable tracing and logging for easier troubleshooting and performance optimization. -
Flexible Deployment Options
The solution supports deployment through GitHub Codespaces, VS Code Dev Containers, or local environments, providing flexibility for different development workflows. -
Agent Evaluation
This solution demonstrates how you can evaluate your agent's performance and quality during local development and incorporate it into monitoring and CI/CD workflow. -
AI Red Teaming Agent
Facilitates the creation of an AI Red Teaming Agent that can run batch automated scans for safety and security scans on your Agent solution to check your risk posture before deploying it into production.
Here is a screenshot showing the chatting web application with requests and responses between the system and the user:
Github Codespaces and Dev Containers both allow you to download and deploy the code for development. You can also continue with local development. Once you have selected your environment, click here to launch the development and deployment guide
After deployment, try these sample questions to test your agent.
Once you have the agents and the web app working, you are encouraged to try one of the following:
-
Tracing and Monitoring - View console logs in Azure portal and App Insights tracing in Azure AI Foundry for debugging and performance monitoring.
-
Agent Evaluation - Evaluate your agent's performance and quality using built-in evaluators for local development, continuous monitoring, and CI/CD integration.
-
AI Red Teaming Agent - Run automated security and safety scans on your agent solution to check your risk posture before production deployment.
To prevent incurring unnecessary charges, it's important to clean up your Azure resources after completing your work with the application.
-
When to Clean Up:
- After you have finished testing or demonstrating the application.
- If the application is no longer needed or you have transitioned to a different project or environment.
- When you have completed development and are ready to decommission the application.
-
Deleting Resources: To delete all associated resources and shut down the application, execute the following command:
azd down
Please note that this process may take up to 20 minutes to complete.
Pricing varies per region and usage, so it isn't possible to predict exact costs for your usage. The majority of the Azure resources used in this infrastructure are on usage-based pricing tiers.
You can try the Azure pricing calculator for the resources:
- Azure AI Foundry: Free tier. Pricing
- Azure Storage Account: Standard tier, LRS. Pricing is based on storage and operations. Pricing
- Azure AI Services: S0 tier, defaults to gpt-4o-mini. Pricing is based on token count. Pricing
- Azure Container App: Consumption tier with 0.5 CPU, 1GiB memory/storage. Pricing is based on resource allocation, and each month allows for a certain amount of free usage. Pricing
- Log analytics: Pay-as-you-go tier. Costs based on data ingested. Pricing
- Agent Evaluations: Incurs the cost of your provided model deployment used for local evaluations.
- AI Red Teaming Agent: Leverages Azure AI Risk and Safety Evaluations to assess attack success from the automated AI red teaming scan. Users are billed based on the consumption of Risk and Safety Evaluations as listed in our Azure pricing page. Click on the tab labeled “Complete AI Toolchain” to view the pricing details.
azd down
.
This template also uses Managed Identity for local development and deployment.
To ensure continued best practices in your own repository, we recommend that anyone creating solutions based on our templates ensure that the Github secret scanning setting is enabled.
You may want to consider additional security measures, such as:
- Enabling Microsoft Defender for Cloud to secure your Azure resources.
- Protecting the Azure Container Apps instance with a firewall and/or Virtual Network.
Important Security Notice
This template, the application code and configuration it contains, has been built to showcase Microsoft Azure specific services and tools. We strongly advise our customers not to make this code part of their production environments without implementing or enabling additional security features.
For a more comprehensive list of best practices and security recommendations for Intelligent Applications, visit our official documentation.
This template creates everything you need to get started with Azure AI Foundry:
Resource | Description |
---|---|
Azure AI Project | Provides a collaborative workspace for AI development with access to models, data, and compute resources |
Azure OpenAI Service | Powers the AI agents for conversational AI and intelligent search capabilities. Default models deployed are gpt-4o-mini, but any Azure AI models can be specified per the documentation |
Azure Container Apps | Hosts and scales the web application with serverless containers |
Azure Container Registry | Stores and manages container images for secure deployment |
Storage Account | Provides blob storage for application data and file uploads |
AI Search Service | Optional - Enables hybrid search capabilities combining semantic and vector search |
Application Insights | Optional - Provides application performance monitoring, logging, and telemetry for debugging and optimization |
Log Analytics Workspace | Optional - Collects and analyzes telemetry data for monitoring and troubleshooting |
-
If you have an issue is with timeouts or provisioning resources, changing the location of your resource group can help, as there may be availability constrains for resources. Call
azd down
and remove your current resources, and delete the.azure
folder from your workspace. Then, callazd up
again and select a different region. -
You may debug further using azd commands.
azd show
displays information abour your app and resources, andazd deploy --debug
enables debugging and logging while deploying the application's code to Azure. -
Ensure that your az and azd tools are up to date.
-
After fully deploying with azd, additional errors in the Azure Portal may indicate that your latest code has not been successfully deployed
-
If your ACA does not boot up, it is possible that your deployment has failed. This could be due to quota constraints, permission issues, or resource availability. Check failures in the deployment and container app logs in the Azure Portal.
-
Console traces in ACA can be found in the Azure Portal, but they may be unreliable. Use Python’s logging with INFO level, and adjust Azure HTTP logging to WARNING.
-
Once your ACA is deployed, utilize the browser debugger (F12) and clear cache (CTRL+SHIFT+R). This can help debug the frontend for better traceability.
-
If your agent is occasionally unresponsive, your model may have reached its rate limit. You can increase its quota by adjusting the bicep configuration or by editing the model in the Azure AI Foundry page for your project's model deployments.
-
If your agent is crashing, confirm that you are using a model that you have deployed to your project.
-
This application is designed to serve multiple users on multiple browsers. This application uses cookies to ensure that the same thread is reused for conversations across multiple tabs in the same browser. If the browser is restarted, the old thread will continue to serve the user. However, if the application has a new agent after a server restart or a thread is deleted, a new thread will be created without requiring a browser refresh or signaling to the users. When users submit a message to the web server, the web server will create an agent, thread, and stream back a reply. The response contains
agent_id
andthread_id
in cookies. As a result, each subsequent message sent to the web server will also contain these IDs. As long as the same agent is being used in the system and the thread can be retrieved in the cookie, the same thread will be used to serve the users. -
For document handling, use filename-based downloads to avoid storing files in dictionaries.
-
Intermittent errors may arise when retrieving filenames for file IDs, which may be mitigated by using a single worker and fresh threads for each new agent.
-
File citation can be enhanced by automatically including filenames to reduce manual steps.
To the extent that the Software includes components or code used in or derived from Microsoft products or services, including without limitation Microsoft Azure Services (collectively, “Microsoft Products and Services”), you must also comply with the Product Terms applicable to such Microsoft Products and Services. You acknowledge and agree that the license governing the Software does not grant you a license or other right to use Microsoft Products and Services. Nothing in the license or this ReadMe file will serve to supersede, amend, terminate or modify any terms in the Product Terms for any Microsoft Products and Services.
You must also comply with all domestic and international export laws and regulations that apply to the Software, which include restrictions on destinations, end users, and end use. For further information on export restrictions, visit https://aka.ms/exporting.
You acknowledge that the Software and Microsoft Products and Services (1) are not designed, intended or made available as a medical device(s), and (2) are not designed or intended to be a substitute for professional medical advice, diagnosis, treatment, or judgment and should not be used to replace or as a substitute for professional medical advice, diagnosis, treatment, or judgment. Customer is solely responsible for displaying and/or obtaining appropriate consents, warnings, disclaimers, and acknowledgements to end users of Customer’s implementation of the Online Services.
You acknowledge the Software is not subject to SOC 1 and SOC 2 compliance audits. No Microsoft technology, nor any of its component technologies, including the Software, is intended or made available as a substitute for the professional advice, opinion, or judgement of a certified financial services professional. Do not use the Software to replace, substitute, or provide professional financial advice or judgment.
BY ACCESSING OR USING THE SOFTWARE, YOU ACKNOWLEDGE THAT THE SOFTWARE IS NOT DESIGNED OR INTENDED TO SUPPORT ANY USE IN WHICH A SERVICE INTERRUPTION, DEFECT, ERROR, OR OTHER FAILURE OF THE SOFTWARE COULD RESULT IN THE DEATH OR SERIOUS BODILY INJURY OF ANY PERSON OR IN PHYSICAL OR ENVIRONMENTAL DAMAGE (COLLECTIVELY, “HIGH-RISK USE”), AND THAT YOU WILL ENSURE THAT, IN THE EVENT OF ANY INTERRUPTION, DEFECT, ERROR, OR OTHER FAILURE OF THE SOFTWARE, THE SAFETY OF PEOPLE, PROPERTY, AND THE ENVIRONMENT ARE NOT REDUCED BELOW A LEVEL THAT IS REASONABLY, APPROPRIATE, AND LEGAL, WHETHER IN GENERAL OR IN A SPECIFIC INDUSTRY. BY ACCESSING THE SOFTWARE, YOU FURTHER ACKNOWLEDGE THAT YOUR HIGH-RISK USE OF THE SOFTWARE IS AT YOUR OWN RISK.