Skip to content

Azure-Samples/azure-ai-search-with-content-understanding-python

Repository files navigation

Azure AI Search with Content Understanding Samples (Python)

Welcome! Content Understanding is an innovative solution designed to analyze and interpret diverse media types, including documents, images, audio, and video. It transforms this content into structured, organized, and searchable data. In this sample, we will demonstrate how to index your files using the rich insights extracted from the Content Understanding API, and subsequently index these files with Azure Search to enhance their searchability.

  • The samples in this repository default to the latest preview API version: (2024-12-01-preview).

Samples

File Description
search_with_document_layout.ipynb Use content understanding Layout for content extraction from Documents, and index the file in Azure search
search_with_visual_document.ipynb Extract custom fields with content understanding API, and used it to index the file in Azure search
search_with_video.ipynb Extract custom fields with content understanding API, and used it to index the file in Azure search

Getting started

GitHub Codespaces

You can run this repo virtually by using GitHub Codespaces, which will open a web-based VS Code in your browser.

Open in GitHub Codespaces

Local environment

  1. Make sure the following tools are installed:

  2. Make a new directory called azure-ai-search-with-content-understanding-python and clone this template into it using the azd CLI:

    azd init -t azure-ai-search-with-content-understanding-python

    You can also use git to clone the repository if you prefer.

Configure Azure AI service resource

(Option 1) Use azd commands to auto create temporal resources to run sample

  1. Make sure you have permission to grant roles under subscription
  2. Login Azure
    azd auth login
  3. Setting up environment, following prompts to choose location
    azd up

(Option 2) Manually create resources and set environment variables

  1. Create Azure AI Services resource
  2. Go to Access Control (IAM) in resource, grant yourself role Cognitive Services User
  3. Create Azure OpenAI resource
  4. Deploy GPT model
  5. Deploy embedding model
  6. Go to Access Control (IAM) in resource, grant yourself role Cognitive Services OpenAI User
  7. Create Azure Search resource
  8. Go to Access Control (IAM) in resource, grant yourself role Search Index Data Contributor
  9. Go to Access Control (IAM) in resource, grant yourself role Search Service Contributor
  10. Copy notebooks/.env.sample to notebooks/.env
  11. Fill required information into .env from the resources you created
  12. Login Azure
    az login

Open a Jupyter notebook and follow the step-by-step guidance

Navigate to the notebooks directory and select the sample notebook you are interested in. Since Codespaces is pre-configured with the necessary environment, you can directly execute each step in the notebook.

More Samples using Azure Content Understanding

Azure Content Understanding General Samples

Azure Content Understanding with OpenAI

Notes

  • Trademarks - This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos is subject to those third-party’s policies.

  • Data Collection - The software may collect information about you and your use of the software and send it to Microsoft. Microsoft may use this information to provide services and improve our products and services. You may turn off the telemetry as described in the repository. There are also some features in the software that may enable you and Microsoft to collect data from users of your applications. If you use these features, you must comply with applicable law, including providing appropriate notices to users of your applications together with a copy of Microsoft’s privacy statement. Our privacy statement is located at https://go.microsoft.com/fwlink/?LinkID=824704. You can learn more about data collection and use in the help documentation and our privacy statement. Your use of the software operates as your consent to these practices.

About

No description, website, or topics provided.

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •