Skip to content

aws-samples/sample-ollama-server

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Ollama-Server

Description

Ollama allows users to run open-source large language models (LLMs), offering a streamlined command line experience for interacting with and experimenting with these models. Open WebUI is an extensible, feature-rich, and user-friendly web interface to Ollama. For best performance, a GPU is required.

This repo provides a AWS CloudFormation template to provision NVIDIA GPU EC2 instances with Ollama and Open WebUI, and include access to Amazon Bedrock foundation models (FMs). Solution can be deployed as a website for LLM interaction through Open WebUI, or as application development environment with Amazon DCV server.

Ollama with Amazon DCV

Demo

OpenWebUI-Ollama-Bedrock-Demo.mp4

Architecture Diagram

architecture

Overview of Features

Template provides the following features

EC2 instance can be provisioned in AWS Region that does not support Bedrock

Notice

Although this repository is released under the MIT-0 license, its CloudFormation template uses third party components which are released under the following respective licenses

Usage of Amazon DCV indicates acceptance of DCV EULA. By using the template, you accept license agreement of all software that is installed in the EC2 instance.

Requirements

Deploying using CloudFormation console

Download Ollama-Server.yaml. (Use Ollama-Server-noGPU.yaml if you do not want a GPU EC2 instance. Note that the response from local Ollama models will be slow. )

Login to AWS CloudFormation console. Choose Create Stack, Upload a template file, Choose File, select your .YAML file and choose Next. Enter a Stack name and specify parameters values.

CloudFormation Parameters

In most cases, the default values are sufficient. Do verify instance type availability. You will need to specify values for vpcID, subnetID, ec2KeyPair and albSubnets. For security reasons, configure ingressIPv4 and ingressIPv6 to your IP address.

Ollama

EC2 Instance

  • ec2Name: EC2 instance name
  • ec2KeyPair: EC2 key pair name. Create key pair if necessary
  • osVersion : Ubuntu/Ubuntu Pro 24.04/22.04 (x86_64/arm64). Default is Ubuntu 24.04 (x86_64).
  • instanceType: EC2 instance type. Do ensure type matches processor architecture (x86_64 or arm64). Default is g4dn.xlarge
  • ec2TerminationProtection: enable EC2 termination protection to prevent accidental deletion. Default is Yes

EC2 Network

  • vpcID: VPC with internet connectivity. Select default VPC if unsure
  • subnetID: subnet with internet connectivity. Select subnet in default VPC if unsure
  • displayPublicIP: set this to No if your EC2 instance will not receive public IP address. EC2 private IP will be displayed in CloudFormation Outputs section instead. Default is Yes
  • assignStaticIP: associates a static public IPv4 address using Elastic IP address. Default is Yes

EC2 Remote Administration

  • ingressIPv4: allowed IPv4 source prefix to remote administration services, e.g. 1.2.3.4/32. You can get your source IP from https://checkip.amazonaws.com. Default is 0.0.0.0/0.
  • ingressIPv6: allowed IPv6 source prefix to remote administration services. Use ::1/128 to block all incoming IPv6 access. Default is ::/0
  • allowSSHport: allow inbound SSH. Option does not affect EC2 Instance Connect access. Default is Yes
  • installDCV: install graphical desktop environment and Amazon DCV server. Default is No

SSH and DCV inbound access are restricted to ingressIPv4 and ingressIPv6 IP prefixes.

EBS volume

Application Load Balancer (ALB)

  • enableALB: deploy Application Load Balancer with EC2 instance as target. Associated charges are listed on Elastic Load Balancing pricing page. Default is No
  • albSubnets#: subnets for ALB. Select at least 2 AZ subnets in EC2 VPC
  • albScheme: either internet-facing or internal. An internet-facing load balancer routes requests from clients to targets over the internet. An internal load balancer routes requests to targets using private IP addresses. Default is internet-facing
  • albIpAddressType: IP address type, either IPv4, IPv4-and-IPv6 or IPv6. Default is IPv4
  • albLogging: enable access logging to S3 bucket. Default is No

Select a subnet even if enableALB is No

ALB HTTPS listener

The above options only apply if enableALB is Yes

Amazon CloudFront

  • enableCloudFront: create a Amazon CloudFront distribution to your EC2 instance or ALB. Associated charges are listed on Amazon CloudFront pricing page. Default is No
  • originType: either Custom Origin or VPC Origin. Most AWS Regions support VPC Origins, which allow CloudFront to deliver content even if your EC2 instance is in a VPC private subnet. Default is Custom Origin
  • cloudFrontLogging: enable CloudFront standard logging to S3 bucket. Default is No

AWS Backup

  • enableBackup : EC2 data protection with AWS Backup. Associated charges are listed on AWS Backup pricing page. Default is Yes
  • scheduleExpression: start time of backup using CRON expression. Default is 1 am
  • scheduleExpressionTimezone: timezone in which the schedule expression is set. Default is Etc/UTC
  • deleteAfterDays: number of days after backup creation that a recovery point is deleted. Default is 35

Continue Next with Configure stack options, Review Stack, and click Submit to launch your stack.

It may take more than 20 minutes to provision the EC2 instance. After your stack has been successfully created, its status changes to CREATE_COMPLETE.

CloudFormation Outputs

The following are available on Outputs section

If installDCV is Yes

  • DCVwebConsole : DCV web browser client URL. Native DCV clients can be downloaded from https://www.amazondcv.com/. Default password is EC2 instance ID. Use SSM session manager or EC2 instance connect to set ubuntu user password, and login as ubuntu.

Open WebUI

If installWebUI is Yes**

  • WebUrl: Open WebUI URL

If enableALB is Yes

  • AlbConsole: ALB console URL
  • AlbDnsName: ALB domain name. Create a DNS CNAME or Route 53 alias to ALB domain name especially if you are using HTTPS listener

If enableCloudFront is Yes

  • CloudFrontConsole : CloudFront console URL link. Adjustment of your CloudFront distribution settings may be required.
  • CloudFrontURL : CloudFront distribution URL, e.g. https://d111111abcdef8.cloudfront.net

** Go to EC2, ALB, or CloudFront URL and create an administrative account immediately

Troubleshooting

To troubleshoot any installation issue, you can view contents of the following log files (if available)

  • /var/log/cloud-init-output.log
  • /var/log/install-cfn-helper.log
  • /var/log/install-sw.log
  • /var/log/install-dcv.log

Using Ollama and Open WebUI

Managing models

Refer to Starting With Ollama for model management instructions. Ollama site provides a listing of available language models and their size (e.g. DeepSeek). For best performance, ensure that model size is less than GPU memory size. You can refer to EC2 Accelerated Computing page for GPU memory size specifications.

Change EC2 instance type

If you need more powerful instance , you can change instance type.

Disk space considerations

If you are running out of disk space to download models, increase EBS volume and extend file system

Configuration

Docker compose is used to run Open WebUI and LiteLLM. You can customise Open WebUI and LiteLLM Proxy Server configuration by modifying /opt/docker/compose.yaml.

Amazon Bedrock models

To add or remove Amazon Bedrock or Amazon SageMaker text or image models, modify /opt/docker/bedrock-models.yaml and /opt/docker/bedrock-image-models.yaml respectively.

Image generation model selection

To change default image generation model

  • In Open WebUI, navigate to Settings > Admin Settings > Images menu
  • In Set Default Model text box, enter one of the following
    • Nova Canvas
    • Stable Diffusion 3.5 Large
    • Stable Image Ultra 1.0
  • Click Save

Remote connectivity to underlying services

Nginx (/etc/nginx/sites-available/reverse-proxy) is used to provide HTTP and HTTPS access to Open WebUI which listens on TCP port 8080.

Ollama, LiteLLM(text) and LiteLLM(image) are configured to listen on EC2 instance's network interface on TCP port 11434, 4000 and 4100 respectively. To allow remote connections, modify EC2 instance security group inbound rules to allow access from your IP address. You can use Nginx to provide HTTPS encryption. If ALB is provisioned (enableALB), you can create a HTTP or (preferably) HTTPS ALB listener to your EC2 instance.

Obtaining certificate for HTTPS

Amazon CloudFront (enableCloudFront) supports HTTPS. You can use AWS Certificate Manager to request a public certificate for your own domain and associate it with your CloudFront distribution.

The EC2 instance uses a self-signed certificate for HTTPS. You can use Certbot to obtain and install Let's Encrypt certificate on your web server.

Using Certbot

Ensure you have a domain name whose DNS entry resolves to your EC2 instance IP address. If you do not have a domain, you can register a new domain using Amazon Route 53 and create a DNS A and/or AAAA record.

Option 1: Nginx plugin

  • From terminal, run the below command and follow instructions.

    sudo certbot --nginx
    

    Nginx plugin uses HTTP-01 challenge, and requires HTTP port 80 to be accessible from public internet

Option 2: Route 53 plugin

  • The certbot-dns-route53 option requires your DNS to be hosted by Route 53. It supports wildcard certificates and domain names that resolve to private IP addresses. Ensure that Route 53 zone access is granted by specifying r53ZoneID value. From terminal, run the below command and follow instructions.

    sudo certbot --dns-route53 --installer nginx
    

Refer to Certbot site for help with the tool.

About EC2 instance

Updating software

Ubuntu unattended upgrades is enabled. To update Ollama, run /home/ubuntu/update-ollama script.

Open WebUI and LiteLLM are automatically updated by Watchtower, while a cron job runs docker image prune daily to remove unused images.

Restoring from backup

If you enable AWS Backup, you can restore your EC2 instance from recovery points (backups) in your backup vault. The CloudFormation template creates an IAM role that grants AWS Backup permission to restore your backups. Role name can be located in your CoudFormation stack Resources section as the Physical ID value whose Logical ID value is backupRestoreRole

Monitoring

Amazon CloudWatch agent is installed on EC2 instance, and is configured to send disk, memory and GPU utilization metrics.

Securing

To futher secure your EC2 instance, you may want to consider the following

Clean Up

To remove created resources, you will need to

  • Delete any recovery points in created backup vault
  • Disable EC2 instance termination protection (if enabled)
  • Delete CloudFormation stack

Security

See CONTRIBUTING for more information.

License

This library is licensed under the MIT-0 License. See the LICENSE file.

About

Ollama on GPU EC2 instance with Open WebUI web interface and Bedrock access

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Contributors 2

  •  
  •