Run Gemma 3 Locally Using Open WebUI

Experience the latest Google open-source model on your laptop with Ollama, Docker, Open WebUI, and GPU acceleration for optimal performance.

In this tutorial, we will learn to run Gemma 3 locally using Open WebUI, a user-friendly interface that simplifies deploying large language models on personal hardware. Open WebUI, alongside tools like Ollama, makes it possible to harness the power of Gemma 3 without requiring extensive technical expertise.

Gemma 3, Google’s latest open-weight large language model (LLM), represents a significant leap forward in AI capabilities. Designed for versatility and efficiency, Gemma 3 supports over 140 languages, processes both text and images, and excels in tasks like mathematical reasoning, coding, and instruction following. Its training process uses a massive dataset—spanning trillions of tokens—on Google TPUs with the JAX framework, ensuring state-of-the-art performance across its 1B, 4B, 12B, and 27B parameter variants.

1. Why Open WebUI and Ollama?

Open WebUI is a powerful and user-friendly web application that provides an intuitive interface for interacting with large language models (LLMs) like Gemma 3. Similar to ChatGPT, it allows users to harness advanced AI capabilities while running models locally. Open WebUI supports multimodal capabilities, extended context lengths, a code interpreter, and seamless integration with local hardware, making it a versatile tool for a wide range of applications.

On the other hand, Ollama complements Open WebUI by offering an efficient environment for running LLMs on your local machine. Even on older laptops, Ollama delivers a smooth and reliable experience, making it an excellent choice for users who want to explore AI without relying on cloud-based solutions.

2. Installing Ollama and Docker

We will start by installing Ollama and Docker locally using a simple setup.

Install Ollama

Go to the Ollama official website to download and install the Ollama package with the default settings. If you have CUDA installed locally, it will automatically use the GPU as the default accelerator instead of the CPU.

Install Docker

Next, visit the Docker official website to download Docker Desktop. Install it using the default options. If you are installing it on Windows, it will prompt you to install WSL so that you can access Linux within Windows.

3. Downloading the Gemma 3 model

To download the Gemma 3 model, launch the terminal or Windows shell and type the following command:

ollama pull gemma3

It will take a few minutes to download all the files based on your internet speed.

4. Setting up and running Open WebUI

Now we will download the Open WebUI Docker image and run it locally in a container using the following command:

docker run -d -p 3000:8080 --gpus all --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:cuda

Explanation of the command:

-d: Runs the container in detached mode.
-p 3000:8080: Maps port 8080 of the container to port 3000 on your local machine.
–gpus all: Enables GPU acceleration (if available).
–add-host=host.docker.internal:host-gateway: Adds a host entry for internal communication.
-v open-webui:/app/backend/data: Mounts a volume for persistent data storage.
–name open-webui: Names the container “open-webui”.
–restart always: Ensures the container restarts automatically if it stops.
ghcr.io/open-webui/open-webui:cuda: Specifies the Docker image to use.

Again, it might take about 10 minutes to set up everything based on your internet speed, as it downloads all the necessary files.

Once everything is completed, copy the http://localhost:3000/ URL and paste it into your browser. You will be redirected to the Open WebUI welcome page. This page will prompt you to create an admin account.

Once completed, you will be taken to the main chat interface where Gemma 3 is already selected as the default model. You can now enter prompts and start using it like ChatGPT.

5. Testing the Gemma 3 model

Now, we will test various options of Open WebUI and Gemma 3. As you can see, we can use the speech-to-text option to write prompts, generate and run code using the code interpreter, and upload files and images.

We asked it to create a web application for calculating US income tax, and it generated the code while also displaying the web application within Open WebUI.

You can even run the code within the app to test it.

Gemma 3 is multimodal, meaning you can also upload images and ask questions about them.

6. Conclusion

In this tutorial, we have learned a very simple and intuitive way to use Gemma 3 locally. We installed Ollama and Docker, then downloaded the model and Open WebUI application, and finally started using the application similar to ChatGPT. It is fast and supports GPU acceleration, which is great.

If you like my content, please let me know what topics you would like me to cover next in the comments. Thank you!

AI Blogathon

Tournaments

Weekly Tournament - May 26, 2025 (Completed)
Weekly Tournament - May 19, 2025 (Completed)
Weekly Tournament - April 28, 2025 (Completed)
Weekly Tournament - April 21, 2025 (Completed)
Weekly Tournament - April 14, 2025 (Completed)
Weekly Tournament - April 7, 2025 (Completed)
AI Madness (Completed)

Leaderboard

This Post's Rank: 5

Rank	Post	Score
1	Smarter Automation With Burr: The Future of Decision-Making	4853
2	How to Build an MCP Server for Kafka and Qdrant	2894
3	Visualizing Chunking Impacts in Agentic RAG with Agno, Qdrant, RAGAS and LlamaIndex	1647
4	Building Conversational AI: A Comprehensive Guide to Voice Assistants with LangChain	1586
5	Run Gemma 3 Locally Using Open WebUI	1306
6	Hamilton in Action: Practical Use Cases for Modern Data Workflows	1124
7	Comparison of Major LLM Architectures (2017– 2025)	969
8	Decoding Language: The Art of Tokenization and Embeddings	868
9	How AI is Redefining the Fight Against Climate Change	753
10	Build Your First AI Agent in Minutes	738