How to Use browser-use to Automate Your Browser with AI Agents | How To Use Browser Use To Automate Your Browser With Ai Agents

Description

The evolution of artificial intelligence (AI) has made browser automation more powerful than ever. With tools like browser-use, you can integrate AI agents to perform automated tasks such as web scraping, form filling, and data extraction, making your workflow more efficient. In this blog, we’ll explore how to use browser-use for AI-driven browser automation.

What is browser-use?

browser-use is a tool that allows AI agents to interact with web browsers, mimicking human-like browsing behavior. It enables automation of repetitive tasks such as:

Navigating websites
Clicking buttons and filling forms
Extracting data from web pages
Managing cookies and authentication

Getting Started with browser-use

Step 1: Prepare the environment

First, we recommend using uv to setup the Python environment.

1uv venv --python 3.11

and activate it with:


1# For Mac/Linux:
2source .venv/bin/activate
3
4# For Windows:
5.venv\Scripts\activate

Install the dependencies:

1uv pip install browser-use

Then install playwright:

1playwright install

Step 2: In Root Create an agent.py file


1import os
2import sys
3from pathlib import Path
4from langchain_google_genai import ChatGoogleGenerativeAI
5from browser_use.agent.views import ActionResult
6from pydantic import SecretStr
7sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
8import asyncio
9
10from langchain_openai import ChatOpenAI
11
12from browser_use import Agent, Controller
13from browser_use.browser.browser import Browser, BrowserConfig
14from browser_use.browser.context import BrowserContext
15
16browser = Browser(
17	config=BrowserConfig(
18		# NOTE: you need to close your chrome browser - so that this can open your browser in debug mode
19		chrome_instance_path='/Applications/Google Chrome.app/Contents/MacOS/Google Chrome',
20	)
21)
22
23api_key = 'GEMINI_API_KEY'
24llm = ChatGoogleGenerativeAI(model='gemini-2.0-flash-exp', api_key=SecretStr(api_key))
25
26async def main():
27	agent = Agent(
28		task='open google document and write an blog about latest tech trends',
29		llm=llm,
30		browser=browser,
31	)
32
33	await agent.run()
34	await browser.close()
35
36	input('Press Enter to close...')
37
38
39if __name__ == '__main__':
40	asyncio.run(main())