The evolution of artificial intelligence (AI) has made browser automation more powerful than ever. With tools like browser-use, you can integrate AI agents to perform automated tasks such as web scraping, form filling, and data extraction, making your workflow more efficient. In this blog, we’ll explore how to use browser-use for AI-driven browser automation.
browser-use is a tool that allows AI agents to interact with web browsers, mimicking human-like browsing behavior. It enables automation of repetitive tasks such as:
First, we recommend using uv to setup the Python environment.
1uv venv --python 3.11
and activate it with:
1# For Mac/Linux:2source .venv/bin/activate34# For Windows:5.venv\Scripts\activate
Install the dependencies:
1uv pip install browser-use
Then install playwright:
1playwright install
1import os2import sys3from pathlib import Path4from langchain_google_genai import ChatGoogleGenerativeAI5from browser_use.agent.views import ActionResult6from pydantic import SecretStr7sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))8import asyncio910from langchain_openai import ChatOpenAI1112from browser_use import Agent, Controller13from browser_use.browser.browser import Browser, BrowserConfig14from browser_use.browser.context import BrowserContext1516browser = Browser(17config=BrowserConfig(18# NOTE: you need to close your chrome browser - so that this can open your browser in debug mode19chrome_instance_path='/Applications/Google Chrome.app/Contents/MacOS/Google Chrome',20)21)2223api_key = 'GEMINI_API_KEY'24llm = ChatGoogleGenerativeAI(model='gemini-2.0-flash-exp', api_key=SecretStr(api_key))2526async def main():27agent = Agent(28task='open google document and write an blog about latest tech trends',29llm=llm,30browser=browser,31)3233await agent.run()34await browser.close()3536input('Press Enter to close...')373839if __name__ == '__main__':40asyncio.run(main())
Join the newsletter
Subscribe for weekly updates. No spams ever!
Copyright © 2025 | Hardik Desai | All Rights Reserved