본문 바로가기

데이터 분석

How to crawl Google images using Python, and also how to save those images locally

반응형

Hello! In this blog post, I will show you how to crawl Google images using Python, and also how to save those images locally.

To crawl Google, we will be using the Requests and BeautifulSoup libraries. Requests is a Python library used for making various types of HTTP requests, and BeautifulSoup is a library used for parsing HTML and XML documents.

 

Here is an example Python code that will crawl Google images for the search term "Python" and download the first 5 images to your local machine:

import requests
from bs4 import BeautifulSoup
import os

# Define the search term
search_term = "Python"

# Set the URL to Google Images with the search term
url = f"https://www.google.com/search?q={search_term}&tbm=isch"

# Set the User-Agent header to avoid being blocked by Google
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36"}

# Send a GET request to the URL and store the response
response = requests.get(url, headers=headers)

# Use BeautifulSoup to parse the response content
soup = BeautifulSoup(response.content, "html.parser")

# Find all the image tags on the page
image_tags = soup.find_all("img", limit=5)

# Create a folder to store the downloaded images (if it doesn't exist)
folder_name = f"{search_term}_images"
if not os.path.exists(folder_name):
    os.makedirs(folder_name)

# Download each image and save it to the folder
for i, image_tag in enumerate(image_tags):
    # Get the image URL from the src attribute
    image_url = image_tag.get("src")
    
    # Send a GET request to the image URL and store the response content
    image_response = requests.get(image_url)
    image_content = image_response.content
    
    # Save the image content to a file in the folder
    file_name = f"{search_term}_image_{i+1}.jpg"
    file_path = os.path.join(folder_name, file_name)
    with open(file_path, "wb") as f:
        f.write(image_content)

# Print a message to indicate that the images have been downloaded
print(f"{len(image_tags)} images downloaded to folder {folder_name}")

To use this code for your own Google image searches, simply replace the search_term variable with your desired search term.

I hope you find this code helpful!

 
반응형

'데이터 분석' 카테고리의 다른 글

Forecastiong GUI with Tkinter  (0) 2023.07.13