Fetching Web Pages with Python's urllib Library for GET Requests
urllib Module Overview
The urllib module is a built-in Python library designed for HTTP requests. In Python 3, the primary submodules are urllib.request for handling requests and urllib.parse for URL encoding. This module enables programmatic browser simulation for data extraction tasks.
Practical Examples
Example 1: Retrieving Baidu Homepage C ...
Posted on Sun, 14 Jun 2026 16:53:18 +0000 by kosmidd
Essential Web Scraping Techniques using Urllib and Requests in Python
Utilizing the urllib Module for Web Requests
The urllib library is a built-in Python module used for handling URLs. It provides several ways to fetch data from the web, ranging from simple function calls to complex custom handlers.
Basic Web Access and Custom Openers
The simplest way to retrieve a webpage is using urlopen. For more advanced con ...
Posted on Tue, 26 May 2026 16:22:22 +0000 by andybrooke
Python Web Scraping Fundamentals: Request Handling and Network Operations
GET Requests with Dictionary Parameters
When making GET requests with query parameters, we can construct URLs dynamically using dictionaries:
import urllib.request
import urllib.parse
import string
def get_params():
base_url = "http://www.baidu.com/s"
params = {
"query": "中文",
" ...
Posted on Sat, 23 May 2026 17:18:26 +0000 by tj71587
Managing Sessions, Errors, and HTTP Requests in Python Web Scraping
Handling HTTP Cookies and Session State
HTTP is inherently stateless, meaning each request is independent. To maintain user sesions across multiple requests, web servers rely on cookies. When building scrapers, managing these cookies correctly is essential for accessing authenticated or personalized content.
Manual Cookie Injection
The simplest ...
Posted on Wed, 20 May 2026 07:53:20 +0000 by weknowtheworld
Quick Python Web Scraping Guide: Choose Your Meal in Minutes
This process isn't technically complex—it's more about patience and attention to detail. That’s why many people choose web scraping as a side job. Though it’s time-consuming, the technical barrier is relatively low. After this lesson, you won't think web scraping is hard anymore. You may later encounter challenges like session management or byp ...
Posted on Mon, 11 May 2026 06:37:04 +0000 by sahel
Building a Python Image Scraper with urllib and Regex
The core mechanism of an image scraper involves three steps: fetching a webpage, parsing its HTML to extract image URLs, and downloading each image file. Below are two practical examples—one for a general gallery site and another for a specific article. These scripts rely on Python's urllib.request and re modules and were originally written in ...
Posted on Thu, 07 May 2026 09:05:51 +0000 by kula