C# HTTP Helper for Web Scraping with Automatic Encoding Detection and Cookie Support
This utility class simplifies making HTTP requests in C# while automatically handling character encoding, gzip-compressed responses, cookies, and common request headers. It is especially useful for web scraping scenarios where the target page's encoding is unknown or inconsistent.
Core Features
Automatic detection of page encoding from HTML me ...
Posted on Fri, 15 May 2026 16:37:01 +0000 by RDKL PerFecT
Scraping Douban Book Data with Scrapy
Scrapy is an asynchronous web crawling framework built on Twisted, enabling efficient and scalable data extraction in Python. To begin scraping book information from Douban’s web site, first install Scrapy using pip:
pip install Scrapy -i https://pypi.tuna.tsinghua.edu.cn/simple
Create a new project named douban:
scrapy startproject douban
cd ...
Posted on Fri, 15 May 2026 10:30:34 +0000 by webmaster1
Parsing HTML and XML Data in Python with re, BeautifulSoup, and lxml
Regular Expressions with the re Module
The re module provides pattern matching operations for string processing, often used for data etxraction and validation.
import re
# Extract all numeric sequences from a string
number_list = re.findall(r'\d+', 'ID: 12345, Code: 67890')
print(number_list)
# Use an iterator for memory-efficient matching
nu ...
Posted on Thu, 14 May 2026 21:47:22 +0000 by jjfletch
Extracting Audio Files with Python Web Scraping
To extract audio files from websites, Python's requests library can be used to send HTTP requests and retrieve data. The process involves identifying audio URLs from network requests and saving the files locally.
First, inspect the network activity of a target webpage using browser developer tools. For example, on a music site like gequbao.com, ...
Posted on Thu, 14 May 2026 15:19:10 +0000 by uatec
Quick Python Web Scraping Guide: Choose Your Meal in Minutes
This process isn't technically complex—it's more about patience and attention to detail. That’s why many people choose web scraping as a side job. Though it’s time-consuming, the technical barrier is relatively low. After this lesson, you won't think web scraping is hard anymore. You may later encounter challenges like session management or byp ...
Posted on Mon, 11 May 2026 06:37:04 +0000 by sahel
Parsing HTML Content with Beautiful Soup in Python
Beautiful Soup is a Python library for parsing HTML and XML documents, creating parse trees that are helpful for extarcting data from web pages. It provides simple methods for navigating, searching, and mdoifying the parse tree.
Installation
pip install beautifulsoup4
Basic Usage
from bs4 import BeautifulSoup
html_doc = """
< ...
Posted on Sun, 10 May 2026 16:24:05 +0000 by cornix
Common Regular Expression Usage for Python Web Scraping
Regular Expression Patterns
Regex patterns use special syntax to define matching rules. The following table lists common special syntax elements, note that some meanings may change when using optional flags.
Pattern
Description
^
Matches the start of a string
$
Matches the end of a string
.
Matches any single character except newline ...
Posted on Sun, 10 May 2026 08:12:24 +0000 by movieflick
Reversing Youdao Translate Sign Generation and Response Decryption in Python
Generating the sign Parameter
The sign parameter obsevred in Youdao's translate API appears as a 32-character hexadecimal string, indicative of MD5 hashing.
Trace the JavaScript logic to locate where sign is produced:
const clientId = "fanyideskweb";
const productTag = "webfanyi";
function md5Hex(input) {
return require(' ...
Posted on Sat, 09 May 2026 03:23:40 +0000 by explorer
Selenium Web Scraping and Flume Data Processing Implementation
Extracting Stock Market Data with SeleniumTo retrieve financial information from dynamic web pages, Selenium is used to automate browser interactions, specifically targeting elements that load via JavaScript. The target involves extracting data from the Shanghai A-shares, Shenzhen A-shares, and aggregated boards. The data is persisted in a stru ...
Posted on Thu, 07 May 2026 11:10:01 +0000 by Ryan Sanders
Python Web Scraping for Animated Images Collection
Web Scraping Animated Images with Python
Automatically collecting animated images from websites can be useful when manual downloading is cumbersome, especially when websites restrict right-click functionality. This guide demonstrates how to create a Python script to extract GIFs from online sources.
We'll be scraping images from "FunnyGIFs", ...
Posted on Thu, 07 May 2026 09:14:37 +0000 by onlinegamesnz