Web Scraping with XPath: Extracting News Headlines from 36Kr
Having previously explored the powerful BeautifulSoup library for HTML parsing and techniques for capturing HTTP requests through on line tools, we now turn our attention to another fundamental web scraping approach: XPath. XPath serves as a query language designed to navigate and select specific portions of XML documents. While originally deve ...
Posted on Fri, 12 Jun 2026 18:32:30 +0000 by 9mm
Advanced HTML Parsing Strategies with PyQuery
Core Overview
PyQuery provides an efficient interface for DOM manipulation in Python, mirroring the functionality of jQuery. It leverages the lxml parser back end to handle complex HTML structures.
Environment Setup
Installation requires the core parsing libraries.
pip install lxml pyquery
Initializing the Document Object
Processing begins by ...
Posted on Wed, 13 May 2026 21:53:15 +0000 by Eckstra