Web Scraping with XPath: Extracting News Headlines from 36Kr

Having previously explored the powerful BeautifulSoup library for HTML parsing and techniques for capturing HTTP requests through on line tools, we now turn our attention to another fundamental web scraping approach: XPath. XPath serves as a query language designed to navigate and select specific portions of XML documents. While originally deve ...

Posted on Fri, 12 Jun 2026 18:32:30 +0000 by 9mm

Advanced HTML Parsing Strategies with PyQuery

Core Overview PyQuery provides an efficient interface for DOM manipulation in Python, mirroring the functionality of jQuery. It leverages the lxml parser back end to handle complex HTML structures. Environment Setup Installation requires the core parsing libraries. pip install lxml pyquery Initializing the Document Object Processing begins by ...

Posted on Wed, 13 May 2026 21:53:15 +0000 by Eckstra