Web Scraping with XPath: Extracting News Headlines from 36Kr
Having previously explored the powerful BeautifulSoup library for HTML parsing and techniques for capturing HTTP requests through on line tools, we now turn our attention to another fundamental web scraping approach: XPath. XPath serves as a query language designed to navigate and select specific portions of XML documents. While originally deve ...
Posted on Fri, 12 Jun 2026 18:32:30 +0000 by 9mm
Applying XPath Expressions with Python's lxml Library
Installation
Install the library using pip:
pip install lxml
XPath Core Concepts
Node Types
XPath defines seven node types: element, attribute, text, namespace, processing instruction, comment, and the document (root) node. An XML document is represented as a node tree, with the root of the tree being the document or root node.
Consider this s ...
Posted on Wed, 20 May 2026 18:13:14 +0000 by alego
:Practical XPath Parsing with Python's lxml Library
The lxml library serves as a powerful Pythonic wrapper around C libraries like libxml2 and libxslt, delivering exceptional perofrmance for parsing HTML and XML documents. Its comprehensive support for XPath 1.0 makes it an ideal choice for targeted data extraction tasks.
Setup
Install the package via pip:
pip install lxml
Initializing Parser O ...
Posted on Fri, 15 May 2026 15:42:04 +0000 by zackcez