Practical Ajax Data Scraping and MySQL Integration
Target Data Extraction
Extract movie details including title, categories, duration, release location/date, description, and rating from Scrape | Movie pages, then store in MySQL database.
Ajax Request Analysis
By inspecting network requests from the target website, we identify the structured data endpoint pattern:
https://spa1.scrape.center/api/movie/{id}/
Response data is in dictionary format, which can be parsed directly after conversion from string to native Python dictionary.
Data Extraction Implementation
We use eval() to convert JSON-like string response to Python dictionary. The implementation handles missing fields gracefully using try...except blocks to prevent program termination during data extraction.
import requests
def fetch_movie_data(url):
response = requests.get(url)
raw_data = eval(response.text)
return {
'title': raw_data['name'],
'foreign_name': raw_data['alias'],
'categories': ','.join(raw_data['categories']),
'regions': ','.join(raw_data['regions']),
'release_date': raw_data['published_at'],
'duration': raw_data['minute'],
'rating': raw_data['score'],
'description': raw_data['drama']
}
MySQL Integration
Database Table Creation
Before storing data, we create a table with appropriate schema to match the extracted fields:
def create_movie_table():
connection = pymysql.connect(**db_config)
cursor = connection.cursor()
cursor.execute("""
CREATE TABLE IF NOT EXISTS movies (
id INT AUTO_INCREMENT PRIMARY KEY,
title VARCHAR(100),
foreign_name VARCHAR(100),
categories VARCHAR(100),
regions VARCHAR(100),
release_date DATE,
duration VARCHAR(50),
rating VARCHAR(20),
description TEXT
)
""")
connection.close()
Data Insertion
Implement data insertion using parameterized queries for security and efficiency:
def store_movie_data(data):
connection = pymysql.connect(**db_config)
cursor = connection.cursor()
columns = ','.join(data.keys())
placeholders = ','.join(['%s'] * len(data))
query = f"INSERT INTO movies ({columns}) VALUES ({placeholders})"
cursor.execute(query, tuple(data.values()))
connection.commit()
connection.close()
Complete Implementation
import requests
import pymysql
from db_config import db_config
# Table creation function (shown above)
# Data extraction function (shown above)
# Data storage function (shown above)
if __name__ == '__main__':
create_movie_table()
for i in range(1, 101):
url = f'https://spa1.scrape.center/api/movie/{i}/'
try:
movie_data = fetch_movie_data(url)
store_movie_data(movie_data)
except Exception as e:
print(f'Error processing {url}: {str(e)}')