Business Overview
A small online retailer wants daily insight into sales performance—order volume, revenue, and payment-method splits. The operational system (MySQL schema itcast_shop) contains six transactional tables that must be moved into a dedicated analytics store (itcast_shop_bi) and then aggregated for reporting.
Operational Schema
| Table | Purpose |
|---|---|
| ityewu_areas | Province → city → district hierarchy |
| ityewu_goods | SKU master data |
| ityewu_goods_cats | Three-level product category tree |
| ityewu_orders | Order header |
| ityewu_order_goods | Order line items (1-to-many with orders) |
| ityewu_users | Customer master |
Architecture
ODS (Operational Data Store) → APP (pre-aggregated results) → FineBI dashboards.
ETL engine: Pentaho Data Integration (Kettle).
Repository: MySQL (shared DB repo for version control and scheduling). Repository Setup
- Open Spoon → Connect → Repository Manager → Add → Database Repository.
- Create a new JDBC connection to
itcast_shop_bi, test, and save. - Default admin/admin credentials let the team share jobs/transformations.
ODS Layer – Raw Copy
All source tables are copied 1-for-1 into ods_ tables with an extra etl_date column.
Incremental Strategy
| Source | ODS Table | Frequency | Mode |
|---|---|---|---|
| ityewu_areas | ods_ityewu_areas | Monthly | Insert/Update on areaId |
| ityewu_goods_cats | ods_ityewu_goods_cats | month | Insert/Update on catId |
| ityewu_goods | ods_ityewu_goods | daily | Insert/Update on goodsId |
| ityewu_orders | ods_ityewu_orders | daily | Insert/Update on orderId |
| ityewu_order_goods | ods_ityewu_order_goods | daily | Insert/Update on ogId |
| ityewu_users | ods_ityewu_users | daily | Insert/Update on userId |
Kettle Job Example – Monthly Static Tables
Start
|
+-- Transformation: areas_to_ods
| Table Input: SELECT *, CURDATE() AS etl_date FROM itcast_shop.ityewu_areas
| Insert/Update: target = ods_ityewu_areas, keys = areaId
|
+-- Transformation: cats_to_ods
Table Input: SELECT *, CURDATE() AS etl_date FROM itcast_shop.ityewu_goods_cats
Insert/Update: target = ods_ityewu_goods_cats, keys = catId
Schedule the job via Start → Repeat → Monthly → 1st 02:00.
APP Layer – Business Aggregates
All result tables are prefixed app_ and refreshed daily.
1. Daily Order Revenue & Count
CREATE TABLE app_order_total (
id INT AUTO_INCREMENT PRIMARY KEY,
dt DATE,
total_money DOUBLE,
total_cnt INT
);
INSERT INTO app_order_total(dt, total_money, total_cnt)
SELECT
DATE(createTime) AS dt,
SUM(realTotalMoney) AS total_money,
COUNT(1) AS total_cnt
FROM ods_ityewu_orders
GROUP BY dt;
2. Daily Distinct Buyers
CREATE TABLE app_order_user (
id INT AUTO_INCREMENT PRIMARY KEY,
dt DATE,
total_user_cnt INT
);
INSERT INTO app_order_user(dt, total_user_cnt)
SELECT
DATE(createTime) AS dt,
COUNT(DISTINCT userId)
FROM ods_ityewu_orders
GROUP BY dt;
3. Daily Payment-Method Split
CREATE TABLE app_order_paytype (
id INT AUTO_INCREMENT PRIMARY KEY,
dt DATE,
paytype VARCHAR(20),
total_money DOUBLE,
total_cnt INT
);
INSERT INTO app_order_paytype(dt, paytype, total_money, total_cnt)
SELECT
DATE(createTime) AS dt,
CASE payType
WHEN 1 THEN 'Alipay'
WHEN 2 THEN 'WeChat'
WHEN 3 THEN 'Card'
ELSE 'Other'
END AS paytype,
SUM(realTotalMoney),
COUNT(1)
FROM ods_ityewu_orders
GROUP BY dt, payType;
4. Top-5 Power Users by Order Count
CREATE TABLE app_order_user_top5 (
id INT AUTO_INCREMENT PRIMARY KEY,
dt DATE,
userid VARCHAR(20),
username VARCHAR(50),
total_cnt INT
);
INSERT INTO app_order_user_top5(dt, userid, username, total_cnt)
SELECT
DATE(createTime) AS dt,
userId,
userName,
COUNT(orderId) AS total_cnt
FROM ods_ityewu_orders
WHERE DATE(createTime) = CURDATE() - INTERVAL 1 DAY
GROUP BY dt, userId, userName
ORDER BY total_cnt DESC
LIMIT 5;
5. Revenue & Orders by Top-Level Category
Because the category tree is self-referencing, we first materialize a flattened helper:
CREATE TABLE tmp_goods_cat AS
SELECT
l3.catId AS cat_id_l3,
l2.catId AS cat_id_l2,
l1.catId AS cat_id_l1,
l1.catName AS cat_name_l1
FROM ods_ityewu_goods_cats l3
JOIN ods_ityewu_goods_cats l2 ON l3.parentId = l2.catId
JOIN ods_ityewu_goods_cats l1 ON l2.parentId = l1.catId
WHERE l3.cat_level = 3;
CREATE TABLE app_order_goods_cat (
id INT AUTO_INCREMENT PRIMARY KEY,
dt DATE,
cat_name VARCHAR(50),
total_money DOUBLE,
total_cnt INT
);
INSERT INTO app_order_goods_cat(dt, cat_name, total_money, total_cnt)
SELECT
DATE(o.createTime) AS dt,
c.cat_name_l1,
SUM(og.goodsNum * og.payPrice),
COUNT(DISTINCT og.orderId)
FROM tmp_goods_cat c
JOIN ods_ityewu_goods g ON g.goodsCatId = c.cat_id_l3
JOIN ods_ityewu_order_goods og ON og.goodsId = g.goodsId
JOIN ods_ityewu_orders o ON o.orderId = og.orderId
GROUP BY dt, c.cat_name_l1;
Automation
A single Kettle job orchestrates the daily flow:
- Run incremental transformations for
goods,orders,order_goods,users. - Execute the five SQL scripts above to refresh
app_tables. - Trigger FineBI cache refresh via REST call (optional).
Schedule: every day at 01:30 via Start → Repeat → Daily.
Dashboards
FineBI connects to itcast_shop_bi and reads the app_ tables directly, giving analysts drag-and-drop access to:
- Daily revenue trend
- Payment mix pie chart
- Top buyers leaderboard
- Category performance heat map