DataSophon DolphinScheduler 3.1.9 Integration and Upgrade Guide
Download and Extract Installation Package
DolphinScheduler Download
wget -O /opt/datasophon/DDP/packages/apache-dolphinscheduler-3.1.9-bin.tar.gz https://archive.apache.org/dist/dolphinscheduler/3.1.9/apache-dolphinscheduler-3.1.9-bin.tar.gz
cd /opt/datasophon/DDP/packages/
tar -xvf ./apache-dolphinscheduler-3.1.9-bin.tar.gz
Modify Installation Package Directory Name
Ensure consistency with decompressPackageName in service_ddl.json
mv apache-dolphinscheduler-3.1.9-bin dolphinscheduler-3.1.9
Add jmx Folder
mkdir -p /opt/datasophon/DDP/packages/dolphinscheduler-3.1.9/jmx
cp -r /opt/datasophon/hadoop-3.3.3/jmx/jmx_prometheus_javaagent-0.16.1.jar /opt/datasophon/DDP/packages/dolphinscheduler-3.1.9/jmx/
vi /opt/datasophon/DDP/packages/dolphinscheduler-3.1.9/jmx/prometheus_config.yml
---
lowercaseOutputLabelNames: true
lowercaseOutputName: true
whitelistObjectNames: ["java.lang:type=OperatingSystem"]
blacklistObjectNames: []
rules:
- pattern: 'java.lang<type=OperatingSystem><>(committed_virtual_memory|free_physical_memory|free_swap_space|total_physical_memory|total_swap_space)_size:'
name: os_$1_bytes
type: GAUGE
attrNameSnakeCase: true
- pattern: 'java.lang<type=OperatingSystem><>((?!process_cpu_time)\w+):'
name: os_$1
type: GAUGE
attrNameSnakeCase: true
Modify Startup Commands in Scripts to Enable jmx
vim ./dolphinscheduler-3.1.9/alert-server/bin/start.sh
JAVA_OPTS=${JAVA_OPTS:-"-server -javaagent:$BIN_DIR/../../jmx/jmx_prometheus_javaagent-0.16.1.jar=12359:$BIN_DIR/../../jmx/prometheus_config.yml -Duser.timezone=${SPRING_JACKSON_TIME_ZONE} -Xms1g -Xmx1g -Xmn512m -XX:+PrintGCDetails -Xloggc:gc.log -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=dump.hprof"}
vim ./dolphinscheduler-3.1.9/api-server/bin/start.sh
JAVA_OPTS=${JAVA_OPTS:-"-server -javaagent:$BIN_DIR/../../jmx/jmx_prometheus_javaagent-0.16.1.jar=12356:$BIN_DIR/../../jmx/prometheus_config.yml -Duser.timezone=${SPRING_JACKSON_TIME_ZONE} -Xms1g -Xmx1g -Xmn512m -XX:+PrintGCDetails -Xloggc:gc.log -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=dump.hprof"}
vim ./dolphinscheduler-3.1.9/master-server/bin/start.sh
JAVA_OPTS=${JAVA_OPTS:-"-server -javaagent:$BIN_DIR/../../jmx/jmx_prometheus_javaagent-0.16.1.jar=12357:$BIN_DIR/../../jmx/prometheus_config.yml -Duser.timezone=${SPRING_JACKSON_TIME_ZONE} -Xms1g -Xmx1g -Xmn512m -XX:+PrintGCDetails -Xloggc:gc.log -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=dump.hprof"}
vim ./dolphinscheduler-3.1.9/worker-server/bin/start.sh
JAVA_OPTS=${JAVA_OPTS:-"-server -javaagent:$BIN_DIR/../../jmx/jmx_prometheus_javaagent-0.16.1.jar=12358:$BIN_DIR/../../jmx/prometheus_config.yml -Duser.timezone=${SPRING_JACKSON_TIME_ZONE} -Xms1g -Xmx1g -Xmn512m -XX:+PrintGCDetails -Xloggc:gc.log -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=dump.hprof"}
jmx port numbers must match those in service_ddl.json
- api-server: 12356
- master-server: 12357
- worker-server: 12358
- alert-server: 12359
Modify bin/dolphinscheduler-daemon.sh Script
Modify the dolphinscheduler-3.1.9/bin/dolphinscheduler-daemon.sh script, adding an exit 1 line near the bottom where $state == "STOP"
vi /opt/datasophon/DDP/packages/dolphinscheduler-3.1.9/bin/dolphinscheduler-daemon.sh
(status)
get_server_running_status
if [[ $state == "STOP" ]]; then
# font color - red
state="[ \033[1;31m $state \033[0m ]"
# Add a line to allow DataSophon to determine status based on return value
exit 1
else
# font color - green
state="[ \033[1;32m $state \033[0m ]"
fi
echo -e "$command $state"
;;
(*)
echo $usage
exit 1
;;
Add Required Driver Packages for DolphinScheduler
MySQL8
wget -O /opt/datasophon/DDP/packages/dolphinscheduler-3.1.9/alert-server/libs/mysql-connector-java-8.0.29.jar https://repo1.maven.org/maven2/mysql/mysql-connector-java/8.0.29/mysql-connector-java-8.0.29.jar
wget -O /opt/datasophon/DDP/packages/dolphinscheduler-3.1.9/api-server/libs/mysql-connector-java-8.0.29.jar https://repo1.maven.org/maven2/mysql/mysql-connector-java/8.0.29/mysql-connector-java-8.0.29.jar
wget -O /opt/datasophon/DDP/packages/dolphinscheduler-3.1.9/master-server/libs/mysql-connector-java-8.0.29.jar https://repo1.maven.org/maven2/mysql/mysql-connector-java/8.0.29/mysql-connector-java-8.0.29.jar
wget -O /opt/datasophon/DDP/packages/dolphinscheduler-3.1.9/standalone-server/libs/mysql-connector-java-8.0.29.jar https://repo1.maven.org/maven2/mysql/mysql-connector-java/8.0.29/mysql-connector-java-8.0.29.jar
wget -O /opt/datasophon/DDP/packages/dolphinscheduler-3.1.9/tools/libs/mysql-connector-java-8.0.29.jar https://repo1.maven.org/maven2/mysql/mysql-connector-java/8.0.29/mysql-connector-java-8.0.29.jar
wget -O /opt/datasophon/DDP/packages/dolphinscheduler-3.1.9/worker-server/libs/mysql-connector-java-8.0.29.jar https://repo1.maven.org/maven2/mysql/mysql-connector-java/8.0.29/mysql-connector-java-8.0.29.jar
Reference DolphinScheduler documentation: https://github.com/apache/dolphinscheduler/blob/3.1.9-release/docs/docs/zh/guide/howto/datasource-setting.md
commons-cli
wget -O /opt/datasophon/DDP/packages/dolphinscheduler-3.1.9/alert-server/libs/commons-cli-1.9.0.jar https://repo1.maven.org/maven2/commons-cli/commons-cli/1.9.0/commons-cli-1.9.0.jar
wget -O /opt/datasophon/DDP/packages/dolphinscheduler-3.1.9/api-server/libs/commons-cli-1.9.0.jar https://repo1.maven.org/maven2/commons-cli/commons-cli/1.9.0/commons-cli-1.9.0.jar
wget -O /opt/datasophon/DDP/packages/dolphinscheduler-3.1.9/master-server/libs/commons-cli-1.9.0.jar https://repo1.maven.org/maven2/commons-cli/commons-cli/1.9.0/commons-cli-1.9.0.jar
wget -O /opt/datasophon/DDP/packages/dolphinscheduler-3.1.9/standalone-server/libs/commons-cli-1.9.0.jar https://repo1.maven.org/maven2/commons-cli/commons-cli/1.9.0/commons-cli-1.9.0.jar
wget -O /opt/datasophon/DDP/packages/dolphinscheduler-3.1.9/tools/libs/commons-cli-1.9.0.jar https://repo1.maven.org/maven2/commons-cli/commons-cli/1.9.0/commons-cli-1.9.0.jar
wget -O /opt/datasophon/DDP/packages/dolphinscheduler-3.1.9/worker-server/libs/commons-cli-1.9.0.jar https://repo1.maven.org/maven2/commons-cli/commons-cli/1.9.0/commons-cli-1.9.0.jar
Disable Python Gateway [Optional]
The Python gateway service starts with the api-server by default. To disable it, change python-gateway.enabled to false in the api-server configuration file /opt/packages/apache-dolphinscheduler-3.1.9-bin/api-server/conf/application.yaml.
vim /opt/datasophon/DDP/packages/dolphinscheduler-3.1.9/api-server/conf/application.yaml
Rebuild Installation Package
tar -zcf dolphinscheduler-3.1.9.tar.gz dolphinscheduler-3.1.9
md5sum dolphinscheduler-3.1.9.tar.gz > dolphinscheduler-3.1.9.tar.gz.md5
Update DS/service_ddl.json with the Following Parameters
vi /opt/datasophon/datasophon-manager-1.2.1/conf/meta/DDP-1.2.1/DS/service_ddl.json
{
"name": "DS",
"label": "DolphinScheduler",
"description": "Distributed and extensible visual workflow task scheduling platform",
"version": "3.1.9",
"sortNum": 14,
"dependencies":["ZOOKEEPER"],
"packageName": "dolphinscheduler-3.1.9.tar.gz",
"decompressPackageName": "dolphinscheduler-3.1.9",
"roles": [
{
"name": "ApiServer",
"label": "ApiServer",
"roleType": "master",
"cardinality": "1",
"logFile": "api-server/logs/dolphinscheduler-api.log",
"jmxPort": 12356,
"startRunner": {
"timeout": "60",
"program": "bin/dolphinscheduler-daemon.sh",
"args": [
"start",
"api-server"
]
},
"stopRunner": {
"timeout": "600",
"program": "bin/dolphinscheduler-daemon.sh",
"args": [
"stop",
"api-server"
]
},
"statusRunner": {
"timeout": "60",
"program": "bin/dolphinscheduler-daemon.sh",
"args": [
"status",
"api-server"
]
},
"restartRunner": {
"timeout": "60",
"program": "control.sh",
"args": [
"restart",
"api-server"
]
},
"externalLink": {
"name": "DolphinScheduler Ui",
"label": "DolphinScheduler Ui",
"url": "http://${host}:12345/dolphinscheduler/ui"
}
},
{
"name": "MasterServer",
"label": "MasterServer",
"roleType": "master",
"cardinality": "1+",
"logFile": "master-server/logs/dolphinscheduler-master.log",
"jmxPort": 12357,
"startRunner": {
"timeout": "60",
"program": "bin/dolphinscheduler-daemon.sh",
"args": [
"start",
"master-server"
]
},
"stopRunner": {
"timeout": "600",
"program": "bin/dolphinscheduler-daemon.sh",
"args": [
"stop",
"master-server"
]
},
"statusRunner": {
"timeout": "60",
"program": "bin/dolphinscheduler-daemon.sh",
"args": [
"status",
"master-server"
]
},
"restartRunner": {
"timeout": "60",
"program": "control.sh",
"args": [
"restart",
"master-server"
]
}
},
{
"name": "WorkerServer",
"label": "WorkerServer",
"roleType": "worker",
"cardinality": "1+",
"logFile": "worker-server/logs/dolphinscheduler-worker.log",
"jmxPort": 12358,
"startRunner": {
"timeout": "60",
"program": "bin/dolphinscheduler-daemon.sh",
"args": [
"start",
"worker-server"
]
},
"stopRunner": {
"timeout": "600",
"program": "bin/dolphinscheduler-daemon.sh",
"args": [
"stop",
"worker-server"
]
},
"statusRunner": {
"timeout": "60",
"program": "bin/dolphinscheduler-daemon.sh",
"args": [
"status",
"worker-server"
]
},
"restartRunner": {
"timeout": "60",
"program": "control.sh",
"args": [
"restart",
"worker-server"
]
}
},
{
"name": "AlertServer",
"label": "AlertServer",
"roleType": "master",
"cardinality": "1",
"logFile": "alert-server/logs/dolphinscheduler-alert.log",
"jmxPort": 12359,
"startRunner": {
"timeout": "60",
"program": "bin/dolphinscheduler-daemon.sh",
"args": [
"start",
"alert-server"
]
},
"stopRunner": {
"timeout": "600",
"program": "bin/dolphinscheduler-daemon.sh",
"args": [
"stop",
"alert-server"
]
},
"statusRunner": {
"timeout": "60",
"program": "bin/dolphinscheduler-daemon.sh",
"args": [
"status",
"alert-server"
]
},
"restartRunner": {
"timeout": "60",
"program": "control.sh",
"args": [
"restart",
"alert-server"
]
}
}
],
"configWriter": {
"generators": [
{
"filename": "dolphinscheduler_env.sh",
"configFormat": "custom",
"outputDirectory": "bin/env/",
"templateName": "dolphinscheduler_env.ftl",
"includeParams": [
"databaseUrl",
"username",
"password",
"zkUrls"
]
},
{
"filename": "common.properties",
"configFormat": "properties",
"outputDirectory": "api-server/conf,master-server/conf,worker-server/conf,alert-server/conf",
"includeParams": [
"data.basedir.path",
"resource.storage.type",
"resource.storage.upload.base.path",
"resource.aws.access.key.id",
"resource.aws.secret.access.key",
"resource.aws.region",
"resource.aws.s3.bucket.name",
"resource.aws.s3.endpoint",
"resource.alibaba.cloud.access.key.id",
"resource.alibaba.cloud.access.key.secret",
"resource.alibaba.cloud.region",
"resource.alibaba.cloud.oss.bucket.name",
"resource.alibaba.cloud.oss.endpoint",
"resource.hdfs.root.user",
"resource.hdfs.fs.defaultFS",
"hadoop.security.authentication.startup.state",
"java.security.krb5.conf.path",
"login.user.keytab.username",
"login.user.keytab.path",
"kerberos.expire.time",
"resource.manager.httpaddress.port",
"yarn.resourcemanager.ha.rm.ids",
"yarn.application.status.address",
"yarn.job.history.status.address",
"datasource.encryption.enable",
"datasource.encryption.salt",
"data-quality.jar.name",
"support.hive.oneSession",
"sudo.enable",
"setTaskDirToTenant.enable",
"development.state",
"alert.rpc.port",
"conda.path",
"task.resource.limit.state",
"ml.mlflow.preset_repository",
"ml.mlflow.preset_repository_version",
"custom.common.properties"
]
}
]
},
"parameters": [
{
"name": "databaseUrl",
"label": "DolphinScheduler Database URL",
"description": "",
"configType": "map",
"required": true,
"type": "input",
"value": "",
"configurableInWizard": true,
"hidden": false,
"defaultValue": "jdbc:mysql://${apiHost}:3306/dolphinscheduler?useUnicode=true&characterEncoding=UTF-8&useSSL=false"
},
{
"name": "username",
"label": "DolphinScheduler Database Username",
"description": "",
"configType": "map",
"required": true,
"type": "input",
"value": "",
"configurableInWizard": true,
"hidden": false,
"defaultValue": "dolphinscheduler"
},
{
"name": "password",
"label": "DolphinScheduler Database Password",
"description": "",
"configType": "map",
"required": true,
"type": "input",
"value": "",
"configurableInWizard": true,
"hidden": false,
"defaultValue": "dolphinscheduler"
},
{
"name": "zkUrls",
"label": "ZooKeeper URLs",
"description": "",
"configType": "map",
"required": true,
"type": "input",
"value": "",
"configurableInWizard": true,
"hidden": false,
"defaultValue": "${zkUrls}"
},
{
"name": "data.basedir.path",
"label": "data.basedir.path",
"description": "",
"required": true,
"type": "input",
"value": "",
"configurableInWizard": true,
"hidden": false,
"defaultValue": "/tmp/dolphinscheduler"
},{
"name": "resource.storage.type",
"label": "resource.storage.type",
"description": "",
"required": true,
"type": "input",
"value": "",
"configurableInWizard": true,
"hidden": false,
"defaultValue": "NONE"
},{
"name": "resource.storage.upload.base.path",
"label": "resource.storage.upload.base.path",
"description": "",
"required": true,
"type": "input",
"value": "",
"configurableInWizard": true,
"hidden": false,
"defaultValue": "/dolphinscheduler"
},{
"name": "resource.aws.access.key.id",
"label": "resource.aws.access.key.id",
"description": "",
"required": true,
"type": "input",
"value": "",
"configurableInWizard": true,
"hidden": false,
"defaultValue": "minioadmin"
},{
"name": "resource.aws.secret.access.key",
"label": "resource.aws.secret.access.key",
"description": "",
"required": true,
"type": "input",
"value": "",
"configurableInWizard": true,
"hidden": false,
"defaultValue": "minioadmin"
},{
"name": "resource.hdfs.root.user",
"label": "resource.hdfs.root.user",
"description": "",
"required": true,
"type": "input",
"value": "",
"configurableInWizard": true,
"hidden": false,
"defaultValue": "hdfs"
},{
"name": "resource.hdfs.fs.defaultFS",
"label": "resource.hdfs.fs.defaultFS",
"description": "",
"required": true,
"type": "input",
"value": "",
"configurableInWizard": true,
"hidden": false,
"defaultValue": "${fs.defaultFS}"
},{
"name": "resource.manager.httpaddress.port",
"label": "resource.manager.httpaddress.port",
"description": "",
"required": true,
"type": "input",
"value": "",
"configurableInWizard": true,
"hidden": false,
"defaultValue": "8088"
},{
"name": "yarn.resourcemanager.ha.rm.ids",
"label": "yarn.resourcemanager.ha.rm.ids",
"description": "",
"required": true,
"type": "input",
"value": "",
"configurableInWizard": true,
"hidden": false,
"defaultValue": "${rmHost}"
},{
"name": "yarn.application.status.address",
"label": "yarn.application.status.address",
"description": "",
"required": true,
"type": "input",
"value": "",
"configurableInWizard": true,
"hidden": false,
"defaultValue": "http://ds1:%s/ws/v1/cluster/apps/%s"
},{
"name": "yarn.job.history.status.address",
"label": "yarn.job.history.status.address",
"description": "",
"required": true,
"type": "input",
"value": "",
"configurableInWizard": true,
"hidden": false,
"defaultValue": "http://ds1:19888/ws/v1/history/mapreduce/jobs/%s"
},{
"name": "datasource.encryption.enable",
"label": "datasource.encryption.enable",
"description": "",
"required": true,
"type": "input",
"value": "",
"configurableInWizard": true,
"hidden": false,
"defaultValue": "false"
},{
"name": "datasource.encryption.salt",
"label": "datasource.encryption.salt",
"description": "",
"required": true,
"type": "input",
"value": "",
"configurableInWizard": true,
"hidden": false,
"defaultValue": "!@#$%^&*"
},{
"name": "data-quality.jar.name",
"label": "data-quality.jar.name",
"description": "",
"required": true,
"type": "input",
"value": "",
"configurableInWizard": true,
"hidden": false,
"defaultValue": "dolphinscheduler-data-quality-dev-SNAPSHOT.jar"
},{
"name": "support.hive.oneSession",
"label": "support.hive.oneSession",
"description": "",
"required": true,
"type": "input",
"value": "",
"configurableInWizard": true,
"hidden": false,
"defaultValue": "false"
},{
"name": "sudo.enable",
"label": "sudo.enable",
"description": "",
"required": true,
"type": "input",
"value": "",
"configurableInWizard": true,
"hidden": false,
"defaultValue": "true"
},{
"name": "setTaskDirToTenant.enable",
"label": "setTaskDirToTenant.enable",
"description": "",
"required": true,
"type": "input",
"value": "",
"configurableInWizard": true,
"hidden": false,
"defaultValue": "false"
},{
"name": "development.state",
"label": "development.state",
"description": "",
"required": true,
"type": "input",
"value": "",
"configurableInWizard": true,
"hidden": false,
"defaultValue": "false"
},{
"name": "alert.rpc.port",
"label": "alert.rpc.port",
"description": "",
"required": true,
"type": "input",
"value": "",
"configurableInWizard": true,
"hidden": false,
"defaultValue": "50052"
},{
"name": "conda.path",
"label": "conda.path",
"description": "",
"required": true,
"type": "input",
"value": "",
"configurableInWizard": true,
"hidden": false,
"defaultValue": "/opt/anaconda3/etc/profile.d/conda.sh"
},{
"name": "task.resource.limit.state",
"label": "task.resource.limit.state",
"description": "",
"required": true,
"type": "input",
"value": "",
"configurableInWizard": true,
"hidden": false,
"defaultValue": "false"
},{
"name": "ml.mlflow.preset_repository",
"label": "ml.mlflow.preset_repository",
"description": "",
"required": true,
"type": "input",
"value": "",
"configurableInWizard": true,
"hidden": false,
"defaultValue": "https://github.com/apache/dolphinscheduler-mlflow"
},{
"name": "custom.common.properties",
"label": "Custom common.properties configuration",
"description": "Custom configuration",
"configType": "custom",
"required": false,
"type": "multipleWithKey",
"value": [],
"configurableInWizard": true,
"hidden": false,
"defaultValue": ""
}
]
}
Restart Services
Restart worker nodes
sh /opt/datasophon/datasophon-worker/bin/datasophon-worker.sh restart worker
Restart api on master node
sh /opt/datasophon/datasophon-manager-1.2.1/bin/datasophon-api.sh restart api
Manually Create Database and Run SQL
CREATE DATABASE dolphinscheduler DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
use dolphinscheduler;
source /opt/datasophon/DDP/packages/dolphinscheduler-3.1.9/tools/sql/sql/dolphinscheduler_mysql.sql
Install DolphinScheduler Service
Add Service
Select nodes for ApiServer, MasterServer, and AlertServer roles based on your environment. MasterServer can be deployed on multiple nodes.
Select nodes for WorkerServer role based on your environment. WorkerServer can be deployed on one or multiple nodes.
Modify relevant configurations as needed.
Verify Installation with JPS
| bigdata1 | bigdata2 | bigdata3 |
|---|---|---|
| MasterServer | MasterServer | |
| WorkerServer | WorkerServer | WorkerServer |
| ApiApplicationServer | ||
| AlertServer |