DataX 是阿里云 DataWorks数据集成 的开源版本,在阿里巴巴集团内被广泛使用的离线数据同步工具/平台。DataX 实现了包括 MySQL、Oracle、SqlServer、Postgre、HDFS、Hive、ADS、HBase、TableStore(OTS)、MaxCompute(ODPS)、Hologres、DRDS 、OceanBase 等各种异构数据源之间高效的数据同步功能。
开源地址:alibaba/DataX: DataX是阿里云DataWorks数据集成的开源版本。 (github.com)
DataX 可以在传统数据库 和 OceanBase 进行数据交换,无论是直接交换,还是通过数据文件 (csv)进行中转都行。
本贴主要集中提供相关使用方法介绍和答疑。欢迎用过的朋友提问。
=
示例1 :MySQL 数据导出到文件(CSV)
$cat job/bmsql_oorder_mysql2csv.json
{
"job": {
"setting": {
"speed": {
"channel": 4
},
"errorLimit": {
"record": 0,
"percentage": 0.1
}
},
"content": [
{
"reader": {
"name": "mysqlreader",
"parameter": {
"username": "tpcc",
"password": "Dg0gIexJAV",
"column": [
"*"
],
"connection": [
{
"table": [
"bmsql_oorder"
],
"jdbcUrl": ["jdbc:mysql://127.0.0.1:3306/tpccdb?useUnicode=true&characterEncoding=utf8"]
}
]
}
},
"writer": {
"name": "txtfilewriter",
"parameter": {
"path": "/tmp/tpcc/bmsql_oorder",
"fileName": "bmsql_oorder",
"encoding": "UTF-8",
"writeMode": "truncate",
"dateFormat": "yyyy-MM-dd hh:mm:ss" ,
"nullFormat": "\\N" ,
"fileFormat": "csv" ,
"fieldDelimiter": ","
}
}
}
]
}
}
示例2: CSV 文件到 OceanBase
$cat job/bmsql_oorder_csv2ob.json
{
"job": {
"setting": {
"speed": {
"channel": 4
},
"errorLimit": {
"record": 0,
"percentage": 0.1
}
},
"content": [
{
"reader": {
"name": "txtfilereader",
"parameter": {
"path": ["/tmp/tpcc/bmsql_oorder"],
"fileName": "bmsql_oorder",
"encoding": "UTF-8",
"column": ["*"],
"dateFormat": "yyyy-MM-dd hh:mm:ss" ,
"nullFormat": "\\N" ,
"fieldDelimiter": ","
}
},
"writer": {
"name": "oceanbasev10writer",
"parameter": {
"obWriteMode": "insert",
"column": [
"*"
],
"preSql": [
"truncate table bmsql_oorder"
],
"connection": [
{
"jdbcUrl": "||_dsc_ob10_dsc_||obdemo:oboracle||_dsc_ob10_dsc_||jdbc:oceanbase://127.0.0.1:2883/tpcc?useLocalSessionState=true&allowBatch=true&allowMultiQueries=true&rewriteBatchedStatements=true",
"table": [
"bmsql_oorder"
]
}
],
"username": "tpcc",
"password":"Dg0gIexJAV",
"writerThreadCount":10,
"batchSize": 256,
"memstoreThreshold": "0.9"
}
}
}
]
}
}
示例3: ORACLE 数据同步到 `CSV` 文件
{
"job": {
"setting": {
"speed": {
"channel": 4
},
"errorLimit": {
"record": 0,
"percentage": 0.1
}
},
"content": [
{
"reader": {
"name": "oraclereader",
"parameter": {
"username": "tpcc",
"password": "********",
"column": [
"*"
],
"connection": [
{
"table": [
"bmsql_oorder"
],
"jdbcUrl": [ "jdbc:oracle:thin:@172.17.0.5:1521:helowin"]
}
]
}
},
"writer": {
"name": "txtfilewriter",
"parameter": {
"path": "/tmp/tpcc/bmsql_oorder",
"fileName": "bmsql_oorder",
"encoding": "UTF-8",
"writeMode": "truncate",
"dateFormat": "yyyy-MM-dd hh:mm:ss" ,
"nullFormat": "\\N" ,
"fileFormat": "csv" ,
"fieldDelimiter": ","
}
}
}
]
}
}
1 个赞
mysql 到 ob的增量可以使用 canal (https://github.com/alibaba/canal/ 。ob到mysql的增量等3.1.1版本发布一个 cdc组件。
企业版的涉及到实时增量的同步都通过 oms 产品去做。
1 个赞
可以看看我们的视频呀
https://www.bilibili.com/video/BV1db4y1h7XU?spmidfrom=333.999.0.0
1 个赞