aokunsang d1463d6ffa 修改BsonNull的问题 | 3 月之前 | |
---|---|---|
.mvn | 1 年之前 | |
bin | 1 年之前 | |
config | 1 年之前 | |
docker-seatunnel-build | 5 月之前 | |
docs | 1 年之前 | |
plugins | 1 年之前 | |
seatunnel-api | 1 年之前 | |
seatunnel-common | 1 年之前 | |
seatunnel-config | 1 年之前 | |
seatunnel-connectors-v2 | 3 月之前 | |
seatunnel-core | 1 年之前 | |
seatunnel-dist | 1 年之前 | |
seatunnel-e2e | 1 年之前 | |
seatunnel-engine | 1 年之前 | |
seatunnel-examples | 5 月之前 | |
seatunnel-formats | 1 年之前 | |
seatunnel-plugin-discovery | 1 年之前 | |
seatunnel-shade | 1 年之前 | |
seatunnel-transforms-v2 | 6 月之前 | |
seatunnel-translation | 1 年之前 | |
tools | 1 年之前 | |
.asf.yaml | 1 年之前 | |
.dlc.json | 1 年之前 | |
.gitignore | 1 年之前 | |
.gitmodules | 1 年之前 | |
.licenserc.yaml | 1 年之前 | |
.scalafmt.conf | 1 年之前 | |
DISCLAIMER | 1 年之前 | |
LICENSE | 1 年之前 | |
NOTICE | 1 年之前 | |
README.md | 1 年之前 | |
docker-readme.txt | 5 月之前 | |
generate_client_protocol.sh | 1 年之前 | |
mvnw | 1 年之前 | |
mvnw.cmd | 1 年之前 | |
plugin-mapping.properties | 1 年之前 | |
pom.xml | 1 年之前 | |
release-note.md | 1 年之前 |
SeaTunnel was formerly named Waterdrop , and renamed SeaTunnel since October 12, 2021.
SeaTunnel is a very easy-to-use ultra-high-performance distributed data integration platform that supports real-time synchronization of massive data. It can synchronize tens of billions of data stably and efficiently every day, and has been used in the production of nearly 100 companies.
SeaTunnel focuses on data integration and data synchronization, and is mainly designed to solve common problems in the field of data integration:
The runtime process of SeaTunnel is shown in the figure above.
The user configures the job information and selects the execution engine to submit the job.
The Source Connector is responsible for parallelizing the data and sending the data to the downstream Transform or directly to the Sink, and the Sink writes the data to the destination. It is worth noting that both Source and Transform and Sink can be easily developed and extended by yourself.
The default engine use by SeaTunnel is SeaTunnel Engine. If you choose to use the Flink or Spark engine, SeaTunnel will package the Connector into a Flink or Spark program and submit it to Flink or Spark to run.
Source Connectors supported check out
Sink Connectors supported check out
Transform supported check out
java runtime environment, java >= 8
If you want to run SeaTunnel in a cluster environment, any of the following Spark cluster environments is usable:
If the data volume is small, or the goal is merely for functional verification, you can also start in local mode without a cluster environment, because SeaTunnel supports standalone operation. Note: SeaTunnel 2.0 supports running on Spark and Flink.
Follow this document.
Download address for run-directly software package : https://seatunnel.apache.org/download
SeaTunnel Engine https://seatunnel.apache.org/docs/start-v2/locally/quick-start-seatunnel-engine/
Spark https://seatunnel.apache.org/docs/start-v2/locally/quick-start-spark
Flink https://seatunnel.apache.org/docs/start-v2/locally/quick-start-flink
Weibo business uses an internal customized version of SeaTunnel and its sub-project Guardian for SeaTunnel On Yarn task monitoring for hundreds of real-time streaming computing tasks.
Sina Data Operation Analysis Platform uses SeaTunnel to perform real-time and offline analysis of data operation and maintenance for Sina News, CDN and other services, and write it into Clickhouse.
Sogou Qiqian System takes SeaTunnel as an ETL tool to help establish a real-time data warehouse system.
Qutoutiao Data Center uses SeaTunnel to support mysql to hive offline ETL tasks, real-time hive to clickhouse backfill technical support, and well covers most offline and real-time tasks needs.
Yixia Technology, Yizhibo Data Platform
Yonghui Superstores Founders' Alliance-Yonghui Yunchuang Technology, Member E-commerce Data Analysis Platform
SeaTunnel provides real-time streaming and offline SQL computing of e-commerce user behavior data for Yonghui Life, a new retail brand of Yonghui Yunchuang Technology.
Shuidichou adopts SeaTunnel to do real-time streaming and regular offline batch processing on Yarn, processing 3~4T data volume average daily, and later writing the data to Clickhouse.
Collecting various logs from business services into Apache Kafka, some of the data in Apache Kafka is consumed and extracted through SeaTunnel, and then store into Clickhouse.
For more use cases, please refer to: https://seatunnel.apache.org/blog
This project adheres to the Contributor Covenant code of conduct. By participating, you are expected to uphold this code. Please follow the REPORTING GUIDELINES to report unacceptable behavior.
Thanks to all developers!
dev-subscribe@seatunnel.apache.org
, follow the reply to subscribe
the mail list.
SeaTunnel enriches the CNCF CLOUD NATIVE Landscape.
Various companies and organizations use SeaTunnel for research, production and commercial products. Visit our website to find the user page.