Apache Flink Ecosystem
The ecosystem of Flink consists of various tools, services, APIs, and libraries that play a crucial role in analytical processes. It can be summarized as shown in the diagram below:
Storage / Streaming
Unlike Hadoop, which includes HDFS as its storage system, Flink does not have a built-in storage component. Instead, Flink programs implement transformations on distributed collections. These collections are created from external sources, and the results are written to sinks, which can store data in distributed files or other storage systems.
Deployment Layer
Flink programs can be deployed in multiple ways, depending on the execution environment. It supports:
- Local Mode: Runs on a single machine within a single JVM.
- Cluster Mode: Supports multi-node distributed execution across multiple JVM instances.
- Cloud Mode: Can be deployed on cloud platforms like AWS (Amazon Web Services) and GCP (Google Cloud Platform).
Kernel / Core Layer
Often referred to as the Kernel, this core layer powers Flink’s runtime execution engine. It provides key functionalities, including:
- Reliable and fault-tolerant distributed processing
- Native iterative processing capabilities
- High scalability and real-time data streaming
With this robust ecosystem, Apache Flink stands as a powerful framework for stream processing, ensuring efficiency and scalability.
0 comments:
If you have any doubts,please let me know