StreamingPro

简介: StreamingPro is not a complete application, but rather a extensible and programmable framework for spark streaming (also include spark,storm)that can

Declarative workflows for building Spark Streaming

1de7721f4209f17f306f024d216317c55367bc2f
Spark Streaming
Spark Streaming is an extension of the core Spark API that enables stream processing from a variety of sources.Spark is a extensible and programmable framework for massive distributed processing of datasets,called Resilient Distributed Datasets (RDD). Spark Streaming receives input data streams and divides the data into batches, which are then processed by the Spark engine to generate the results.Spark Streaming data is organized into a sequence of DStreams,represented internally as a sequence of RDDs.

StreamingPro

StreamingPro is not a complete application, but rather  a extensible and programmable framework for spark streaming (also include spark,storm)that can easily be used to build your streaming application.
StreamingPro also make it possible that all you should do to build streaming program is assembling components(eg. SQL Component) in configuration file. 

Features

  • Pure Spark Streaming(Or normal Spark) program (Storm in future)
  • No need of coding, only declarative workflows
  • Rest API for interactive
  • SQL-Oriented workflows support  
  • Data continuously streamed in & processed in near real-time
  • dynamically CURD of workflows  at runtime via Rest API 
  • Flexible workflows (input, output, parsers, etc...) 
  • High performance
  • Scalable   

Documents

Architecture

cfc7ad03f8758fe950f25976c1e140fbc7af0690
Snip20160510_3.png

Declarative workflows

1de7721f4209f17f306f024d216317c55367bc2f
Snip20160510_4.png

Implementation

e7ea91ecaf0f3c5a6a3f0c6288608a460ec1b282
Snip20160510_1.png
目录
相关文章
|
22天前
|
SQL JSON 关系型数据库
bigdata-22-Hive高阶应用
bigdata-22-Hive高阶应用
28 0
|
1月前
|
SQL 分布式计算 Java
在AWS Glue中使用Apache Hudi
在AWS Glue中使用Apache Hudi
30 0
|
5月前
|
分布式计算 Java Hadoop
70 Azkaban MAPREDUCE任务
70 Azkaban MAPREDUCE任务
20 0
|
API 数据安全/隐私保护 Hbase
Dremio与Drill的对比
1.简述 Dremio与Drill简述 2.区别 a).数据源支持 使用最新版本Dremio 3.3.1和Drill 1.16.0Dremio3.1.3版本开始不支持HBase,将来会开源社区版HBase连接器 b).
2885 0
|
SQL 分布式计算 HIVE
Shark
Shark自己也没用过,不太熟悉,只了解它的背景,现在已经被Spark淘汰,也不去熟悉它了! Spark 1.0版本开始,推出了Spark SQL。
1064 0
|
流计算 分布式计算 Hadoop
|
分布式计算 Java Hadoop
|
分布式计算 监控 Hadoop
Hadoop On Demand用户指南
本文讲的是Hadoop On Demand用户指南,后面的文档包括一个快速入门指南能让你快速上手HOD,一个所有HOD特性的详细手册,命令行选项,一些已知问题和故障排除的信息。
1149 0