此次实践过程全属个人学习,我选择了在window下安装Superset,并进行嵌入后台系统实践。对此进行实践过程总结,实践成果分享给大家,供大家参考,如果你有更好的想法,欢迎留言交流。
建议安装Python 3.4 以上版本。Python 2.7 版本在windows 上存在各种编码问题。https://www.python.org/downloads/release/python-350/
下载Windows x86-64 executable installer 。直接使用exe的安装包即可,安装过程中选中增加到环境变量。
检查:CMD下 分别运行python -V 和 pip-V。如果找不到命令,则需要添加python的安装目录到path环境变量下。
此步骤可选,直接安装的话跳到第四步。因为Superset需要安装的组件较多,最好是使用virtualenv独立一套python环境。
在开发Python应用程序的时候,系统安装的Python3只有一个版本。所有第三方的包都会被pip安装到Python3的site-packages目录下。
如果我们要同时开发多个应用程序,那这些应用程序都会共用一个Python,就是安装在系统的Python 3。如果应用A需要jinja 2.7,而应用B需要jinja 2.6怎么办?
这种情况下,每个应用可能需要各自拥有一套“独立”的Python运行环境。virtualenv就是用来为一个应用创建一套“隔离”的Python运行环境。
安装命令:
pip install virtualenv
**
2.3 使用virtualenv**
先在D盘建立d:\pythonVir 目录。
然后激活:
cd d:\pythonVir
virtualenv env//等待初始化完成…
//激活:
env\Scripts\activate
激活之后的界面如下图,注意在命令行输入的左侧有(env)标记,这样我们的后续操作都会在env中生效,不会影响整体的pyhton环境。
Superset中依赖的一些库需要使用microsoft visual c++ 2010编译。
根据说明应该是也可以安装 Visual C++ 2015 Build Tools:
http://landinghub.visualstudio.com/visual-cpp-build-tools
这里是个大坑, 我之前直接安装superset一直安装不成功,报错(sasl.h 找不到)。
解决办法是:通过 http://www.lfd.uci.edu/~gohlke/pythonlibs/#sasl 下载对应的版本
比如咱们安装的python 是3.6版本,系统是64位,就下载sasl-0.2.1-cp36-cp36m-win_amd64.whl。
另外,安装过程中出现“ Failed building wheel for xxx”的解决办法如下:
出现原因:缺失相应的whl文件。
解决办法:下载并安装对应的whl文件。
例如,出现“ Failed building wheel for python_geohash”则下载相应python版本的python_geohash文件。
我用的是Python3.6版本,则找到python_geohash-0.8.5-cp36-cp36m-win_amd64.whl文件进行下载即可。
安装方法:
pip install F:\python_geohash-0.8.5-cp36-cp36m-win32.whl
1)前置环境准备完毕后,开始安装superset.
pip install superset
执行成功界面:
2)创建管理员账号:
fabmanager create-admin --app superset
执行过程界面如下:
3)初始化数据库 (windows下,先进入到 Python安装目录(或者virtualEnv的虚拟目录)下,lib\site-packages\superset\bin下)
执行命令:
python superset db upgrade
4)加载例子(后续操作都需要在lib\site-packages\superset\bin下)
python superset load_examples
5)初始化角色和权限
python superset init
6)启动服务,端口 8088, 使用 -p 更改端口号。
python superset runserver -d
Superset默认使用sqllite。支持以下数据库:
MySQL
pip install mysqlclient
mysql://
Postgres
pip install psycopg2
postgresql+psycopg2://
Presto
pip install pyhive
presto://
Oracle
pip install cx_Oracle
oracle://
sqlite
默认有了
sqlite://
Redshift
pip install sqlalchemy-redshift
postgresql+psycopg2://
MSSQL
pip install pymssql
mssql://
Impala
pip install impyla
impala://
SparkSQL
pip install pyhive
jdbc+hive://
Greenplum
pip install psycopg2
postgresql+psycopg2://
Athena
pip install "PyAthenaJDBC>1.0.9"
awsathena+jdbc://
Vertica
pip install sqlalchemy-vertica-python
vertica+vertica_python://
ClickHouse
pip install sqlalchemy-clickhouse
clickhouse://
使用pip安装好数据库后,就可以在Web界面中,配置相关数据源了。
数据库的连接字符串格式参见:
http://docs.sqlalchemy.org/en/rel_1_0/core/engines.html#database-urls
登录superset后,我们就可以配置自己本地数据源了,进行数据查询以及展示。
修改superset中的config.py配置文件,将PUBLIC_ROLE_LIKE_GAMMA改为True。
注释意思:
授予公共角色与GAMMA角色相同的权限集。
如果想让匿名用户查看,可以设置这里,在仪表盘对特定数据集的授权显示,也在这里设置。
避免iframe跨站访问问题。
其中:
Ø can explore on Superset为导出图表
Ø can explore json on Superset为导出图表json
Ø all database access on all_database_access访问所有数据库权限,也可以设置单个
<iframe
width="600"
height="400"
seamless
frameBorder="0"
scrolling="no"
src="http://127.0.0.1:8088/superset/explore/?form_data=%7B%22datasource%22%3A%223__table%22%2C%22viz_type%22%3A%22line%22%2C%22slice_id%22%3A63%2C%22granularity_sqla%22%3A%22ds%22%2C%22time_grain_sqla%22%3Anull%2C%22since%22%3A%22100+years+ago%22%2C%22until%22%3A%22now%22%2C%22metrics%22%3A%5B%7B%22aggregate%22%3A%22SUM%22%2C%22column%22%3A%7B%22column_name%22%3A%22num_california%22%2C%22expression%22%3A%22CASE+WHEN+state+%3D+%27CA%27+THEN+num+ELSE+0+END%22%7D%2C%22expressionType%22%3A%22SIMPLE%22%2C%22label%22%3A%22SUM%28num_california%29%22%7D%5D%2C%22adhoc_filters%22%3Anull%2C%22groupby%22%3A%5B%22name%22%5D%2C%22limit%22%3A%2210%22%2C%22timeseries_limit_metric%22%3A%7B%22aggregate%22%3A%22SUM%22%2C%22column%22%3A%7B%22column_name%22%3A%22num_california%22%2C%22expression%22%3A%22CASE+WHEN+state+%3D+%27CA%27+THEN+num+ELSE+0+END%22%7D%2C%22expressionType%22%3A%22SIMPLE%22%2C%22label%22%3A%22SUM%28num_california%29%22%7D%2C%22order_desc%22%3Atrue%2C%22contribution%22%3Afalse%2C%22row_limit%22%3A50000%2C%22color_scheme%22%3A%22bnbColors%22%2C%22show_brush%22%3A%22auto%22%2C%22show_legend%22%3Atrue%2C%22rich_tooltip%22%3Atrue%2C%22show_markers%22%3Afalse%2C%22line_interpolation%22%3A%22linear%22%2C%22x_axis_label%22%3A%22%22%2C%22bottom_margin%22%3A%22auto%22%2C%22x_ticks_layout%22%3A%22auto%22%2C%22x_axis_format%22%3A%22smart_date%22%2C%22x_axis_showminmax%22%3Afalse%2C%22y_axis_label%22%3A%22%22%2C%22left_margin%22%3A%22auto%22%2C%22y_axis_showminmax%22%3Afalse%2C%22y_log_scale%22%3Afalse%2C%22y_axis_format%22%3A%22.3s%22%2C%22y_axis_bounds%22%3A%5Bnull%2Cnull%5D%2C%22rolling_type%22%3A%22None%22%2C%22time_compare%22%3A%5B%5D%2C%22num_period_compare%22%3A%22%22%2C%22period_ratio_type%22%3A%22growth%22%2C%22resample_how%22%3Anull%2C%22resample_rule%22%3Anull%2C%22resample_fillmethod%22%3Anull%2C%22annotation_layers%22%3A%5B%5D%2C%22compare_lag%22%3A%2210%22%2C%22compare_suffix%22%3A%22o10Y%22%2C%22markup_type%22%3A%22markdown%22%2C%22metric%22%3A%22sum__num%22%2C%22where%22%3A%22%22%2C%22url_params%22%3A%7B%7D%7D&standalone=true&height=400"
>
</iframe>
效果如下:
为什么需要重定向呢?这里主要是为了后台应用隐藏superset的图表链接,防止被扫描到后,恶意使用;只要在后台应用重新写一个具有权限控制的请求链接,重新定向到superset的图表链接,这样就能防止数据泄露出去。
后台代码:
那么,对于链接地址:/chart/getDemoDashboardUrl,在后台就可以进行权限管理。
以上已经完全可以把superset中的图表嵌入到后台应用系统中了,但是怎么能够实现参数传递呢?现在,我在这里把实现过程整理出来,跟着试验样例看它怎么实现的。
研究一下superset图表提供出去的链接地址,就可以发现,已json作为参数传递的。如下:
form_data={"datasource":"3__table","viz_type":"line","slice_id":63,"granularity_sqla":"ds","time_grain_sqla":null,"since":"100 years ago","until":"now","metrics":[{"aggregate":"SUM","column":{"column_name":"num_california","expression":"CASE WHEN state = 'CA' THEN num ELSE 0 END"},"expressionType":"SIMPLE","label":"SUM(num_california)"}],"adhoc_filters":[{"expressionType":"SIMPLE","subject":"gender","operator":"==","comparator":"boy","clause":"WHERE","sqlExpression":null,"fromFormData":true,"filterOptionName":"filter_gtzm93u9ocq_9sy5vd5ocfg"},{"expressionType":"SIMPLE","subject":"name","operator":"LIKE","comparator":"Aaron","clause":"WHERE","sqlExpression":null,"fromFormData":true,"filterOptionName":"filter_6cgdixdoh3_5wrgyuorwoa"}],"groupby":["name"],"limit":"10","timeseries_limit_metric":{"aggregate":"SUM","column":{"column_name":"num_california","expression":"CASE WHEN state = 'CA' THEN num ELSE 0 END"},"expressionType":"SIMPLE","label":"SUM(num_california)"},"order_desc":true,"contribution":false,"row_limit":50000,"color_scheme":"bnbColors","show_brush":"auto","show_legend":true,"rich_tooltip":true,"show_markers":false,"line_interpolation":"linear","x_axis_label":"","bottom_margin":"auto","x_ticks_layout":"auto","x_axis_format":"smart_date","x_axis_showminmax":false,"y_axis_label":"","left_margin":"auto","y_axis_showminmax":false,"y_log_scale":false,"y_axis_format":".3s","y_axis_bounds":[null,null],"rolling_type":"None","time_compare":[],"num_period_compare":"","period_ratio_type":"growth","resample_how":null,"resample_rule":null,"resample_fillmethod":null,"annotation_layers":[],"compare_lag":"10","compare_suffix":"o10Y","markup_type":"markdown","metric":"sum__num","where":"","url_params":{}}
其中,标注×××的就是过滤条件的参数配置,于是提取出来在代码中做相应的修改:
修改后界面如下:
输入条件,姓名:Amy,性别选择:girl,点击查询如下:
9. 结束语
以上实践为预研superset可视化工具的过程整理,编写的样例都很简陋。如果实际应用到项目中去,按照上诉原理,重新设计一套可拓展,易使用的架构,将其打磨成可配置化的产品工具。这里先留下伏笔,暂不阐述,如果你有好的想法,欢迎留言交流。
手机扫一扫
移动阅读更方便
你可能感兴趣的文章