Summary: Added limited support for visualizations with Pinot via Sqlalchemy. Pinot QL (PQL) is a bit weird and limited, and this patch hacks superset to deal with that weirdness: 1. Pinot's grouping by time is best done as a long epoch. Grouping by a time string is really slow and times out. 2. Pinot's response does not respect column aliases. So columns are not named what they are expected to. So we remember the given column aliases and then stamp them back onto the dataframe 3. Pinot's Json rest call does not return the output types. Instead everything is cast to string. So when grouping by time, the group key is integral and has to be treated specially when casting back to the dataframe __timestamp column. 4. Finally, pinot does support grouping by on expressions. But those expressions cannot then appear on the select clause. They are returned regardless in the response. ie, 'select foo, count(*) from bar group by foo' is okay, but 'select expr(foo), count(*) from bar group by expr(foo)' ain't. One must use 'select count(*) from bar group by expr(foo)'. I also fixed a couple of things that looked like bugs to me: for example, the row-ordering-limit should come at the end always. Test Plan: Tested with the modified pinotdb sqlalchemy driver and an internal pinot cluster. The pinotdb driver changes are in https://github.com/agrawaldevesh/pinot-dbapi. Pinot does not support orderby-limit for aggregated queries. To annotate a query as an aggregate query, this patch adds a hint to the prepared select statement that the pinotdb sqlalchemy driver then heeds. |
||
|---|---|---|
| contrib/docker | ||
| docs | ||
| install/helm/superset | ||
| licenses | ||
| scripts | ||
| superset | ||
| tests | ||
| .dockerignore | ||
| .flaskenv | ||
| .gitignore | ||
| .pylintrc | ||
| .rat-excludes | ||
| .travis.yml | ||
| CHANGELOG.md | ||
| CODE_OF_CONDUCT.md | ||
| CONTRIBUTING.md | ||
| DISCLAIMER | ||
| ISSUE_TEMPLATE.md | ||
| LICENSE.txt | ||
| MANIFEST.in | ||
| NOTICE | ||
| README.md | ||
| RELEASING.md | ||
| UPDATING.md | ||
| alembic.ini | ||
| babel-node | ||
| cypress.json | ||
| gen_changelog.sh | ||
| pypi_push.sh | ||
| requirements-dev.txt | ||
| requirements.txt | ||
| setup.cfg | ||
| setup.py | ||
| tox.ini | ||
README.md
Superset

Apache Superset (incubating) is a modern, enterprise-ready business intelligence web application
[this project used to be named Caravel, and Panoramix in the past]
Screenshots & Gifs
View Dashboards

Slice & dice your data

Query and visualize your data with SQL Lab

Visualize geospatial data with deck.gl

Choose from a wide array of visualizations

Apache Superset
Apache Superset is a data exploration and visualization web application.
Superset provides:
- An intuitive interface to explore and visualize datasets, and create interactive dashboards.
- A wide array of beautiful visualizations to showcase your data.
- Easy, code-free, user flows to drill down and slice and dice the data underlying exposed dashboards. The dashboards and charts acts as a starting point for deeper analysis.
- A state of the art SQL editor/IDE exposing a rich metadata browser, and an easy workflow to create visualizations out of any result set.
- An extensible, high granularity security model allowing intricate rules on who can access which product features and datasets. Integration with major authentication backends (database, OpenID, LDAP, OAuth, REMOTE_USER, ...)
- A lightweight semantic layer, allowing to control how data sources are exposed to the user by defining dimensions and metrics
- Out of the box support for most SQL-speaking databases
- Deep integration with Druid allows for Superset to stay blazing fast while slicing and dicing large, realtime datasets
- Fast loading dashboards with configurable caching
Database Support
Superset speaks many SQL dialects through SQLAlchemy, a Python ORM that is compatible with most common databases.
Superset can be used to visualize data out of most databases:
- MySQL
- Postgres
- Vertica
- Oracle
- Microsoft SQL Server
- SQLite
- Greenplum
- Firebird
- MariaDB
- Sybase
- IBM DB2
- Exasol
- MonetDB
- Snowflake
- Redshift
- Clickhouse
- Apache Kylin
- more! look for the availability of a SQLAlchemy dialect for your database to find out whether it will work with Superset
Apache Druid (Incubating)!
On top of having the ability to query your relational databases, Superset ships with deep integration with Druid (a real time distributed column-store). When querying Druid, Superset can query humongous amounts of data on top of real time dataset. Note that Superset does not require Druid in any way to function, it's simply another database backend that it can query.
Here's a description of Druid from the http://druid.io website:
Druid is an open-source analytics data store designed for business intelligence (OLAP) queries on event data. Druid provides low latency (real-time) data ingestion, flexible data exploration, and fast data aggregation. Existing Druid deployments have scaled to trillions of events and petabytes of data. Druid is best used to power analytic dashboards and applications.
Installation & Configuration
Resources
- Mailing list
- Gitter (live chat) Channel
- Docker image (community contributed)
- Slides from Strata (March 2016)
- Stackoverflow tag
- Join our Slack
- DEPRECATED Google Group
Contributing
Interested in contributing? Casual hacking? Check out Contributing.MD
Who uses Apache Superset (incubating)?
Here's a list of organizations who have taken the time to send a PR to let the world know they are using Superset. Join our growing community!
- AiHello
- Airbnb
- Airboxlab
- Aktia Bank plc
- Amino
- Apollo GraphQL
- Ascendica Development
- Astronomer
- Brilliant.org
- Capital Service S.A.
- Clark.de
- CnOvit
- Dial Once
- Digit Game Studios
- Douban
- Endress+Hauser
- FBK - ICT center
- Faasos
- Fordeal
- GfK Data Lab
- Grassroot
- HuiShouBao
- jampp
- Konfío
- Kuaishou
- Lime
- Lyft
- Maieutical Labs
- Myra Labs
- Now
- PeopleDoc
- Ona
- Pronto Tools
- QPID Health
- Qunar
- ScopeAI
- Shopee
- Shopkick
- Steamroot
- Showmax
- Tails.com
- THEICONIC
- Tobii
- Tooploox
- TrustMedis
- Udemy
- VIPKID
- Windsor.ai
- Yahoo!
- Zaihang
- Zalando