superset/docs/docs/databases/druid.mdx

72 lines
2.6 KiB
Plaintext
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
title: Apache Druid
hide_title: true
sidebar_position: 7
version: 1
---
import useBaseUrl from "@docusaurus/useBaseUrl";
## Apache Druid
A native connector to Druid ships with Superset (behind the `DRUID_IS_ACTIVE` flag) but this is
slowly getting deprecated in favor of SQLAlchemy / DBAPI connector made available in the
[pydruid library](https://pythonhosted.org/pydruid/).
The connection string looks like:
```
druid://<User>:<password>@<Host>:<Port-default-9088>/druid/v2/sql
```
Here's a breakdown of the key components of this connection string:
User: username portion of the credentials needed to connect to your database
Password: password portion of the credentials needed to connect to your database
Host: IP address (or URL) of the host machine that's running your database
Port: specific port that's exposed on your host machine where your database is running
### Customizing Druid Connection
When adding a connection to Druid, you can customize the connection a few different ways in the
**Add Database** form.
**Custom Certificate**
You can add certificates in the **Root Certificate** field when configuring the new database
connection to Druid:
<img src={useBaseUrl("/img/root-cert-example.png")} />{" "}
When using a custom certificate, pydruid will automatically use https scheme.
**Disable SSL Verification**
To disable SSL verification, add the following to the **Extras** field:
```
engine_params:
{"connect_args":
{"scheme": "https", "ssl_verify_cert": false}}
```
### Aggregations
Common aggregations or Druid metrics can be defined and used in Superset. The first and simpler use
case is to use the checkbox matrix exposed in your datasources edit view (**Sources -> Druid
Datasources -> [your datasource] -> Edit -> [tab] List Druid Column**).
Clicking the GroupBy and Filterable checkboxes will make the column appear in the related dropdowns
while in the Explore view. Checking Count Distinct, Min, Max or Sum will result in creating new
metrics that will appear in the **List Druid Metric** tab upon saving the datasource.
By editing these metrics, youll notice that their JSON element corresponds to Druid aggregation
definition. You can create your own aggregations manually from the **List Druid Metric** tab
following Druid documentation.
### Post-Aggregations
Druid supports post aggregation and this works in Superset. All you have to do is create a metric,
much like you would create an aggregation manually, but specify `postagg` as a `Metric Type`. You
then have to provide a valid json post-aggregation definition (as specified in the Druid docs) in
the JSON field.