72 lines
2.6 KiB
Plaintext
72 lines
2.6 KiB
Plaintext
---
|
||
title: Apache Druid
|
||
hide_title: true
|
||
sidebar_position: 7
|
||
version: 1
|
||
---
|
||
|
||
import useBaseUrl from "@docusaurus/useBaseUrl";
|
||
|
||
## Apache Druid
|
||
|
||
A native connector to Druid ships with Superset (behind the `DRUID_IS_ACTIVE` flag) but this is
|
||
slowly getting deprecated in favor of SQLAlchemy / DBAPI connector made available in the
|
||
[pydruid library](https://pythonhosted.org/pydruid/).
|
||
|
||
The connection string looks like:
|
||
|
||
```
|
||
druid://<User>:<password>@<Host>:<Port-default-9088>/druid/v2/sql
|
||
```
|
||
Here's a breakdown of the key components of this connection string:
|
||
|
||
User: username portion of the credentials needed to connect to your database
|
||
Password: password portion of the credentials needed to connect to your database
|
||
Host: IP address (or URL) of the host machine that's running your database
|
||
Port: specific port that's exposed on your host machine where your database is running
|
||
|
||
### Customizing Druid Connection
|
||
|
||
When adding a connection to Druid, you can customize the connection a few different ways in the
|
||
**Add Database** form.
|
||
|
||
**Custom Certificate**
|
||
|
||
You can add certificates in the **Root Certificate** field when configuring the new database
|
||
connection to Druid:
|
||
|
||
<img src={useBaseUrl("/img/root-cert-example.png")} />{" "}
|
||
|
||
When using a custom certificate, pydruid will automatically use https scheme.
|
||
|
||
**Disable SSL Verification**
|
||
|
||
To disable SSL verification, add the following to the **Extras** field:
|
||
|
||
```
|
||
engine_params:
|
||
{"connect_args":
|
||
{"scheme": "https", "ssl_verify_cert": false}}
|
||
```
|
||
|
||
### Aggregations
|
||
|
||
Common aggregations or Druid metrics can be defined and used in Superset. The first and simpler use
|
||
case is to use the checkbox matrix exposed in your datasource’s edit view (**Sources -> Druid
|
||
Datasources -> [your datasource] -> Edit -> [tab] List Druid Column**).
|
||
|
||
Clicking the GroupBy and Filterable checkboxes will make the column appear in the related dropdowns
|
||
while in the Explore view. Checking Count Distinct, Min, Max or Sum will result in creating new
|
||
metrics that will appear in the **List Druid Metric** tab upon saving the datasource.
|
||
|
||
By editing these metrics, you’ll notice that their JSON element corresponds to Druid aggregation
|
||
definition. You can create your own aggregations manually from the **List Druid Metric** tab
|
||
following Druid documentation.
|
||
|
||
### Post-Aggregations
|
||
|
||
Druid supports post aggregation and this works in Superset. All you have to do is create a metric,
|
||
much like you would create an aggregation manually, but specify `postagg` as a `Metric Type`. You
|
||
then have to provide a valid json post-aggregation definition (as specified in the Druid docs) in
|
||
the JSON field.
|