# Blazer

Explore your data with SQL. Easily create charts and dashboards, and share them with your team.

[Try it out](https://blazer.dokkuapp.com)

[![Screenshot](https://blazer.dokkuapp.com/assets/screenshot-6ca3115a518b488026e48be83ba0d4c9.png)](https://blazer.dokkuapp.com)

:envelope: [Get notified of updates](http://eepurl.com/cbUwsD)

:tangerine: Battle-tested at [Instacart](https://www.instacart.com/opensource)

## Features

- **Multiple data sources** - PostgreSQL, MySQL, Redshift, and [many more](#full-list)
- **Variables** - run the same queries with different values
- **Checks & alerts** - get emailed when bad data appears
- **Audits** - all queries are tracked
- **Security** - works with your authentication system

## Docs

- [Installation](#installation)
- [Queries](#queries)
- [Charts](#charts)
- [Dashboards](#dashboards)
- [Checks](#checks)

## Installation

Add this line to your application’s Gemfile:

```ruby
gem 'blazer'
```

Run:

```sh
rails g blazer:install
rake db:migrate
```

And mount the dashboard in your `config/routes.rb`:

```ruby
mount Blazer::Engine, at: "blazer"
```

For production, specify your database:

```ruby
ENV["BLAZER_DATABASE_URL"] = "postgres://user:password@hostname:5432/database"
```

Blazer tries to protect against queries which modify data (by running each query in a transaction and rolling it back), but a safer approach is to use a read only user.  [See how to create one](#permissions).

#### Checks (optional)

Be sure to set a host in `config/environments/production.rb` for emails to work.

```ruby
config.action_mailer.default_url_options = {host: "blazer.dokkuapp.com"}
```

Schedule checks to run (with cron, [Heroku Scheduler](https://elements.heroku.com/addons/scheduler), etc). The default options are every 5 minutes, 1 hour, or 1 day, which you can customize. For each of these options, set up a task to run.

```sh
rake blazer:run_checks SCHEDULE="5 minutes"
rake blazer:run_checks SCHEDULE="1 hour"
rake blazer:run_checks SCHEDULE="1 day"
```

You can also set up failing checks to be sent once a day (or whatever you prefer).

```sh
rake blazer:send_failing_checks
```

Here’s what it looks like with cron.

```
*/5 * * * * rake blazer:run_checks SCHEDULE="5 minutes"
0   * * * * rake blazer:run_checks SCHEDULE="1 hour"
30  7 * * * rake blazer:run_checks SCHEDULE="1 day"
0   8 * * * rake blazer:send_failing_checks
```

## Permissions

### PostgreSQL

Create a user with read only permissions:

```sql
BEGIN;
CREATE ROLE blazer LOGIN PASSWORD 'secret123';
GRANT CONNECT ON DATABASE database_name TO blazer;
GRANT USAGE ON SCHEMA public TO blazer;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO blazer;
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO blazer;
COMMIT;
```

### MySQL

Create a user with read only permissions:

```sql
GRANT SELECT, SHOW VIEW ON database_name.* TO blazer@’127.0.0.1′ IDENTIFIED BY ‘secret123‘;
FLUSH PRIVILEGES;
```

### MongoDB

Create a user with read only permissions:

```
db.createUser({user: "blazer", pwd: "password", roles: ["read"]})
```

Also, make sure authorization is enabled when you start the server.

### Sensitive Data

To protect sensitive info like password hashes and access tokens, use views. Documentation coming soon.

## Authentication

Don’t forget to protect the dashboard in production.

### Basic Authentication

Set the following variables in your environment or an initializer.

```ruby
ENV["BLAZER_USERNAME"] = "andrew"
ENV["BLAZER_PASSWORD"] = "secret"
```

### Devise

```ruby
authenticate :user, -> (user) { user.admin? } do
  mount Blazer::Engine, at: "blazer"
end
```

### Other

Specify a `before_action` method to run in `blazer.yml`.

```yml
before_action: require_admin
```

Then define the custom authentication method in your `application_controller.rb`.

```ruby
def require_admin
  # depending on your auth, maybe something like...
  current_user && current_user.admin?
end
```

## Queries

### Variables

Create queries with variables.

```sql
SELECT * FROM users WHERE gender = {gender}
```

Use `{start_time}` and `{end_time}` for time ranges. [Example](https://blazer.dokkuapp.com/queries/9-time-range-selector?start_time=1997-10-03T05%3A00%3A00%2B00%3A00&end_time=1997-10-04T04%3A59%3A59%2B00%3A00)

```sql
SELECT * FROM ratings WHERE rated_at >= {start_time} AND rated_at <= {end_time}
```

### Smart Variables

[Example](https://blazer.dokkuapp.com/queries/1-smart-variable)

Suppose you have the query:

```sql
SELECT * FROM users WHERE occupation_id = {occupation_id}
```

Instead of remembering each occupation’s id, users can select occupations by name.

Add a smart variable with:

```yml
smart_variables:
  occupation_id: "SELECT id, name FROM occupations ORDER BY name ASC"
```

The first column is the value of the variable, and the second column is the label.

You can also use an array or hash for static data and enums.

```yml
smart_variables:
  period: ["day", "week", "month"]
  status: {0: "Active", 1: "Archived"}
```

### Linked Columns

[Example](https://blazer.dokkuapp.com/queries/3-linked-column) - title column

Link results to other pages in your apps or around the web. Specify a column name and where it should link to. You can use the value of the result with `{value}`.

```yml
linked_columns:
  user_id: "/admin/users/{value}"
  ip_address: "http://www.infosniper.net/index.php?ip_address={value}"
```

### Smart Columns

[Example](https://blazer.dokkuapp.com/queries/2-smart-column) - occupation_id column

Suppose you have the query:

```sql
SELECT name, city_id FROM users
```

See which city the user belongs to without a join.

```yml
smart_columns:
  city_id: "SELECT id, name FROM cities WHERE id IN {value}"
```

You can also use a hash for static data and enums.

```yml
smart_columns:
  status: {0: "Active", 1: "Archived"}
```

### Caching

Blazer can automatically cache results to improve speed. It can cache slow queries:

```yml
cache:
  mode: slow
  expires_in: 60 # min
  slow_threshold: 15 # sec
```

Or it can cache all queries:

```yml
cache:
  mode: all
  expires_in: 60 # min
```

Of course, you can force a refresh at any time.

## Charts

Blazer will automatically generate charts based on the types of the columns returned in your query.

**Note:** The order of columns matters.

### Line Chart

There are two ways to generate line charts.

2+ columns - timestamp, numeric(s) - [Example](https://blazer.dokkuapp.com/queries/4-line-chart-format-1)

```sql
SELECT date_trunc('week', created_at), COUNT(*) FROM users GROUP BY 1
```

3 columns - timestamp, string, numeric - [Example](https://blazer.dokkuapp.com/queries/5-line-chart-format-2)


```sql
SELECT date_trunc('week', created_at), gender, COUNT(*) FROM users GROUP BY 1, 2
```

### Column Chart

There are also two ways to generate column charts.

2+ columns - string, numeric(s) - [Example](https://blazer.dokkuapp.com/queries/6-column-chart-format-1)

```sql
SELECT gender, COUNT(*) FROM users GROUP BY 1
```

3 columns - string, string, numeric - [Example](https://blazer.dokkuapp.com/queries/7-column-chart-format-2)

```sql
SELECT gender, zip_code, COUNT(*) FROM users GROUP BY 1, 2
```

### Scatter Chart

2 columns - both numeric

```sql
SELECT x, y FROM table
```

### Maps

Columns named `latitude` and `longitude` or `lat` and `lon` or `lat` and `lng` - [Example](https://blazer.dokkuapp.com/queries/15-map)

```sql
SELECT name, latitude, longitude FROM cities
```

To enable, get an access token from [Mapbox](https://www.mapbox.com/) and set `ENV["MAPBOX_ACCESS_TOKEN"]`.

### Targets

Use the column name `target` to draw a line for goals. [Example](https://blazer.dokkuapp.com/queries/8-target-line)

```sql
SELECT date_trunc('week', created_at), COUNT(*) AS new_users, 100000 AS target FROM users GROUP BY 1
```

## Dashboards

Create a dashboard with multiple queries. [Example](https://blazer.dokkuapp.com/dashboards/1-dashboard-demo)

If the query has a chart, the chart is shown. Otherwise, you’ll see a table.

If any queries have variables, they will show up on the dashboard.

## Checks

Checks give you a centralized place to see the health of your data. [Example](https://blazer.dokkuapp.com/checks)

Create a query to identify bad rows.

```sql
SELECT * FROM ratings WHERE user_id IS NULL /* all ratings should have a user */
```

Then create check with optional emails if you want to be notified. Emails are sent when a check starts failing, and when it starts passing again.

## Anomaly Detection

Anomaly detection is supported thanks to Twitter’s [AnomalyDetection](https://github.com/twitter/AnomalyDetection) library.

First, [install R](https://cloud.r-project.org/). Then, run:

```R
install.packages("devtools")
devtools::install_github("twitter/AnomalyDetection")
```

And add to `config/blazer.yml`:

```yml
anomaly_checks: true
```

If upgrading from version 1.4 or below, also follow the [upgrade instructions](#15).

If you’re on Heroku, follow [these additional instructions](#anomaly-detection-on-heroku).

## Data Sources

Blazer supports multiple data sources :tada:

Add additional data sources in `config/blazer.yml`:

```yml
data_sources:
  main:
    url: <%= ENV["BLAZER_DATABASE_URL"] %>
    # timeout, smart_variables, linked_columns, smart_columns
  catalog:
    url: <%= ENV["CATALOG_DATABASE_URL"] %>
    # ...
  redshift:
    url: <%= ENV["REDSHIFT_DATABASE_URL"] %>
    # ...
```

### Full List

- [PostgreSQL](#postgresql-1)
- [MySQL](#mysql-1)
- [SQL Server](#sql-server)
- [Oracle](#oracle)
- [IBM DB2 and Informix](#ibm-db2-and-informix)
- [SQLite](#sqlite)
- [Amazon Redshift](#amazon-redshift)
- [Amazon Athena](#amazon-athena)
- [Presto](#presto)
- [Apache Drill](#apache-drill)
- [Google BigQuery](#google-bigquery)
- [MongoDB](#mongodb-1)
- [Cassandra](#cassandra)
- [Druid](#druid)
- [Elasticsearch](#elasticsearch-beta) [beta]

You can also [create an adapter](#creating-an-adapter) for any other data store.

**Note:** In the examples below, we recommend using environment variables for urls.

```yml
data_sources:
  my_source:
    url: <%= ENV["BLAZER_MY_SOURCE_URL"] %>
```

### PostgreSQL

Add [pg](https://bitbucket.org/ged/ruby-pg/wiki/Home) to your Gemfile (if it’s not there) and set:

```yml
data_sources:
  my_source:
    url: postgres://user:password@hostname:5432/database
```

### MySQL

Add [mysql2](https://github.com/brianmario/mysql2) to your Gemfile (if it’s not there) and set:

```yml
data_sources:
  my_source:
    url: mysql2://user:password@hostname:3306/database
```

### SQL Server

Add [tiny_tds](https://github.com/rails-sqlserver/tiny_tds) and [activerecord-sqlserver-adapter](https://github.com/rails-sqlserver/activerecord-sqlserver-adapter) to your Gemfile and set:

```yml
data_sources:
  my_source:
    url: sqlserver://user:password@hostname:1433/database
```

### Oracle

Use [activerecord-oracle_enhanced-adapter](https://github.com/rsim/oracle-enhanced).

### IBM DB2 and Informix

Use [ibm_db](https://github.com/ibmdb/ruby-ibmdb).

### SQLite

Add [sqlite3](https://github.com/sparklemotion/sqlite3-ruby) to your Gemfile and set:

```yml
data_sources:
  my_source:
    url: sqlite3:path/to/database.sqlite3
```

### Amazon Redshift

Add [activerecord4-redshift-adapter](https://github.com/aamine/activerecord4-redshift-adapter) or [activerecord5-redshift-adapter](https://github.com/ConsultingMD/activerecord5-redshift-adapter) to your Gemfile and set:

```yml
data_sources:
  my_source:
    url: redshift://user:password@hostname:5439/database
```

### Amazon Athena

Add [aws-sdk](https://github.com/aws/aws-sdk-ruby) `~> 2` to your Gemfile and set:

```yml
data_sources:
  my_source:
    adapter: athena
    database: database
    output_location: s3://some-bucket/
```

### Presto

Add [presto-client](https://github.com/treasure-data/presto-client-ruby) to your Gemfile and set:

```yml
data_sources:
  my_source:
    url: presto://user@hostname:8080/catalog
```

### Apache Drill

Add [drill-sergeant](https://github.com/ankane/drill-sergeant) to your Gemfile and set:

```yml
data_sources:
  my_source:
    adapter: drill
    url: http://hostname:8047
```

### Google BigQuery

Add [google-cloud-bigquery](https://github.com/GoogleCloudPlatform/google-cloud-ruby/tree/master/google-cloud-bigquery) to your Gemfile and set:

```yml
data_sources:
  my_source:
    adapter: bigquery
    project: your-project
    keyfile: path/to/keyfile.json
```

### MongoDB

Add [mongo](https://github.com/mongodb/mongo-ruby-driver) to your Gemfile and set:

```yml
data_sources:
  my_source:
    url: mongodb://user:password@hostname:27017/database
```

### Cassandra

Add [cassandra-driver](https://github.com/datastax/ruby-driver) to your Gemfile and set:

```yml
data_sources:
  my_source:
    url: cassandra://user:password@hostname:9042/keyspace
```

### Druid

First, [enable SQL support](http://druid.io/docs/latest/querying/sql.html#configuration) on the broker.

Set:

```yml
data_sources:
  my_source:
    adapter: druid
    url: http://hostname:8082
```

### Elasticsearch [beta]

Add [elasticsearch](https://github.com/elastic/elasticsearch-ruby) to your Gemfile and set:

```yml
data_sources:
  my_source:
    adapter: elasticsearch
    url: http://user:password@hostname:9200
```

## Creating an Adapter

Create an adapter for any data store with:

```ruby
class FooAdapter < Blazer::Adapters::BaseAdapter
  # code goes here
end

Blazer.register_adapter "foo", FooAdapter
```

See the [Presto adapter](https://github.com/ankane/blazer/blob/master/lib/blazer/adapters/presto_adapter.rb) for a good example. Then use:

```yml
data_sources:
  my_source:
    adapter: foo
    url: http://user:password@hostname:9200/
```

## Query Permissions

Blazer supports a basic permissions model.

1. Queries without a name are unlisted
2. Queries whose name starts with `#` are only listed to the creator
3. Queries whose name starts with `*` can only be edited by the creator

## Learn SQL

Have team members who want to learn SQL? Here are a few great, free resources.

- [Khan Academy](https://www.khanacademy.org/computing/computer-programming/sql)
- [Codecademy](https://www.codecademy.com/learn/learn-sql)

## Useful Tools

For an easy way to group by day, week, month, and more with correct time zones, check out [Groupdate](https://github.com/ankane/groupdate.sql).

## Anomaly Detection on Heroku

Add the [R buildpack](https://github.com/virtualstaticvoid/heroku-buildpack-r) to your app.

```sh
heroku buildpacks:add --index 1 https://github.com/virtualstaticvoid/heroku-buildpack-r.git\#cedar-14
```

And create an `init.r` with:

```sh
if (!"AnomalyDetection" %in% installed.packages()) {
  install.packages("devtools")
  devtools::install_github("twitter/AnomalyDetection")
}
```

Commit and deploy away. The first deploy may take a few minutes.

## Upgrading

### 1.5

To take advantage of the anomaly detection, create a migration

```sh
rails g migration upgrade_blazer_to_1_5
```

with:

```ruby
add_column(:blazer_checks, :check_type, :string)
add_column(:blazer_checks, :message, :text)
commit_db_transaction

Blazer::Check.reset_column_information

Blazer::Check.where(invert: true).update_all(check_type: "missing_data")
Blazer::Check.where(check_type: nil).update_all(check_type: "bad_data")
```

### 1.3

To take advantage of the latest features, create a migration

```sh
rails g migration upgrade_blazer_to_1_3
```

with:

```ruby
add_column :blazer_dashboards, :creator_id, :integer
add_column :blazer_checks, :creator_id, :integer
add_column :blazer_checks, :invert, :boolean
add_column :blazer_checks, :schedule, :string
add_column :blazer_checks, :last_run_at, :timestamp
commit_db_transaction

Blazer::Check.update_all schedule: "1 hour"
```

### 1.0

Blazer 1.0 brings a number of new features:

- multiple data sources, including Redshift
- dashboards
- checks

To upgrade, run:

```sh
bundle update blazer
```

Create a migration

```sh
rails g migration upgrade_blazer_to_1_0
```

with:

```ruby
add_column :blazer_queries, :data_source, :string
add_column :blazer_audits, :data_source, :string

create_table :blazer_dashboards do |t|
  t.text :name
  t.timestamps
end

create_table :blazer_dashboard_queries do |t|
  t.references :dashboard
  t.references :query
  t.integer :position
  t.timestamps
end

create_table :blazer_checks do |t|
  t.references :query
  t.string :state
  t.text :emails
  t.timestamps
end
```

And run:

```sh
rake db:migrate
```

Update `config/blazer.yml` with:

```yml
# see https://github.com/ankane/blazer for more info

data_sources:
  main:
    url: <%= ENV["BLAZER_DATABASE_URL"] %>

    # statement timeout, in seconds
    # applies to PostgreSQL only
    # none by default
    # timeout: 15

    # time to cache results, in minutes
    # can greatly improve speed
    # none by default
    # cache: 60

    # wrap queries in a transaction for safety
    # not necessary if you use a read-only user
    # true by default
    # use_transaction: false

    smart_variables:
      # zone_id: "SELECT id, name FROM zones ORDER BY name ASC"

    linked_columns:
      # user_id: "/admin/users/{value}"

    smart_columns:
      # user_id: "SELECT id, name FROM users WHERE id IN {value}"

# create audits
audit: true

# change the time zone
# time_zone: "Pacific Time (US & Canada)"

# class name of the user model
# user_class: User

# method name for the current user
# user_method: current_user

# method name for the display name
# user_name: name

# email to send checks from
# from_email: blazer@example.org
```

## TODO

- advanced permissions
- standalone version
- better navigation

## History

View the [changelog](https://github.com/ankane/blazer/blob/master/CHANGELOG.md)

## Thanks

Blazer uses a number of awesome open source projects, including [Rails](https://github.com/rails/rails/), [Vue.js](https://github.com/vuejs/vue), [jQuery](https://github.com/jquery/jquery), [Bootstrap](https://github.com/twbs/bootstrap), [Selectize](https://github.com/brianreavis/selectize.js), [StickyTableHeaders](https://github.com/jmosbech/StickyTableHeaders), [Stupid jQuery Table Sort](https://github.com/joequery/Stupid-Table-Plugin), and [Date Range Picker](https://github.com/dangrossman/bootstrap-daterangepicker).

Demo data from [MovieLens](http://grouplens.org/datasets/movielens/).

## Want to Make Blazer Better?

That’s awesome! Here are a few ways you can help:

- [Report bugs](https://github.com/ankane/blazer/issues)
- Fix bugs and [submit pull requests](https://github.com/ankane/blazer/pulls)
- Write, clarify, or fix documentation
- Suggest or add new features

Check out the [dev app](https://github.com/ankane/blazer-dev) to get started.