Core concepts

Type Safety

Getting real time feedback on your pipeline

One of our biggest frustrations with string based SQL builders like dbt is that it's almost impossible to write tools that get your computer to check your work. ERA makes it easy to analyze data transformations, and ships with a full type checker (including aggregation and windowing!).

import { From } from '@cotera/era'

From({
  name: 'events',
  schema: 'public',
  attributes: {
    id: 'string',
    state: 'string',
  },
}).select((t) => ({
  ...t.pick('id', 'state', 'oops_invalid_attribute!'),
}))

Will show this error, including the line number and pointer to the exact issue in the code, along with all the attributes that do exist and their types

Error: NoSuchAttribute - "oops_invalid_attribute!" does not exist in
(
  "id" string,
  "state" string
)
TraceBack:
 -> attr - "from"

 ❯ src/apps/rfm-analytics.ts:11:8
      9|   },
     10| }).select((t) => ({
     11|   ...t.pick('id', 'state', 'oops_invalid_attribute!'),
       |        ^
     12| }));
     13|
 ❯ Relation.select ../era/src/lib/builder/relation.ts:174:22
 ❯ src/apps/rfm-analytics.ts:10:4

This all happens without doing a round trip to your warehouse or builing any real database tables. Just instantly validating your pipeline on your machine and in CI. This dramatically lowers the amount of time to refactor your base models.

What is Type Safety?

Typically, when you write code, you are working with data. Perhaps your putting two strings together, or you're adding two numbers.

Data always has a "type". For example, a "number" is a type, and a "string" is a type. You'll be familiar with these from your data warehouse table schemas. Every column has a type when you write a CREATE TABLE statement.

One practical way to think about types is that the type of some data defines all the things that you can do with it. For example, if I have a number I can add it to another number... but it doesn't really make sense to add it to a string.

A type safe language is one that is able to track all of the information about the types of your data before the code ever runs. This means that it can warn you when you do things that don't make sense. It turns out that this is an extremely useful thing and it's a big productivity boost.

Assumptions

So how does ERA know what the data looks like in your data warehouse? Well, we create assumptions files, which are basically a contract with your data warehouse. They tell ERA "there is a table in there that looks like this", and ERA can take over and type check everything from that point on.

Assumptions look like this:

const PRODUCT = {
  schema,
  name: 'PRODUCT',
  attributes: {
    ID: { ty: 'int', nullable: false },
    TITLE: { ty: 'string', nullable: false },
    HANDLE: 'string',
    PRODUCT_TYPE: 'string',
    STATUS: 'string',
  },
}

If you've got a dbt project, then ERA can generate the assumptions from your dbt models, which will save you some typing! See the DBT section for more info.

Ready to learn more?

Then come and say hello to us on Discord! Or head over to GitHub to try out the ERA examples on your computer.
Previous
Learn by example