Core concepts

FAQ

What's the ERA elevator pitch?

ERA is a practical framework for building composable data applications that are easy to maintain.

If you're familiar with dbt, you can think of ERA as a super powered replacement for dbt's string based templating engine. You write your code with ERA, and then ERA takes care of compiling your code to SQL for the data warehouse that you want to target. You can use it to build and maintain models, and then ERA can do instant analysis to find errors in your pipeline. You can also use it to define the UI (for example, dashboards) that depends on those models.

Why does ERA exist?

ERA was built to maintain testable/composable data pipelines. Our team was ripping our hair out trying to maintain dbt/SQL scripts across different data warehouses (Redshift, BigQuery, Postgres, Snowflake) on top of ever shifting data foundations maintained by our customer's internal data teams. ERA is the result of our learnings from field experience.

We wanted to write abstractions so that we could reuse code. We wanted to bundle those abstractions into libraries. We wanted to statically analyse our models so that we caught more errors before production. We wanted a fast unit test suite that you could iterate on locally without connecting to a data warehouse.

In short, we wanted to use all the same practices we used for building our other software. Tools like dbt made a great start at importing these kinds of practices into analytics, but there are many great aspects of the software engineering workflow that still aren't easy to replicate in data.

Is ERA an ORM? A SQL builder?

Neither! ERA is kinda it's own thing. ERA is more like a minimal relational algebra programming language shipped as a Typescript library. It borrows a bunch of learnings from other programming languages and applies them to OLAP programming / data engineering.

Why not build on top of dbt's abstractions?

We think dbt has done amazing things for the analytics ecosystem! We think dbt has elevated analytics engineering with tons of learnings from traditional software engineering. At the same time we think that dbt's core abstractions have hit their limits. String interpolation as dbt's primary abstraction makes it hard to build tooling on top of dbt code. There is dbt code that it's impossible to tell is it's SQL output syntactically valid without reaching out to the real warehouse, let alone try to statically analyze the semantics of a project. The ERA project hopes to rethink some of the core abstractions in the data analytics engineering world and learn from other software engineering disciplines.

Rather than re-invent the wheel at every company, in a world with composable, testable analytics pipelines we could get every company to a decent starting point in no time at all! We believe getting a company from no data stack to a fully operational setup (complete with all the standard metrics and models for the relevant domain) should be as easy as making a new Ruby on Rails project.

Why TypeScript

We've had this question a lot, we'll be the first to admit that TypeScript does not have a lot of traction in the data ecosystem.

TypeScript is a "general purpose" programming language.

To be fair, this is true of almost any language out there... it's just not true of SQL, which is the lingua franca of analytics at present.

What does a programming language buy you? An awful lot it turns out. The ability to create abstractions through basic features such as variables and functions is completely absent from SQL. The ability to write your own abstractions enables code reuse, makes your code easier to understand, allows you to write ergonomic unit tests... etc. These are the building blocks for an ecosystem of high quality reusable code.

dbt macros go some way to bringing these benefits to SQL, but in our opinion they fall far short of just having first class support for them in your programming language.

TypeScript has a fantastic ecosystem.

The TypeScript ecosystem is the JavaScript ecosystem, which is one of the largest and most vibrant software ecosystem that there is. This means that there is first class editor support, library distribution infrastructure, and a plethora of libraries out there to help you organize and understand your code.

We love projects like Malloy and PRQL, but think that there is an enormous hurdle for them to overcome. Having first class editor support and a way to distribute packages is table stakes for an ecosystem, but it's a monumental challenge to build from scratch.

ERA piggybacks on TypeScript and npm for the production grade tools that they provide.

TypeScript runs in your browser.

TS is just JS... and that means that it can run anywhere, including your browser. This is hugely advantageous in modern analytics. Many products are starting to implement last mile analytical features in the browser, standing on the shoulders of duckdb.

Writing analytics code that can run on both the backend and the browser feels like a super power in this paradigm. No need to work with multiple dialects of SQL - you can just write libraries in ERA and reuse the same code anywhere in your stack.

Ready to learn more?

Then come and say hello to us on Discord! Or head over to GitHub to try out the ERA examples on your computer.
Previous
Write once, run anywhere