Core concepts
Write Once Run Anywhere
Expect your ERA library to work on any data warehouse
ERA isn't a SQL abstraction, it's more of a cousin to SQL. The computer science-y way to think about ERA is that it exposes a small relational algebra machine that has defined semantics. All of ERA's features are built on top of the semantics guaranteed by ERA's type checker, including battle tested ERA => SQL generators. Build and test your analytics code against one warehouse (like a local DuckDB instance) and expect it to work the same way when you run it on your production Snowflake, BigQuery, Redshift, or Postgres instance.
Why is ERA not a SQL abstraction?
SQL is not portable
When you write a SQL query, you are writing in a specific dialect of SQL that is not portable across different data warehouses. Typically learning one dialect of SQL will prepare you for other dialects, but the actual debugged code produced for one database overwhelmingly unlikely to be able to be used in another data base.
SQL lacks abstraction
Some reading we love
SQL does not give you the fundamental building blocks for building abstractions with code. Variables, functions, loops, conditionals... these are all missing (or highly impractical) in SQL. This fact, combined with the fact that there are many dialects of SQL, makes a real ecosystem of shared code more or less impossible.
SQL is hard for machines to reason about
One joke we make around the Cotera office is "Generating SQL takes an afternoon, parsing and analyzing SQL takes a PHD thesis". SQL is not only hard to parse, but almost impossible to infer the semantics of statically. In practice understanding SQL code can only really be done by the database itself, and with the rise of data warehouse as a service parsers like BigQuery or Snowflake aren't even open source.
ERA is designed to be machine readable. The AST in ERA stands for Abstract Syntax Tree, and is designed to be easy to build tooling on top of.
Just because ERA and it's type checker are designed to be machine readable doesn't mean it's hard for humans to write! Probably the opposite actually. Because it's easy to build tooling on top of ERA, and the fact ERA is "just" a TypeScript library, ERA works great with world class tooling like VS code and the TypeScript language server. ERA ships with a "Builder" library that makes it feel like a real programming language that stand on the shoulders of JavaScript
ERA Libraries
In SQL, library code must be written (and debugged!) once for each target platform, and this is not feasible in practice. dbt packages are a good example of this - they're a great idea but there are very few high quality packages that are consistent across data warehouses.
The ERA project came out of Cotera's need to write and maintain analytics libraries at scale. With ERA, we can start to share battle tested composable analytics code. We no longer need to reinvent things like funnel analyses, cohort queries, or standard metrics at every company
One huge benefit of ERA being "just" a TypeScript library is that it doesn't need to have any of the mountain of tooling required to maintain an ecosystem. ERA libraries work perfectly on mature registries like NPM, work with existing linters, formatters and test frameworks.
We think the ability to share code is something that the analytics ecosystem would benefit greatly from (it's certainly something the software world has benefited from).
An example - a date utility function
Here's a simple example. It's a date utility function that allocates a reporting period depending on how far through the month we are.
// reporting-period.ts
import { Expression, If } from '@cotera/era'
export const reportingPeriod = (date: Date | Expression): Expression => {
const expr = Expression.wrap(date)
return If(expr.datePart('day').gte(15), {
then: expr.dateTrunc('month').dateAdd('days', 14),
else: expr
.dateTrunc('month')
.dateSub('days', 1)
.dateTrunc('month')
.dateAdd('days', 14),
})
}
Unlike SQL, 'operators' in ERA are just functions. You import them from the standard library.
SQL just has keywords
SQL does not allow you to import a function. So instead there are over a thousand of keywords. This is actually why SQL dialect differences are so deep rooted. While the relational core is consistent, every extra bit of functionality had to go in as a keyword to fit the ANSI standard.
Our aim here is to create a utility function that turns a date into a reporting period. To do this we're defining a function that takes a date, operates on it, and returns an answer. We can now import this function anywhere and use it anywhere we like.
Including in unit test cases:
// reporting-period.test.ts
import { reportingPeriod } from './reporting-period'
test('for a date before the 15th of the month', () => {
const result = reportingPeriod(new Date('2023-07-10Z')).evaluate()
expect(result).toEqual(new Date('2023-06-15Z'))
})
We can run these kinds of tests locally and in CI without even connecting to our data warehouse. This makes them super quick to iterate on.
How can ERA avoid talking to the data warehouse?
ERA get's to leverage the fact that it is portable code! We can run your code against DuckDB or in ERA's compile time interpreter on your computer.