Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(migrations): implemented multitenancy (wildcard) migrations #3526

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

nicomouss
Copy link

@nicomouss nicomouss commented Sep 22, 2022

This adds the ability for Mikro-orm to manage multitenant architectures.
Mikro-orm is now able to generate migration files to migrate dynamic db schemas.

Example of a multitenant architecture:

"public" (schema)
    "Tenant" (entity)
      - id
      - name
      - description
"mytenant1" (schema)
    "Photo" (entity)
      - id
      - author_id
      - url
    "Author" (entity)
      - id
      - name
"mytenant2" (schema)
    "Photo" (entity)
      - id
      - author_id
      - url
    "Author" (entity)
      - id
      - name
... (other tenant schema)
... (other tenant schema)

In this architecture, we have:

  • a Tenant entity, sitting in the public/default schema (or a specific schema of your choice, like "admin"). This entity is used to identify each tenant.
  • a list of schemas (mytenant1, mytenant2, ...), each tenant sitting in its own schemas.

Regarding the Tenant entity used to identify a tenant, usually one property is used to store the schema name of the tenant. For example, we can model the entity using the id property to identify the tenant schema:

[id]            [name]              [description]  
"mytenant1"     "My Tenant 1"       "bla bla bla..."  
"mytenant2"     "My Tenant 2"       "bla bla bla..."  
"mytenantxxx"   "My Tenant xxx"     "bla bla bla..."  
"mytenantyyy"   "My Tenant yyy"     "bla bla bla..."  

To activate multitenancy mode in Mikro-orm, set and configure the migrations.multitenancy config option (see details below).

In multitenancy mode, every entity set with Entity({schema: '*'}) is considered part of a tenant schema.

In our example, each tenant is composed of two entities: Photo and Author.
This means that we need to set those two entities with the {schema: '*'} option:

// Photo.ts
Entity({schema: '*'})
export class Photo {
  // ...
}

// Author.ts
Entity({schema: '*'})
export class Author {
  // ...
}

The Tenant entity is just set as normal, residing in the public/default schema or a specific schema of your choice:

// Tenants.ts
Entity() // default schema
export class Tenant {
  // ...
}

The migrations.multitenancy option is configurable according to the following signature:

multitenancy?: {
  // Tenants callback
  tenants: (driver: DatabaseDriver<Connection>, ctx: Transaction) => Promise<string[]>;
  // Name of the Tenant entity residing in the default/public schema
  // (default value: "Tenant")
  tenantEntity?: string;
  // Path of the folder in which the TenantHelper class will be generated
  // (default value: Mikro-orm baseDir config)
  // tenantHelperPath or tenantHelperPathTs is choosen depending on how you set migrations.emit
  tenantHelperPath?: string;
  tenantHelperPathTs?: string;
  // Custom class name you can set for the TenantHelper class
  // (default value: "TenantHelper")
  tenantHelperClassName?: string;
  // Custom TenantHelper generator
  tenantHelperGenerator?: Constructor<ITenantHelperGenerator>;
};

The tenants option is mandatory. It is a callback function that you need to implement in order to provide Mikro-orm the list of tenant schemas to migrate.

What happens is: when a migration is executed (using migration:up or migration:down commands), Mikro-orm is basically now asking 'please give me the list of tenant schemas that I need to migrate', providing you a callback from which you need to return that list of tenants (as an array of string).

Example:

const mikroOrmConfig: Options = {
  // ...
  migrations: {
    // ...
    multitenancy: {
      tenants: async (driver: AbstractSqlDriver<AbstractSqlConnection>, ctx: Transaction) =>
        // retrieving the list of tenant schemas.
        // would return ['mytenant1', 'mytenant2'] in our example
        (await driver.createEntityManager().find('Tenant', {}, { ctx: ctx })).map((t:Tenant) => t.Id)
    }
  }
}

In multitenancy mode, when creating a migration (using migration:create command), Mikro-orm now generates the proper migration file in order to dynamically migrate the tenants:

async up(): Promise<void> {
  // Migrating "static" schemas (public schemas and others) ...
  this.addSql(`alter table "Tenant" add column "active" boolean not null default true;`);
  this.addSql(`alter table "mySchema"."MyTable" add column "myColumn" varchar(255) not null;`);
  // ...

  // Migrating tenant schemas (dynamic schemas) ...
  const tenants = await this.options.multitenancy.tenants(this.driver, this.ctx);
  tenants.forEach((tenant) => {
      this.addSql(`alter table "${tenant}"."Photo" add column "description" varchar(255) not null;`);
      this.addSql(`alter table "${tenant}"."Author" add column "bio" varchar(255) not null;`);
      // ...
  });
}

Alongside with the migration file, Mikro-orm also now generates a TenantHelper class. You want to use this helper class in your application codebase whenever your need to create or delete a Tenant:

export class TenantHelper {
  public static GetCreateSql(tenant: string): string[] {
    return [
      `create schema if not exists "${tenant}";`,
      `create table "${tenant}"."Author" ("id" serial primary key, "name" varchar(255) not null);`,
      `create table "${tenant}"."Photo" ("id" serial primary key, "url" varchar(255) not null, "author_id" int not null);`,
      `alter table "${tenant}"."Photo" add constraint "Photo_author_id_foreign" foreign key ("author_id") references "${tenant}"."Author" ("id");`,
    ];
  }

  public static GetDropSql(tenant: string): string[] {
    return [
      `drop table if exists "${tenant}"."Photo" cascade;`,
      `drop table if exists "${tenant}"."Author" cascade;`,
    ];
  }
}

Prerequisite for the new multitenancy mode to work:

  • Mikro-orm type config option must be set to postgresql
  • Mikro-orm migrations.snapshot config options must be enabled

Notes about this PR:

  • This new feature is non-breaking. Mikro-orm keeps working as before, and it is only by activating multitenancy option that we introduce a new behaviour.

  • Feel free to use the above description as the documentation for this new feature.

  • Main idea under the hood is as follow: when creating a new migration under multitenancy mode, Mikro-orm clones the entities metadata and dynamically replace all the * schemas with a tag ${tenant}: by replacing the * schema with a proper value, Mikro-orm is now outputting those tagged-schemas-entities in the generated SQL instructions. Mikro-orm can then properly re-arrange those generated SQL instructions to create a migration file dealing with dynamic multitenant schemas.

  • At the moment, the chosen solution to clone metadatas is to re-discover them from scratch (see cloneMetadataForMultinenancy function in AbstractSchemaGenerator.ts). It might be more efficient/cleaner to copy/clone the metadata storage object graph in-memory instead of re-triggering the discovery process.

  • This implementation has been thoroughly tested locally with a postgresql database. I did not have time/resource to write proper test and submit them along this PR. I'd thus be glad if someone can take this work further by writing tests, if needed.

  • Related to Support for Dynamic Schema Migration #3319 PostgreSQL schemas for multitenancy #2074 feat(core): add support for multiple schemas (including UoW) #2296, and all other discussions about multitenancy support in Mikro-orm.

Copy link
Member

@B4nan B4nan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh, thanks, but I would appreciate some discussion upfront before sending such a PR :] Note that (as you already found) there are existing issues, all of them are closed, for a reason - I dont see this as something that should be part of the ORM, not without a clear specification, and all the currently suggested solutions were too incomplete.

I will try to review this closer when I have some spare time, but in general:

Feel free to use the above description as the documentation for this new feature.

If you want to see this merged, you need to provide docs. At least the basic usage.

This implementation has been thoroughly tested locally with a postgresql database. I did not have time/resource to write proper test and submit them along this PR. I'd thus be glad if someone can take this work further by writing tests, if needed.

Again, no plans on merging anything without proper test coverage.

At the moment, the chosen solution to clone metadatas is to re-discover them from scratch (see cloneMetadataForMultinenancy function in AbstractSchemaGenerator.ts). It might be more efficient/cleaner to copy/clone the metadata storage object graph in-memory instead of re-triggering the discovery process.

Indeed, and more imporantly there is no reason for changing the sync APIs to async. I am sure we can find a better way, with a single discovery, and ideally even without any metadata copy.

The configuration seems very complex for such a use case, no comments, no docs, so not sure what it means (e.g. the tenant helper and tenant entity). Please comment on what those are for.

Do I get it right that this is migrations only PR, and during runtime it leverages the wildcard entities?

edit:

  • why does the schema snapshot needs to be enabled?
  • why do we even need to generate something like tenant helper? this should be all dynamic, we dont need codegen for such

Object.values(metadataStorage.getAll()).forEach(m => m.schema === '*' ? m.schema = '${tenant}': m.schema);
};

await cloneMetadataForMultinenancy();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this does not make much sense, discovery is costly, you dont need to do it again just to generate a copy of the metadata

Copy link
Author

@nicomouss nicomouss Sep 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well again please see my note on the PR description: I am definitely aware that this is not the right way to do this and that this is not making sense. I tried to generate a copy of the MetadataStorage object using the clone tools available in the codebase, but this end up throwing some circular dependency exceptions. There is very few comments in the codebase, so it's difficult to understand the structure of the MetadataStorage tree (why/when stuff are self-referencing other, why/when is there a reference to a root entity, and so on...), hence it's difficult to implement a proper clone function for someone landing in the codebase for the first time.

Hence the solution to get a clone for the time being: re-discovering the metadatas.

@B4nan B4nan changed the title feat(core): implemented multitenancy migrations (dynamic schemas migrations) feat(migrations): implemented multitenancy (wildcard) migrations Sep 22, 2022
@B4nan B4nan marked this pull request as draft September 22, 2022 07:34
@nicomouss
Copy link
Author

nicomouss commented Sep 22, 2022

Uh, thanks, but I would appreciate some discussion upfront before sending such a PR :] Note that (as you already found) there are existing issues, all of them are closed, for a reason

Sure, I have definitely read the discussions about the different issues and understood that they were closed for a reason. That's indeed the very reason of why I started working on this PR.

What I intended here is to provide something implemented based on the existing issues and reflections. This in order to trigger some more discussions from the community, and potentially start to suggest some solutions to those things that are incomplete.

Multitenancy is a complex subject in itself. From personal experience working on such architectures, most of the time discussing about it at a theoretical level would not lead to something tangible (meaning something implemented). Working from a draft PR (or prototype, or whatever you call it) is at least something tangible from which discussions can eventually be started, and from which new ideas can eventually pop up.

These were my initial intentions with this PR.

I dont see this as something that should be part of the ORM

See, people submitting PR are genuinely doing so because they value your product and want it to improve. So again no offense here, but just rejecting an idea out of the blue with no proper argument is not making any justice to the awesomeness of your library.

Many here are coming from TypeOrm because its initiator has been just denying mostly all suggestions, if those are not coming from him. That's a shame because it is an interesting project which could gain interest if only community efforts where more valued.

Anyway, for the community here interested in multitenancy, I would be very surprised that Mikro-orm offering multitenancy support be seen as something not having to be part of the ORM.

Multitenancy is IMHO another strong selling point for Mikro-orm adoption.

I will try to review this closer when I have some spare time, but in general:

Feel free to use the above description as the documentation for this new feature.

If you want to see this merged, you need to provide docs. At least the basic usage.

I would be happy to do so, could you direct me towards where/how I need to write such documentation?

This implementation has been thoroughly tested locally with a postgresql database. I did not have time/resource to write proper test and submit them along this PR. I'd thus be glad if someone can take this work further by writing tests, if needed.

Again, no plans on merging anything without proper test coverage.

For what I have read in your docs (please correct me if I am wrong), I need to install docker on my machine to fully execute tests, which I can't unfortunately (don't have the required 100gb space, and time/resource to upgrade my machine at the moment).
What are my other options other than installing all the database servers one by one (Mysql, Postgres...)?

Could you also direct me towards a guide on how tests needs to be written within Mikro-orm codebase? Whenever I have some time I will do so.

At the moment, the chosen solution to clone metadatas is to re-discover them from scratch (see cloneMetadataForMultinenancy function in AbstractSchemaGenerator.ts). It might be more efficient/cleaner to copy/clone the metadata storage object graph in-memory instead of re-triggering the discovery process.

Indeed, and more imporantly there is no reason for changing the sync APIs to async. I am sure we can find a better way, with a single discovery, and ideally even without any metadata copy.

Yes, that was not my initial intention. I tried using the clone utility provided in the codebase, but it triggers an exception as the metadata graph has some recursivity in it. So to clone the metadatas, I reverted to the solution of re-discovering the metadatas from scratch. Re-discovering the metadatas being async, I had thus to make things async... Not a personal design choice at all.

The configuration seems very complex for such a use case, no comments, no docs, so not sure what it means (e.g. the tenant helper and tenant entity). Please comment on what those are for.

Does this 'seems' complex, or is it definitely complex based on some proper arguments?

Default config is just a callback function that returns a list of strings, being the list of database schemas to migrate. I know complexity is all but relative, but I guess that it is pretty basic stuff.

multitenancy: {
      tenants: async (driver: AbstractSqlDriver<AbstractSqlConnection>, ctx: Transaction) =>
        // retrieving the list of tenant schemas.
        // would return ['mytenant1', 'mytenant2'] in our example
        (await driver.createEntityManager().find('Tenant', {}, { ctx: ctx })).map((t:Tenant) => t.Id)
}

As for Tenant entity: at the Db level in a multitenant architecture, you typically have a Tenant table in the public schemas that is used to identify all your tenants.
So here in Mikro-orm context, if I have to deal with a Tenant table at the db level, I need to define a matching entity in my codebase.
Back to why this is needed in the config: In order to properly generate dynamic migration files, the current implementation needs to know what is the name of this entity.

Do I get it right that this is migrations only PR, and during runtime it leverages the wildcard entities?

Yes. This PR does not alter runtime behaviour of Mikro-orm. Only migrations when executing migration:create.

edit:

  • why does the schema snapshot needs to be enabled?

Because the snapshot file holds the entities with the tenant schema tag (replacing the wilcard).
That enables to get the proper schema diffs regarding entities set with wildcard schemas.
If snapshot is disabled, there is no way to get those schema diffs for wildcard entities.

  • why do we even need to generate something like tenant helper? this should be all dynamic, we dont need codegen for such

Yes this is indeed not theoretically needed, we can dynamically use Mikro-orm API `generator.getCreateSchemaSQL()' (passing in the schema as an option parameter) to retrieve that SQL code.

Thanks for pointing that out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants