Using multiple databases in jmix as multi-tenant approach

ghircoiascamil · March 28, 2024, 10:24am

Hi,

I am interested to any solution that allows me to use multidatabases in jmix app for a multitenancy approach.

I am a little familiar with jmix and build some relatively small apps with it and i am realy interested in migrating my erp solution that i have on postgresql database and legacy vfp to jmix. Due to more factors we decided that for us the only viable solution is with multidatabase and not schemas or tennant_id fields inside one database as we need scalling and have to deal with our existing data.

From a user perspective the user will enter it’s credentials and choose a database on main page and then app will connect to that database. When user wants to connect to another one it will chose another database and so on.

It is possible for jmix app to connect to a master database on that cluster and will store users and internal tables that it needs as normaly but then app will somehow have to deal with the database user choosed and all the entities will map to that database. Even it will not function some jmix stuff i am interested if that approach could work and some feedback if you have how it can be done with some examples so I have to start with.

Thanks.
Camil Ghircoias

krivopustov · March 28, 2024, 12:13pm

Hi Camil,

Yes I think it’s possible.

In this case the Main Data Store will be shared between tenants and contain the shared list of users. An additional data store with “application level” tables will be accessed through a routing DataSource which will return connections to tenant databases depending on a user session attribute.

The tenant databases will have to be of identical structure, mapped to the single set of entities in the application.

Regards,
Konstantin

ghircoiascamil · March 28, 2024, 12:39pm

Yes, the tenant databases will be identical as structure.

If it is possible can y provide an example of routing procedure ?
Suppose the main database is a postgresql one called test on localhost where user will reside and jmix will see as the main one and other one will be tenant1 with customers table.

I think it will be good for others too to have an working example when dealing with this approach.

mbucan · March 28, 2024, 12:44pm

Hello,
I would like to see that example too. I’d like to build a multitenant application myself, with good separation.
I found this Multi-Tenant Apps & Postgres That Scales Out | Citus Data
Have not tried it yet.

Kind regards,
Mladen

ghircoiascamil · March 28, 2024, 1:01pm

Citus is something else we need from jmix pov how to deal with separate databases. This thing it is the only that blocks me now from starting my project.

info-expert · March 28, 2024, 1:07pm

It would be great to have a working example.
Tudor

krivopustov · April 2, 2024, 8:53am

Regarding the example, let’s clarify some requirements.

How do you see the user management? Who will create users and assign roles?
Who and at what moment will create tenants and their databases, including the database schema?

Regards,
Konstantin

ghircoiascamil · April 2, 2024, 3:09pm

I see rhings like this:
There should be an app for logging users and user management like a standard jmix app. This app connects to a single database called let’s say main database where all the users will be saved.
This app will allow creating users and tenant databases if a user (an admin one let’s put it that way) has the right to do. This will work as normal jmix app where a user can or can’t have rights on specific forms. The create database form can be a standard Jmix form that prompts for database name and after will do some specific sql that creates the database for tenant no.1. This database let’s name it tenant1.
Now let’s deal with choosing a specific tenant database and switch to it. The user is still on main app and will have to enter on another form where will chose from a list of available databases from a combo. This list of databases will be records in a specific jmix entity stored in main database as a normal entity.
Here it comes the problem. After user will chose database tenant1 it will click on a button connect let’s say. And it will launch another page/app (here we need the solution) that will work with this second database. We still need to access the list of users from main database. I don’t know or deal with a replication scenario from main database users to tenant1 database.
In a winform classic client server scenario the users and rights are accesed from main database and other data is on tenant1,2 3 database.
Thanks for your disponibility.

krivopustov · April 3, 2024, 7:23am

Thanks for the explanation.

It seems to me that the option with the shared Main Data Store and tenant-specific Additional Data Store would satisfy your requirements. And I think it’s a good idea that the app should create tenant databases when they are registered, and run their Liquibase changelogs right after creation. What do you think about updating schemas for newer application version? Perhaps the app should do it after start - to go through the list of registered databases and run Liquibase for them.

Could you explain more about your business case? Why any user can select a database to work with? This doesn’t look like a typical multi-tenant application where users and data are isolated in their spaces.

mbucan · April 3, 2024, 8:38am

Hello,
regarding the requirements, I would like two examples, let’s call them “simple example” and “bpm example”.

Simple example:

lets say we have one Jmix 2.2.x application using PostgreSQL database, that is being used by several customers (organizations from CUBA) and they have their data stored into separate tenant-specific databases, to satisfy security requirements compliance - users are not asked to choose their database as we know which tenant they belong to
user data is stored in the Main, admin user can add new tenant organizations (TenantAdmin role), and also add tenant manager (TenantManager) who can additionally manage the users from his tenant only
tenant databases have the same structure
bonus: tenant databases can have a common structure and additionally different tables not present in other, and this differential to not to be interfered with by liquibase
liquibase updates can be done upon application restart, using the list of registered databases

BPM example
BPM addon has access to all of the underlying data as it contains elements that can execute SQL or scripts, so if the user has access to those features, he has the access too. Awesome Apache Superset integration that is coming may have the same security problem.
Therefore, would it be possible to have BPM and Flowable tables stored also in the tenant specific database? Or maybe we should solve this with external BPM engines as you describe in the Jmix roadmap?

Kind regards,
Mladen

ghircoiascamil · April 3, 2024, 8:57am

Hi and thanks for your answer,

For your first question about updating specific databases.
Now I have a flag into a specific table on each database and when app starts checks for that specific flag and do some specific sql updatates if necessary so the database for tenant1,2,3 is up to date.
After the update is done with success the flag is put on false and that’s it, next time it will not checking anything.

Now about my business logic.

I have an erp with accounting, salary, sales, etc.
This erp it is used in different scenarious:

One is by accounting companies that have in custody from one company to deal with or ten’s of them or even more then a hundred in their portofolio.

Another scenarion is with companies that are not accounting but use the erp for their business processes. Some of them have also one or 2,3,4 companies that are own by same person/entity and have all the accounting in their location for all their companies. Some of these databases can be very big.

Another scenario is with one company that have different locations where they deal with erp and for some of them we have made separate servers and databases and instalations with replica scenarious due to their specific demands.
For example they do not want to depend on internet if something happens on their main locations so we have separate servers for their locations. Also for some of them this was our decision to have all the data in small databases for every location due to scalability, speed or other reasons.

So for all these scenarious we have separate tenant databases with same structure.
Having a single database with tenant_id column can work for some scenarious but not for all situations. There are pro and cons and for us cons where bigger then pros.

One reason is for some databases the grow very big and having separate databases is more scalable and easy to manage. If I need to backup one specific database it is much faster to have one small backup then to have a big bulk backup with all messed together that take long time.
Sometimes you need to bring customer data to your server to have some specific tests or dealing with bugs.

Also with one very big database all the queries, indexes are big and will make app slower with the passing of time.

Regards,
Camil Ghircoias

krivopustov · April 8, 2024, 6:36am

Hi everyone,

Thank you for your input.

I’ve prepared a minimalistic example following mostly the “simple” requirements from Mladen’s post above:

Please take a look and share your thoughts on what should be improved and whether this approach can be used in real life applications.

Regards,
Konstantin

ghircoiascamil · April 8, 2024, 1:33pm

It looks promising, we need some time to test it and keep you up to date.

ghircoiascamil · April 8, 2024, 1:51pm

Can y provide what is wrong when running the example

Execute the Recreate action for Main Data Store - Done, no errors, database is created
But on tenant data store - there is no recreate option in context menu on tenant data store.
So we got this error:
Liquibase failed to start because no changelog could be found at ‘classpath:/db/changelog/db.changelog-master.yaml’.

krivopustov · April 9, 2024, 6:20am

Try to switch the DB Schema Management mode to Create and Update:

ghircoiascamil · April 9, 2024, 7:50am

As far as I see it works and it is a good point to start with.

About logical structure you provide, in my opinion one user should have rights for more than one tennant database so it will be better a mapping structure one to many not one user to one tennant database.

In this case the user connection form should have also database name prompted after user and password textbox. And the database to connect to should be from a list of databases tennants the user is mapped to.

Browser will keep session to the user and may be it will be possible to have multiple pages opened in the same browser, one for tenant 1 and one for tennant 2 as it will be same user and not different ones as with one user to one tennant approach ? What is your opinion ?

Thank you,
Camil Ghircoias

krivopustov · April 11, 2024, 10:27am

If you associate a user with multiple tenants, it immediately raises a problem of detecting what tenant is chosen for the user in the current session. So the DataSourceRepository should get the tenant not directly from the user, but from the session.

I will modify the example in such a way that the tenant will be stored in the session upon login, and DataSourceRepository will get it from there.

Regards,
Konstantin

ghircoiascamil · April 11, 2024, 11:55am

Thanks.

krivopustov · April 14, 2024, 6:17am

I’ve updated jmix-projects/multidb-multitenancy-sample to set current tenant in session.

Please take a look and share your thoughts.

ghircoiascamil · April 17, 2024, 9:17am

Hello,

I’ve looked at the example and I understand the logic, it works with this method setCurrentTenantInSession() so it it can be done a solution with one user login and multiple tenant databases.
We’ll give some more feedback when we’ll implement in a more complex solution, hope it will do it’s job.
I think this example should be in jmix documentation as it will help others also.
Thank you.