Index on Join Table of Many to Many Association

buchholz · March 2, 2023, 10:19am

We thought about improving the performance of our application by introducing another index on the join table of a many to many association.

Is there any possibility to add this index so that Jmix creates a Liquibase script for that?

As my understanding of indexes might have some gaps, correct me if I’m wrong.
The table has a index for the primary key which is composed from ids of both entities. With this the DB should find the associations in one direction fast. For the other direction we need another index, right?

As an alternative, do we have the possibility to switch the order of the columns in the join table? We are usually searching the associations from entity B, but the order in the join table is A, B.

krivopustov · March 3, 2023, 10:57am

I would suggest analyzing execution plans for real SQL queries run by your application before adding any indexes.

The next question is where to define indexes if they are actually needed. JPA allows for defining join table indexes in @JoinTable(..., indexes = {...}) annotation, but Studio now ignores them when generating changelogs. I’ve created an issue for an improvement. So currently you can define them manually in your changelog.

In theory, based on the fact that there is always an index for the primary key on both ID_1 and ID_2, it should be enough for all use cases to create a separate index on ID_2. I’m not a DB expert though, and I would be glad to hear other opinions.

Regards,
Konstantin

buchholz · March 8, 2023, 12:30pm

Hi @krivopustov ,

thanks for your support. I checked the execution plans and were able to add an index in a test environment. Now we have to check how the application performs with and without different indexes.

mheyer · June 27, 2023, 8:41am

Hi @krivopustov,

those indexes on @JoinTable are quite important. I see astronomical performance differences with medium sized tables (~100k entities) on Postgres. Currently I have to add them manually, and if something changes or I forget to add the index, the app will crash or degrade.

I would be glad to see this feature added to studio. I hope it is not a big thing.

Thank you very much.

Best regards,

Martin

PS: Is there a login to youtrack.jmix.io, so we can add context on the issue?

krivopustov · June 27, 2023, 11:41am

Can you confirm my assumption from your experience?

Unfortunately no, we have licenses only for internal users.
Let’s discuss the problem here and I will update the issue.

Regards,
Konstantin

mheyer · June 27, 2023, 11:54am

Hi Konstantin,

let’s see where the issue originates from. For a moment you had me worried about that primary key, but it seems to be interesting. The context would be postgres, as this is where I run tests.

The table has a multicolumn primary key: Index “indexFoo” PRIMARY KEY, btree(col1,col2).
EXPLAIN SELECT … WHERE col1 = ‘foo’ … yields an index scan, which is expected.
EXPLAIN SELECT … WHERE col2 = ‘bar’ … yields a sequential scan, which is bad.

The behaviour is documented here: PostgreSQL: Documentation: 15: 11.3. Multicolumn Indexes

As far as I understand, a btree(col1, col2) does not really help looking up col2 alone. I do not understand why Postgres would scan the table instead of an index scan. Perhaps this is some internal decision. So the index on the pkey would help queries on the primary-key but not parts of it. Unfortunately those join tables would always be queried partially…

Following my limited understanding I would create individual indexes for every column which is to be queried individually, except col1 which is covered by the pkey index.

Best regards,

Martin

PS: We would have no problem using the join table in forward mode, just filtering on col1. As soon as we use it in reverse, it gets very slow.

PS2: I should read your text more precisely. So the conclusion is that you are correct. We only need an index on col2.