I'm having a major memory issue when creating a large amount of FileDescriptor objects

modemmisuser · November 4, 2022, 2:44pm

In my migrated CUBA application, I have data importers that import data (via CSV files) from our legacy system that the CUBA/Jmix system is replacing. They all work flawlessly just as they did in CUBA.

All except one. The one that imports patient notes and their attached documents seems to have a very serious memory issue when it gets to the portion of the data that has attached documents. (This is just a test load, not all of the notes have documents.)

The code that reads in the file to the FileDescriptor is very simple:

                    try (FileInputStream inputStream = new FileInputStream(documentFile)){
                            byte[] fileBytes = new byte[inputStream.available()];
                            inputStream.read(fileBytes);
                            fileStorageAPI.saveFile(fileDescriptor, fileBytes);
                    } catch (FileNotFoundException e) {
                        e.printStackTrace();
                    } catch (IOException | FileStorageException e) {
                        throw new RuntimeException(e);
                    }

documentFile is just a java.io.file object representing the file that has been verified to exist on the server and able to be read.

When the importer gets to the portion of the data that does have attached docs, the memory starts blowing up very quickly.

Using fileLoader.saveStream(fileDescriptor, () -> inputStream); instead of the byte array method produces the same memory issues.

What am I missing here?

modemmisuser · November 7, 2022, 1:39pm

Plot twist!

This does not happen on my Windows development machine running the exact same data import!

It only happens when running in a container on my Ubuntu test server, which is how the app will be deployed once it’s completed. Same JDK version (Adoptium 17).

Ideas?

krivopustov · November 8, 2022, 6:59pm

CUBA’s FileStorageAPI interface has saveStream(FileDescriptor, InputStream) method. Use it without loading files into memory.

modemmisuser · November 8, 2022, 8:37pm

Unfortunately as I’d mentioned in the original post,

… which I believe calls the same internal logic as FileStorageAPI::saveStream()

modemmisuser · November 9, 2022, 12:04pm

Yup, confirmed, in FileLoaderImpl.java,

    @Override
    public void saveStream(FileDescriptor fd, Supplier<InputStream> inputStreamSupplier) throws FileStorageException {
        fileStorageAPI.saveStream(fd, inputStreamSupplier.get());
    }

It just calls fileStorageAPI.

krivopustov · November 9, 2022, 6:34pm

Right, it’s the same.
I would start from investigating a heap dump to find out for sure what eats the memory.