Nextcloud already has support for using S3 services as a storage backend out of the box. However, in my personal experience, it is slow, buggy, and not very efficient - for example, when moving a file, Nextcloud downloads and re-uploads the entire object instead of using the proper S3 API calls. Videos take forever to buffer and the absence of any caching creates a ton of unnecessary traffic if several people try to download the same file at once.
Enter S3QL - this Python app turns an S3 bucket into a mounted filesystem that can be used like any traditional drive. Files placed on an S3QL filesystem are divided into 10MB "blocks" before being uploaded to S3.
It has:
- caching - recently accessed data is cached locally so file accesses don't need to hit whatever S3 service is being used
- encryption - data uploaded to S3 can be encrypted with a passphrase
- compression - data can be compressed before being uploaded to save space
- de-duplication - copies of the same data are only stored once, which also helps save space
- threaded downloads - data blocks are fetched from S3 in parallel, making downloads a lot faster
- scalability - the virtual filesystem never runs out of space and scales (as far as your wallet does)
S3QL also stores metadata locally in an SQLite database, so operations such as moving files can be done without needing to connect to S3 at all.
This combines the performance of a local, POSIX compliant filesystem with the low cost of S3 storage.
Setup
I won't go into too much detail on how to get S3QL itself installed and setup - all that is covered in-depth at https://www.rath.org/s3ql-docs/.
However, I went through the steps below to get my Nextcloud installation migrated over to an S3QL filesystem:
This guide assumes you have mounted the S3QL filesystem in a folder called /nextcloud/s3ql. Substitute this if your filesystem is mounted elsewhere.
Step 1.
Put Nextcloud into maintenance mode:
occ maintenance:mode --on
Step 2.
Go into your Nextcloud data directory (located in the docroot by default) and identify the appdata and updater folders. These are named appdata_<random_id> and updater_<random_id>.
Step 3.
Copy the two folders above to the /nextcloud/ directory. These folders contain cached scripts and data served to the client by Nextcloud, so they should not be placed into the S3QL folder for performance reasons.
Step 4.
Delete the appdata and updater folders from the Nextcloud data directory.
Step 5.
Go back to the document root and copy the contents of the data directory to S3QL:
cp -r data /nextcloud/s3ql/
Note: This can take a long time, depending on your network connection and how many files you have stored.
Step 6.
Bind-mount the appdata and updater folders back into the data directory:
mount --bind /nextcloud/appdata_<random_id> /nextcloud/s3ql/appdata_<random_id>
mount --bind /nextcloud/updater_<random_id> /nextcloud/s3ql/updater_<random_id>
Step 7.
Ensure that the web server user can access the data directory:
sudo chown -R www-data:www-data /nextcloud/s3ql/data
Step 8.
Update Nextcloud's config.php to point to the relocated directory:
'datadirectory' => '/nextcloud/s3ql/data',
Step 9.
Finally, take Nextcloud out of maintenance mode and trigger a full file scan:
occ maintenance:mode --off
occ files:scan --all
Hopefully everything goes well and your installation picks up exactly where it left off!
Notes
- When setting up S3QL, allocate as much cache as you can. This stores as many files as possible locally, helping to reduce latency.
- You might want to choose an S3 provider that does not charge for API calls. While S3QL is quite efficient bandwidth-wise, it can make several calls per file download/upload because of its chunking system.
- If you have multiple people on your server, ensure that storage quotas are set up to avoid overage fees. S3 prices can get nasty once you exceed your plan's limits.