Skip to content

Normalize filenames based on new storage filename format

Lars Kruse requested to merge lars/grouprise:normalize-filenames into main

Previously the file storage suffered from files with the same name being uploaded multiple times (each creating a new File or Image instance), but the physical file locations were simply overwritten.

The normalization does the following:

  • all File and Image objects are processed, if their filename does not follow the new filename schema (non-nested, 16 characters)
  • the old file is renamed (moved) to its new location
  • the file is copied (instead of moved), if more than one File object pointing at the old filename exists
    • thus duplicate uploads of the same file are split into multiple copies (each bound to a different physical file)
  • a redirect (from /stadt/media/... to the new URL) is added for each file

The current implementation is slightly dirty, since the migration module is placed below grouprise/features/files/migrations, but in fact it migrates File as well as Image objects. If we want to clean this up, then we would need to move the duplicate code (almost everything) somewhere below grouprise/core/ and add only the File and Image specific pieces to the respective applications. But I am not sure, whether this cleanliness would be worth the effort.

btw: after applying this migration to the current dataset of stadtgestalten.org, 4000 of 5200 files are normalized. The remaining files are probably:

  • email attachments (for some reason they seem to have been written directly to the filesystem - without creating File instances)
  • logo and avatar images for Gestalt instances (see #776)
  • some unknown files, which are not mentioned in the database (to be analyzed via #756)
Edited by Lars Kruse

Merge request reports