Removing Unused Photos on a Ghost Blog

In what I consider a major oversight, it's impossible to delete photos once it is uploaded to a Ghost Blog. Take the following use-case:

  1. User uploads photo.
  2. User decides the photo isn't right for the post.
  3. User removes link to that photo.

That photo is still around in the /content/images/ directory. Even worse, since Ghost doesn't do an md5 check when uploading new photos, multiple unused duplicates can be uploaded.

Obviously this isn't a crisis if you have a reasonable server, but if you are running a photoblog on a bare-bones droplet, this obviously becomes a large concern.

It's worthy to note that the Ghost team is currently discussing adding this feature.

So here is a "solution" around this problem. The quotes are because I'm sorry. You could set this up as a cron job, but I would opt for frequent backups and manual verification.

Please make backups, especially of the site's content folder.

Getting the Active Photos

In your Ghost admin panel, go to "Labs" and export your content as a json. Open up the file in any decent editor and use the following regex to select all the images:

/content/images/[a-zA-Z0-9-/._]*

Copy the results to a text file called used_photos and then use multi-select to trim off the "/content/images" part off each line.

Getting All Photos on the Server

We can use ls to get all the uploaded photos:

ls -Rt images/ > all_uploaded_photos

Then with some multi-select, you can append the year and month path to each folder quickly so the format matches the list of active photos.

Generating the Difference

Next we use a small python script to generate the difference:

fh1 = open( "all_uploaded_photos", "r" )

all_uploaded_photos = fh1.readlines( )

fh1.close( )

fh2 = open( "used_photos", "r" )

used_photos = fh2.readlines( )

fh2.close( )

unused_photos = list( set( all_uploaded_photos ) - set( used_photos ) )

fh3 = open( "unused", "w" )
fh3.writelines( unused_photos )
fh3.close()

unused will now contain all the unused photos on the server.

Deleting All the Unused Photos

Finally we will use the following bash script to iterate through the files and delete them from the server:

#!/bin/bash
file="unused"
while IFS= read -r line
do
    find . -path "*""$line" -type f -delete
done <"$file"

Seeing the results

By running df -h both before and after you can see how much space you gained.

Conclusion

Make backups before you try this.