Sunday 30 December 2012

ExifPeeler: A web-based exif removal tool

I bought a cheap-and-cheerful VPS a year or two ago from the gentlemen at BuyVM.net. I've used it on and off for the many useful things that you can do with a VPS, but hadn't really tested it's limits. But the server is single core with an astounding 128mb of ram and 10gb hard drive, what can you really do with that?

Enter ExifPeeler. It does one thing and does it competently: You upload a batch of pictures, it strips out the EXIF data from them and gives them back to you nice and clean. You can then do anything you like with the pics without worrying about accidentally giving away your location or any other personal data.


Here's what it looks like.

And here's what it looks like after you upload files. Simple, right?


Features:
  • Batch upload. You can upload up to 100 files or 50mb at once, and download the results individually or as one big zip file. 
  • Unique URLs: Each pic has a unique and distinctive URL. Only you or anybody you link to your images will be able to see them.
  • Secure Timed Destruction: After an hour, your files are deleted. There are no options to keep them. Because I have such limited hard server space, I can't afford to leave your holiday snaps lying around for weeks. Let's call it a security feature.
  • Cross-Browser Support: Works great on firefox, chrome, and safari. IE9 and below support single-file upload only. I haven't tested with IE10, so let me know how it goes.
  • Mobile Device Support: It handles batch uploading from ios, android, and blackberry like a pro.   I haven't tested on win phone 7 or 8, so let me know if you try it. 
  • Duplicate Renaming: If you upload 5 pics called 'image.jpg' at once, they'll all get appropriately renamed.
  • Relatively Graceful Failure: if any individual pictures you upload fail - say, you accidentally upload a word document at the same time - it will still clean all the legitimate pictures and list any failures separately at the end.
Technology:
The server I'm running it on pretty much dictated the rest of the choices. One PHP script that does all the admin (uploading, removing anything that isn't photos, creating thumbnails, etc), while it farms out all the hard work to exiftool. It double-checks to see that the exif data is definitely gone, and will refuse to even display the image if it's not successfully removed. Every 5 minutes a cron job runs and removes any files uploaded more than an hour ago. 

The key consideration is traffic. I've done some stress-testing, and it should be able to handle multiple file uploads per second. Under heavy load you might get a 5-second delay when uploading large batches (50+ files), but we'll have to see how it goes in real-world conditions.

Why an exif remover?
I really just wanted to see how easily it could be done. While my solution isn't particularly elegant, it works and is damn simple to implement. It consists of one php script, one third-party program, and one cron job. 

I also couldn't find another site that does the same thing. There are sites that will let you remove exif data from single files, there are sites that have exif removal as a bonus feature hidden away in menus, and there are of course hundreds of local exif-removal tools. I couldn't spot a web-based exif removal tool that lets you upload more than one file at once. This fills that niche.

Don't a lot of websites remove EXIF data anyway?
Yes, they do. Sites like blogger, facebook, and flickr will prevent other users from seeing some or all exif data from your pics. The data is still present, and the companies can still use it for what they like. They just don't provide it to viewers.

Most smaller sites don't seem to remove EXIF data. If you're uploading pics to a forum or emailing them to somebody, it's definitely good to remove the EXIF first.

If I shouldn't send files around without removing the EXIF, Doesn't that mean I shouldn't upload them to ExifPeeler?
Yes. If you're worried about it, strip out the data before you send it anywhere. The aforementioned exiftool is good and there are lots of other options. Exifpeeler is great for casual use but if you have information you genuinely need to keep private it's best it never goes on the internet at all. 

Where's the script so I can run it in my own project?
I'll probably put it up here soon, but in the meantime you can email me if you want more info. The script is nothing special, I just want to clean it up a bit and see how it works in the real world before I post it. 

How many times was EXIF mentioned on this page?
18.



If you have the time try it out, and let me know if you manage to break it. If you do break it, please send a screenshot along with a brief description of what you were doing.