Gelbooru

Notice: We are now selling NEW Gelbooru Merch~! Domestic shipping is free on all orders! Do you have an artist tag on Gelbooru? Let us know so we can properly credit you!

Ticket Information - ID: #771


ID:Category:SeverityReproducibilityDate SubmittedUpdated By:
0000771Feature RequestnormalN/A09/24/14 12:31PMDickServ
ReporterDickServ
Assigned to:geltas
Resolution:Open
View StatusPublic
Version:
Target Version:N/A
Summary:API for mass downloading MD5s
Description:I assume querying the posts API for "md5:xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" thousands of times would be unduly hard on the site. I think it would be handy if there was a URL that simply spat out every MD5 on the site.
Additional Info:If the hashes were stored in binary form (128 bit unsigned integers) and had no delimiters between them, 2,234,362 images would make a 34 MiB file. The file could be hosted on the image server and only updated every X hours/days if necessary.

If the post ID was stored next to each hash as a 32 bit unsigned integer (goes up to 4 billion) the file would be 42.6 MiB. If it ordered the hashes by post ID and used a placeholder hash for deleted posts, 2,418,571 entries would make a 36.9 MiB file.
Jerl replied at 2014-09-24 12:36:47
Why is it that you need a list of post md5 hashes?

DickServ replied at 2014-09-24 14:19:27
Checking whether any images I've obtained elsewhere are on Gelbooru.

Jerl replied at 2014-09-24 14:52:07
For what purpose? If it's to avoid uploading duplicates, we already check for exact matches using the post's md5, so there's no need for you to do so externally.

DickServ replied at 2014-09-24 22:24:37
So I can download and keep metadata for the small number of images that match. I want to download the metadata for the same reasons that I want it included when I use the posts API.