Ticket Information - ID: #771
ID: | Category: | Severity | Reproducibility | Date Submitted | Updated By: |
---|---|---|---|---|---|
0000771 | Feature Request | normal | N/A | 09/24/14 12:31PM | DickServ |
|
|
Summary: | API for mass downloading MD5s |
Description: | I assume querying the posts API for "md5:xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" thousands of times would be unduly hard on the site. I think it would be handy if there was a URL that simply spat out every MD5 on the site. |
Additional Info: | If the hashes were stored in binary form (128 bit unsigned integers) and had no delimiters between them, 2,234,362 images would make a 34 MiB file. The file could be hosted on the image server and only updated every X hours/days if necessary. If the post ID was stored next to each hash as a 32 bit unsigned integer (goes up to 4 billion) the file would be 42.6 MiB. If it ordered the hashes by post ID and used a placeholder hash for deleted posts, 2,418,571 entries would make a 36.9 MiB file. |
Jerl replied at 2014-09-24 12:36:47 |
Why is it that you need a list of post md5 hashes? |
DickServ replied at 2014-09-24 14:19:27 |
Checking whether any images I've obtained elsewhere are on Gelbooru. |
Jerl replied at 2014-09-24 14:52:07 |
For what purpose? If it's to avoid uploading duplicates, we already check for exact matches using the post's md5, so there's no need for you to do so externally. |
DickServ replied at 2014-09-24 22:24:37 |
So I can download and keep metadata for the small number of images that match. I want to download the metadata for the same reasons that I want it included when I use the posts API. |