Discussion Forums > Technology
Danbooru Image tagger/renamer << NEED HELP!!!
ph4zr:
--- Quote from: Jorin on April 26, 2011, 11:00:19 AM ---Sorry if I was off base for bumping the thread. It was only on the second page and seemed reasonably new, so I figured I'd give it a shot.
--- End quote ---
Heh. No... I'm mostly just covering my own rear. Different forums probably have different guidelines for what constitutes necro, and I don't know where this one stands. On a faster moving board, someone would probably yell at you for bumping a thread from as recently as yesterday. On others, month old threads are still fair game. I've not discovered a hard and fast rule. /off-topic, but I thought I'd clarify
Edit: Plus, it isn't like you didn't have anything relevant to say, so it wasn't a bump for bumping's sake. Better to revive an old(er) topic than start a new one that's exactly the same, and all.
Jorin:
Cool, good to know. I'm going to try out that Danbooru Downloader add-on because I've been interested in doing something like this with music files too.
If a stand-alone browser does end up being in the works, I agree about allowing it to pull data locally, not only for the performance increase but also simply so you can get your existing collection up to speed. It would be feasible for it to access any or all of the main databases (Sankaku, Danbooru, etc., just switch them on or off when scanning) to see which files you've downloaded come from where.
Another feature that would suit this project really well would be including the tag types information in the browser. This is something normal image browsers like Picasa can't do, but it's really useful in this kind of image collection. You want to know what's an artist tag, and what's a copyright tag for example.
ph4zr:
--- Quote from: Jorin on April 26, 2011, 08:42:39 PM ---see above
--- End quote ---
(click to show/hide)Just from looking at the tagging guidelines on booru sites, I'm led to believe that those "tags" are actually stored with their type prefixes. The JPG tag metadata equivalent would simply be to add a pre-determined prefix to special tag types, such as "copy:", "char:". Pools (~Picasa Albums) I'm not sure about, but they would likely either have their own table-space or also use a specially formatted tag, such as "pool:###" or "pool:{name}". The latter seems more likely, since tags are already being queried for the image, and looking up the post ID in a different (likely highly populated) table that isn't even guaranteed to have a hit seems like a serious waste of time.
Prefixes are certainly the most straightforward way to approach it in Picasa, and they're easily searchable—you could even search for all copyright tags by using "copy:" or "copy:*" as a search string, if the software supported tag listing like that. The only immediately obvious downside is that it's unsightly. Actually querying for specific tag types from the booru might be slightly more difficult, as I'm not sure if the API will tell you what's what. If you go the route of HTML parsing, the tag type would be indicated by the class of the element wrapping the tag in question, just because styling requires some form of identifier for application to appropriate elements.
DD does not appear to save tag types in its DB, although it still applies them to templates properly. Looking at its source, it seems to be parsing HTML directly rather than using the API—which makes sense given its nature as a browser extension. Someone with sufficient time and motivation to deal with unfamiliar code could probably alter DD's functionality enough to hide images with known post IDs, though I'm not sure if this can be done early enough to prevent the browser from fetching the image previews. However, this would still require you to have downloaded the image using DD previously, and would not work with existing image collections. Still, a thought for anyone familiar w/ browser extensions. The code is modular enough that the rating problem w/ gelbooru could be fixed independently without breaking other functionality. (click to show/hide)About checking different sites: if they use the exact same image—i.e., if tag data hasn't been modified and the images haven't been edited—you could use the md5 hash to identify them. Or if the site hashes the image data independently of file data, if that's even possible, the md5 sum used by one site might still be identical on other sites—I honestly don't know how JPEGs and other image formats are laid out, so maybe it's not possible or is just too much of a hassle. In any case, the post ID is pretty much guaranteed to be different on each site and you'd have to use an identification service if the images are different in any detectable way.
Anything more advanced than a simple hash check, name/ID check, or bit by bit comparison I wouldn't know anything about. Image identification is best left to the realm of the graphically inclined and AI, IMO. I couldn't even begin to guess how such services detect image similarity if they aren't using that or tag comparisons. If someone -had- such a module it could probably be incorporated, but you'd need to fetch the image for comparison at that point, anyway.
Personally I don't mind having duplicates from different sites, just because it's too much work to go through 10k+ images looking for them, and it's easy enough to just ignore them or delete them as I see them. But I admit it would still be -nice- to be able to find duplicates and prune them or set them off to the side.
Danbooru Downloader fix for Gelbooru ratings
||NOTE: If you downloaded any of these files before the 2011.04.27 or the last edit time, whichever is older, you need to get the updated versions, because I was an idiot and packed the wrong folder. If I did it again, let me know. And hit me over the head with ... something soft, please.
(click to show/hide)Edit: Hm... I tried re-installing the downloader, apparently it no longer de-compresses itself to its own extension folder. The only difference if it does NOT de-compress, is to unpack the "danbooru_downloader@cuberocks.net.xpi" archive to its own folder, and then perform the same edits. If you don't feel like repacking it (zip format, compression optional) and renaming it to an ".xpi" file, you can probably just delete the .xpi and Firefox should detect the addon in its folder. BUT I have to verify that last bit. Unpacking/repacking the .xpi is the same procedure as for the ".jar" file. In the meantime, I wouldn't modify the .xpi files unless you've done it before, since I haven't tried that method yet.
Edit^2: Just remember to delete either the unpacked folder OR the XPI file before you try to launch Firefox again. It doesn't like it when two addons have the same name... as I can attest after spending about 10 minutes trying to figure out why I didn't seem to have any. -facepalms-
Disclaimer: Back up the DD extension before you do any of this. I might have forgotten a step somewhere, or it might not work -quite- the same in Firefox versions less than 4. I.e. I'm not sure what "find updates" will do to the addon in older versions. This edit applies to version 1.7.1 that I found at http://www.brothersoft.com/danbooru-downloader-397279.html, which is the only version I've come across. If you find a newer one, for heaven's sakes LET ME KNOW! XD
Oh, right, another disclaimer: I haven't gone through all of the code, and I don't know anything about XPI installers. If it destroys the world or taints your soul gem (or both), I take no responsibility!
Short Version
|| if you have at least as much clue what you're doing as I did at the time
Edit "danbooru_downloader _tab.js" in "danbooru_downloader .jar\chrome\content\" in the extensions folder. Be sure to repack properly. Compression doesn't seem to matter, but if you accidentally repack with an additional folder level the addon won't function at all. After doing all that, force Firefox to re-validate the addon by checking for an update and restart.
If your editor doesn't support line numbering, you can just search for the contents, there should only be one hit.
Original code: danbooru_downloader _tab.js || LINE 329
--- Code: ---
var details = this.tabDocument.getElementById('stats').childNodes[3].childNodes;
--- End code ---
Replacement code: danbooru_downloader _tab.js
--- Code: ---
//MODIFICATION|ph4zr BEG
var stat_section = this.tabDocument.getElementById('stats').childNodes;
var child_x = -1;
for(var x=0; x<stat_section.length; x++)
{
if(stat_section[x].tagName == "UL") {
child_x = x;
break;
}
}
//MODIFICATION|ph4zr END
//MODIFICATION|ph4zr LINE
//REPLACEMENT
var details = this.tabDocument.getElementById('stats').childNodes[child_x].childNodes;
//ORIGINAL
// var details = this.tabDocument.getElementById('stats').childNodes[3].childNodes;
--- End code ---
There is probably a better way to do all of that, but I was trying to stick as close to the original coder's style as possible, with as few edits as possible. If I had my way I'd probably change some of it to regexps and go through commenting all the code. Most people don't look at the source, though, so comments would likely just bloat the addon—mine would literally almost double its unpacked size if I took the time to do it.
Long Version
|| if you aren't really sure where to find stuff, or aren't very comfortable with archives
If you want gelbooru ratings to be identified properly, you can modify the extension according to these instructions. You will need 7-zip or an equally functionally archive/ZIP program, a text editor for saving in -plain- text (I prefer Notepad++ on Windows), and the ability to find DD's extension folder. You -should- be able to find it on Windows at:
%appdata%/Mozilla/Firefox/Profiles/__USERSPECIFIC_DIRECTORY__/extensions/danbooru_downloader@cuberocks.net
The %appdata% string is an environment variable that should de-reference properly if you just paste it into an explorer window. Obviously "__USERSPECIFIC_DIRE CTORY__" is a directory that changes per person, so you'll have to browse to it yourself. If that doesn't work, the -default- user %appdata% value is "C:\Users\__USERNAME__\AppData\Roaming\", keep in mind that "AppData" is a hidden folder by default.
Now that you're here, browse to the "chrome"* folder. Take a second to back up "danbooru_downloader .jar"**, just as a precaution in case something goes wrong. Depending on how comfortable you are with zip files (or extensions in general), you can either unpack the entire zip file or just extract "danbooru_downloader _tab.js" from "danbooru_downloader .jar\chrome\content\" for editing.
*I don't know why it's called that either.
**Assuming you have file extension display enabled. If you don't, you can enable it in Windows via "Explorer Menu/Tools/Folder Options | Tab:View", by unchecking "Hide extensions for known file types". The menus might be slightly different depending on your OS version.
/branch
If you did NOT unpack the entire thing, make your edits to the file, save it somewhere, and add it back to the right folder in the archive (I just drag it to the window). 7-zip will probably compress this, but it didn't seem to have any effect.
/alternate branch
If you unpacked the entire thing, browse to the file and perform the edits at the top. After you do this, add the unpacked "chrome" folder to a 7-zip archive in ZIP format with the "store" option for compression settings. You can probably compress it, but the original isn't compressed, so I figure "why tempt fate?"
If you did this properly, by default 7-zip will have created a "chrome.zip" file. Take a second to verify that "chrome" is the top level folder in the archive, if it isn't you need to make sure to repack the actual "chrome" folder, not its contents or the folder it's in. If you are repacking, make sure you're repacking the one you created when you decompressed the archive. Now rename "chrome.zip" to "danbooru_downloader .jar" and place it in location of the original "danbooru_downloader .jar" file. At this point you can delete the unpacked files.
/branch end
Having done that, -restart- Firefox and open up the addons panel. In Firefox 4, scroll all the way down to the bottom, you'll notice that "Danbooru Downloader" is disabled as "incompatible". Just right-click and click "find updates". This won't actually do anything, but Firefox will do whatever it does and ask you to restart. Once you do that, the addon should work properly.
Lazy Version
|| for those who are just lazy bastards (like me) and/or have a high level of trust (not like me)
Assuming you're fine with not knowing what you're putting into your addons folder, you can just download "danbooru_downloader .jar" directly from the following link and place it in the appropriate folder (detailed above in big bold letters) the XPI file and install that to Firefox the same as any other extension. ... after you back up the other one in case things go horribly, horribly wrong. Alternatively, you can download the modified source file and replace it in the appropriate archives.
XPI file: Install this directly
http://www.megaupload.com/?d=YPH7FW9N
JAR file: Replace the original JAR file in the DD extension directory
http://www.megaupload.com/?d=296A4WOV
ZIP archive of modified source file: ...if you're downloading this, surely you know what to do with it.
http://www.megaupload.com/?d=61PTB8KR
Jorin:
(click to show/hide)I didn't actually know there was a standard way to store tag types/prefixes. This is very cool. There would be advantages to using both internal (prefix:tag) and external (A large table such as http://danbooru.donmai.us/tag) methods for this. Internal metadata is obviously more portable and easier to migrate because it can move with the image and be scanned into a database. That's helpful with well-established fields or prefixes that will be useful in a lot of different settings. An external metadatabase would be more helpful for specific uses and niche information that you may not want to store directly in the file.
Exiftool's readout of the subject field in images I downloaded from Danbooru and tagged with Sheska didn't show any prefixes among the tags. They were just simply the tags by themselves. I wonder if this actually means Danbooru reads from a table to provide the information for tag types/prefixes rather than reading prefixes from the image files themselves.
For duplicate detection, unless there's a simple and effective (and free!) multi-format algorithm for edge detection and RGB values I'd be more inclined to leave it up to the experts. I'm with you in that I don't really need to worry about duplicates that much. I just delete them when I see them.
I noticed that DD didn't pull the tags when I tried to use it. The subject field is empty when I read it with Exiftool. Should it be tagging the images, or just only downloading them?
It would be incredibly useful to link the functionality of DD and the Sheska script (or just Exiftool if that's possible) if the latter situation is the case, that way you'd get images downloaded into organized folders, and also searchable with the internal tags in an image viewer.
(click to show/hide)I noticed DD has an option to control which sites trigger it, and that information ultimately directs images to different local folders. If it's possible to run Sheska or a Sheska-ish thing on an image automatically after it downloads with DD, it should also be possible to slipstream the correct json url for whichever site the file came from, that way the tags could be included automatically from their source as well.
If a standalone *booru viewer is in the cards, colour-coding the prefixes and making sure they're searchable would be great. Blacklisting and whitelisting is already accounted for within DD, so various tags could be filtered. Also, image viewers like Picasa should allow you to detect unwanted tags and mass-remove them from your collection as well.
It looks like almost all the ingredients are here to make a useful app. What's missing still is a process to mass-tag a bunch (like, hundreds or thousands) of files already downloaded, provided they're not renamed and/or they have their original md5 hash.
Tiffanys:
I use Tagbooru to download whole gobs of junk: http://code.google.com/p/tagbooru/
Navigation
[0] Message Index
[#] Next page
[*] Previous page
Go to full version