Author Topic: [Manga Database] Here's what's missing...if you want to contribute!  (Read 7187 times)

Offline type-a1pha

  • Member
  • Posts: 52
As a way to learn pattern matching, jsoup and at the same time find something to upload, I decided to go through the process of parsing and cross-checking manga-updates and bakabt in order to discover which manga were missing from bakabt (as of the 10th of August, 2014).

The result, which I hope will be useful to whoever wants to contribute to the manga database, are the following 4 tables.
I split the result in 4 for size convenience, taking as criteria being oneshot and/or having genre hentai.
However, the pages are still quite large and they may take some time to load!

1.  https://googledrive.com/host/0B9c8Cq9hoZDRV0RQd1diUTh2c2s/diff_not_oneshots_not_hentai.html (manga not oneshot and not hentai) - 2.3 MB
2.  https://googledrive.com/host/0B9c8Cq9hoZDRV0RQd1diUTh2c2s/diff_not_oneshots_hentai.html (manga not oneshot and hentai) - 170 KB
3.  https://googledrive.com/host/0B9c8Cq9hoZDRV0RQd1diUTh2c2s/diff_oneshots_not_hentai.html (manga oneshot and not hentai) - 925 KB
4.  https://googledrive.com/host/0B9c8Cq9hoZDRV0RQd1diUTh2c2s/diff_oneshots_hentai.html (manga oneshot and hentai) - 380 KB

Some remarks:
- Since the main problem is retrieving files, I added a field ACTIVE% which indicates, as a percentage, the number of chapters (taken singularly) with an active group associated. So, if a manga has 100% for this field, it means that for each chapter there's at least an active group that worked on it. (it won't always be correct since I didn't take into account all the idiosyncrasies of the manga-updates database. Nonetheless it should be a good approximation most of the times...hopefully!).
- All the manga in the list are completely scanlated as they were returned from the manga-updates advanced search with that radio button checked.
- I excluded doujinshi, lolicon, shotacon, non published and blacklisted manga. (still check before offering though!)
- Columns are sortable.

So, yeah, I hope this can be of some use! If you have any feedback, let me know.

Edit: here's the list of the blacklisted titles as extracted from the two blacklisted sites:
Sorry but you are not allowed to view spoiler contents.
« Last Edit: August 12, 2014, 06:41:24 pm by type-a1pha »

Offline Sherlock

  • Former Staff
  • Member
  • Posts: 305
  • To Nyaa or Not to Nyaa..
Re: [Manga Database] Here's what's missing...if you want to contribute!
« Reply #1 on: August 11, 2014, 01:57:22 pm »
This is pretty brilliant! Thanks ^_^ Brings back memories of Dille's 'Let's Flood BBT with Hentai manga' project..

Btw, didn't check everything, but Gacha-Gacha Capsule and Secret are present on BBT http://bakabt.me/browse.php?q=gacha+gacha

But since they're probably present in the same torrent, while parsing, it didn't detect it? Also, I think I saw a Yuru Yuri dj present too// Just throwing that out there, since you mention you'd excluded it..
« Last Edit: August 11, 2014, 02:01:10 pm by Sherlock »
There is no greater natural advantage in life than to have an enemy over-estimate your faults, unless it is a friend under-estimating your virtues.

Offline type-a1pha

  • Member
  • Posts: 52
Re: [Manga Database] Here's what's missing...if you want to contribute!
« Reply #2 on: August 11, 2014, 02:48:24 pm »
Yeah, it's far from being a perfect cross-check. To exclude the manga I used the BakaBT title, splitting it based on the character | and getting rid of non-alphanumeric symbols, and the manga-update id which can be found in the related links of the wish list or the description. So, if the title is mispelled or the links are missing, I didn't attempt to consider other information. Also, I had no standard way to gather information about batch torrents since there's no standard template for descriptions.

Probably in the Gacha Gacha case the parser matched only one of the two ids that is, the one in the description.
Edit: there are 3 Gacha Gacha entries in manga-updates:
- https://www.mangaupdates.com/series.html?id=44 <- present in BakaBT and not in the list (id 44)
- https://www.mangaupdates.com/series.html?id=27751
- https://www.mangaupdates.com/series.html?id=27752
The last two are included in the list and should be alternative different stories from the first one (not sure...checking with a tablet is pretty painful).

In the case of Yuru Yuri, if you refer to the fact that it is a Doujinshi, the one included is tagged as manga but you are right that it shouldn't be there since it's not published and thus not eligible for offering. In any case, if a Doujinshi-tagged manga is present, it's manga-updates' fault since I excluded them by parsing the query response with the checkbox "exclude" checked.

I may improve the result by excluding the non-published ones...a thing that I didn't think about earliet! :/ ...thanks for pointing that out!

Moreover I don't guarantee for the absolute correctness of the code. :P
« Last Edit: August 11, 2014, 03:29:45 pm by type-a1pha »

Offline brunoais

  • Developer
  • Member
  • Posts: 1661
  • It's juice and jam time!
Re: [Manga Database] Here's what's missing...if you want to contribute!
« Reply #3 on: August 11, 2014, 07:48:37 pm »
WOW, nice. Well done m8!
If you need help from a technical standpoint I have help you.
BTW, as a 1st try, the js code looks quite good. Good job.
Want more smilies in bakabt? Check this: http://forums.bakabt.me/index.php?topic=28322.0
Please!!! I need hentai recommendation here: http://forums.bakabt.me/index.php?topic=28566
BLOG POST
BBT Ako & Riko Suminoe Fanclub#000004

Offline SeventyX7

  • Member
  • Posts: 3212
Re: [Manga Database] Here's what's missing...if you want to contribute!
« Reply #4 on: August 11, 2014, 07:53:04 pm »

In the case of Yuru Yuri, if you refer to the fact that it is a Doujinshi, the one included is tagged as manga but you are right that it shouldn't be there since it's not published and thus not eligible for offering. In any case, if a Doujinshi-tagged manga is present, it's manga-updates' fault since I excluded them by parsing the query response with the checkbox "exclude" checked.

Mangaupdates is terrible about tagging doujinshi.  Most of it is just tagged as manga.  Typically, it won't be tagged as doujinshi unless the person who submitted the original entry included a " - Naruto dj" at the end of it, in the entry name itself.

Also, pretty sure all Korean webtoons are tagged as Manhwa even though bbt treats it all as doujinshi.

Is there any way to add gender bender, yuri, or yaoi tags to your filters?  I have a 2 week break before school coming up and I will probably upload a shit ton during that time.

Offline futz

  • Member
  • Posts: 243
Re: [Manga Database] Here's what's missing...if you want to contribute!
« Reply #5 on: August 11, 2014, 11:07:48 pm »
Pretty cool stuff here, but I have to point out that you missed an important thing while listing all the hentai. Loli/shota are not accepted on BakaBT, unless there's only a few chapters in a collection of short stories for example.

Here's what could be removed:
Sorry but you are not allowed to view spoiler contents.
It's mostly loli/shota stuff, when I added "needs to be checked" it simply means that I'm not sure about the amount/content of the manga. Others without any indication = I'm like 97% sure they won't be accepted.
I only went through the first 100 entries (on the manga list, no one-shot). And I unchecked Shounen Ai/Yaoi. So there's actually a lot more that could be removed from the list, as they won't be accepted under BBT rules.



Is there any way to add gender bender, yuri, or yaoi tags to your filters?
I see both yuri and yaoi tags. It seems that only the gender bender tag is missing.

Offline carks

  • Staff
  • Member
  • Posts: 418
  • Everyday is always the best day~
    • muh blog
Re: [Manga Database] Here's what's missing...if you want to contribute!
« Reply #6 on: August 12, 2014, 01:21:11 am »
Pretty nice. Anyway, you need to take out DMP's licensed manga since they cannot be uploaded here :C

Anime-Planet.com - anime | manga |

Offline type-a1pha

  • Member
  • Posts: 52
Re: [Manga Database] Here's what's missing...if you want to contribute!
« Reply #7 on: August 12, 2014, 02:01:57 pm »
Ok. I made the version 2.0 of the tables.

- Excluded manga with genre or tag Doujinshi, Lolicon, Shotacon or "dj" in the title (hentai-not-oneshot went from 300KB to 170KB while oneshot-hentai went from 1.0 MB to 380 KB...!!!)
- Excluded non-published manga (based on the Original Publisher field on MangaUpdates)
- Excluded blacklisted manga (based on the id of the English Publisher on MangaUpdates and a parsing of the titles from the DMP and DMI book listing)
- Added the missing tags in the tables search options
- Added a few search functions (prevail policies)
- Tried to improve the cross-checking here and there

Let me know if you have more feedback, suggestions on improvements, or find something that seems erroneous.

As for manwha, manhua and weebtons, there shouldn't be any, since I parsed the result of this query:
https://www.mangaupdates.com/series.html?page=%%n%%&perpage=100&filter=scanlated&type=manga&exclude_genre=Doujinshi_Lolicon_Shotacon

I also added to the first post the list of the blacklisted titles as parsed from the sites.
« Last Edit: August 12, 2014, 06:42:36 pm by type-a1pha »

Offline 12laus

  • Staff
  • Member
  • Posts: 989
Re: [Manga Database] Here's what's missing...if you want to contribute!
« Reply #8 on: September 29, 2014, 07:12:17 pm »
I've moved this to Torrent Requests and stickied it. Seems like it makes more sense here.

Online Al_Sleeper

  • Member
  • Posts: 8517
Re: [Manga Database] Here's what's missing...if you want to contribute!
« Reply #9 on: September 29, 2014, 07:14:49 pm »
Please don't exclude lolicon and shotacon for non-hentai titles.

Offline Nemu_khao

  • Member
  • Posts: 102
Re: [Manga Database] Here's what's missing...if you want to contribute!
« Reply #10 on: October 08, 2014, 05:25:43 pm »
Nice initiative. Btw your upload Uwagaki still shows up on the list :P

Offline Mistgun_Zero

  • Member
  • Posts: 6454
  • Idol~chan
Re: [Manga Database] Here's what's missing...if you want to contribute!
« Reply #11 on: October 08, 2014, 07:11:00 pm »
This is pretty awesome. This is gonna be pretty helpful.

Homu-Homu is troubled

Offline type-a1pha

  • Member
  • Posts: 52
Re: [Manga Database] Here's what's missing...if you want to contribute!
« Reply #12 on: November 27, 2014, 01:22:35 pm »
Nice initiative. Btw your upload Uwagaki still shows up on the list :P
That's because the lists are static and need to be "manually" updated. Since the beginning of August around 250 manga were added. So I'll probably update it in the next few days adding the requested changes.
« Last Edit: November 27, 2014, 01:25:45 pm by type-a1pha »

Offline moiman

  • Member
  • Posts: 23
Re: [Manga Database] Here's what's missing...if you want to contribute!
« Reply #13 on: December 20, 2015, 09:15:48 am »
Year has passed. Could you update this list?

Offline Mistgun_Zero

  • Member
  • Posts: 6454
  • Idol~chan
Re: [Manga Database] Here's what's missing...if you want to contribute!
« Reply #14 on: December 20, 2015, 10:41:43 am »
Year has passed. Could you update this list?
He is busy. Someone take over.

Homu-Homu is troubled

Offline carks

  • Staff
  • Member
  • Posts: 418
  • Everyday is always the best day~
    • muh blog
Re: [Manga Database] Here's what's missing...if you want to contribute!
« Reply #15 on: December 20, 2015, 07:06:26 pm »
Year has passed. Could you update this list?
He is busy. Someone take over.
Well, someone i know might want to take this over, like, he might do his own thing like this. Who knooOoOoooOows.

Anime-Planet.com - anime | manga |

Offline Hatsune4505

  • Member
  • Posts: 154
  • So many games, so little time!
Re: [Manga Database] Here's what's missing...if you want to contribute!
« Reply #16 on: January 02, 2016, 10:12:24 pm »
I've used this list for most of my uploads and it's still nowhere near close from done.  ;D  Especially the tons of shoujo manga that I'm probably not gonna upload. This list is quite useful.
« Last Edit: January 02, 2016, 10:26:29 pm by Hatsune4505 »
It's a trap?!

Offline moiman

  • Member
  • Posts: 23
Re: [Manga Database] Here's what's missing...if you want to contribute!
« Reply #17 on: February 18, 2017, 03:12:42 pm »
Links seem to be broken now.

Offline Hatsune4505

  • Member
  • Posts: 154
  • So many games, so little time!
Re: [Manga Database] Here's what's missing...if you want to contribute!
« Reply #18 on: February 19, 2017, 02:34:56 am »
Links seem to be broken now.
I kept a copy of the "Manga not oneshot not hentai" page in a html format for offline use. I'll leave a link here in case anyone wanted to see it while type-a1pha still hasn't updated the links.
https://drive.google.com/file/d/0B4U7yTO0SbGZLWcweVRhc3l1X3c/view?usp=sharing
Edit: Just download it and you can now view it offline, too. Also, it is a bit outdated because more manga have been completed and a few from the list are now uploaded.
« Last Edit: February 19, 2017, 02:58:23 am by Hatsune4505 »
It's a trap?!