[ / / / / / / / / / / / / / ] [ dir / bbg / choroy / dempart / doomer / flutter / vfur / vietnam / vril ]

/hydrus/ - Hydrus Network

Bug reports, feature requests, and other discussion for the hydrus network.
Winner of the 75nd Attention-Hungry Games
/caco/ - Azarath Metrion Zinthos

March 2019 - 8chan Transparency Report
Name
Email
Subject
Comment *
File
Password (Randomized for file and post deletion; you may also set your own.)
* = required field[▶ Show post options & limits]
Confused? See the FAQ.
Embed
(replaces files and can be used instead)
Options

Allowed file types:jpg, jpeg, gif, png, webm, mp4, swf, pdf
Max filesize is 16 MB.
Max image dimensions are 15000 x 15000.
You may upload 5 per post.


New user? Start here ---> http://hydrusnetwork.github.io/hydrus/

Experienced user with a bit of cash who wants to help out? ---> Patreon

Current to-do list has: 1,626 items

The program is now on Python 3! Check v335 release post if you need to update from before then!

Current big job: None--check the pinned poll thread


File: 3711d0607c5673e⋯.mp4 (3.38 MB, 1024x576, 16:9, 3711d0607c5673ea2d3f398a9a….mp4)

ff909c  No.12152

Ok lads, as I am now finishing up OR search, I am soon going to be free to work on a new 'big job'. I am pleased that I was able to make simple Client API and OR search in much faster iterations than previously. I hope to continue like this, keeping the next big job 8-12 weeks at the most before running a new poll.

The current list is:

Just catch up on small work for a couple of months

Reduce crashes and ui jitter and hanging by improving ui-db async code

Clean up code and add unit tests

Improve tag siblings/parents and tag 'censorship'

Add ways to display files in ways other than thumbnails (like 'details' view in file explorers)

Add text and html support

Add Ugoira support (including optional mp4/webm conversion)

Add CBZ/CBR support (including framework for multi-page format)

Add import any file support (giving it 'unknown' mime but preserving file extension)

Improve 'known urls' searching and management

Explore a prototype for neural net auto-tagging

Add support for playing audio for audio and video files

Add ui for waifu2x and other file converters/processors

Write some ui to allow selecting thumbnails with a dragged bounding box

Add popular/favourite tag cloud controls for better 'browsing' search

Improve the client's local booru (this likely now means a backend migration to the Client API)

Improve duplicate db storage and filter workflow (need this first before alternate files support)

Improve shortcut customisation, including mouse shortcuts

Add ratings import/export, and add 'rating import options' to auto-rate imports

Add more commands to the undo system

Improve display of very large/zoomed files in the media viewer

Set thumbnail border colours on user-editable rating and namespace conditions

Improve hydrus network encryption with client cert management and associated ui

Add tag metadata (private sort order, presentation options, tag description/wiki support)

Improve file lookup scripts and add mass auto-lookup

Add multiple local file services (which will enable true nsfw/sfw partition)

Add an incremental number tagging dialog for thumbnails (for adding page:n etc… to a sequence of files)

Permit custom ordering of thumbnails, through mouse-dragging or otherwise

Allow user to have multiple open split tab columns or separate windows with one or more pages

Improve rating workflow by providing score representatives to compare with

Add file modified/creation timestamp searching and sorting

Write an URL Repository so clients can share known url mappings

Add animated thumbnails for videos (animating on mouseover)

Allow multiple custom 'open externally'-style file launch commands for files

Add version tracking to downloader system objects and explore remote fetching of updates

Expand file notes system

I will put up this poll with the 349 release post and then select whatever seems to be on top by 351, with the proviso that I will try to discount any non-organic (e.g. botted) votes. You will be allowed to vote for multiple items. I am happy to work on any of it.

Please feel free to suggest new items or ask for longer explanations of any of the above. I will edit the list as new items are agreed on.

Post last edited at

85001f  No.12154

I like the idea of CBZ support because I've got quite a few of those, but it's more important to have hashes of the archive contents, then the archive itself, and I can't think of how to resolve that with the hash is the file identity paradigm.

I recognize that this will make the storage requirement of manga marginally larger, but I would be willing to have that for some order.

As such I think it would be best to just have a way to unpack these files and improve the way hydrus manages and displays collections.


b5de07  No.12155

-public url repo (aka PURR) - imagine searching the hydrus tag db then downloading all the urls associated with the resulting hashes

-another round of api because api is great


85001f  No.12156

-PCIDR: Public ipfs content id repo

ipfs content ids seperate from url repository.

-Also I would make the PURR optional to pull from, because I can imagine that not scaling well.

-API

-duplicate pairs

-manage subscriptions

-add url without publishing a page

-get GUG/urlclass results

-Subscription url class

Generate subscription entry from url, good for use with hydrus companion


cb1608  No.12157

watch uma musume


4ed47a  No.12158

From top to bottom, this is my main wishlist:

- Add multiple local file services (which will enable true nsfw/sfw partition)

- Improve the client's local booru (this likely now means a backend migration to the Client API)

- Add an incremental number tagging dialog for thumbnails (for adding page:n etc… to a sequence of files)

- Add more commands to the undo system

On the other hand, I'm still quite a few versions behind. I tend to keep a large number of pages open because my tagging barely keeps up with my downloading, but having a large number of pages open on upgrade means that I often lose my session on upgrade and all my organized-but-not-tagged-yet files vanish into the database. So I'm not in any real rush for any of those.

Sure, a mass auto-tagger would be helpful with my tagging problems, but I don't think it necessarily needs to be built in. It looks like the Client API should provide most or all of what's needed for external tools to do that, and I've already got my own little personal-use mass-tag-search tool I wrote myself that should integrate nicely with it when I upgrade to a version that has the API.


9e79ff  No.12160

I've seen it suggested a few times on Discord now:

How about namespace aliases and/or renaming?


f75732  No.12161

>>12152

Going to have to ask for an expansion of labeling why things are deleted

the current where it got deleted from helps a lot. in fact it could facilitate a jerry riggable system where if we have different places to send files, currently we have

Inbox

Trash

Archive

If the new where it got deleted from knows that it was in inbox, or it was in archive when it was deleted, then having other places to deposit files, such as

Waste of Hdd

Generic

Bad Meme

could be user made, and while its more steps to deleting the files, it would be tagged in the detailed status as 'deleted from 'waste of space'' and effectively work as an explanation of why things are deleted.

If this is possible, Ill vote for it over just deleted file tagging as more users have a use for more places to put files then they do for tagging reasons for removal, I do understand my problem is a minority but if its possible to solve though a encompassing issue, then ill go that way instead.


621edc  No.12162

I think something like "file sibling" will be great.

for exmaple:

you file A and B.

you delete file B in the duplicate filter in favor of file A because A is better.

when you import file B from a booru, it will be marked as deleted. I think it should point to A instead and give it the tags. and same with the ptr.


5ab69a  No.12166

>>12152

1. Text, Comic Books, Documents, html/xml/json and code support.

2. Scraping documents or webpages support (see list in bottom)

3. Older video, lossy audio, lossless audio and other arbitrary file support


https://github.com/mikf/gallery-dl (Python, Art Gallery and Booru)
https://github.com/Bionus/imgbrd-grabber (C++, Booru)
https://github.com/Xonshiz/comic-dl (Python, Comic/Manga)
https://github.com/yuru-yuri/manga-dl (Python, Manga)
https://github.com/manga-download/hakuneko (JS, Manga)
https://github.com/Hamuko/cum (Python, Manga)
https://github.com/NguyenDanPhuong/MangaRipper (C#, Manga)
https://github.com/JimmXinu/FanFicFare (Python, Novel + FanFic)
https://github.com/kanasimi/work_crawler (JS, Comic/Manga + Novel)
https://github.com/riderkick/FMD (JS, Comic/Manga)
https://github.com/rg3/youtube-dl (Python, Videos) (yes I know)
https://github.com/adolfosilva/libgen.py (Python, LibGen)
https://github.com/NadalVRoMa/PyLibGen (Python, LibGen)
https://github.com/bibcure/bibcure (Python, LibGen/SciHub)


f75732  No.12168

So, I just got done with the first 1000 images on the duplicate processing

Shit was slow to start due to how the program handles zooming in and out, and seeing that progress was VERY slow, I got something called magnifier 2.4

http://www.iconico.com/magnifier/

I have a second 13 inch 1080p screen now,

So with the magnifier being the entire second screen, I put the cursor on the eye, zoom in 10x and flip

Shit significantly increased productivity when looking for dups

im wondering if there is something that could be done in program for much the same effect, or even utilizing the fact the program has the source files to look over to not only zoom in, but possibly even increase resolution till zooming is needed


262c36  No.12169

The "yiff.party attachments" simple downloader is broken, the links it finds are 404s.


558767  No.12171

>>12168

pretty much doing the same thing but with dsr and [win] + [+] there is a build in solution in win that does it


43e3de  No.12172

>>12152

From what I can tell the client UI doesn't hang enough to prioritize that right now

Neural net auto tagging is something I'd like in the future but not before batch/sustained mass lookup

Custom thumbnail ordering might be useful for some but I see little reason to prioritize it now

My priority list would likely be

1) improve local booru and go ahead with migrating to client api

2) batch/mass tag lookup and application, maybe with an option to negate application of tags before committing to either local or remote tag repo

3) db storage and filter workflow to open up alternate file support and maybe squeeze out extra performance

4) split windows/tabs for a/b trash application on manual file import


262c36  No.12173

In the manage tags window, can we get the option to NOT remove tags we type in manually if they already exist? Typically when I type it I want it added, if I want to remove tags I double click them with the mouse. The way it is now is just annoying.


ca1447  No.12174

I didn't follow hydrus closely. since the python 2 to 3 move.

Are pages and filenames metadata yet? as in:(on pixiv, this filename and page if part of a serie, on nijie, this filename and page, in this tank, ect) Is it even doable?

Do we have a separate editable metadata box?

Can the tag be weighed like what we see in vndb (very useful for comic/image set tags vs illustration tags?

Do we have a separation between spelling error hard siblings (always replace) and alias soft sibling (never replace but appear as).

can we alias a namespace easily?

And most importantly, do we have a way to organize the subscription per circle/author and not per site.

If metada are added, it should be possible to sort by specific elements of the metadata, the ike of pixiv work:#

Thanks for bearing with me.


41b719  No.12175

>>12173

this is workflow I use often, I don't take my hands off the keyboard when tagging so I need to be able to delete tags by typing them. If dev changes this please do it with a setting.


269a51  No.12176

>>12152

My wishlist from top importance to least:

-Improve duplicate db storage and filter workflow (need this first before alternate files support)

I think the duplicate filter UI could use some work. One thing is that the top grey panel covers a lot of the images. Transparency and the ability to move the panel would both help.

Another thing that could help is using symbols and color coding to make it easier and faster to recognize the details of the image.

Maybe an icon with arrows could indicate resolution/dimension of the image. If the icon is green, it mean the current image of the comparison is higher res. If it's red, it's lower res. A green calendar could mean more recent, and a green T could mean more tags. Grey could mean it's a tie for any of the aspects, and black could mean zero (as in no tags). There could be PNG or JPG icons to show file type too.

-Reduce crashes and ui jitter and hanging by improving ui-db async code / Just catch up on small work for a couple of months

Some good old fixup work would be great too though.

-Improve the client's local booru (this likely now means a backend migration to the Client API)

I had some thoughts about basing the booru more around the idea of shared pages than the current state of sharing batches of files.

-Add popular/favourite tag cloud controls for better 'browsing' search

The more browsing features, the better!

-Add file modified/creation timestamp searching

This could be really handy for going back into previous years of my collection where the tagging isn't so thorough compared to today.


88eeab  No.12177

One of the biggest areas that need improvement is the duplicates section: it will be so convenient to be able to check for all duplicates inside an opened page


8e95b7  No.12178

File: df93140821fdeb5⋯.png (1.31 MB, 2000x2200, 10:11, df93140821fdeb59cd55faa4e8….png)

Honestly all I wish right now is polishing the already existing features, instead of adding more half finished ones. Duplicate processing still has a lot of work to do, like highlighting certain important tags like "revision", allowing to check new imports for duplicates instead of only being able to check the entire collection, better comparison between images (retain zoom level with different aspect ratios, make zooming close not freeze the entire program), giving images ratings (which you mentioned). I think the client api could also be improved, and possibly make some sort of direct integration with Hydrus possible, for example being able to add custom actions to right click menus (my long term wish would be being able to have an action that processes the selected files externally with waifu2x, add them to the file service, set them as alternates to the original files, copy the original tags and adding a "waifu2x" tag, while being able to do all of this using the client/plugin api). I think a highly extensible api for Hydrus would be very healthy for the program, because for a long time the trajectory has been that of baking literally everything to the program, even if they were possible to do with simple plugins, making Hydrus even more of an unmaintainable mess as the time progresses. Also, all the usual ones like audio support and less crashing on linux.


6d6dd6  No.12180

>>12175

There should absolutely keep being a keyboard friendly way to delete tags,

but IMO a better solution than toggle would be prefixing a tag with minus to delete it.

either that or a modifier like alt+enter


41b719  No.12181

>>12180

since dev has made or with a modifier key too I feel like a modifier like that is a good idea.


262c36  No.12187

>>12169

oops this was supposed to go in the main release thread…

>>12175

Sure, that's why I suggested an option for it. There's that cog button right there just waiting for it. A modifier key is a good idea too though.


d4832c  No.12188

Since people are talking about shortcuts, I think I'd be great to go another step further and update the shortcut system to accommodate an entirely keyboard-driven setup. Currently there are some things you have to use the mouse for, not to mention window focus glitches that require you to refocus with the mouse.


4ee079  No.12191

>>12188

Hydrus in Emacs when?


228e79  No.12192

A U D I O

U

D

I

O


3fd3de  No.12196

>>12174

The Pixiv and nijie downloaders work but its still too messy to use in Hydrus. You're biggest concern will be trying to translate all those Japanese tags if you don't know Japanese. From 1 artist alone, it took me 1-2 weeks of changing all the Japanese tags into english using siblings as I don't know any other better way to do this.


5ab69a  No.12198

>>12191

Vim or bust.


5ab69a  No.12199

Anything from >>11761


6c3532  No.12200

>>12154

You might wanna check >>12166 which has a broader scope.


000000  No.12201

hi! could you please consider adding pagination (booru-style) to files dialog?


ff909c  No.12210

>>12154

Yeah, a potential route is for me to provide both actual cbz and virtual, either in a clever or dumb way, so you can access both the archive and the pages as separate files. For a first step though, I would only start with adding actual .zip/.rar inspection and presentation, so it would always be a single file in hydrus. What this job really is is adding single-file-but-multi-page presentation support to the media viewer, some kind of a secondary previous/next when looking at a cbz object so you can page through it as well as move to the previous/next actual media file.

There would also be ancillary stuff like cleverer cbz/cbr thumbnail generation from first page and so on.


ff909c  No.12211

>>12155

Thank you. I will add public url repo to the list. I want to avoid heavy work on the API for a little bit and see how fitting it into regular weekly work goes, so I will skip it this time.


ff909c  No.12212

I am changing "Add file modified/creation timestamp searching" to "Add file modified/creation timestamp searching and sorting".


ff909c  No.12213

>>12156

IPFS multihashes sort of fit into the idea of a URL Repository, so perhaps a second version of a URL Repo would support other identifiers. It could also fit into a completely different Hash Repository, for public sharing of md5 and sha1 hashes as well, which might be neat for some future booru lookup operations where looking up the md5 of something you don't have may be useful.

Would you like me to add at IPFS Repo to the list, or maybe just a full iteration on current IPFS support first? I still want to add that no-blocks upload system, so maybe that is better done first.

I would run a PUR I think, but it would be as optional as the PTR.

As I said just above >>12211, I will keep API off this cycle. I am happy with the first version I have just done and want it to get a bit of use and extended feedback before I go heavy back to it. It'd be nice if I can just do little work here and there without having to wait a whole big job iteration to push it forward.


ff909c  No.12214

>>12160

That's covered by "Improve tag siblings/parents and tag 'censorship'". I'd love to have user-custom namespace siblings, including from something to nothing, like if you don't like 'clothing:bikini', you can have that appear as 'bikini'.


ff909c  No.12215

>>12161

Yeah, multiple local file services is probably your best bet for this. I'd add the particular local file domain's name to the deletion reason.

But tbh, now I have that deleted reason table, I may also sneak in a quite option for you that allows you to set up some custom deletion reasons and then add a new button to the delete confirmation box. It'll let you select that 'bad meme' reason and pass that instead of the current generic reason. Please play with the existing new system for another week and then let me know what sort of workflow you would like.


ff909c  No.12216

>>12162

Yeah, I would like this. It will require some db prep work before I can do it, so you want "Improve duplicate db storage and filter workflow (need this first before alternate files support)" for now. File alternates will have some sort of file family relationships support with it.


860cb9  No.12217

I vote for collections (so we can read/archive mangas etc. properly) and audio in the media viewer.

PS: Just something to consider: blacklist files based on pHash (similarity).


ff909c  No.12218

>>12168

I have this as "Improve display of very large/zoomed files in the media viewer" for now. I'll move to a tile-based rendering system, so I am only storing what's on screen in memory. Atm, I make a giganto bmp which is pretty shit for several reasons.


ff909c  No.12219

>>12169

>>12187

No worries, should be fixed in 348.


ff909c  No.12220

>>12174

No, I am afraid most of that stuff is not in yet. Some of those issues would be worked on in the items in the big job list.


ff909c  No.12221

>>12173

Sure, that is small and I can add it in the next few weeks. I'll make a new 'add only' check item in the cog button of manage tags dialog.


ff909c  No.12222

I am adding "Add animated thumbnails for videos (animating on mouseover)" to the list.


ff909c  No.12223

>>12181

>>12180

>>12188

>>12191

>>12198

Yeah, you probably want "Improve shortcut customisation, including mouse shortcuts". There's still a lot to do here, but the ideal is to allow customisation for all actions. I'd love to speed up tag management for keyboard-only, and it would start there. So much shortcut processing is still hardcoded.


85001f  No.12224

Run with command line (all argv or xargs like), context menu for thumbnail grid.


85001f  No.12225

I vote to call the Public URL repository

PUR for consistency, or

PURe because it's memorable


ff909c  No.12226

>>12201

Can you explain this more? What's the 'files dialog', and how would you like it to page? Would you like the thumbnails from a regular search to be split into pages?


ff909c  No.12227

>>12217

For sub-paged collections in the media viewer, I think you want cbz/cbr support, which tackles this basic problem of adding two tiers of pages in the media viewer. I can tack collection paging onto the media viewer as part of that.

Blacklisting by similarity is something I would love to do when the duplicate filter is more mature. If I can be 99.98% sure in code that one file is the same as another, I can auto-dupe clear on several rules for easy situations (like exact pixel dupes) and only leave complicated/fuzzy dupe processing to human eyes. And as soon as there is auto dupe processing, I can do it on import and fail the import with an appropriate message.


ff909c  No.12228

>>12224

Thank you, I am adding "Allow multiple custom 'open externally'-style file launch commands for files" to the list


85001f  No.12229

>>12161

I would go with a total 1 byte ranking number, which you can optionally assign labels to. Lower numbers have higher trash priority.


000000  No.12230

>>12226

thank you for reply

by "files dialog" i meant "my files" page

yes, I guess you're right. the only option i see when i have many images selected by query but i don't want to load all of them is to add system:limit to search, but there's no system:offset. it would be awesome to have the opportunity, for example, to set number of thumbnails on page in hydrus options and scroll through this pages even without system:limit. for sure not all thumbnails/tags should be loaded, but only ones that are on page

from gui view, i guess arrows with [curr. page]/[tot.num of pages] will be enough


5ab69a  No.12232

>>12225

If that is the case PTR should be called PeTR


af51b2  No.12233

File: d326c653fca23f4⋯.png (328.28 KB, 590x590, 1:1, stopit.png)

>manga/CBZ

>hydrus

Why? There is already a good software for those things. HappyPanda/X for example.

I fear that hydrus slowly going to become an all-in-one bloated software, stray from the original path of being a organizer of random images.


4c2469  No.12234

>>12233

>>12233

>Why? There is already a good software for those things. HappyPanda/X for example.

Indeed, the most zip/cbr support hydrus needs is being able to make thumbnails for those.


3fd3de  No.12235

>>12233

I kind of agree with this, some things just shouldn't all mix together. Doujins just add a whole another layer of tagging that just doesn't mix well with single images, videos, txt files, etc. All this stuff is really starting to clash with hydrus. Hydrus needs to be split into a different programs if you really want all that extra stuff or running multiple instances of Hydrus(which I'm currently doing for videos but mostly just testing it out).

I suggested this once but a separate version of Hydrus that's meant solely for doujins would be a pretty nice idea as some people might not want to use a webclients like happypandX or LANraragi. Managing doujins is a whole lot easier than single images so I doubt you'd need to do much work to manage a program like that.


bb3719  No.12236

Siblings handle some cases where you want to replace namespaces, but this doesn't work well for single images or long tags (which you have to retype).

I'd like an option to change the namespace for selected tags in one or more on the fly, e.g. change an unnamespaced tag to an actual name space, or replace one for another.

For example, I'm manually changing many creator: tags to blog:, tumblr: instagram: or whatever because the account owner is not the actual creator of the work. I can do this with some SQL-fu, but it woul be much easier to say, right-click a tag in the Manage Tags screen, select "change namespace", then type in the new namespace or select it from a dropdown.

It would be nice to have some kind of search-and-replace interface for tags globally when doing this kind of maintenance.


29857b  No.12237

>>12235

>Managing doujins is a whole lot easier

I might be slightly biased towards this(I'm the dev of one of the webclients and it's not HPX), but doujin providers are much more annoying to scope for metadata than boorus. Alongside this, you also can run into encoding issues with the filenames inside of the archive.

Hydrus bypasses encoding problems by renaming files currently (and it's probably much better off thanks to that), but that wouldn't apply to files inside of a cbz.

I'd personally rather see support for more esoteric formats like xml/svg than manga, but that's just me wishing for stuff that'd fit my workflow more. With the original HP dead there's not much left in terms of desktop clients for doujins.


f75732  No.12240

File: bec24e26cb659fb⋯.png (10.81 KB, 603x286, 603:286, chrome_2019-04-15_07-08-28.png)

>>12215

Honestly loving the new system as, while dup filtering is slow and generally i'm getting rid of the 120kb version instead of the 10mb one, its slower then what I would find optimal.

lets give you an example,

https://boards.4chan.org/s/thread/18749087

generally any image I get from /s/ wont have duplicates, and generally I subscribe to the 3dpd philosophy, however on this board, I pick only the threads that have an interest for me, or are really attractive, going through this thread, I would probably keep 1/3 of the images, with most being far to low quality, and even using them as art references would be a difficult task, If I could delete them and tag them with 'unattractive' or 'low quality - real' so that's what shows up, I would, and I could likely go through all my threads from real boards and get back quite a bit of space in a short order of time doing this.

as for how I would want it… you may remember a while ago, I posted a 'mockup' of what it could look like, an area for inputing a reason, and several quick input buttons for saved canned common reasons all of this on the delete image dialogue, and all of it completely ignoreable incase you don't want to ad a reason.

If this was an option either at the send to trash stage and if nothing was added a second chance to add something at the permanent delete stage would be nice.

while I love the idea of inputtable text for a delete (and in cases like this, if I delete 1 or 1000 images at once, the one input applies to all) i'm not tied to it, this just facilitates custom reasons to better explain why something is there. lets say I have 3 quick access buttons

Low quality

Meme

Waste of hdd space

and someone decides to just dump gore which I don't want, so I would I would have the ability to make a custom 'jackass dumped gore' for the reason and go on, not needing to take up quick action slots and gives further context. Lets say there is a good r34 thread on /b/ if I saw low quality or waste of hdd space, I may still be inclined to see what it was because the rest of the thread gave me some good images, but the added context of 'jackass posted gore' or 'scat' (the common shit posted to threads to bumplimit them without falling into meme) would tell me all I need to know.

If you go for a drop down and select approach, I highly suggest having some quick slots in the top 5 or so because low quality will get used far more often then 'assole posted scat' or 'stix log shit' or other generic bullshit people will post to hit bump limits.

TLDR

_______

|send this file to trash? |

|do you have a reason? |

|{——-Generic text box here———} |

|[1] [2] [3] [4] [5] |

| |

| [yes][no] |

———————————————————–

With 1-5 being quick adds that paste text to the generic text box

Because its so tedious to use image editing software to mock something up, here, its also the image in case posting formats it and fucks it to hell and back. this is what I consider ideal, 5 quick reasons, with a text box for a custom more 'fuck these images, seriously' reason

Honestly I will take drop down with some quick reasons and no 'per image group' special reason, so long as there are a few custom slots for reasons that can cover nearly everything I need, the images I would use a custom reason for getting rid of would just have to fall into something generic.

And BIG THANK YOU for this, If I know that something like this is coming soon, im capable of not downloading images for a few releases if it gets to the point of requiring a new hdd. The first steps to getting my db under control are here.


f75732  No.12241

>>12218

lol well aware of those reasons, the bmp has eaten all my ram before stopping its attempts to render a few times.


f75732  No.12242

Oh, one thing hdev

I have gone into this in length in the past with manga viewing/viewer version.

If you go down this road to actually doing a manga version before you go further would you be able to have a pause like this for input?

personally i'm of the mindset that you would have to 100% make all storage user parseable. with images, I have clumps of who gives an actual fuck what its named, so if hydrus dies tomorrow, I lose nothing, but as far as a manga reader goes, if hydrus dies, there goes my entire db/collection.

Personally, I think that it would be for the best to branch hyrus into 2 programs at that point, one for manga, one for images, which while having the same base codebase so they can both be worked on at the same time, they would function in how they archive and present things in 2 radically different ways.


f75732  No.12243

>>12233

>>12234

>>12235

Hentai panda x is fucking garbage

Hydrus with its current tagging system and a public repository could tag every manga in existance and it would all come down automatically with little to no hassle.

because hydrus would take control of the hentai/manga itself, it would never run into the 'woopse, you moved a file, all the info is now gone' that every current option with tagging has,

it has nearly all the ui elements that I would need to replace acdsee as my manga reader

The only issue is its not user parseable. which would need to change, and why I propose a split into a manga version and an image version, using the same codebase, just different things get ticked for different versions, images would all be stored in hashes, manga would all be stored in group-author/manga/autor folders group-author for hentai/doujin manga for known manga and author for oneshots.

This way if hydrus fucks itself, it has saved everything in a non destructive method and likely saved it in a way that would put anything we currently do to shame

every few months I get so pissed at acdsee that I go out looking to see if ANYTHING can replace it, and long story short, nothing can, but hydrus is SO fucking close to being able to do it that I would love to see it try, most cbr/comic readers fail miserably because they rely on outside programs to do something, usually acting as a browser, and this sucks because I cant one hand read, what I mean is with acdsee I get to a manga I want to read

enter, down, up, enter

i'm now reading the manga

page down read till done

enter, backpage im out

down and i'm on the next chapter

repeat above and i'm now reading the next chapter, modify for manga of choices distribution method.

The closest replacement I have found is comicrack, but the whole system is near perfect, yet a few steps away from perfect at the same time. I have to use the library in program to save any data and I can't modify files on hdd through library so quick removal of shit is not possible, I also can't do easy dup searches and have to rely on things being named correctly. so shit is very suboptimal from a archive management perspective, the way I have it set up it wont straight up loose info I add to it if a file moves, but i'm not able to even attempt to file manage anything anymore or else shit will get fucked.

I'm honestly using this just for porn as its a far better way to do it then acdsee at this point, and hentai panda x cant handle a 45000 file archive without shitting itself horribly. comic rack, while its dead, does a good enough job as an intermediary between acdsee isn't good for porn, and when hydrus adds manga to its functions. the worst that will happen is I either need to manually redo everything I do in this program, which honestly may be fairly simple, I don't plan on going overboard, due to the last time I did that with a tagging program, I lost 10k tags because I thought they were file based and moved with files, not program library based.


394bad  No.12245

File: ace2bdcfe1255c4⋯.png (471.28 KB, 1200x2093, 1200:2093, Money_and_job_Osaka.png)

>>12243

>asking essentially for a second new program with a whole new metadata storage paradigm in order to fit your workflow

wew

And I thought I was being greedy asking for svg


f75732  No.12247

Have an idea for a duplicate auto mode.

going through the dup finder, I am encountering quite a few jpeg images that were converted to png at some point, the png being 3-5 times the file size for the exact same image. would there be a way to check for duplicates that are jpeg and png, then auto pair against each other to see if they are literally the exact same quality, or within a tolerance where the increased size does not make up for the extrem minor gain in quality if there is even a gain at all?


262c36  No.12248

>>12243

Just make a separate database if you don't want to mix manga with images


f75732  No.12249

File: a493e183588d604⋯.png (610.06 KB, 1919x2074, 1919:2074, ACDSeeUltimate2018_2019-04….png)

File: d79f1f1e63da1ca⋯.png (161.31 KB, 1926x1332, 107:74, ACDSeeUltimate2018_2019-04….png)

>>12248

did you read what I wrote? that is not the issue at all, the issue is the way that manga is handled needs to be handled FAR differently then the way single images are dealt with. you can hash images with next to no issue, but lets say hdev dies and something fucks hydrus from working again, there is 0 chance of pulling a reasonably large manga collection out of the hashes, even less so if instead of cbz it held them as separate images.

the way that it needs to be handle sorting files NEEDS to be user parseable.

2 programs, one for images and one for manga, is needed, lets give an example,

for hydrus to even potentially be useful for manga reader it would need a folder directory. that you can go in and out of, because if you don't, shit WILL get lost, this isn't like images where that's ok, because I get manga form a few sources, I get some with different names, I know there was one that had

watashi wa

and

watashiwa

now to find the manga you would need watashi and wa, otherwise you would get alot of other manga cluttering everything, but 2 sources never agreed on what to name it, this would be a pain in the ass at best, but lets say 3-5 years down the road you forget this, and you see you are missing files, so you re download things you already have because you don't know.

having it sorted by

group-artist - hentai and doujin

manga - manga

artist - oneshots

would be the best base way to sort manga into as few unnecessary subfolders as possible, letting tagging deal with the rest.

256 folders all hashed and non parseable by humans will not work with manga/doujins/oneshots.

splitting the program off into 2, while maintaining the same code base for both and all that changes is a few checkboxes at the end or install to go from image mode to manga mode would facilitate better management of manga as that wont be treated like images. If hydrus went and became a great reader without implementing anything, all I think I would ever put in it is porn, as that's hitting a point where its hard to manage, much like my image archives did and losing the names, while painful, I could see as an acceptable loss, But I would need one hell of a push to even consider that, but with user parseable imports, even if everything goes to hell, I can still go back to how I currently handle it.


9d65e9  No.12268

>>12243

>>12249

Be sure to check https://github.com/CuddleBear92/Hydrus-Presets-and-Scripts/issues/70

Personal opinion: It is better for Hydrus to have a rewrite in Qt before we should have complete comic support.


f75732  No.12272

File: 6713b4ef449a4c7⋯.png (1.78 KB, 255x23, 255:23, ComicRack_2019-04-16_03-07….png)

>>12268

honestly, If hydrus added support for zip/rar/archives in general, its more then good enough to do it locally.

hydrus would need to extract and open zips/rars in ram and hold them, that's really the only potential issue, switching from winrar to 7zip recently for everything (it was used before but in an all else fails capacity) gave me a brand new appreciation for paid software and what it can do.

hydrus would also have to handle folders of images rather then images themselves otherwise people will have to zip/rar everything before importing,

a user searchable in program directory would facilitate manga the best, and user accessible directories would facilitate general browsing rather then needing to know that a file exists. god knows, I only remember half at best of what I have.

hydrus would need to implement a way to sort images

1

2

3

4

5

6

7

8

9

10

instead of

1

10

11

12

13

14

15

16

17

18

19

2

20

or other various methods people have used to send comics.

possibly hydrus could parse a txt file in side the archives or put one in there with all the tags and shit, this would facilitate sharing archives

and for thumbnails, hydrus should honestly only save the cover image for the folder/archive and reparse on the fly the rest. I personally have a near 4 million file archive, that shit eats nearly 80gb of space for the thumbs, if manga did the same, I could likely add 40-50gb just though the porn, and then another 200gb through manga, not to mention that due to how black and white files work, and manga in general just giving thumbs of everything laying around… it would be a little pointless. personally if hydrus had a 'detailed view' for a manga mode, that's what I would use for nearly everything but cover pages.

it would be nice if the cover pages could come up for duplicate detection, or even potentially without thumbs having a duplicate image catalogue of shit that's in the archives, may be able to get rid of some things that way.

I said it before, hydrus is nearly there for replace acdsee as a comic viewer, and comic rack already replaced it for porn for me, but that's more because I don't care about things being even relatively sorted. hydrus could very easily replace both.


9d65e9  No.12273

>>12272

human-oriented sorting is a problem.

Also > duplicate thumbs


9d65e9  No.12274

>>12272

https://github.com/SethMMorton/natsort there we have it, HyDev take note.


9d65e9  No.12275

>>12272

Also if you want a fast implementation https://github.com/sourcefrog/natsort


434912  No.12277

>>12247

Yes, that's basically something like these: https://github.com/andrewekhalel/sewar or even https://github.com/lidq92/CNNIQA or https://github.com/lidq92/CNNIQAplusplus … and so on.

Having these available in Hydrus' duplicate filter (or its revised version) should help a lot.

That said, you should only expect imperfect reliability in fully automatic mode. You won't ALWAYS get very accurate scoring or even just be able to identify the "better" image automatically. These also generally can misidentify variant images as "the same but worse", and other mistakes like that.

>>12152

Like the other anon implicitly did, I also propose the ability to run some more of these image quality / image similarity metrics as a possible feature. Would be nice to have more options to populate the filter and assisted automatic scoring from within the duplicate filter.

At he same time, it would probably be a good idea to make the duplicate filter more modal. E.g. "this is above the certainty threshold you set up, so here's how the weighted scoring of the algorithms you picked would solve this and the corresponding scores - confirm?". Probably also needs some thought put into a colored multi-selection GUI thing that makes it quick to see the scoring & resolution to manually deviate and fix mistakes.


4eabc4  No.12278

I don't know if this is a feature already or not but a sibling thing that could have multiple dependencies so that it could go like

>if [arist] + [ocname] then swap to artist:[artist] + character:[ocname](artist)

would be very convenient, expecially for scrapping tags from places like furaffinity where people don't use underscoring properly


ca1447  No.12281

On the subject of character name.

Would there be a way, far in the future to link 2 tag together.

Instead of:

character:character_(series)

use:

[[character]]x[[series]]

Where x is a dynamic taglink action.

And while we're at it, instead of namespaces, multiple list of tagtype a tag can be presented as for that specific media.

Tear me up, it's brainstorming more than a definitive solution. I don't even know if it is possible.


cd98f7  No.12282

>>12277

Never trust Neural Network systems verbatim. Always find an expert system that can work well first.


f75732  No.12283

>>12282

>>12277

true, it wouldnt be perfect, but right now I have to parse between two images and when the quality is close, what could take 1-5 seconds takes 10-20 seconds.

if we had dup tiers, like the current one is a blunt and stupid shit looks close, that is perfect, sadly I don't have any examples on had as I deleted them, but there was a scalie thread on /trash/ where some jackass came in, made the shittiest 'corrupt' images that were 7-8 times the good images file size, in a thumbnail looked passable most of the time, but full size, it was unrecognizable. if a dup filter was more accurate, these duplicates would get overlooked, what I would like is this

current duptetector as a base line

from this a more stringent dup detector to check the base lines work. this should filter out alternative images from normal dups

and from there a far more stringent one that would do something like jpeg to png comparisons, and if they are close enough to trigger here, in nearly all cases the jpeg was at some point converted to png, so in this case, scrapping the png would be done as it would save space without needing to go though with a fine comb by hand,

with that final level, imagine that the two images are just shown, right click is keep png, left click is keep jpeg, and due to how close they are you almost never click png. at least this is how i imagine it, dups going through 2 dup filters and then a third png to jpeg filter would turn the 10-20+ second checks into 1-5 second confirmations.


434912  No.12285

File: 558f64aaf2dc916⋯.png (762.49 KB, 1440x864, 5:3, NIMA.png)

>>12282

Of course I'd *also* want the usual expert systems from sewar and so on because they usually are faster, mostly very easy to implement, and even more suitable for some use cases.

But actually we probably need both.

The problem is that we don't really have an expert system that really can do the technical model analysis of something like this:

https://github.com/idealo/image-quality-assessment


5ab69a  No.12286

>>12285

I will add more information to the Optimization thread on the list of expert system vs NN repos


434912  No.12287

>>12286

Personally I preferred to KISS and just suggested a few python frameworks that might be easy to hack into Hydrus - but sure.


0d1edf  No.12288

- parser revisioning

- remote fetching

- version tracking


2ff564  No.12293

>>12152

I'm not sure how big this is but here is a suggestion: implement an icon and search command for files which has any notes attached to them.


610d19  No.12296

A probably small thing that would help me a lot would be an "delete both" button in the duplicate filter - i know that i can press del but i usually dont have a hand at the keyboard while filtering and then i also still need to press delete both anyway…


41b719  No.12298

>>12232

PeTR

Public

Emacs???

Tag

Repository


76182a  No.12299

- Cookie management from API

- Tag statistics from API so if you search for a_totally_sfw_tag, and it produces a lot of creator:cname , you know you should sub to cname


5ab69a  No.12300

>>12298

Public Integrative Tag Repo

Public Incorporative Tag Repo

Public Interdependent Tag Repo

Public Ingrained Tag Repo

(PITR)


5ab69a  No.12313

>>12166

You might wanna get https://github.com/deanmalmgren/textract (this is some good stuff for text-like documents)


ff909c  No.12314

>>12230

Thanks for clarifying. Unfortunately, this is not trivial to do for hydrus. Any paged system needs sorting, and hydrus supports many clever kinds of sort, so in order to implement this, I would need to load 'media' metadata for every file in a search result before I could fetch the first page of results. This would not save much time from the current system, where most of a search delay is in fetching that same media metadata. The proper solution, and I imagine how the boorus probably do it, is by having a sort cache (and they have page caches as well, and generally simpler searches to cache), to cross-reference search results against to figure out page slices. This is more complicated than I want to make hydrus search code at the moment. I am happy with being able to display and manage thousands of results at once in the main gui, and I also don't want to further complicate the viewer with paged management and load code. As you say, I encourage users to add 'system:limit=x' if they want less laggy searches.

I would be interested in your further thoughts if you have certain scenarios where search is very slow. If there are particular instances where the client runs very slow for you, I'd love to help it run faster.


5ab69a  No.12315

>>12213

IPFS repos would be important, also an advanced API that can trade IPFS hashes and images would be sweet


ff909c  No.12316

>>12234

>>12233

>>12235

>>12243

>>12245

>>12248

Yeah, I am mixed on cbz. I like the idea in the sense of waving a magic wand and having great support, but I can't do that and I know I fall to feature creep too easily. If this is voted on, I would try to make very simple support and see how that goes, and then iterate on it in future if it proves popular. I can't out-compete the programs already out there, but I can do some simple stuff, and ancilliary code like navigating multi-page single-file media will have uses for things like file alternates.

I really want all future big jobs to be small improvements and experiments, ideally 6-8 weeks and pref no more 12, so I don't get bogged down like the downloader engine overhaul. I am open to experiments that fail and don't want to get emotionally attached or fall into sunken cost fallacy.


ff909c  No.12317

>>12236

Thanks. Yeah, I would like easier sibling workflow, including from the right-click menus, as part of a tag sibling/parent improvement. I would push in this direction with "Improve tag siblings/parents and tag 'censorship'".


ff909c  No.12318

>>12240

Yeah, your dialog mock-up is exactly the sort of thing I was thinking of. I'll have a new options panel somewhere that turns on the advanced mode of this dialog and let you set up some favourite reasons and custom entry. Now that I have the 'set a reason' infrastructure in place, this will not be super difficult to add. I expect to have it in in the next few weeks.


ff909c  No.12319

>>12247

>>12277

Yeah, my first push here will be to set up a system that permits auto-decisions in a sensible and generic way, along with user ability to control what is permitted, and then in future hang new auto-decision systems on it. The 'this is a png copy of a jpg' seems like a nice simple way to start, and I know I can do very quick detection of that by just hashing image pixels. Then maybe explore some 'this jpg is definitely lower quality than this one of same resolution' stuff. The way to slice through dupe mountain will be through automatic systems to reduce the human drudgework, but I am similarly leery >>12282 >>12283 of anything too clever/vapourware to start with. Most of all I want to get the infrastructure and maintenance processing code in, and then we can test all kinds of different comparison systems for our exact purposes.


ff909c  No.12320

>>12272

>>12273

>>12274

>>12275

Thanks. I have 'human' number sort capability in hydrus already, although I am sure there are still places to apply it. All numbered tags should sort like this atm. I am pretty confident I can get directory listings and file access of rars and zips with the python libraries I already have. Any first version of a cbz viewer would be simple and just read through the internal pages one by one, no bookmarks or per-page metadata or anything. Just something that lets you penetrate the 'list of numbered jpg' zips already in your db in the media viewer (and rename to .cbz or whatever, so you can 'open externally' to your preferred comic reader).


ff909c  No.12321

>>12277

For new image recognition techniques, yeah, I designed the search system to make this possible. Much like >>12319 , the main push of duplicate detection 1.0 was to build a search system that could handle many search systems and hang one simple 'looks like' system on it. I can fairly easily add new techniques to support rotation or colour similarity or whatever on it now. This would not be my urge at the moment, as this simple system we already have the biggest problem is there are way too many to go through, so the processing workflow is now the weakest link, but once we have that more under control I can work on this.


ff909c  No.12322

>>12278

Yeah, that's a tricky one. I know exactly what you are talking about, and I would love to have a nice system for it, but the actual guts of how 'if … then' tag relations would work are way more complicated than I am confident I can currently support. For siblings and parents, my first priority is to improve the data store behind the whole system first. Once that isn't on fire behind the scenes, I'll consider carefully adding this sort of power. I am sure it could go very wrong if not thought about, so it'll be baby steps until we have some real world experience.


ff909c  No.12323

>>12281

Both of these thoughts are on my mind. Adding tag siblings revealed to me a big set of pain in the ass problems related to tag definitions I had not considered before.

For both of these, I think the ultimate far future way to solve them is to have a tag definition structure "Add tag metadata (private sort order, presentation options, tag description/wiki support)", where clever metadata can be applied to tags.

So you could say:

character:shimakaze (kantai collection)

And the tag definition, which would essentially be a cleverer iteration of the current siblings and parent system, would say "this has 'series:kantai collection' parent" and also perhaps "this can be displayed to the user as 'character:shimakaze'" without destroying the unique tag identifier through merging with some 'character:shimakaze (my oc series, donut steel)', basically being aware of the (kantai collection) after the main tag. I am very much on the side of letting users display tags how they want, as there are many different spergy desires here, and having a system that recognises info about tags lets us do mass management rather than the current per-tag mess and endless firefight.

Same for namespaces. I am experimenting with 'clothing:' namespace on the PTR, but I know some users hate that. It would be ideally better if the tag 'bikini' had the 'property' of "clothing" rather than an explicit namespace to argue over, and then a user could say 'when a tag has "clothing" property, display it as namespace'.

As it is, I expect my next step here will be more in line with little patches. Namespace sibling control (like saying 'display all creator: tags as artist: please' or 'display all clothing: as unnamespaced') seems an easy-ish next step.

"Tags were a mistake." - t. hydrus_dev


ff909c  No.12324

>>12288

Thank you, I am adding 'Add version tracking to downloader system objects and explore remote fetching of updates' to the list.


ff909c  No.12325

>>12296

Thanks, this is actually a small thing. I assume you want no duplicate action applied, just a basic 'get rid of these two shits' and move on to the next decision? I'll see if I can add a button to quickly do this for 349 or 350.


ff909c  No.12326

>>12299

Thank you, I have api cookie management in my current to-do. I won't work on client api as a big job in this cycle just so it has some time to breathe.

Can you explain the 'tag statistics' idea a bit more? Could this be something to apply to the program more generally, rather than just the API, like something to click that says "show me what artists I like and do not sub to?"

Subs need a db-level data overhaul before I can do clever inspection about them btw. But this is something else to integrate into the client api as well–managing subs.


ff909c  No.12327

>>12293

Thank you. There are multiple jobs for improving notes that I have not been able to get to. I am adding "Expand file notes system" to the list to cover a general push in this direction. This would include multiple notes and likely connecting the notes system to the downloader as well.




[Return][Go to top][Catalog][Nerve Center][Cancer][Post a Reply]
Delete Post [ ]
[]
[ / / / / / / / / / / / / / ] [ dir / bbg / choroy / dempart / doomer / flutter / vfur / vietnam / vril ]