[ home / board list / faq / random / create / bans / search / manage / irc ] [ ]

/hydrus/ - Hydrus Network

Bug reports, feature requests, and other discussion for the hydrus network.

Catalog

Name
Email
Subject
Comment *
File
* = required field[▶ Show post options & limits]
Confused? See the FAQ.
Embed
(replaces files and can be used instead)
Options
Password (For file and post deletion.)

Allowed file types:jpg, jpeg, gif, png, webm, mp4, swf, pdf
Max filesize is 8 MB.
Max image dimensions are 10000 x 10000.
You may upload 5 per post.


New user? Start here ---> http://hydrusnetwork.github.io/hydrus/

Currently prioritising: simple IPFS plugin


YouTube embed. Click thumbnail to play.

 No.2230

windows

zip: https://github.com/hydrusnetwork/hydrus/releases/download/v196/Hydrus.Network.196.-.Windows.-.Extract.only.zip

exe: https://github.com/hydrusnetwork/hydrus/releases/download/v196/Hydrus.Network.196.-.Windows.-.Installer.exe

os x

app: https://github.com/hydrusnetwork/hydrus/releases/download/v196/Hydrus.Network.196.-.OS.X.-.App.dmg

tar.gz: https://github.com/hydrusnetwork/hydrus/releases/download/v196/Hydrus.Network.196.-.OS.X.-.Extract.only.tar.gz

linux

tar.gz: https://github.com/hydrusnetwork/hydrus/releases/download/v196/Hydrus.Network.196.-.Linux.-.Executable.tar.gz

source

tar.gz: https://github.com/hydrusnetwork/hydrus/archive/v196.tar.gz

I had a good week. I fixed and improved some things, and I made some important changes to the autocomplete tag code.

The past couple of weeks have brought a lot of new mappings to my public tag repository, and I think we yesterday hit 50 million! It represents about 700k different tags applied to 3 million files! I'm really pleased, and I appreciate the contributions everyone has made. While this expansion is revealing some terrible lag in places, I still think it neat that we have collectively gathered so much stuff.

autocomplete rewrite

I made good progress this week on the new autocomplete cache layer I have been planning. I have a good plan of where I am headed, a semi-fleshed skeleton of the cache database written, and a fair amount of optimism about bringing it all together. I am generally confident the new cache will deliver sub-second autocomplete results for pretty much all queries when it is done. I still need to think about some things, and there is a lot of work still to do, but I've made some decisions on what I can and cannot easily attempt to calculate in quick time. Disabling 'all known files'/'all known tags' last week was part of that.

I also gave the existing code a polish and converted it to this new calculation philosophy. Some sibling stuff is fixed, and some bits of inefficient code are cleaned up, but the important thing is that 'all known tags' autocomplete results (e.g. the sort you get when you open a new local search page and start typing) are calculated in a faster but sometimes less accurate way. If you only have 'local tags' and my PTR, then you will likely see no change, but if you have multiple services that contain the same tags for the same files, you will see counts that are the sum of those services' tag counts, rather than their union (they will be a bit higher than is accurate, sometimes).

I am not sure if I will be able to bring back 100% accurate counts for absolutely all cases, but I will revisit it after the first version of the cache layer is done. I might end up including it in a second layer that collapses sibling counts and applies local censorship rules as well, but we will see. For now, I plan to add '<=' signs to uncertain counts and see how much of a big deal it turns out to be.

Anyway, the upshot of changing this calculation is that all autocomplete results seem to be returning a bit quicker than usual already. That still often means ten seconds, so the new cache is still needed, but at least we are headed in the right direction. Regular tag processing, as well, which interacts with the existing cache, seems to be running up to 40% faster, and some other big jobs like resetting services now no longer have to do a very laggy autocomplete purge. Let me know how you get on!

A lot of orphaned cache entries need to be deleted, so the update may take a few minutes.

other stuff

The 8chan thread watcher should now work for all boards. Let me know if it doesn't work for something unusual.

I also fixed the tumblr downloader, which recently broke due to a subtle API change.

The IPFS downloader now detects directory multihashes and gives a more pleasant error. I will next add directory downloading and then directory pinning.

The recent 'load tags from neighbouring .txt files' import option is now available for import folders. If you are interesting in this, check it out at the edit import folders dialog and let me know if the new option makes sense. The .txt files should be deleted or moved alongside the media files they refer to depending on how you have those actions set up.

full list

- fixed the 8chan thread watcher for boards that host content on media.8ch.net

- improved the thread watcher url check logic so it won't lag with the new fix

- cleaned up the ac generation code a little

- 'all known tags' ac counts are now summed from all the known tag services rather than calculated directly (a <= indicator for when these cases overlap will be forthcoming). this speeds up file add/delete, service reset, a/c fetch time, and general tag processing, and reduces the size of the db

- ac generation code now deals with 'is the entry text an exact match or not?' better

- ac generation code will now no longer produce non-exact-match siblings on an exact match search

- ac generation code will no longer save half complete search text into the db as new tags

- on update, the a/c cache and its helper table 'existing tags' will be cleaned of a lot of orphans, which may take a few minutes

- fixed some bad unicode path parsing when importing files in some OSes, I think!

- fixed some bad read autocomplete sibling substitution

- fixed a bug where autocomplete predicate lists would not update if the new list was merely a reorder (which can happen in some unusual sibling cases)

- fixed the tumblr parser for the subtly new API

- import folders now support loading tags from neighbouring .txt files–check the dialog to set up which tag services you would like to import to

- the ipfs file downloader now queries DAG object links, determines if the given multihash is a directory or other complicated object, and if so politely dumps out (handling of directory downloads is forthcoming)

- some db code is cleaned up

- prepared db code for some future subclasses

- wrote most of the new ac cache db

- misc cleanup

- added some browser addon links to the ipfs help

next week

I will continue this cache layer stuff, and I'll also see about IPFS directory parsing and a bit of gui to pick the files you want to download.

My overall near-future 'big stuff' plan is to go something like IPFS directories->cache layer->extract service data from client.db->suggested tags control->faster dupe searching->make a new poll on what to work on next.

 No.2236

File: 1457567494568.png (27.32 KB, 522x310, 261:155, 16-03-10_10-49-50-hydrus_c….png)

>Exit client for update

>yoooo want to do the usual service updates

>suuuuure

Glad this is on my SSD


 No.2239

I'm new to Hydrus, just testing it out for now. I downloaded the PTR sync database and about 10 different tag databases. Then imported about 1000 images. After importing the images Hydrus starts processing them and this is taking a very long time, up to 30 seconds per file. Is this normal?


 No.2242

I got many copies of these errors when I tried to load my inbox.

UnicodeDecodeError

'ascii' codec can't decode byte 0xc3 in position 3: ordinal not in range(128)

File "site-packages\wx-3.0-msw\wx\_core.py", line 16766, in <lambda>

File "include\HydrusController.py", line 227, in ProcessPubSub

try: self._pubsub.Process()

File "include\HydrusPubSub.py", line 127, in Process

callable( *args, **kwargs )

File "include\ClientGUICommon.py", line 4096, in SetTagsByMediaPubsub

self.SetTagsByMedia( media, force_reload = force_reload )

File "include\ClientGUICommon.py", line 4007, in SetTagsByMedia

self._RecalcStrings( tags_changed )

File "include\ClientGUICommon.py", line 3821, in _RecalcStrings

tag_string = self._GetTagString( tag )

File "include\ClientGUICommon.py", line 3763, in _GetTagString

if self._show_pending and tag in self._pending_tags_to_count: tag_string += ' (+' + HydrusData.ConvertIntToPrettyString( self._pending_tags_to_count[ tag ] ) + ')'

UnicodeDecodeError

'ascii' codec can't decode byte 0xc3 in position 3: ordinal not in range(128)

File "include\ClientGUIMedia.py", line 1948, in EventPaint

self._DrawCanvasPage( page_index, bmp )

File "include\ClientGUIMedia.py", line 1254, in _DrawCanvasPage

dc.DrawBitmap( thumbnail.GetBmp(), x, y )

File "include\ClientGUIMedia.py", line 2981, in GetBmp

upper_info_string += ', '.join( series )

I tried opening a new tab to see if maybe there were some particular images/tags causing the issue, but this new tab is loading everything fine. It's only when I switch back to the original inbox tab that the errors appear.

The only difference I can think of between the two is that I was trying to import a folder with a bunch of .txt files while loading the first tab, but I don't think that's related.


 No.2243

>>2242

Wait, I've been messing around some more. The errors are actually popping up whenever I archive an image or send it to the inbox. It has nothing to do with the image itself or its tags (it happens with an image with no tags at all, too).


 No.2251

File: 1457808074078.jpg (110.96 KB, 1000x671, 1000:671, 62912fe9e49f6d6379956041f8….jpg)

>>2239

30 seconds sounds quite high. Even a slow computer with a lot of tags might take perhaps 2 seconds to import a typical file. I expect the many recent tags you added have fragged up your database file and/or your hard drive, so I suggest you go database->maintenance->vacuum, which will clean up your client.db (it may take ten minutes to complete), and then shut the client down and run a hard drive defrag. Once that is done, restart the client, and see if things are running faster.

If things are still running slowly, please check out:

http://hydrusnetwork.github.io/hydrus/help/reducing_lag.html

>>2242

>>2243

Thank you for this report. I believe a tag got imported and was not properly decoded to unicode at the correct point in the import process. The bad tag hung around in gui memory and hence failed to render to screen. The inbox/archive events were triggering a refresh of the tag gui controls, repeating the error.

I think this error is temporary. If you have since restarted your client, it should have gone already. I had a look and think I might have fixed it, but let's check:

Were the .txt files you parsed through a manual import that you created, or through the new import folder code?

Did any of the tags in your .txt files have unusual characters, like rare accents, punctuation, or japanese text?

If you load up a fresh page with something like system:age<7 days to show all those files you imported with tags, do you get any display errors at all? Are there any tags with unusual characters, and if so, do they render ok?


 No.2253

File: 1457811812775.jpg (316.53 KB, 843x675, 281:225, 1435928990501.jpg)

Thank you hydrus.

After all the IPFS stuff is integrated I'll have no choice but to migrate all my images to hydrus. The only thing holding me back is laziness.




[Return][Go to top][Catalog][Post a Reply]
Delete Post [ ]
[]
[ home / board list / faq / random / create / bans / search / manage / irc ] [ ]