Aaaand turned the dashboard view I already don't use into an algorithmic social feed because... why?

Like, I don't need it to recommend me projects my friends are working on, or at the very least it should be rare. I'm pretty sure there's a lot of shit I work on that nobody else cares about (e.g. if you follow my repo that doesn't mean you want to know about )

#github #Ruffle #PDDiffusion

Last updated 1 year ago

Part of the reason why I keep banging my head against the wall on is because PD (by age) training data neatly solves all the ethical problems AI art generators have. It'd be fairly difficult to ACCIDENTALLY rip off a living artist with such a thing.

With a PD-trained music generator this isn't so certain.

#PDDiffusion

Last updated 1 year ago

If I jump into the actual set of images it read from, a LOT of them are . Which explains why the from-scratch CLIP trained likes to draw maps, but not why the OpenAI CLIP trained one generates pixel nonsense.

Actually, no, it doesn't explain it, because these are all clearly labeled as maps and should be able to distinguish between the maps, portraits, and landscapes in the set.

#maps #PDDiffusion #clip

Last updated 2 years ago

Hey, remember how was spitting out nothing but maps?

Well, I retrained on OpenAI CLIP, and now it's spitting out nothing but nonsense. The attached image is supposed to be a "landscape painting of a forest".

#PDDiffusion

Last updated 2 years ago

90k finished today, and the results are...

Uh... it literally forgot how to anything that isn't a map. The prompt for this was "forest landscape painting". All the training data from the 29k version is still there.

I'm retraining with instead of my own from-scratch model to try and narrow down the cause of this model forgetting literally everything.

#PDDiffusion #training #draw #openai #clip #ai #aiart

Last updated 2 years ago

Oh my god, I just heard about the whole Automatic1111 thing. tl;dr the biggest Stable Diffusion frontend is written by one guy that writes racist Rimworld mods. He recently got banned from GitHub... not because of the game mods, but because he linked to how-tos on how to use his frontend to generate underage AI anime porn. 🤢🤮

This is basically fitting into every stereotype of the AI art community that I have. I am getting less enthused about every time I read about this shit

#PDDiffusion

Last updated 2 years ago

I'm back from dinner and training is now 25% complete. 4h30m estimated left.

That's... really weird that it got that fast, but OK I guess

#PDDiffusion #clip

Last updated 2 years ago

So, the scraper in choked on another weird date format (negative years) around the 90k image mark. I decided, screw it let's just do another training run. It's not quite "10x the model" but hey it should at least provide a measurable improvement.

is training now; about 18 hours. I expect the U-Net to take a week.

I also found out that my wikitext label extractor was busted and not actually extracting label data. So that should also help.

#wikimediacommons #PDDiffusion #clip

Last updated 2 years ago

Well, this is great. CLIP training is hella broken in

#PDDiffusion

Last updated 2 years ago

Y'know, if doesn't work out, I came up with an alternate option. It might be a little too late but I figured it'd at least be funny:

We chuck the grifters against the NFT grifters.

Right-click, save-as is inefficient. How about we instead scrape the entirety of OpenSea and chuck it into a U-Net? Such a network would not only generate every NFT ever (thus devaluing it), but it will also generate every NFT that could ever be (because they're really fucking samey).

#PDDiffusion #aiart

Last updated 2 years ago

I decided to screw the VAE training for now and just start scraping images again

I have to babysit the scraper because the wikitext parsing still hits corner cases and crashes because, say, this CHEEKY FUCKER decided he was going to be painted on 176X

commons.wikimedia.org/wiki/Fil

I thought the X years were only invented in 200X

#PDDiffusion #aiart

Last updated 2 years ago

So, with slicing enabled I CAN train the same VAE architecture that Stable Diffusion uses (128,256,512,512)... with a batch size of one and 20 minutes per iteration.

Note, that's not per epoch. That's 20 minutes PER IMAGE.

...aaaand it just threw the CUDA out-of-memory error anyway. Blargh.

#PDDiffusion

Last updated 2 years ago

Oh look another poorly-documented "click here to go fast" option

#PDDiffusion

Last updated 2 years ago

update:

- Data augmentation went as well as I could have hoped. Trained models are now a lot more likely to spit out something vaguely related to your prompt.

- I got rid of my flat-file database hackery and actually set up SQL to store image metadata and labels. Right now I'm using SQLite but I can migrate this over to MySQL or Postgres just by changing a connection string

- I started work on VAE training. It's not going well.

#PDDiffusion

Last updated 2 years ago

So, I've been working on data augmentation strategies for . As part of that, I learned the main reason why it seemed so damned unresponsive to text prompts:

I was only training on the image's CLIP vector, not the CLIP vector for its associated label text.

If CLIP training just so happened to be well fit on that particular image/label pair, great. If it wasn't, then text prompts that matched those images were effectively not being trained on.

#PDDiffusion

Last updated 2 years ago

BRUUUUH, the training set absolutely *does* have in it

commons.wikimedia.org/wiki/Fil

It should at least be regurgitating this image when I run it with the prompt "a guinea pig"

#PDDiffusion #guineapigs

Last updated 2 years ago

Ok, so I tried training against OpenAI's CLIP... again. The first time I ran up against a weird bug, the second time I actually fixed it.

My first thought was that the small data set meant a dumber CLIP. But bringing in a smarter CLIP does not actually make the model smarter. (So is the infringement is stored in the U-Net?)

#PDDiffusion

Last updated 2 years ago

"painting circa 1800 portrait of a paintress oil on canvas"

So, CLIP isn't broken after all. 's label set is so narrow and with so many specific phrases that prompt engineering is hilariously critical to getting anything useful out of it - even with the improved wikitext parser. Descriptions aren't good enough.

Definitely going to have to build a manual labeling tool at some point, because there's entire styles of things in the dataset that you just can't recall right now.

#PDDiffusion

Last updated 2 years ago

Ok, so there's also apparently a bunch of art tagged with commons.wikimedia.org/wiki/Tem which makes the labels CC-BY-NC-SA.

Going to have to completely filter those out from the training set, since even though label copyright does not leak through U-Nets, it does cover the model weights itself and would make it illegal to use the pipeline in a commercial manner (which I totally want to be able to do)

#PDDiffusion

Last updated 2 years ago

So, I've made a bunch of improvements to the wikitext parser in , but in the process found that basically all the dates on a particular artist's paintings have stray parenthesis in them. :/ Time to warm up my decades-old Wikipedia account...

#PDDiffusion

Last updated 2 years ago