Furkan Gözükara

MonsterMMORPG

AI & ML interests

Check out my youtube page SECourses for Stable Diffusion tutorials. They will help you tremendously in every topic

Articles

Organizations

MonsterMMORPG's activity

posted an update about 14 hours ago
view post
Post
625
How to Extract LoRA from FLUX Fine Tuning / DreamBooth Training Full Tutorial and Comparison Between Fine Tuning vs Extraction vs LoRA Training

Full article is here public post : https://www.patreon.com/posts/112335162

This was short on length so check out the full article - public post

Conclusions as below

Conclusions
With same training dataset (15 images used), same number of steps (all compared trainings are 150 epoch thus 2250 steps), almost same training duration, Fine Tuning / DreamBooth training of FLUX yields the very best results

So yes Fine Tuning is the much better than LoRA training itself

Amazing resemblance, quality with least amount of overfitting issue

Moreover, extracting a LoRA from Fine Tuned full checkpoint, yields way better results from LoRA training itself

Extracting LoRA from full trained checkpoints were yielding way better results in SD 1.5 and SDXL as well

Comparison of these 3 is made in Image 5 (check very top of the images to see)

640 Network Dimension (Rank) FP16 LoRA takes 6.1 GB disk space

You can also try 128 Network Dimension (Rank) FP16 and different LoRA strengths during inference to make it closer to Fine Tuned model

Moreover, you can try Resize LoRA feature of Kohya GUI but hopefully it will be my another research and article later

Image Raw Links
Image 1 : MonsterMMORPG/FLUX-Fine-Tuning-Grid-Tests

Image 2 : MonsterMMORPG/FLUX-Fine-Tuning-Grid-Tests

Image 3 : MonsterMMORPG/FLUX-Fine-Tuning-Grid-Tests

Image 4 : MonsterMMORPG/FLUX-Fine-Tuning-Grid-Tests

Image 5 : MonsterMMORPG/FLUX-Fine-Tuning-Grid-Tests
replied to their post 3 days ago
view reply

much lesser degree but i can't say fully fixes yet :/

posted an update 3 days ago
view post
Post
3151
Full Fine Tuning of FLUX yields way better results than LoRA training as expected, overfitting and bleeding reduced a lot

Configs and Full Experiments
Full configs and grid files shared here : https://www.patreon.com/posts/kohya-flux-fine-112099700

Details
I am still rigorously testing different hyperparameters and comparing impact of each one to find the best workflow
So far done 16 different full trainings and completing 8 more at the moment
I am using my poor overfit 15 images dataset for experimentation (4th image)
I have already proven that when I use a better dataset it becomes many times betters and generate expressions perfectly
Here example case : https://www.reddit.com/r/FluxAI/comments/1ffz9uc/tried_expressions_with_flux_lora_training_with_my/
Conclusions
When the results are analyzed, Fine Tuning is way lesser overfit and more generalized and better quality
In first 2 images, it is able to change hair color and add beard much better, means lesser overfit
In the third image, you will notice that the armor is much better, thus lesser overfit
I noticed that the environment and clothings are much lesser overfit and better quality
Disadvantages
Kohya still doesn’t have FP8 training, thus 24 GB GPUs gets a huge speed drop
Moreover, 48 GB GPUs has to use Fused Back Pass optimization, thus have some speed drop
16 GB GPUs gets way more aggressive speed drop due to lack of FP8
Clip-L and T5 trainings still not supported
Speeds
Rank 1 Fast Config — uses 27.5 GB VRAM, 6.28 second / it (LoRA is 4.85 second / it)
Rank 1 Slower Config — uses 23.1 GB VRAM, 14.12 second / it (LoRA is 4.85 second / it)
Rank 1 Slowest Config — uses 15.5 GB VRAM, 39 second / it (LoRA is 6.05 second / it)
Final Info
Saved checkpoints are FP16 and thus 23.8 GB (no Clip-L or T5 trained)
According to the Kohya, applied optimizations doesn’t change quality so all configs are ranked as Rank 1 at the moment
I am still testing whether these optimizations make any impact on quality or not
  • 2 replies
·
posted an update 5 days ago
view post
Post
4033
Trained Myself With 256 Images on FLUX — Results Mind Blowing

Detailed Full Workflow

Medium article : https://medium.com/@furkangozukara/ultimate-flux-lora-training-tutorial-windows-and-cloud-deployment-abb72f21cbf8

Windows main tutorial : https://youtu.be/nySGu12Y05k

Cloud tutorial for GPU poor or scaling : https://youtu.be/-uhL2nW7Ddw

Full detailed results and conclusions : https://www.patreon.com/posts/111891669

Full config files and details to train : https://www.patreon.com/posts/110879657

SUPIR Upscaling (default settings are now perfect) : https://youtu.be/OYxVEvDf284

I used my Poco X6 Camera phone and solo taken images

My dataset is far from being ready, thus I have used so many repeating and almost same images, but this was rather experimental

Hopefully I will continue taking more shots and improve dataset and reduce size in future

I trained Clip-L and T5-XXL Text Encoders as well

Since there was too much push from community that my workflow won’t work with expressions, I had to take a break from research and use whatever I have

I used my own researched workflow for training with Kohya GUI and also my own self developed SUPIR app batch upscaling with face upscaling and auto LLaVA captioning improvement

Download images to see them in full size, the last provided grid is 50% downscaled

Workflow

Gather a dataset that has expressions and perspectives that you like after training, this is crucial, whatever you add, it can generate perfect

Follow one of the LoRA training tutorials / guides

After training your LoRA, use your favorite UI to generate images

I prefer SwarmUI and here used prompts (you can add specific expressions to prompts) including face inpainting :

https://gist.github.com/FurkanGozukara/ce72861e52806c5ea4e8b9c7f4409672

After generating images, use SUPIR to upscale 2x with maximum resemblance

Short Conclusions

Using 256 images certainly caused more overfitting than necessary

...
posted an update 10 days ago
view post
Post
2277
Ultimate FLUX LoRA Training Tutorial: Windows and Cloud Deployment

I have done total 104 different LoRA trainings and compared each one of them to find the very best hyper parameters and the workflow for FLUX LoRA training by using Kohya GUI training script.

You can see all the done experiments’ checkpoint names and their repo links in following public post: https://www.patreon.com/posts/110838414

After completing all these FLUX LoRA trainings by using the most VRAM optimal and performant optimizer Adafactor I came up with all of the following ranked ready to use configurations.

You can download all the configurations, all research data, installers and instructions at the following link : https://www.patreon.com/posts/110879657


Tutorials
I also have prepared 2 full tutorials. First tutorial covers how to train and use the best FLUX LoRA locally on your Windows computer : https://youtu.be/nySGu12Y05k

This is the main tutorial that you have to watch without skipping to learn everything. It has total 74 chapters, manually written English captions. It is a perfect resource to become 0 to hero for FLUX LoRA training.

The second tutorial I have prepared is for how to train FLUX LoRA on cloud. This tutorial is super extremely important for several reasons. If you don’t have a powerful GPU, you can rent a very powerful and very cheap GPU on Massed Compute and RunPod. I prefer Massed Compute since it is faster and cheaper with our special coupon SECourses. Another reason is that in this tutorial video, I have fully in details shown how to train on a multiple GPU setup to scale your training speed. Moreover, I have shown how to upload your checkpoints and files ultra fast to Hugging Face for saving and transferring for free. Still watch first above Windows tutorial to be able to follow below cloud tutorial : https://youtu.be/-uhL2nW7Ddw

For upscaling SUPIR used : https://youtu.be/OYxVEvDf284
posted an update 17 days ago
view post
Post
2375
I started training a public LoRA style (2 seperate training each on 4x A6000).

Experimenting captions vs non-captions. So we will see which yields best results for style training on FLUX.

Generated captions with multi-GPU batch Joycaption app.

I am showing 5 examples of what Joycaption generates on FLUX dev. Left images are the original style images from the dataset.

I used my multi-GPU Joycaption APP (used 8x A6000 for ultra fast captioning) : https://www.patreon.com/posts/110613301

I used my Gradio batch caption editor to edit some words and add activation token as ohwx 3d render : https://www.patreon.com/posts/108992085

The no caption dataset uses only ohwx 3d render as caption

I am using my newest 4x_GPU_Rank_1_SLOW_Better_Quality.json on 4X A6000 GPU and train 500 epochs — 114 images : https://www.patreon.com/posts/110879657

Total step count is being 500 * 114 / 4 (4x GPU — batch size 1) = 14250

Taking 37 hours currently if I don’t terminate early

Will save a checkpoint once every 25 epochs

Full Windows Kohya LoRA training tutorial : https://youtu.be/nySGu12Y05k

Full cloud tutorial I am still editing

Hopefully will share trained LoRA on Hugging Face and CivitAI along with full dataset including captions.

I got permission to share dataset but can’t be used commercially.

Also I will hopefully share full workflow in the CivitAI and Hugging Face LoRA pages.

posted an update 25 days ago
view post
Post
1287
First fully multi-GPU supporting and very advanced batch image captioner APP with Gradio interface published (as far as i know first)

Multi-GPU batch caption with JoyCaption. JoyCaption uses Meta-Llama-3.1–8B and google/siglip-so400m-patch14–384 and a fine tuned image captioning neural network.

Link : https://www.patreon.com/posts/110613301

Link for batch caption editor : https://www.patreon.com/posts/108992085

Coding multi-gpu in Python and Torch and bitsandbytes was truly a challange.

Our APP uses JoyCaption image captioning fine tuned model.

Our APP supports bitsandbytes 4bit model loading as well even in multi GPU mode (9.5 GB VRAM)

Tested on 8x RTX A6000 (cloud) and RTX 3090 TI + RTX 3060 (my PC)

1-click to install on Windows, RunPod and Massed Compute

Excellent caption quality, automatically distributes images into each GPU, lots of features. You can resume caption with skip captioned images option.

For full details checkout screenshots
posted an update 28 days ago
view post
Post
1970
Huge updates and improvements for FLUX LoRA training : https://www.patreon.com/posts/kohya-flux-lora-110293257

10 GB, 16 GB, 24 GB and 48 GB GPU configs added - 10 GB config is like 3x to 5x slower sadly

Massed Compute, RunPod and Windows Kohya SS GUI LoRA installers added to the zip file

Also right now testing new 16 GB FLUX LoRA training config and new way of regularization images. Moreover testing Apply T5 Attention Mask too. Lets see if Kohya FLUX LoRA workflow will become even better or not

Also massive grids comparisons shared here : https://www.reddit.com/r/StableDiffusion/comments/1eyj4b8/kohya_ss_gui_very_easy_flux_lora_trainings_full/
replied to their post 28 days ago
replied to their post about 1 month ago
posted an update about 1 month ago
view post
Post
1007
ResShift 1-Click Windows, RunPod, Massed Compute, Kaggle Installers with Amazing Gradio APP and Batch Image Processing. ResShift is Efficient Diffusion Model for Image Super-resolution by Residual Shifting (NeurIPS 2023, Spotlight).


Official Repo : https://github.com/zsyOAOA/ResShift

I have developed a very advanced Gradio APP.

Developed APP Scripts and Installers : https://www.patreon.com/posts/110331752

Features

It supports following tasks:

Real-world image super-resolution

Bicubic (resize by Matlab) image super-resolution

Blind Face Restoration

Automatically saving all generated image with same name + numbering if necessary

Randomize seed feature for each generation

Batch image processing - give input and output folder paths and it batch process all images and saves

1-Click to install on Windows, RunPod, Massed Compute and Kaggle (free account)

Windows Requirements

Python 3.10, FFmpeg, Cuda 11.8, C++ tools and Git

If it doesn't work make sure to below tutorial and install everything exactly as shown in this below tutorial

https://youtu.be/-NjNy7afOQ0

How to Install on Windows

Make sure that you have the above requirements

Extract files into a folder like c:/reshift_v1

Double click Windows_Install.bat and it will automatically install everything for you with an isolated virtual environment folder (VENV)

After that double click Windows_Start_app.bat and start the app

When you first time use a task it will download necessary models (all under 500 MB) into accurate folders

If during download it fails, file gets corrupted sadly it doesn't verify that so delete files inside weights and restart

How to Install on RunPod, Massed Compute, Kaggle

Follow the Massed_Compute_Instructions_READ.txt and Runpod_Instructions_READ.txt

For Kaggle follow the notebook written steps

An example video of how to use my RunPod, Massed Compute scripts and Kaggle notebook can be seen

https://youtu.be/wG7oPp01COg
·
posted an update about 1 month ago
view post
Post
1568
AuraSR Giga Upscaler V1 by SECourses - Upscales to 4x

AuraSR is a 600M parameter upsampler model derived from the GigaGAN paper. It works super fast and uses a very limited VRAM below 5 GB. It is deterministic upscaler. It works perfect in some images but fails in some images so it is worth to give it a shot.

GitHub official repo : https://github.com/fal-ai/aura-sr

I have developed 1-click installers and a batch upscaler App.

You can download installers and advanced batch App from below link:
https://www.patreon.com/posts/110060645

Check the screenshots and examples below

Windows Requirements

Python 3.10, FFmpeg, Cuda 11.8, C++ tools and Git

If it doesn't work make sure to below tutorial and install everything exactly as shown in this below tutorial

https://youtu.be/-NjNy7afOQ0

How to Install and Use on Windows

Extract the attached GigaGAN_Upscaler_v1.zip into a folder like c:/giga_upscale

Then double click and install with Windows_Install.bat file

It will generate an isolated virtual environment venv folder and install requirements

Then double click and start the Gradio App with Windows_Start_App.bat file

When first time running it will download models into your Hugging Face cache folder

Hugging Face cache folder setup explained below

https://www.patreon.com/posts/108419878

All upscaled images will be saved into outputs folder automatically with same name and plus numbering if necessary

You can also batch upscale a folder

How to Install and Use on Cloud

Follow Massed Compute and RunPod instructions

Usage is same as on Windows

For Kaggle start a Kaggle notebook, import our Kaggle notebook and follow the instructions

App Screenshots and Examples below









posted an update about 1 month ago
view post
Post
1883
BiRefNet State Of The Art Newest Very Best Background Batch Remover APP

Official repo : https://github.com/ZhengPeng7/BiRefNet

Download APP and installers from : https://www.patreon.com/posts/109913645

Hugging Face Demo : ZhengPeng7/BiRefNet_demo

I have developed a very advanced Gradio APP for this with full proper file saving and batch processing. Also my version removes BG and saves as transparent background.

The APP uses huge VRAM for high resolution images. However it is still working uber fast even though using shared VRAM. So make sure that you have high RAM or set virtual RAM.

Click below to see how to set virtual RAM on Windows.
https://www.windowscentral.com/how-change-virtual-memory-size-windows-10

On Massed Compute A6000 GPU (31 cents per hour) you can very fast remove even very high res images backgrounds.

Currently we have 1 click installers for RunPod, Massed Compute, Kaggle and Windows.

Windows Requirements
Python 3.10, FFmpeg, Cuda 11.8, C++ tools and Git

If it doesn't work make sure to below tutorial and install everything exactly as shown in this below tutorial

https://youtu.be/-NjNy7afOQ0

How To Use On Windows
Just extract files into like c:/BiRefNet_v1

Double click Windows_Install.bat file and it will generate a isolated virtual environment and install requirements

It will automatically download models into your Hugging Face cache (best model under 1 GB)

Then start and use the Gradio APP with Windows_Start_App.bat

Cloud How To Use
Massed Compute, RunPod has instructions txt files. Follow them

Kaggle has all the instructions 1 by 1

On Kaggle set resolution 1024x1024 or you will get out of memory error
replied to gokaygokay's post about 1 month ago
view reply

Amazing I was planning to make gradio for this

replied to merve's post about 1 month ago
replied to not-lain's post about 1 month ago
posted an update about 1 month ago
view post
Post
2208
Live Portrait Updated to V5

Animals Live animation added

All of the main repo changes and improvements added to our modified and improve app

Link : https://patreon.com/posts/107609670

Works perfect on Massed Compute, RunPod, free Kaggle account and Windows

1-Click to install with instructions

All tested and verified

Windows tutorial : https://youtu.be/FPtpNrmuwXk

Cloud (RunPod, Massed Compute & free Kaggle account) tutorial : https://youtu.be/wG7oPp01COg

Making XPose / UniPose / ops library compiling working was a challenge on Massed Compute and Kaggle.
replied to sayakpaul's post about 1 month ago
view reply

why this is not single file i was gonna test in swarmUI :/

replied to sayakpaul's post about 1 month ago
view reply

nice but still reduced quality :/ i should compare with dev 20 steps

posted an update about 2 months ago
view post
Post
5095
FLUX Local & Cloud Tutorial With SwarmUI - FLUX: The Groundbreaking Open Source txt2img Model Outperforms Midjourney & Others - FLUX: The Anticipated Successor to SD3

🔗 Comprehensive Tutorial Video Link ▶️ https://youtu.be/bupRePUOA18

FLUX represents a milestone in open source txt2img technology, delivering superior quality and more accurate prompt adherence than #Midjourney, Adobe Firefly, Leonardo Ai, Playground Ai, Stable Diffusion, SDXL, SD3, and Dall E3. #FLUX, a creation of Black Forest Labs, boasts a team largely comprised of #StableDiffusion's original developers, and its output quality is truly remarkable. This statement is not hyperbole; you'll witness its capabilities in the tutorial. This guide will demonstrate how to effortlessly install and utilize FLUX models on your personal computer and cloud platforms like Massed Compute, RunPod, and a complimentary Kaggle account.

🔗 FLUX Setup Guide (publicly accessible) ⤵️
▶️ https://www.patreon.com/posts/106135985

🔗 FLUX Models One-Click Robust Automatic Downloader Scripts ⤵️
▶️ https://www.patreon.com/posts/109289967

🔗 Primary Windows SwarmUI Tutorial (Essential for Usage Instructions) ⤵️
▶️ https://youtu.be/HKX8_F1Er_w

🔗 Cloud-based SwarmUI Tutorial (Massed Compute - RunPod - Kaggle) ⤵️
▶️ https://youtu.be/XFUZof6Skkw

🔗 SECourses Discord Server for Comprehensive Support ⤵️
▶️ https://discord.com/servers/software-engineering-courses-secourses-772774097734074388

🔗 SECourses Reddit Community ⤵️
▶️ https://www.reddit.com/r/SECourses/

🔗 SECourses GitHub Repository ⤵️
▶️ https://github.com/FurkanGozukara/Stable-Diffusion

🔗 Official FLUX 1 Launch Announcement Blog Post ⤵️
▶️ https://blackforestlabs.ai/announcing-black-forest-labs/

Video Segments

0:00 Introduction to the state-of-the-art open source txt2img model FLUX
5:01 Process for integrating FLUX model into SwarmUI
....
posted an update about 2 months ago
view post
Post
3192
FLUX FP16 produces better quality than FP8 but requires 28 GB VRAM - Full comparisons - Also compared Dev vs Turbo model and 1024 vs 1536

check the file names in the below given imgsli to see all details

SwarmUI on L40S is used to compare - 1.82 it / second step speed for 1024x1024

imgsli link that compares all : https://imgsli.com/MjgzNzM1

SwarmUI full tutorial public post : https://www.patreon.com/posts/106135985

1-Click FLUX models downloader scripts for Windows, RunPod and Massed Compute are in below post

https://www.patreon.com/posts/109289967

free Kaggle account notebook that supports FLUX already : Download from here : https://www.patreon.com/posts/106650931

prompt :

(medium full shot) of (awe-inspiring snake) with muscular body, amber eyes, bronze brown armored scales, venomous fangs, coiling tail, gemstone-studded scales frills, set in a barren desert wasteland, with cracked earth and the remains of ancient structures, a place of mystery and danger, at dawn, ,Masterpiece,best quality, raw photo, realistic, very aesthetic, dark

CFG 1 - seed 1 - FLUX CFG is default : 3.5

Full public SwarmUI tutorial

Zero to Hero Stable Diffusion 3 Tutorial with Amazing SwarmUI SD Web UI that Utilizes ComfyUI

https://youtu.be/HKX8_F1Er_w

Full public Cloud SwarmUI tutorial

How to Use SwarmUI & Stable Diffusion 3 on Cloud Services Kaggle (free), Massed Compute & RunPod

https://youtu.be/XFUZof6Skkw
replied to their post about 2 months ago
view reply

It's true: happened also to a video I created from a photo of my wife. But, plus of becoming a little more asiatic, she also grew vampire teeth

totally related to dataset. so their dataset is very likely to unbalanced

replied to their post about 2 months ago
posted an update about 2 months ago
view post
Post
4381
Kling AI Video is FINALLY Public (All Countries), Free to Use and MIND BLOWING - Full Tutorial > https://youtu.be/zcpqAxYV1_w

You probably seen those mind blowing AI made videos. And the day has arrived. The famous Kling AI is now worldwide available for free. In this tutorial video I will show you how to register for free with just email to Kling AI and use its mind blowing text to video animation, image to video animation and text to image, and image to image capabilities. This video will show you non-cherry pick results so you will know the actual quality and capability of the model unlike those extremely cherry pick example demos. Still, #KlingAI is the only #AI model that competes with OpenAI's #SORA and it is real to use.

🔗 Kling AI Official Website ⤵️
▶️ https://www.klingai.com/

🔗 SECourses Discord Channel to Get Full Support ⤵️
▶️ https://discord.com/servers/software-engineering-courses-secourses-772774097734074388

🔗 Our GitHub Repository ⤵️
▶️ https://github.com/FurkanGozukara/Stable-Diffusion

🔗 Our Reddit ⤵️
▶️ https://www.reddit.com/r/SECourses/
·
posted an update about 2 months ago
view post
Post
1809
Just published Image Captioning Editor Gradio APP - Edit Your Captions Super Easy Including Batch Editing - For Windows, RunPod and Massed Compute. Developed by me. Have lots of amazing features that all you need. Let me know if missing any features.

Gradio is amazing to develop such amazing apps in short time. Used Claude 3.5 to develop it :)

Scripts are available here : https://www.patreon.com/posts/108992085
replied to their post about 2 months ago
view reply

Massed Compute coupon we have is available indefinitely - that is what they told me and still working

replied to their post about 2 months ago
view reply

Massed Compute coupon we have is available indefinitely - that is what they told me and still working

replied to their post about 2 months ago
view reply

For this task currently LivePortrait is best

posted an update about 2 months ago
view post
Post
4904
LivePortrait AI: Transform Static Photos into Talking Videos. Now supporting Video-to-Video conversion and Superior Expression Transfer at Remarkable Speed

A new tutorial is anticipated to showcase the latest changes and features in V3, including Video-to-Video capabilities and additional enhancements.

This post provides information for both Windows (local) and Cloud installations (Massed Compute, RunPod, and free Kaggle Account).

🔗 Windows Local Installation Tutorial ️⤵️
▶️ https://youtu.be/FPtpNrmuwXk

🔗 Cloud (no-GPU) Installations Tutorial for Massed Compute, RunPod and free Kaggle Account ️⤵️
▶️ https://youtu.be/wG7oPp01COg

The V3 update introduces video-to-video functionality. If you're seeking a one-click installation method for LivePortrait, an open-source zero-shot image-to-animation application on Windows, for local use, this tutorial is essential. It introduces the cutting-edge image-to-animation open-source generator Live Portrait. Simply provide a static image and a driving video to create an impressive animation in seconds. LivePortrait is incredibly fast and adept at preserving facial expressions from the input video. The results are truly astonishing.

With the V3 update adding video-to-video functionality, those interested in using LivePortrait but lacking a powerful GPU, using a Mac, or preferring cloud-based solutions will find this tutorial invaluable. It guides you through the one-click installation and usage of LivePortrait on #MassedCompute, #RunPod, and even a free #Kaggle account. After following this tutorial, you'll find running LivePortrait on cloud services as straightforward as running it locally. LivePortrait is the latest state-of-the-art static image to talking animation generator, surpassing even paid services in both speed and quality.

  • 2 replies
·
posted an update 2 months ago
posted an update 2 months ago
view post
Post
857
# LivePortrait 1-Click Installers and full tutorials for Windows and Cloud (useful for Mac users) (Massed Compute, RunPod and a free Kaggle Account)

I know there are a lot of experts around here that can install and use easily. But I have prepared solid tutorials for newbies and shown how to use this amazing top quality app LivePortrait. I have to say congrats to the developers.

I am also a researcher you can see my LinkedIn profile here but recently I am shifted into AI lectures : https://www.linkedin.com/in/furkangozukara/

Both Windows and Cloud tutorial has manually written (100% accurate) captions / subtitles. Also both have manually written by me very detailed video chapters.

Windows LivePortrait Tutorial : https://youtu.be/FPtpNrmuwXk

Cloud LivePortrait Tutorial : Massed Compute, RunPod & Kaggle : https://youtu.be/wG7oPp01COg

## Windows LivePortrait Tutorial Video Chapters

- 0:00 Introduction to LivePortrait: A cutting-edge open-source application for image-to-animation conversion
- 2:20 Step-by-step guide for downloading and installing the LivePortrait Gradio application on your device
- 3:27 System requirements and installation process for LivePortrait
- 4:07 Verifying the successful installation of required components
- 5:02 Confirming installation completion and preserving installation logs
- 5:37 Initiating the LivePortrait application post-installation
....
posted an update 3 months ago
view post
Post
2413
Really got amazing results with our InstantID next level. Tutorial will be published soon hopefully. This is the best ever 0-shot likeliness and high quality I have ever got.
posted an update 3 months ago
view post
Post
1108
How to Use SwarmUI & Stable Diffusion 3 on Cloud Services Kaggle (free), Massed Compute & RunPod : https://youtu.be/XFUZof6Skkw

Tutorial link : https://youtu.be/XFUZof6Skkw

It has manually written captions / subtitles and also video chapters.

If you are a GPU poor this is the video you need

In this video, I demonstrate how to install and use #SwarmUI on cloud services. If you lack a powerful GPU or wish to harness more GPU power, this video is essential. You'll learn how to install and utilize SwarmUI, one of the most powerful Generative AI interfaces, on Massed Compute, RunPod, and Kaggle (which offers free dual T4 GPU access for 30 hours weekly). This tutorial will enable you to use SwarmUI on cloud GPU providers as easily and efficiently as on your local PC. Moreover, I will show how to use Stable Diffusion 3 (#SD3) on cloud. SwarmUI uses #ComfyUI backend.

🔗 The Public Post (no login or account required) Shown In The Video With The Links ➡️ https://www.patreon.com/posts/stableswarmui-3-106135985

🔗 Windows Tutorial for Learn How to Use SwarmUI ➡️ https://youtu.be/HKX8_F1Er_w

🔗 How to download models very fast to Massed Compute, RunPod and Kaggle and how to upload models or files to Hugging Face very fast tutorial ➡️ https://youtu.be/X5WVZ0NMaTg

🔗 SECourses Discord ➡️ https://discord.com/servers/software-engineering-courses-secourses-772774097734074388

🔗 Stable Diffusion GitHub Repo (Please Star, Fork and Watch) ➡️ https://github.com/FurkanGozukara/Stable-Diffusion

Coupon Code for Massed Compute : SECourses
Coupon works on Alt Config RTX A6000 and also RTX A6000 GPUs

posted an update 3 months ago
view post
Post
490
LLaVA 1.6 - 34 billion parameters - loaded as 8-bit precision caption. Uses around 37 GB VRAM on Massed Compute with our installers (31 cents per hour with SECourses coupon). A new tutorial for this will be made hopefully soon.
  • 4 replies
·
posted an update 3 months ago
view post
Post
6885
Zero to Hero Stable Diffusion 3 Tutorial with Amazing SwarmUI SD Web UI that Utilizes ComfyUI

https://youtu.be/HKX8_F1Er_w

Do not skip any part of this tutorial to master how to use Stable Diffusion 3 (SD3) with the most advanced generative AI open source APP SwarmUI. Automatic1111 SD Web UI or Fooocus are not supporting the #SD3 yet. Therefore, I am starting to make tutorials for SwarmUI as well. #StableSwarmUI is officially developed by the StabilityAI and your mind will be blown after you watch this tutorial and learn its amazing features. StableSwarmUI uses #ComfyUI as the back end thus it has all the good features of ComfyUI and it brings you easy to use features of Automatic1111 #StableDiffusion Web UI with them. I really liked SwarmUI and planning to do more tutorials for it.

🔗 The Public Post (no login or account required) Shown In The Video With The Links ➡️ https://www.patreon.com/posts/stableswarmui-3-106135985

0:00 Introduction to the Stable Diffusion 3 (SD3) and SwarmUI and what is in the tutorial
4:12 Architecture and features of SD3
5:05 What each different model files of Stable Diffusion 3 means
6:26 How to download and install SwarmUI on Windows for SD3 and all other Stable Diffusion models
8:42 What kind of folder path you should use when installing SwarmUI
10:28 If you get installation error how to notice and fix it
11:49 Installation has been completed and now how to start using SwarmUI
12:29 Which settings I change before start using SwarmUI and how to change your theme like dark, white, gray
12:56 How to make SwarmUI save generated images as PNG
13:08 How to find description of each settings and configuration
13:28 How to download SD3 model and start using on Windows
13:38 How to use model downloader utility of SwarmUI
14:17 How to set models folder paths and link your existing models folders in SwarmUI
14:35 Explanation of Root folder path in SwarmUI
14:52 VAE of SD3 do we need to download?
posted an update 4 months ago
view post
Post
2532
V-Express: 1-Click AI Avatar Talking Heads Video Animation Generator - D-ID Alike - Free Open Source

Full Windows YouTube Tutorial : https://youtu.be/xLqDTVWUSec

Ever wished your static images could talk like magic? Meet V-Express, the groundbreaking open-source and free tool that breathes life into your photos! Whether you have an audio clip or a video, V-Express animates your images to create stunning talking avatars. Just like the acclaimed D-ID Avatar, Wav2Lip, and Avatarify, V-Express turns your still photos into dynamic, speaking personas, but with a twist—it's completely open-source and free to use! With seamless audio integration and the ability to mimic video expressions, V-Express offers an unparalleled experience without any cost or restrictions. Experience the future of digital avatars today—let's dive into how you can get started with V-Express and watch your images come alive!

1-Click V-Express Installers Scripts ⤵️
https://www.patreon.com/posts/105251204

Requirements Step by Step Tutorial ⤵️
https://youtu.be/-NjNy7afOQ0

Official Rope GitHub Repository Free To Install and Use ⤵️
https://github.com/tencent-ailab/V-Express

SECourses Discord Channel to Get Full Support ⤵️
https://discord.com/servers/software-engineering-courses-secourses-772774097734074388


replied to their post 4 months ago
view reply

It is open source you can easily install by following github instructions

posted an update 4 months ago
view post
Post
1651
Rope is the newest 1-Click, most easy to use, most advanced open source Deep Fake application. It has been just published few days ago. In below tutorials I show how to use Rope Pearl DeepFake application both on Windows and on a cloud machine (Massed Compute). Rope is way better than Roop, Roop Unleashed and FaceFusion. It supports multi-face Face Swapping and making amazing DeepFake videos so easily with 1-Click. Select video, select faces and generate your DeepFake 4K ultra-HD video.

1-Click Rope Installers Scripts (contains both Windows into an isolated Python VENV and Massed Compute — Cloud — No GPU)⤵️

https://www.patreon.com/posts/most-advanced-1-105123768

Tutorials are made only for educational purposes. On cloud Massed Compute machine, you can run with staggering 20 threads and can FaceSwap entire movies. Fully supports face tracking and multiple face changes.

Mind-Blowing Deepfake Tutorial: Turn Anyone into Your Fav Movie Star! Better than Roop & Face Fusion ⤵️
https://youtu.be/RdWKOUlenaY

Best Deepfake Open Source App ROPE — So Easy To Use Full HD Feceswap DeepFace, No GPU Required Cloud ⤵️
https://youtu.be/HLWLSszHwEc

·
posted an update 4 months ago
view post
Post
421
The zip file contains installers for Windows, RunPod, Massed Compute and a free Kaggle account notebook

It generates a VENV and install everything inside it. Works with Python 3.10.x - I suggest 3.10.11

Also you need C++ tools and Git. You can follow this tutorial to install all : https://youtu.be/-NjNy7afOQ0

Updated 27 May 2024 : https://www.patreon.com/posts/95759342

21 January 2024 Update
SDXL model upgraded to ip-adapter-faceid-plusv2_sd15
Kaggle Notebook upgraded to V3 and supports SDXL now

First of all I want to thank you so much for this amazing model.

I have spent over 1 week to code the Gradio and prepare the video. I hope you let this thread remain and even add to the Readme file.

After video has been published I even added face embedding caching mechanism. So now it will calculate face embedding vector only 1 time for each image, thus super speed up the image generation.

Instantly Transfer Face By Using IP-Adapter-FaceID: Full Tutorial & GUI For Windows, RunPod & Kaggle : https://youtu.be/rjXsJ24kQQg

chapters are like below

0:00 Introduction to IP-Adapter-FaceID full tutorial
2:19 Requirements to use IP-Adapter-FaceID gradio Web APP
2:45 Where the Hugging Face models are downloaded by default on Windows
3:12 How to change folder path where the Hugging Face models are downloaded and cached
3:39 How to install IP-Adapter-FaceID Gradio Web APP and use on Windows
5:35 How to start the IP-Adapter-FaceID Web UI after the installation
5:46 How to use Stable Diffusion XL (SDXL) models with IP-Adapter-FaceID
5:56 How to select your input face and start generating 0-shot face transferred new amazing images
6:06 What does each option on the Web UI do explanations

replied to their post 4 months ago
replied to their post 4 months ago
view reply

OK so I have created a template space. Of course it's not working itself because it runs on a CPU but people can duplicate it on a GPU. It should work but I can only test the interface. I say that they need 60 GB VRAM. Correct me if it's wrong. I will wait for feedback.

Our apps works with 29gb ram on kaggle

Can't tell others

replied to their post 4 months ago
posted an update 4 months ago
view post
Post
1545
Stable Cascade Full Tutorial for Windows, Massed Compute, RunPod & Kaggle — Predecessor of SD3 — 1-Click Install Amazing Gradio APP

Stable Cascade is another amazing model for Stability AI

Weights are published

Stable Cascade Full Tutorial for Windows — Predecessor of SD3–1-Click Install Amazing Gradio APP : https://youtu.be/q0cYhalUUsc

Stable Cascade Full Tutorial for Cloud — Predecessor of SD3 — Massed Compute, RunPod & Kaggle : https://youtu.be/PKDeMdEObNo

replied to their post 4 months ago
view reply

sadly i can't for this. I also don't know and this requires good GPU

posted an update 5 months ago
view post
Post
2720
The IDM-VTON (Improving Diffusion Models for Authentic Virtual Try-on in the Wild) is so powerful that it can even transfer beard or hair as well.

I have prepared installer scripts and full tutorials for Windows (requires min 8 GB VRAM GPU), Massed Compute (I suggest this if you don’t have a strong GPU), RunPod and a free Kaggle account (works perfect as well but slow).

Windows Tutorial : https://youtu.be/m4pcIeAVQD0

Cloud (Massed Compute, RunPod & Kaggle) Tutorial : https://youtu.be/LeHfgq_lAXU

posted an update 5 months ago
view post
Post
3563
Complete Guide to SUPIR Enhancing and Upscaling Images Like in Sci-Fi Movies on Your PC : https://youtu.be/OYxVEvDf284

In this video, I explain how to 1 click install and use the most advanced image upscaler / enhancer in the world that is both commercially and open source available. The upscaler that I am going to introduce you is open source #SUPIR and the model is free to use. SUPIR upscaler is many times better than both paid Topaz AI and Magnific AI and you can use this upscaler on your computer for free forever. The difference of SUPIR vs #Topaz and #Magnific is like ages. So in this tutorial you are going to learn everything about how to install, update and use SUPIR upscaler on your personal computer. The video shows Windows but it works perfectly fine on Linux as well.

Scripts Download Link ⤵️
https://www.patreon.com/posts/99176057

Samplers and Text CFG (Text Guidance Scale) Comparison Link ⤵️
https://imgsli.com/MjU2ODQz/2/1

How to install accurate Python, Git and FFmpeg on Windows Tutorial ⤵️
https://youtu.be/-NjNy7afOQ0

Full DreamBooth / Fine-tuning Tutorial ⤵️
https://youtu.be/0t5l6CP9eBg

Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild : https://arxiv.org/abs/2401.13627

Authors introduce SUPIR (Scaling-UP Image Restoration), a groundbreaking image restoration method that harnesses generative prior and the power of model scaling up. Leveraging multi-modal techniques and advanced generative prior, SUPIR marks a significant advance in intelligent and realistic image restoration. As a pivotal catalyst within SUPIR, model scaling dramatically enhances its capabilities and demonstrates new potential for image restoration. Authors collect a dataset comprising 20 million high-resolution, high-quality images for model training, each enriched with descriptive text annotations. SUPIR provides the capability to restore images guided by textual prompts, broadening its application scope and potential

  • 9 replies
·
posted an update 5 months ago
view post
Post
3680
Watch the full tutorial here : https://youtu.be/0t5l6CP9eBg

The tutorial is over 2 hours literally with manually fixed captions and perfect video chapters.

Most Awaited Full Fine Tuning (with DreamBooth effect) Tutorial Generated Images - Full Workflow Shared In The Comments - NO Paywall This Time - Explained OneTrainer - Cumulative Experience of 16 Months Stable Diffusion

In this tutorial, I am going to show you how to install OneTrainer from scratch on your computer and do a Stable Diffusion SDXL (Full Fine-Tuning 10.3 GB VRAM) and SD 1.5 (Full Fine-Tuning 7GB VRAM) based models training on your computer and also do the same training on a very cheap cloud machine from MassedCompute if you don't have such computer.

Tutorial Readme File ⤵️
https://github.com/FurkanGozukara/Stable-Diffusion/blob/main/Tutorials/OneTrainer-Master-SD-1_5-SDXL-Windows-Cloud-Tutorial.md

Register Massed Compute From Below Link (could be necessary to use our Special Coupon for A6000 GPU for 31 cents per hour) ⤵️
https://bit.ly/Furkan-Gözükara

Coupon Code for A6000 GPU is : SECourses


0:00 Introduction to Zero-to-Hero Stable Diffusion (SD) Fine-Tuning with OneTrainer (OT) tutorial
3:54 Intro to instructions GitHub readme
4:32 How to register Massed Compute (MC) and start virtual machine (VM)
5:48 Which template to choose on MC
6:36 How to apply MC coupon
8:41 How to install OT on your computer to train
9:15 How to verify your Python, Git, FFmpeg and Git installation
12:00 How to install ThinLinc and start using your MC VM
12:26 How to setup folder synchronization and file sharing between your computer and MC VM
13:56 End existing session in ThinClient
14:06 How to turn off MC VM
14:24 How to connect and start using VM
14:41 When use end existing session
16:38 How to download very best OT preset training configuration for SD 1.5 & SDXL models
18:00 How to load configuration preset
18:38 Full explanation of OT configuration and best hyper parameters for SDXL
.
.
.
replied to their post 6 months ago
posted an update 6 months ago
replied to their post 6 months ago
view reply

I did over 100 trainings empirically to find best hyper parameters. And training U-NET + Text Encoder 1 yields better results that only U-NET @researcher171473

posted an update 6 months ago
view post
Post
1984
Now You Can Full Fine Tune / DreamBooth Stable Diffusion XL (SDXL) with only 10.3 GB VRAM via OneTrainer — Both U-NET and Text Encoder 1 is trained — Compared 14 GB config vs slower 10.3 GB Config

Full config and instructions are shared here : https://www.patreon.com/posts/96028218

Used SG161222/RealVisXL_V4.0 as a base model and OneTrainer to train on Windows 10 : https://github.com/Nerogar/OneTrainer

The posted example x/y/z checkpoint comparison images are not cherry picked. So I can get perfect images with multiple tries.

Trained 150 epochs, 15 images and used my ground truth 5200 regularization images : https://www.patreon.com/posts/massive-4k-woman-87700469

In each epoch only 15 of regularization images used to make DreamBooth training affect

As a caption only “ohwx man” is used, for regularization images just “man”
You can download configs and full instructions here : https://www.patreon.com/posts/96028218

Hopefully full public tutorial coming within 2 weeks. I will show all configuration as well

The tutorial will be on our channel : https://www.youtube.com/SECourses
Training speeds are as below thus durations:

RTX 3060 — slow preset : 3.72 second / it thus 15 train images 150 epoch 2 (reg images concept) : 4500 steps = 4500 3.72 / 3600 = 4.6 hours

RTX 3090 TI — slow preset : 1.58 second / it thus : 4500 * 1.58 / 3600 = 2 hours

RTX 3090 TI — fast preset : 1.45 second / it thus : 4500 * 1.45 / 3600 = 1.8 hours

A quick tutorial for how to use concepts in OneTrainer : https://youtu.be/yPOadldf6bI


  • 5 replies
·
replied to their post 7 months ago
replied to their post 7 months ago
view reply

@ameerazam08 100%. I am talking with original developers for CPU Offloading too if they hopefully add.

posted an update 7 months ago
view post
Post
I have dedicated several days, working over 12 hours each day, on SUPIR (Scaling-UP Image Restoration), a cutting-edge image enhancement and upscaling model introduced in the paper Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild.

This model is simply mind-blowing. At the bottom of this post, you will see side-by-side comparisons of SUPIR versus the extremely expensive online service, Magnific AI. Magnific is known to be the best among the community. However, SUPIR is by far superior. SUPIR also significantly outperforms Topaz AI upscale. SUPIR manages to remain faithful to the original image almost 100% while adding details and achieving super upscaling with the best realism.

You can read the full blog post here : https://huggingface.co/blog/MonsterMMORPG/supir-sota-image-upscale-better-than-magnific-ai

·
posted an update 7 months ago
view post
Post
Compared Stable Diffusion 3 with Dall-E3 and results are mind blowing.

SD 3 can follow prompts many times better than SD 1.5 or SDXL. It is even better than Dall-E3 in following text / spelling prompts.

The realism of the SD 3 can't be even compared with Dall-E3, since every Dall-E3 output is like a digital render.

Can't wait to get approved of Stability AI early preview program to do more intensive testing.

Some people says be skeptical about cherry picking. I agree but I hope that these Stability AI released images are not that heavy cherry picking.

You can see SD3 vs Dall-E3 comparison here : https://youtu.be/DJxodszsERo