Anime Illust Diffusion XL
模型介绍(Chinese Part)
I 引言
在本介绍中,您将了解:
模型介绍(见 II 部分);
使用指南(见 III 部分);
训练参数(见 IV 部分);
触发词列表(见附录 A 部分)
II 模型介绍
动漫插画设计XL,或称 AIDXL 是一款专用于生成二次元插图的模型。它内置了 200 种以上(随着更新越来越多)的插画风格,依靠特定触发词(见附录 A 部分)触发。
优点:构图大胆,没有摆拍感,主体突出,没有过多繁杂的细节,认识很多动漫人物(依靠角色日文名拼音触发,例如,“ayanami rei”对应角色“绫波丽”,“kamado nezuko”对应角色“祢豆子”)。
模型难度较大,不推荐入门者使用。
III 使用指南(将与时俱进)
推荐使用 ComfyUI 生成图像……
现在,WebUI和ComfyUI在生成式无明显差别。
1 生成参数
如果您无法生成与预览图相似的图像,请参照以下指南。
建议图像总分辨率(总分辨率=高度x宽度)大于 1024x1024 且 小于 1024x1024x1.5,否则生成的图像可能质量不高。此为经验法则,即生成图像的总分辨率应高于训练集图像的总分辨率,且同时低于训练集图像总分辨率的 1.5 倍,以防止模糊和畸变。例如,本模型在 1024x1024 总分辨率上训练,因此您最大可以生成 1024x1536(以 2:3 为例)分辨率的图像。
推荐使用 tag + 自然语言 的形式书写正面提示词。提高自然语言中的名词密度,避免使用抽象形容词,或用多个形容词叠加地修饰名词。另外,无需使用过多负面提示词。建议负面提示词数量不超过10个。
不进行“Clip Skip”操作,即 Clip Skip = 1。
采用 “dpmpp_2m” 采样器(sampler),搭配 “karras” 调度器(scheduler),该组合在 webui 里称为 DPM++ 2M Karras。在 7 CFG Scale 上采样 35 步以上。
仅需要使用模型本身,而不使用精炼器(Refiner)。
使用基底模型 vae 或 sdxl-vae。
使用附录部分提供的触发词以活用风格化。注意,从v0.5版本开始将支持部分质量提示词,如 best quality, masterpiece 等。使用它们将提高图像平均的美学质量(并不总是)。
2 注意事项
使用 SDXL 支持的 VAE 模型、文本嵌入(embeddings)模型和 Lora 模型。注意:sd-vae-ft-mse-original 不是支持 SDXL 的 vae;EasyNegative、badhandv4 等负面文本嵌入也不是支持 SDXL 的 embeddings。
生成图像时,强烈推荐使用模型专用的负面文本嵌入(下载参见 Suggested Resources 栏),因其为模型特制,故对模型几乎仅有正面效果。
由于初步训练,版本新增触发词将在当前版本效果相对较弱或不稳定。
3 实验
触发词所指向的风格能够相互融合而产生新的风格。
自 v0.5 版本开始,新增了质量提示词。
IV 训练参数
以 SDXL1.0 为底模,使用大约 2w 张自己标注的图像在 5e-6 学习率,总长为 1 的余弦调度器上训练了约 100 期得到模型 A。之后在 2e-7 学习率,其余参数相同的条件下,训练得到模型 B。将模型 A 与 B 混合后得到 AIDXLv0.1 模型。
V 对比基于 SD1.5 的 AID
2023/08/08:AIDXL 使用与 AIDv2.10 完全相同的训练集进行训练,但表现优于 AIDv2.10。AIDXL 更聪明,能做到很多以 SD1.5 为底模型无法做到的事。它还能很好地区分不同概念,学习图像细节,处理对 SD1.5 来说难于登天的构图,几近完美地学习旧版 AID 无法完全掌握的风格。总的来说,它绝对拥有比 SD1.5 更高的上限,我会继续更新 AIDXL。
Model Introduction (英文部分)
I Introduction
In this introduction, you'll learn about:
Model presentation (see Section II);
Instructions for use (see Section III);
Training parameters (see Section IV);
List of Trigger Words (see Appendix Part A)
II AIDXL
Anime Illustration Design XL, or AIDXL, is a model dedicated to generating anime illustrations. It has more than 200 (with more and more updates) built-in illustration styles, which are triggered by specific trigger words (see Appendix A).
Advantages: flexible composition, no sense of posing, prominent subject, not too many complicated details, familiar with many anime characters (triggered by the character's name, for example, "ayanami rei" corresponds to the character "Ayanami Rei", "kamado nezuko " corresponds to the character "Nezuko", "lucy \(cyberpunk edgerunners\) corresponds to the character "Lucy").
It's a little bit difficult. Not recommended for beginners.
III User Guide
(Keep pace with the times)
Now, there is no obvious difference in the generation between WebUI and ComfyUI.
1 Suggested Generation Parameters
If you are unable to generate an image similar to the preview, please follow the guidelines below.
It is recommended that the total image resolution (total resolution = height x width) be greater than 1024x1024 and less than 1024x1024x1.5, otherwise the generated image may not be of high quality. This is a rule of thumb that the total resolution of the generated images should be higher than the total resolution of the training set images and at the same time lower than 1.5 times the total resolution of the training set images to prevent blurring and distortion. For example, this model is trained on a total resolution of 1024x1024, so you can generate images up to a resolution of 1024x1536 (2:3 for example).
It is recommended to use the form of tag + natural language to write positive prompt words. Increase the density of nouns in natural language and avoid using abstract adjectives or using multiple adjectives to modify nouns in a superimposed manner. Also, there is no need to use too many negative cue words. It is recommended that the number of negative prompt words does not exceed 10.
No "Clip Skip", that is, Clip Skip = 1.
Use "dpmpp_2m" sampler (sampler), with "karras" scheduler (scheduler), this combination is called DPM++ 2M Karras in webui. Sample more than 35 steps over 4~9 CFG.
Do not use refiner model.
Use VAE of the model itself or sdxl-vae.
Use SDXL supported VAE models, text embeddings (embeddings) models and Lora models. Note:
sd-vae-ft-mse-original
is not an SDXL-capable vae; negative embeddings likeEasyNegative
andbadhandv4
are not SDXL-capable embeddings.
2 Notes
Use SDXL supported VAE models, text embeddings models and Lora models. Note: sd-vae-ft-mse-original is not a vae that supports SDXL; negative text embeddings such as EasyNegative and badhandv4 are not embeddings that support SDXL.
When generating images, I do recommend using the model-specific negative embedding (see the Suggested Resources). Because it is specially trained for the model, it has positive effects only.
Due to partly underfitting, the new trigger words in the version will be relatively weak or unstable. This will always improve in its next few versions.
3 Experiments
The styles of their trigger words can merge with each other to produce new styles.
Starting from version v0.5, some new quality trigger words have been added.
IV Training Parameters
Using SDXL1.0 as the base model, using about 22k labeled images to train about 100 epochs on a cosine scheduler with a learning rate of 5e-6 and a total length of 1 to obtain model A. Then, using a learning rate of 2e-7 and the same other parameters to obtain model B. The AIDXLv0.1 model is obtained by merging model A and B.
V AIDXL vs SD1.5 based AID
2023/08/08. AIDXL is trained on the same training set as AIDv2.10, but outperforms AIDv2.10. AIDXL is smarter and can do many things that SD1.5-based models cannot. It also does a really good job of distinguishing between concepts, learning image detail, handling compositions that are difficult or even impossible for SD1.5 and AID. Overall, it is absolute potential. I'll keep updating AIDXL.
Appendix / 附录
A. Trigger Words List / 触发词列表
v0.1 & v0.2: by 35s00, by 3meiji, by 5eyo, by 7nu, by 7thknights, by adenim, by agm, by ajimita, by akizero, by ame929, by anmi, by anteiru, by arutera, by ask, by atelier irrlicht, by bunbun, by caaaaarrot, by camu, by canking, by ccroquette, by chi4, by chicken utk, by chon, by cola, by cutesexyrobutts, by darumakarei, by dino, by dora, by dsmile9, by ei maestrl, by ekita kuro, by ekita xuan, by eku uekura, by fadingz, by fajyobore, by foomidori, by freng, by fuzichoco, by gesoking, by gomzi, by hachisan, by hakuhiru oeoe, by hamukukka, by haru, by hata, by hidulme, by hikinito0902, by hinaki, by hitoimim, by hitomio16, by hizumi, by homutan, by hotatenshi, by houk1se1, by hyatsu, by icecenya, by ichigo ame, by inoriac, by iromishiro, by iwzry, by jnthed, by joezunzun, by junsui0906, by karohroka, by kaya7hara, by kazari tayu, by killow, by kin, by kinta, by kishiyo, by kitada mo, by kkuni, by konya karasue, by kooork55, by kot rou020, by krenz, by kurige horse, by kuroume, by lalalalack, by lemoneco, by lm7, by lovelymelm, by lpmya, by mar takagi, by matcha, by matsukenmanga, by melowh, by menou, by midori xu, by mika pikazo, by misumigumi, by miv4t, by mochizukikei, by mogumo, by momoco, by momoku, by morikuraen, by mqkyrie, by muina, by munashichi, by muryou tada, by myaru, by myc0t0xin, by myung yi, by nack, by naji yanagida, by nanmo, by nardack, by narue, by nekojira, by netural, by nezukonezu32, by nico tine, by nikuzume, by nine, by nineo, by ninev, by niwa uxx, by nixeu, by noco, by noodle4cool, by nounoknown, by noyu, by oda non, by omutatsu, by onineko, by palow, by panp, by pikuson, by poharo, by poire, by potg, by pro-p, by qooo003, by rai hito, by rattan, by reiko, by rella, by rhtkd, by rin7914, by roitz, by ryuseilan, by saberiii, by sais, by sakiika, by samip, by sanosomeha, by say hana, by scottie0073, by senryoko, by serie niai, by seuhyo99, by shal-e, by shimanun, by shirabii, by shiraishi kanoya, by shiren, by shirentutu, by sho, by sia, by siki, by silver, by solipsist, by some1else45, by sonomura00, by sooon, by star furu, by starshadowmagic, by starzin07, by sui 0z0, by sul, by sushi0831, by suzukasuraimu, by taiki, by takumi bis, by teffish, by tidsean, by tira27, by tsukiho tsukioka, by tsvbvra, by ttosom, by tukumi bis, by uiiv, by ukiatsuya, by umaiyo puyoman, by void, by wait ar, by walzrj, by wanke, by whoisshe, by wlop, by xilmo, by yejji, by yogisya, by yohan, by yomu, by yoneyama mai, by yosk6000, by yumenouchi, by yun216, by yunikon147, by yunsang, by ziyun, by zumoti4
v0.3 adds: by akita hika, by asaikeu, by atdan, by bannou, by bison, by bodhi, by bonnie23, by cell, by chela77, by coco1758, by ebkim, by eichi, by electrophorus, by fungi, by gekidan, by glutton, by hews, by hirohorn, by hle, by hlymoriia, by icomochi, by iumu, by jeone0, by kana dfy, by kikinoki, by kumori ufo, by kurige, by lam, by liaowen, by limuli ceey, by lirseven, by maenoo, by magotsuki, by marusin, by mechari, by minncn, by modare, by r1zen, by rag ragko, by rannou, by rolua, by rurudo, by saclia, by sai izumi, by takunomi, by tedineon, by torino, by tororoshanyao, by tsubonari, by uuuzan, by yamanokami eaka, by zumizumi
v0.5 adds: by 3333382, by agoto, by am1m, by apoco, by aroa, by asea, by asets96, by asicah, by attas, by ayanon, by baihuahua, by ballpa cohi, by buzhijinfeng, by chai, by choyeon, by ciwu, by criin, by cupoi, by demizu posuka, by diyokama, by eyyy, by futamotu, by ganet, by greembang, by han, by hapu, by hcc33, by hoodxart, by hxxg, by ikky, by imlllsn, by japste, by jeanorgan, by jhcoon, by jiaming, by jirujiaru, by jjjsss, by jlt4n, by jojomaki, by jue, by jumbo, by kaedelic, by katann, by kieed, by kinokohime, by kirin, by kji rozo, by kmgrn, by kookie, by ksorede, by kuroduki, by kyusoukyu, by laza, by letco, by linfi muu, by lingli, by lizhiyan360, by luenar, by mamenomoto, by mashilemo, by mayf42, by mgdown, by miaopulu, by michihasu, by misaka12003, by mitsuki sanagi, by mizokooohmygod, by moguta, by moyu marginal, by mygom, by myless23, by nbrush19, by necomi, by nekoshoko, by nemn, by ninebell, by nininisama, by njer, by nmk, by nocopyrightgirl, by noir, by novelance, by ohanatoomoti, by omao, by qysthree, by rumoon, by ruoganzhua, by saltsaltzome, by sanmuyyb, by sannso, by saturn, by senju yosiyuki, by shant, by sheya, by shirataking, by siokazunoko, by siu, by skyjack, by sogawa, by ssr susu, by swd3e22, by swkld, by tansuan, by terite3lio, by timo, by uenomigi, by unfairr, by xianyuliangryo, by xiaoluoxl, by yaegasinan, by yarn, by yktmr10, by ymqqq, by yolanda, by yuichohui, by yumo012, by yutomaru, by yuzuyomogi
v0.5 adds (quality / traditional style): impasto, pseudo impasto, photorealistic, cel shading, flat color, realistic, oil painting, sketch, 3d, vivid color, perspective
B. Trigger Words Introduction / 触发词介绍
(Updating...)
写实风格(realistic):by wlop, by nixeu, by shal-e
卡通渲染(cel shading):by void, by 7thknights, by novelance, by ciwu, by homutan, by melowh
厚涂(impasto):by dino, by xilmo, by solipsist, by reiko, by some1else45, by noodle4cool, by unfairr
平涂(flat color):by uenomigi, by magotsuki, by 3333382, by eku uekura, by hakuhiru oeoe, by hamukukka, by haru, by hirohorn, by hizumi, by ichigo ame, by kooork55, by mitsuki sanagi, by qooo003, by nezukonezu32, by nico tine, by nocopyrightgirl, by sanosomeha, by sonomura00, by tsubonari, by tsvbvra, by tukumi bis, by ukiatsuya, by yktmr10, by yosk6000, by yutomaru
---------------------------------------------------------------------------------------------------------------------
Type
|
Checkpoint
|
Stats |
4136
0
|
Published |
2023-12-27 14:27:55 |
Base Model |
SDXL 1.0
|
Usage Tips |
Clip Skip: 2
|
Trigger Words |
SEE APPENDIX A
见附录A
|
