Hey! I am a software engineer with vast experience of building full stack application and lately I’ve been really mesmerized by popping up tools that utilize AI to solve common day to day problems. Generating blog outlines from couple of lines of text, creating realistic avatars of yourself in different settings, generating art from text prompts. I’ve never even had touch points with such technologies so it’s quite overwhelming for me in terms where to start!<p>Do I need to know the basics? Shall I just utilize the existing solutions like gpt-3, openAI, stable diffusion and built applications with them? Can I make those tools tailored for my uses cases(model training) or I should built similar from the scratch?<p>Looking for advice!
For the image generation (or even indexing with the CLIP interrogator) side of things, recommend just installing the AUTOMATIC1111 github repo (<a href="https://github.com/AUTOMATIC1111/stable-diffusion-webui" rel="nofollow">https://github.com/AUTOMATIC1111/stable-diffusion-webui</a>), it's a web ui with pretty much every variant of stable diffusion you could want to try out, like txt2img, img2img, inpainting (both textual inversion and dream booth), outpainting, style customization, clip interrogation, etc. Most importantly, there are about 1000 youtube tutorials on how to do each of these things with it, so you can pick your interest areas and just try it out without having to understand all the details first.<p>From there, if you're interested in how it works, I highly recommend the last 4 videos on Jeremy Howard's youtube channel: <a href="https://www.youtube.com/user/howardjeremyp/videos" rel="nofollow">https://www.youtube.com/user/howardjeremyp/videos</a><p>He's currently teaching a class on stable diffusion from the ground up and these lectures give a really good introduction to how it all works.
1. It depends, if you just want to <i>use</i> some model and call APIs, then you do not have to learn any ML theory. You just have to learn using libraries following their GitHub Readme instructions. Get a Colab Pro+ subscription or run Kaggle Notebooks for free. You can also simply use GUIs built on top of Open Source models.<p>2. Learn to use the Hugging Face library, and use their stuff on your Notebooks.<p>3. Learn some ML theory so you can understand hyperparameters better, and can tweak them in a better way.<p>____<p>If you want to get into training models by yourself from scratch, you have to learn in a deeper manner, and cannot overlook learning ML theory in a deeper manner.<p>____<p>The most obvious ways would be:<p>1. Looking into stuff that John Whitaker does [0] and his elaborate free course on AI Art [1].<p>2. Learning ML from scratch starting from Andrew Ng ML, then going to DL, then learning about GANs.<p>3. Learning from fast.ai through their two-part course on Deep Learning, where Stable Diffusion is now being taught. Then learn PyTorch from another place like Sebastian Raschka's book.<p>4. Watching old videos from Stanford CS231n when Karpathy was a TA, and taught in the class. Then Deep Dream was standard.<p>_____<p>If you are a responsible, mature person, and you are in it for the long term, and have deep pockets, buy some GPU. 2x 3090 is reasonable, and should be enough.<p>____<p>Let me know if you have any further questions.<p>[0]: <a href="https://datasciencecastnet.home.blog/" rel="nofollow">https://datasciencecastnet.home.blog/</a><p>[1]: <a href="https://youtube.com/playlist?list=PL23FjyM69j910zCdDFVWcjSIKHbSB7NE8" rel="nofollow">https://youtube.com/playlist?list=PL23FjyM69j910zCdDFVWcjSIK...</a>
For search, ie ecommerce search; the person searches in long-sleeve or ripped-jeans tensor search helps to categorise text-vector, image-vector etc. I would reccommend, and actively use the Marqo repo @ <a href="https://search-the-way-you-th.ink/3FNq2lG" rel="nofollow">https://search-the-way-you-th.ink/3FNq2lG</a> . Super handy if you are focused on search and want to implement it into current projects. Only just using it out now and it's awesome! Although i can't comment much further as i've only just begun using it.
If you want to train the model, you can try Dreambooth-Stable-Diffusion. <a href="https://github.com/XavierXiao/Dreambooth-Stable-Diffusion" rel="nofollow">https://github.com/XavierXiao/Dreambooth-Stable-Diffusion</a>
If you were to do Google colab, I'd recommend [fast-stable-diffusion](<a href="https://github.com/TheLastBen/fast-stable-diffusion" rel="nofollow">https://github.com/TheLastBen/fast-stable-diffusion</a>). If not, I'm working on a [fork](<a href="https://github.com/askiiart/universal-fast-stable-diffusion" rel="nofollow">https://github.com/askiiart/universal-fast-stable-diffusion</a>). However, it's GPU-specific (and not functional yet), so if recommend checking out what other commenters say instead.
I've noticed a lot of the online services wrap openAI and sell a specific feature set. I'm also interested to see what I could build with gpt-3 without having to pay for it. But if I have to pay, I will as I don't have a lot of free time to learn.