2023 Transformers in Speech Processing- A Survey
Recent survey on the application of Transformer model on tasks of audio speech processing. Tasks include automatic speech recognition, speech synthesis, speech translation, speech para-linguistics, speech enhancement, spoken dialogue systems, and numerous multimodal applications.
2019 Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer
Depth-estimation model used in StableDiffusionDepth2ImgPipeline by HuggingFace.
2020 Denoising Diffusion Probabilistic Models
🌲 Famous based diffusion model inspired by the nonequilibrium thermodynamics.
2021 High-Resolution Image Synthesis with Latent Diffusion Models
🌲 Stable Diffusion model by HuggingFace, dealing with the text-to-image task.
2021 LoRA- Low-Rank Adaptation of Large Language Models
🌲 Good components to speed up efficient training❓ Maybe learning from HuggingFace-Diffusers.
2021 Score-Based Generative Modeling through Stochastic Differential Equations
🌲 Mathematic idea of SDE ❓
2021 SDEdit- Guided Image Synthesis and Editing with Stochastic Differential Equations
Stable Diffusion model by HuggingFace, dealing with the image-to-image task.
Breakdown: Given an input image with user guidance input (unnatural artifacts eg. stroke painting), generate a realistic and faithful image with stochastic differential equations (SDEs) based model, which is pre-trained on unlabeled data.
Cao, P., Zhou, F., Song, Q., & Yang, L. (2024). Controllable Generation with Text-to-Image Diffusion Models: A Survey.
Recent survey on the application of diffusion model on text-to-image tasks.