TOVUSH VA TASVIRLARNI BIRLASHTIRIB SUPER-RESOLUTION UCHUN YANGI GENERATIV MODEL YARATISH

Authors

  • Normo‘minov Akbar Kamol o‘g‘li Muhammad al-xorazmiy nomidagi toshkent axborot texnologiyalari universiteti

Keywords:

Super-resolution, multimodal yondashuv, generativ model, audio va tasvir birikmasi, neyron tarmoq, fusion modul, dekoder, ResNet, CNN

Abstract

Ushbu maqolada tovush (audio) va tasvir (image) ma'lumotlarini birlashtirish orqali super-resolution (SR) uchun yangi generativ modelni yaratish taklif qilinadi. Mazkur yondashuv multimodal ma’lumotlardan foydalanib, past aniqlikdagi tasvir va tovushlardan yuqori sifatli natijalar olish imkonini beradi. Ushbu model tovush va tasvir xususiyatlarini chuqur neyron tarmoqlar orqali ajratib, birlashtiruvchi modul yordamida yuqori aniqlikdagi tasvirni qayta tiklash imkoniyatini beradi.  Yondashuvning nazariy asosi, texnik tafsilotlari va eksperimental natijalari muhokama qilinadi.

References

Ledig, C., et al., "Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network," CVPR, 2017.

He, K., et al., "Deep Residual Learning for Image Recognition," CVPR, 2016.

Vaswani, A., et al., "Attention Is All You Need," NeurIPS, 2017.

Griffin, D., and Lim, J., "Signal Estimation from Modified Short-Time Fourier Transform," IEEE Transactions on Acoustics, Speech, and Signal Processing, 1984.

Wang, Z., et al., "Image Quality Assessment: From Error Visibility to Structural Similarity," IEEE Transactions on Image Processing, 2004.

Huang, G., et al., "Densely Connected Convolutional Networks," CVPR, 2017.

Kingma, D. P., and Ba, J., "Adam: A Method for Stochastic Optimization," ICLR, 2015.

Dong, C., et al., "Learning a Deep Convolutional Network for Image Super-Resolution," ECCV, 2014.

Russakovsky, O., et al., "ImageNet Large Scale Visual Recognition Challenge," IJCV, 2015.

Chollet, F., "Xception: Deep Learning with Depthwise Separable Convolutions," CVPR, 2017.

Goodfellow, I., Bengio, Y., Courville, A. Deep Learning. MIT Press, 2016.

Ushbu kitob chuqur o‘rganishning barcha asoslarini, jumladan, konvolyutsion neyron tarmoqlarni tushuntiradi.

Dumoulin, V., Visin, F. A Guide to Convolution Arithmetic for Deep Learning. arXiv preprint, 2016.URL: arXiv:1603.07285 Konvolyutsion qatlamlarning matematik asoslari haqida batafsil.

LeCun, Y., Bengio, Y., Hinton, G. Deep Learning. Nature, 2015.

CNNlarning rivojlanishi va tarmoqning asosi.

Krizhevsky, A., Sutskever, I., Hinton, G. E.

ImageNet Classification with Deep Convolutional Neural Networks. NeurIPS, 2012. Konvolyutsion tarmoqlarni tasvir klassifikatsiyasida qo‘llashdagi eng mashhur maqolalardan biri.

O‘Shea, K., Nash, R.An Introduction to Convolutional Neural Networks. arXiv preprint, 2015.URL: arXiv:1511.08458

Dong, C., Loy, C. C., He, K., Tang, X. Learning a Deep Convolutional Network for Image Super-Resolution. ECCV, 2014. URL: Paper Link

CNN yordamida super-resolution ishlash mexanizmI.

Wang, X., et al. ESRGAN: Enhanced Super-Resolution Generative Adversarial Networks. ECCV, 2018. URL: Paper Link Super-resolution uchun yaxshilangan GAN modeli.

Baltrusaitis, T., Ahuja, C., Morency, L. P. Multimodal Machine Learning: A Survey and Taxonomy. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019. Multimodal ma'lumotlarni qayta ishlashning asosiy usullari.

Tao, X., et al. Detail-Revealing Deep Video Super-Resolution. ICCV, 2017.

Video va multimodal SR manba.

Goodfellow, I., Pouget-Abadie, J., Mirza, M., et al. Generative Adversarial Networks (GAN). NeurIPS, 2014. URL: Paper Link. GANlarning nazariyasi va asoslari.

Isola, P., Zhu, J. Y., Zhou, T., Efros, A. A. Image-to-Image Translation with Conditional Adversarial Networks. CVPR, 2017. Generativ modellarni tasvirlar o‘rtasida o‘tkazishda qo‘llash.

Kingma, D. P., Welling, M. Auto-Encoding Variational Bayes (VAE). ICLR, 2014.

URL: Paper Link. VAE (Variational Autoencoder).

Creswell, A., et al. Generative Adversarial Networks: An Overview. IEEE Signal Processing Magazine, 2018. GANlar haqida keng ko'lamli ko'rinish.

R.I. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision. London, U.K: Cambridge Univ. Press, 2000.

M. Irani and P. Anandan, “About direct methods,” in Vision Algorithms: Theory and Practice (Lecture Notes in Computer Science), W. Triggs, A. Zisserman, and R. Szeliski, Eds. New York: Springer Verlag, 2000.

M. Irani and S. Peleg, “Improving resolution by image registration,” Graphi. Models Image Process., vol. 53, pp. 231-239, 1991.

M. Irani and S. Peleg, “Motion analysis for image enhancement:resolu-tion, occlusion, and transparency,” J. Visual Commun. Image Representation, vol. 4, pp. 324-335, 1993.

Z. Lin and H.Y. Shum, “On the fundamental limits of reconstruc-tion-based super-resolution algorithms,” in Proc. Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2001, pages I:1171-1176.

S. Mann and R.W. Picard, “Virtual bellows: Constructing high quality stills from video,” in Int. Conf. Image Processing, 1994.

W. Press, S. Teukolsky, W. Vetterling, and B. Flannery. Numerical Recipes in C, 2nd ed. London, U.K.: Cambridge Univ. Press, 1992.

H.S. Sawhney, S. Hsu, and R. Kumar, “Robust video mosaicing through topology inference and local to global alignment,” in Proc. Euro. Conf. Com-puter Vision, 1998, pp. 103-119.

R.R. Schultz and R.L. Stevenson, “Extraction of high-resolution frames from video sequences,” IEEE Trans. Image Processing, vol. 5, pp. 996-1011, June 1996.

A. Shashua and S. Toelg, “The quadric reference surface: Theory and ap-plications,” Int. J. Comput. Vision, vol. 23, no. 2, pp. 185-198, 1997.

C. Slama, Manual of Photogrammetry, 4th ed. Falls Church, VA: Amer. Soc. Photogrammetry, 1980.

V.N. Smelyanskiy, P. Cheeseman, D. Maluf, and R. Morris, “Bayesian super-resolved surface reconstruction from images,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2000, pp I:375-382.

R. Szeliski, “Image mosaicing for tele-reality applications,” Digital Equip-ment Corp., Cambridge, MA, Tech. rep., 1994.

A.N. Tikhonov and V.Y. Arsenin. Solutions of Ill-Posed Problems. Washing-ton, DC: Winston,Wiley, 1977.

P.H.S. Torr and A. Zisserman, “MLESAC: A new robust estimator with

application to estimating image geometry,” Comput. Vision Image Under-standing, vol. 78, pp. 138-156, 2000.

R. Tsai and T. Huang, “Multiframe image restoration and registration,”

advances Comput.Vision Image Processing, vol. 1, pp. 317-339, 1984.

H. Ur and D. Gross, “Improved resolution from subpixel shifted pictures,”

Graph. Models Image Process., vol. 54, no. 2, pp. 181-186, Mar. 1992.

Y. Wexler and A. Shashua, “Q-warping: Direct computation of quadratic

reference surfaces,” in Proc. IEEE Conf. Computer Vision and Pattern Recog-nition, vol. 1, 1991, pp. 333-338.

W.Y. Zhao and S. Sawhney, “Is super-resolution with optical flow feasi-ble?” in Proc. Euro. Conf. Computer Vision (Lecture Notes in Computer Sci-ence, vol. 2350). Springer-Verlag, 2002, pp. 599-613.

A. Zomet and S. Peleg, “Applying super-resolution to panoramic mosa-ics,” in Workshop Applications of Computer Vision, 1998.

Downloads

Published

2025-02-18

How to Cite

Normo’minov, A. (2025). TOVUSH VA TASVIRLARNI BIRLASHTIRIB SUPER-RESOLUTION UCHUN YANGI GENERATIV MODEL YARATISH. DIGITAL TRANSFORMATION AND ARTIFICIAL INTELLIGENCE, 3(1), 94–99. Retrieved from https://dtai.tsue.uz/index.php/dtai/article/view/v3i115