Research Article | Open Access | Download Full Text
Volume 4 | Issue 2 | Year 2025 | Article Id: DST-V4I2P102 DOI: https://doi.org/10.59232/DST-V4I2P102
Constructing a Multi-Modal Dataset for Digital Learning Feature Extraction
Bao Chau, Anh Le, Quang Cao, Thuy Nguyen, Sang Ho, Giang Ma, Hai Tran
| Received | Revised | Accepted | Published |
|---|---|---|---|
| 09 Feb 2025 | 08 Mar 2025 | 06 Apr 2025 | 30 Apr 2025 |
Citation
Bao Chau, Anh Le, Quang Cao, Thuy Nguyen, Sang Ho, Giang Ma, Hai Tran. “Constructing a Multi-Modal Dataset for Digital Learning Feature Extraction.” DS Journal of Digital Science and Technology, vol. 4, no. 2, pp. 23-42, 2025.
Abstract
The rapid advancement of technology has catalyzed the widespread adoption of online platforms, transforming communication, learning, and professional practices across diverse sectors. In education, this technological shift has spurred the integration of digital tools to enhance pedagogical methodologies. A critical component of this integration is the development of high-quality, realistic datasets to train educational models and tools. This study aims to address this need by constructing VNEC2018, a standardized image dataset derived from Vietnam’s 2018 General Education Program. The dataset is designed to support and elevate student learning outcomes through accurate and representative digital resources. By adhering to this rigorous framework, VNEC2018 is positioned to become a benchmark resource for educational technology, facilitating the creation of robust training models and addressing the evolving demands of digital-age pedagogy.
Keywords
Digital learning materials, Graph Neural Network, General education programs, Multilabel image classification.
References
[1] Neha Gupta, “Chapter One - Introduction to Hardware Accelerator Systems for Artificial Intelligence and Machine Learning,” Advances in Computers, vol. 122, pp. 1-21, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[2] Huu-Tai Thai, “Machine Learning for Structural Engineering: A State-of-the-Art Review,” Structures, vol. 38, pp. 448-491, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[3] Trinh Le Hong Phuong, ” Building digitized lear's resourcees to enhance the teachat and learning of some contents of chemistry in high school, Ho Chi Minh City University of Education Journal of Science, (8), 178-187. [Online]. Available: https://journal.hcmue.edu.vn/index.php/hcmuejos/article/view/1803/1792.
[4] Ministry of Education and Training, Decision No. 3784/QD-BGDDT on guidelines for developing digital learning materials on MOOC platform application, Law Library, 2022. [Online]. Available: https://thuvienphapluat.vn/van-ban/Cong-nghe-thong-tin/Quyet-dinh-3784-QD-BGDDT-2022-Huong-dan-xay-dung-hoc-lieu-so-tren-ung-dung-MOOCs-609186.aspx
[5] Ministry of Education and Training, General Curriculum of General Education Program (Circular No. 32/2018/TT-BGDĐT), Socialist Republic of Vietnam, 2018. [Online]. Available: https://moet.gov.vn/content/tintuc/Lists/News/Attachments/8421/chuong-trinh-tong-the-ctgdpt-2018.pdf.
[6] Franck Anaël Mbiaya et al., “Dataset for Image Classification with Knowledge,” Data in Brief, vol. 57, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[7] Tsung-Yi Lin et al., “Microsoft COCO: Common Objects in Context,” Computer vision - ECCV 2014, 13th European Conference, Zurich, Switzerland, pp. 740-755, 2014.
[CrossRef] [Google Scholar] [Publisher Link]
[8] Yazhou Yao et al., “A New Web-Supervised Method for Image Dataset Constructions,” Neurocomputing,, vol. 236, pp. 23-31, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[9] Loris Nanni et al., “Comparison of Different Image Data Augmentation Approaches,” Journal of Imaging, vol. 7, no. 12, pp. 1-13, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[10] Yazhou Yao et al., “ Exploiting Web Images for Dataset Construction: A Domain Robust Approach," IEEE Transactions on Multimedia, vol. 19, no. 8, pp. 1771-1784, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[11] David Morris, Eric Müller-Budack, and Ralph Ewerth, “Slideimages: A Dataset for Educational Image Classification,” Advances in Information Retrieval, 42nd European Conference on IR Research, Lisbon, Portugal, pp. 289-296, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[12] K.V. Jobin, Ajoy Mondal, and C.V. Jawahar, “DocFigure: A Dataset for Scientific Document Figure Classification,” IEEE International Conference on Document Analysis and Recognition Workshops, Sydney, NSW, Australia, pp. 74-79, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[13] Ministry of Education and Training, Circular No. 15/2017/TT-BGDDT Amending Regulations on Working Regime for High School Teachers, Law Library, 2017. [Online]. Available: https://thuvienphapluat.vn/van-ban/Bo-may-hanh-chinh/Thong-tu-15-2017-TT-BGDDT-sua-doi-Quy-dinh-che-do-lam-viec-doi-voi-giao-vien-pho-thong-341252.aspx
[14] L. Parrott and R. Kok, “Design and Development of Multimedia Courseware: An Overview,” Canadian Society for Bioengineering, vol. 39, no. 2, pp. 131-137, 1997.
[Google Scholar] [Publisher Link]
[15] Jan van der Akker, Paul Keursten, and Tjeerd Plomp, “The Integration of Computer Use in Education,” International Journal of Educational Research, vol. 17, no. 1, pp. 65-76, 1992.
[CrossRef] [Google Scholar] [Publisher Link]
[16] Rahul Shrivastava, Yogendra Kumar Jain, and Ajay Kumar Sachan, “Designing and Developing e-Learning Solution: Study on Moodle 2.0,” International Journal of Machine Learning, vol. 3, no. 3, pp. 305-308, 2013.
[CrossRef] [Google Scholar] [Publisher Link]
[17] Connecting knowledge with life [Digital textbooks], Vietnam Education Publishing House, Hanoi, 2020. [Online]. Available: https://hanhtrangso.nxbgd.vn/sach-dien-tu?book_active=0&classes=1
[18] Creative horizons [Digital textbooks], Vietnam Education Publishing House, Hanoi, 2020. [Online]. Available: https://hanhtrangso.nxbgd.vn/sach-dien-tu?book_active=1&classes=1
[19] Janis Osis, and Uldis Donins, “Software Designing With Unified Modeling Language Driven Approaches,” Topological UML Modeling, pp. 53-82, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[20] Stephen M. Kosslyn, “Understanding Charts and Graphs,” Applied Cognitive Psychology, vol. 3, no. 3, pp. 185-225, 1989.
[CrossRef] [Google Scholar] [Publisher Link]
[21] Leila Mohammadpour et al., “A Survey of CNN-based Network Intrusion Detection,” Applied Sciences, vol. 12, no. 16, pp. 1-34, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[22] Chanthini Baskar et al., “Computer Graphic and Photographic Image Classification using Transfer Learning Approach,” Signal Processing, vol. 39, no. 4, pp. 1267-1273, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[23] Ke Dong et al., “MobileNetV2 Model for Image Classification,” IEEE 2nd International Conference on Information Technology and Computer Application, Guangzhou, China, pp. 476-480, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[24] PyTorch, ResNet18 - Torchvision main documentation. 2025, [Online]. Available: https://pytorch.org/vision/main/models/generated/torchvision.models.resnet18.html#torchvision.models.resnet18
[25] Carl F. Sabottke, and Bradley M. Spieler, “The Effect of Image Resolution on Deep Learning in Radiography,” Radiology: Artificial Intelligence, vol. 2, no. 1, pp. 1-7, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[26] Mingxing Tan, and Quoc Le, “Efficientnet: Rethinking Model Scaling for Convolutional Neural Networks,” Proceedings of the 36th International Conference on Machine Learning, vol. 97, pp. 6105-6114, 2019.
[Google Scholar] [Publisher Link]
[27] Guanghan Ning et al., “Data Augmentation for Object Detection via Differentiable Neural Rendering,” arXiv, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[28] Connor Shorten, and Taghi M. Khoshgoftaar, “A Survey on Image Data Augmentation for Deep Learning,” Journal of Big Data, vol. 6, no. 1, pp. 1-48, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[29] Sima Siami-Namini, Neda Tavakoli, and Akbar Siami Namin, “The performance of LSTM and BILSTM in Forecasting Time Series,” IEEE International Conference on Big Data, Los Angeles, CA, USA, pp. 3285-3292, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[30] Zhiheng Huang, Wei Xu, and Kai Yu, “Bidirectional LSTM-CRF Models for Sequence Tagging,” arXiv, 2015.
[CrossRef] [Google Scholar] [Publisher Link]
[31] Yoon Kim et al., “Character-Aware Neural Language Models,” In Proceedings of the AAAI Conference on Artificial Intelligence, vol. 30, no. 1, pp. 2741- 2749, 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[32] Abdeslam El Harraj, and Naoufal Raissouni, “OCR Accuracy Improvement on Document Images through a Novel Pre-Processing Approach,” arXiv, 2015.
[CrossRef] [Google Scholar] [Publisher Link]
[33] OpenCV, Color Conversions, 2025. [Online]. Available: https://docs.opencv.org/3.4/de/d25/imgproc_color_con
versions.html
[34] JaidedAI/EasyOCR, 2023. [Online]. Available: https://github.com/JaidedAI/EasyOCR
[35] Zhang A. et al., Residual Networks (ResNet) and Model Design, Dive into Deep Learning, 2013. [Online]. Available: https://vi.d2l.ai/chapter_convolutional-modern/resnet.html
[36] Kamal Kamal, and Hamid Ez-Zahraouy, “A Comparison between the VGG16, VGG19 and RESNET50 Architecture Frameworks for Classification of Normal and CLAHE Processed Medical Images,” 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[37] Ngo Huu Huy et al., “Comparative Performance of Resnet50 and Vgg16 in Lung Infection Detection,” Advances in Information and Communication Technology, Proceedings of the 3rd International Conference, pp. 733-744, 2025.
[CrossRef] [Google Scholar] [Publisher Link]
[38] Kaiming He et al., “Deep Residual Learning for Image Recognition,” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770-778, 2016.