تشخیص الگوهای معنایی و استخراج موضوعات رأس برای اصطلاحنامه علم اطلاعات و دانش‌شناسی با تکیه‌بر تکنیک‌های پیشرفته پردازش متن و تحلیل محتوا

حاجی‌زین‌العابدینی, محسن; کشاورز, حمید; زمانی کلجاهی, مهنام

doi:10.30473/mrs.2024.11582

- فلوچارت مراحل ورود و ثبت نام در سامانه نشریات علمی وزارت عتف

- راه اندازی سیستم مدیریت نشریات علمی دانشگاه پیام نور

تعداد نشریات	49
تعداد شماره‌ها	1,269
تعداد مقالات	10,971
تعداد مشاهده مقاله	22,475,061
تعداد دریافت فایل اصل مقاله	15,147,550

حاجی‌زین‌العابدینی, محسن, کشاورز, حمید, زمانی کلجاهی, مهنام. (1403). تشخیص الگوهای معنایی و استخراج موضوعات رأس برای اصطلاحنامه علم اطلاعات و دانش‌شناسی با تکیه‌بر تکنیک‌های پیشرفته پردازش متن و تحلیل محتوا. فصلنامه علمی زیست فناوری گیاهان زراعی, 11(شماره 3 (پیاپی 42)), 47-60. doi: 10.30473/mrs.2024.11582

محسن حاجی‌زین‌العابدینی; حمید کشاورز; مهنام زمانی کلجاهی. "تشخیص الگوهای معنایی و استخراج موضوعات رأس برای اصطلاحنامه علم اطلاعات و دانش‌شناسی با تکیه‌بر تکنیک‌های پیشرفته پردازش متن و تحلیل محتوا". فصلنامه علمی زیست فناوری گیاهان زراعی, 11, شماره 3 (پیاپی 42), 1403, 47-60. doi: 10.30473/mrs.2024.11582

حاجی‌زین‌العابدینی, محسن, کشاورز, حمید, زمانی کلجاهی, مهنام. (1403). 'تشخیص الگوهای معنایی و استخراج موضوعات رأس برای اصطلاحنامه علم اطلاعات و دانش‌شناسی با تکیه‌بر تکنیک‌های پیشرفته پردازش متن و تحلیل محتوا', فصلنامه علمی زیست فناوری گیاهان زراعی, 11(شماره 3 (پیاپی 42)), pp. 47-60. doi: 10.30473/mrs.2024.11582

حاجی‌زین‌العابدینی, محسن, کشاورز, حمید, زمانی کلجاهی, مهنام. تشخیص الگوهای معنایی و استخراج موضوعات رأس برای اصطلاحنامه علم اطلاعات و دانش‌شناسی با تکیه‌بر تکنیک‌های پیشرفته پردازش متن و تحلیل محتوا. فصلنامه علمی زیست فناوری گیاهان زراعی, 1403; 11(شماره 3 (پیاپی 42)): 47-60. doi: 10.30473/mrs.2024.11582

	تشخیص الگوهای معنایی و استخراج موضوعات رأس برای اصطلاحنامه علم اطلاعات و دانش‌شناسی با تکیه‌بر تکنیک‌های پیشرفته پردازش متن و تحلیل محتوا
پژوهش های کتابخانه های دیجیتالی و هوشمند
دوره 11، شماره 3 (پیاپی 42) - شماره پیاپی 41، آبان 1403، صفحه 47-60 اصل مقاله (1.68 M)
شناسه دیجیتال (DOI): 10.30473/mrs.2024.11582
نویسندگان
محسن حاجی‌زین‌العابدینی¹؛ حمید کشاورز²؛ مهنام زمانی کلجاهی³
¹استایار، گروه علم اطلاعات و دانش‌شناسی، دانشگاه شهید بهشتی، تهران، ایران.
²استادیار، گروه علم اطلاعات و دانش‌شناسی دانشگاه شهید بهشتی، تهران، ایران
³دانشجوی کارشناسی ارشد، گروه علم اطلاعات و دانش‌شناسی دانشگاه شهید بهشتی، تهران، ایران.
چکیده
در عصر انفجار اطلاعات، حوزه علم اطلاعات و دانش‌شناسی به دنبال ساده‌سازی و ارتقای فرآیند تولید اصطلاحنامه است. این هدف با استفاده از تکنیک‌های متن‌کاوی و الگوریتم‌های یادگیری ماشین تحقق می‌یابد. رویکرد پیشنهادی شامل استخراج خودکار موضوعات از داده‌های متنی بدون ساختار و شناسایی مفاهیم کلیدی در حوزه علم اطلاعات و دانش‌شناسی است. هدف اصلی این پژوهش، بهبود و توسعه اصطلاحنامه با تمرکز بر تکنیک‌های متن‌کاوی است. این رویکرد به‌طور مؤثری بازیابی اطلاعات را تسهیل می‌کند و فرآیند تولید اصطلاحنامه را ساده‌سازی می‌کند. روش‌شناسی پژوهش شامل چند مرحله اصلی است. ابتدا، چکیده‌های مقالات مرتبط با حوزه علم اطلاعات و دانش‌شناسی از پایگاه استنادی Web of Science در بازه زمانی 2022-1968 جمع‌آوری شدند. داده‌ها در پایتون پیش‌پردازش شدند تا از نویسه‌ها و نمادهای غیرضروری پاک‌سازی شوند. سپس، الگوریتم TextRank با استفاده از کتابخانه‌های Pandas و NLTK برای کشف موضوعات پنهان در متن‌ها اعمال شد. این فرآیند تکراری به شناسایی موضوعات رأس در حوزه موضوعی منجر شد. در نهایت، با تحلیل و مقایسه اصطلاحنامه دستی موجود و بررسی معیارهای انسجام موضوع و پوشش موضوعی، اثربخشی رویکرد پیشنهادی ارزیابی و اصطلاحات رأس انتخاب شدند. این روش به‌طور مؤثری از داده‌های بزرگ برای استخراج موضوعات کلیدی در حوزه علم اطلاعات و دانش‌شناسی استفاده کرد. یافته‌های پژوهش بیان می‌کند که این مطالعه با استفاده از تکنیک‌های متن‌کاوی و الگوریتم TextRank، به استخراج موضوعات کلیدی و انتخاب موضوعات رأس پرداخته است. نتایج نشان‌دهنده شناسایی 17 موضوع اصلی در حوزه علم اطلاعات و دانش‌شناسی است. این موضوعات شامل حوزه‌های مهمی مانند آرشیوها و مراکز اطلاعاتی، هوش مصنوعی، کتابشناختی، رده‌بندی، توسعه مجموعه، واژگان کنترل‌ شده، کتابخانه‌های دیجیتال، سازمان‌دهی اطلاعات، بازیابی اطلاعات و استخراج داده‌ها، علم اطلاعات و کتابداری، نظام‌های اطلاعات و منابع، مدیریت دانش، کتابخانه‌ها و خدمات اجتماعی، فراداده، خدمات مرجع، سرعنوان‌های موضوعی و علم‌سنجی هستند. این فهرست موضوعات رأس به‌طور مؤثری نماینده مفاهیم کلیدی در حوزه علم اطلاعات و دانش‌شناسی است و می‌تواند به‌عنوان پایه‌ای برای توسعه اصطلاحنامه و بهبود فرآیند بازیابی اطلاعات استفاده شود. این پژوهش با بهره‌گیری از روش‌های متن‌کاوی و الگوریتم‌های پیشرفته، به استخراج و پیشنهاد موضوعات کلیدی برای اصطلاح رأس از طریق تجزیه و تحلیل دقیق منابع متنی، پرداخت.
کلیدواژه‌ها
اصطلاحنامه؛ علم اطلاعات و دانش‌شناسی؛ متن‌کاوی؛ موضوعات رأس
عنوان مقاله [English]
Recognizing Semantic Patterns and Extracting Top Topics for the Thesaurus of Information Science and Epistemology by Relying on Advanced Text Processing and Content Analysis Techniques
نویسندگان [English]
Mohsen HajiZeinolabedini¹؛ Hamid Keshavarz²؛ Mahnam Zamani Kalajahi³
¹Assistant Professor, Department of Knowledge and Information Science, Shahid Beheshti University, Tehran, Iran
²Assistant Professor, Department of Knowledge and Information Science, Shahid Beheshti University, Tehran, Iran.
³Msc Student, Department of Knowledge and information Science, Shahid Beheshti University, Tehran, Iran.
چکیده [English]
In the age of information explosion, the field of information science and knowledge seeks to simplify and improve the thesaurus production process. This goal is realized by using text mining techniques and machine learning algorithms. The proposed approach includes automatically extracting topics from unstructured text data and identifying key concepts in the field of information science and knowledge. The main goal of this research is to improve and develop the thesaurus by focusing on text mining techniques. This approach effectively facilitates information retrieval and simplifies the thesaurus generation process. This study includes several main steps. First, abstracts of articles related to the field of information science and knowledge were collected from the Web of Science citation database in the period of 1968-2022. Data were preprocessed in Python to remove unnecessary characters and symbols. Then, TextRank algorithm was applied using Pandas and NLTK libraries to discover hidden topics in texts. This iterative process led to the identification of top topics in the subject area. Finally, by analyzing and comparing the existing manual thesaurus and examining the criteria of subject coherence and thematic coverage, the effectiveness of the proposed approach was evaluated and the top terms were selected. This method effectively used big data to extract key topics in the field of information science and knowledge. This study has extracted key topics and selected top topics using text mining techniques and TextRank algorithm. The results indicate the identification of 17 main issues in the field of information science and knowledge. These topics include important areas such as archives and information centers, artificial intelligence, bibliography, classification, collection development, controlled vocabulary, digital libraries, information organization, information retrieval and data extraction, information science and librarianship, information systems and resources, knowledge management, Libraries and community services are metadata, reference services, subject headings, and scientology. This list of top topics effectively represents key concepts in the field of information science and knowledge and can be used as a basis for developing a thesaurus and improving the information retrieval process. Using text mining methods and advanced algorithms, this research extracted and proposed key topics for the term Ras through detailed analysis of textual sources.
کلیدواژه‌ها [English]
Thesaurus, Information Science and Knowledge, Text Mining, Top Topics

مراجع
Aase, K. G. (2011). Text mining of news articles for stock price predictions (Master’s thesis, Institutt for datateknikk informasjonsvitenskap). Abol-sadegh, S. (2011). Application of Text Mining in Reviewing Industrial Engineering Literature. Master's Thesis, Industrial Engineering, Faculty of Engineering, Yazd University, Yazd, Iran. (In Persian)) Aitchison, J., & Clarke, S. D. (2004). The thesaurus: a historical viewpoint, with a look to the future. Cataloging & classification quarterly, 37(3-4), 5-21. Doi: 10.1300/J104v37n03_02 Baba-Aghaei, S. (2013). Discovery of the Internal Structure of Positive Psychology Studies Using Text Mining. Master's Thesis, Information Science and Knowledge Management, Faculty of Educational Sciences and Psychology, Allameh Tabataba'i University, Tehran, Iran. (In Persian) Baruni, J. S., & Sathiaseelan, J. G. R. (2020). Keyphrase extraction from document using RAKE and TextRank algorithms. Int. J. Comput. Sci. Mob. Comput, 9(9), 83-93. Doi:10.47760/IJCSMC.2020.v09i09.009 De Jesus Holanda, A., Pisa, I. T., Kinouchi, O., Martinez, A. S., & Ruiz, E. E. S. (2004). Thesaurus as a complex network. Physica A: Statistical Mechanics and its Applications, 344(3-4), 530-536. Doi: 10.1016/j.physa.2004.06.025 Ghanadinezhad, F., Osareh, F., & Ghane, M.R. (2023). Thematic analysis of scientific productions of Iranian researchers in the field of knowledge and information science with text mining approach. Library and Information Sciences, 26(2), 223-249. (In Persian) Doi:10.30481/lis.2021.298842.1862 Hassanzadeh, M., Zandian, F., Ahmadi Meinagh, S.S. (2018). Mapping the cognitive structure and its evolution in "Knowledge and Information Science": text mining approach (2004-2013). Scientometric research journal, 4(8), 123-142. (In Persian) Doi: 10.22070/rsci.2018.616 Kardan, A.A., & Kaihaninejad, M. (2012). Proposing a Model for Extracting Information from Textual Documents, Based on Text Mining in E-learning, Journal of Information and Communication Technology, 4(11), 47-54. (In Persian) Kit, C., & Nie, J. Y. (2023). Information retrieval and text mining. In Routledge Encyclopedia of Translation Technology (pp. 601-642). Routledge. Liu, W., Sun, Y., Yu, B., Wang, H., Peng, Q., Hou, M., & Liu, C. (2024). Automatic Text Summarization Method Based on Improved TextRank Algorithm and K-Means Clustering. Knowledge-Based Systems, 287(1), 111447. Doi: 10.1016/j.knosys.2024.111447 Pons-Porrata, A., Berlanga-Llavori, R., & Ruiz-Shulcloper, J. (2007). Topic discovery based on text mining techniques. Information Processing & Management, 43(3), 752-768. https://doi.org/10.1016/j.ipm.2006.06.001 Silwattananusarn, T., & Kulkanjanapiban, P. (2022). A text mining and topic modeling based bibliometric exploration of information science research. IAES International Journal of Artificial Intelligence, 11(3), 1057. DOI: http://doi.org/10.11591/ijai.v11.i3.pp1057-1065 Teimourpour, B. (2009). Discovery of Emerging Trends in Scientific Fields Based on Dynamic Clustering Using Text Mining and Link Analysis. Doctoral Dissertation, Information Technology in Industrial Engineering, Faculty of Engineering, Tarbiat Modares University, Tehran, Iran. (In Persian) Wang, X., Xu, X., Zhang, J., Zhu, Y., Fan, Y., & Xu, P. (2021). Research on intelligent construction algorithm of subject knowledge thesaurus based on literature resources. In Journal of Physics: Conference Series. 1955(012038). IOP Publishing. Yan, B. N., Lee, T. S., & Lee, T. P. (2015). Analysis of research papers on E-commerce: (2000–2013) based on a text mining approach. Scientometrics, 105(1), 403-417. Doi: 10.1007/s11192-015-1675-6
آمار تعداد مشاهده مقاله: 412 تعداد دریافت فایل اصل مقاله: 472

سامانه مدیریت نشریات علمی. قدرت گرفته از سیناوب

پیوندهای مفید

اخبار و اعلانات

آمار

تشخیص الگوهای معنایی و استخراج موضوعات رأس برای اصطلاحنامه علم اطلاعات و دانش‌شناسی با تکیه‌بر تکنیک‌های پیشرفته پردازش متن و تحلیل محتوا