METHODS AND SOFTWARE TOOLS FOR UPLOADING PUBLICATION DATA OF A SCIENTIFIC ORGANIZATION POSTED ON THE ELIBRARY.RU
Publication reports are an important part of the overall reporting process in a scientific organization. They are necessary not only for monitoring the organization's status but also for planning future activities. Publication reports include information on the count and list of publications in various formats, such as articles from the Russian Science Citation Index, articles in journals from the HAC list, articles in journals from the "White List," and many others. Currently, the main source of input data for publication reports is the scientific electronic library eLIBRARY.RU. Lists of publications from the organization's profile on eLIBRARY.RU can be manually uploaded, through the API, using tools from the SCIENCE INDEX analytical system, or via web scraping techniques. This article describes the development of algorithms and software tools for extracting and scraping page content, which allow for the retrieval of data about titles, author lists, and journal information in spreadsheet form. The automatic saving of web pages and their scraping is based on emulation browser user actions and the use of specific HTML tags. The created Python scripts can be compiled into standalone executable files, reducing the time required to upload and transform the content of eLIBRARY.RU web pages into spreadsheets by 5 minutes for every 100 articles in search results during manual processing.
Reznichenko O.S. Methods and software tools for uploading publication data of a scientific organization posted on the eLIBRARY.RU // Research result. Information technologies. – T.10, №3, 2025. – P. 3-19. DOI: 10.18413/2518-1092-2025-10-3-0-1
While nobody left any comments to this publication.
You can be first.
1. Babynina L.S., Grunina I.S. Priority-2030 Program as A Development of Conceptual Approaches in The Assessment of Higher Education Institutions' Performance. KANT. 2024. No. 1(50). P. 4-11. EDN: AAMGCY. DOI: 10.24923/2222-243X.2024-50.1 URL: https://www.elibrary.ru/download/elibrary_65310545_87336556.pdf (accessed: 03 April 2025)
2. Polukhin O.N., Mamatov A.V., Spichak I.V., et al. Formation of an innovative educational system at the Belgorod State National Research University as the basis for training world-class personnel at the Scientific and Educational Center “Innovative Solutions in the Agro-Industrial Complex” in the Belgorod region. Dostizheniya nauki i tekhniki APK. 2020. No. 34(9). P. 9-13. DOI: 10.24411/0235-2451-2020-10902. URL: https://www.elibrary.ru/download/elibrary_44095025_15844399.pdf (accessed: 03 April 2025)
3. Kochetkov D.M. Russian Journal Whitelist: Questions to be answered. Science Editor and Publisher. 2022. No. 7(2). P. 185-190. URL: https://doi.org/10.24069/SEP-22-48 (accessed: 03 April 2025)
4. Moed H.F., Markusova V., Akoev M. Trends in Russian research output indexed in Scopus and Web of Science // Scientometrics. 2018. No. 116. P. 1153-1180. URL: https://doi.org/10.1007/s11192-018-2769-8 (accessed: 03 April 2025)
5. About eLIBRARY.RU project. Scientific Electronic Library. 2025. URL: https://elibrary.ru/defaultx.asp (accessed: 03 April 2025)
6. Russian Science Citation Index. Scientific Electronic Library. 2025. URL: https://elibrary.ru/projects/citation/cit_index.asp (accessed: 03 April 2025)
7. About RSCI project. Scientific Electronic Library 2025. URL: https://www.elibrary.ru/rsci_about.asp? (accessed: 03 April 2025)
8. List of peer-reviewed journals in which the principal scientific findings of dissertations for the degrees of Candidate of Sciences and Doctor of Sciences must be published (as of 10.06.2024). 2024. URL: https://vak.minobrnauki.gov.ru/uploader/loader?type=19&name=91107547002&f=23267 (accessed: 03 April 2025)
9. Application Programming Interface API. Scientific Electronic Library. 2025. URL: https://www.elibrary.ru/projects/api/api_info.asp (accessed: 03 April 2025)
10. Science Index. Interface for a company representative. Scientific Electronic Library. 2025. URL: https://www.elibrary.ru/projects/science_index/science_index_org_info.asp? (accessed: 03 April 2025)
11. Gintoft A.S., Novgorodov P.A., Korenev A.N. Integration Module for Importing Scientific Metric Data from a Scientific Electronic Library. Certificate of state registration of a computer program. 2021. No. RU 2021663072 at 04.08.2021. URL: https://www.elibrary.ru/item.asp?id=46484977 (accessed: 03 April 2025)
12. Reznichenko O.S. Algorithms and Tools for Processing Data on Scientific Organizations' Articles Uploaded to eLIBRARY.RU. Economics. Information Technologies. 2025. No. 52(1). P. 181-193). DOI: 10.52575/2687-0932-2025-52-1-181-193
13. Popov A.Y., Remez M.V., Zhilina E.V., Ozhiganova M.I. Parsing of electronic resources. Selenium library or fake useragent? Informatization in the Digital Economy. 2022. No. 3(4), P. 197-210. URL: https://doi.org/10.18334/ide.3.4.115219 (accessed: 03 April 2025)
14. Zapekin S., Shitov V., Zavaruev I. eLibrary Scientific Journal Parser // GitHub. 2021. URL: https://github.com/Lfdd/Parser (accessed: 03 April 2025)
15. Supported Browsers. 2025. URL: https://www.selenium.dev/documentation/webdriver/browsers/ (accessed: 03 April 2025)
16. van Rossum G. at al. Python. 2025. URL: https://www.python.org (accessed: 03 April 2025)
17. JetBrain. PyCharm: The Python IDE for Professional Developers. 2025. URL: https://www.jetbrains.com/pycharm/ (accessed: 03 April 2025)
18. Van den Bossche J. at al. pandas.DataFrame. 2024. URL: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html (accessed: 03 April 2025)
19. Richardson L. Beautiful Soup Documentation. 2015. URL: https://beautiful-soup.readthedocs.io/en/latest/ (accessed: 03 April 2025)
20. Cortesi D. PyInstaller Manual. 2025. URL: https://pyinstaller.org/en/stable/index.html (accessed: 03 April 2025)
21. van Rossum G. at al. 5.1.3. List Comprehensions. 2025. URL: https://docs.python.org/3/tutorial/datastructures.html#list-comprehensions (accessed: 03 April 2025)
22. Reznichenko O. Appendix to article "Methods and Software Tools for Uploading Publication Data of a Scientific Organization Posted on eLIBRARY.RU" // GitHub. 2025. URL: https://github.com/leo-phoenix/elibrary_html_to_xls (accessed: 03 April 2025)