16+
DOI: 10.18413/2518-1092-2025-10-3-0-1

METHODS AND SOFTWARE TOOLS FOR UPLOADING PUBLICATION DATA OF A SCIENTIFIC ORGANIZATION POSTED ON THE ELIBRARY.RU

Publication reports are an important part of the overall reporting process in a scientific organization. They are necessary not only for monitoring the organization's status but also for planning future activities. Publication reports include information on the count and list of publications in various formats, such as articles from the Russian Science Citation Index, articles in journals from the HAC list, articles in journals from the "White List," and many others. Currently, the main source of input data for publication reports is the scientific electronic library eLIBRARY.RU. Lists of publications from the organization's profile on eLIBRARY.RU can be manually uploaded, through the API, using tools from the SCIENCE INDEX analytical system, or via web scraping techniques. This article describes the development of algorithms and software tools for extracting and scraping page content, which allow for the retrieval of data about titles, author lists, and journal information in spreadsheet form. The automatic saving of web pages and their scraping is based on emulation browser user actions and the use of specific HTML tags. The created Python scripts can be compiled into standalone executable files, reducing the time required to upload and transform the content of eLIBRARY.RU web pages into spreadsheets by 5 minutes for every 100 articles in search results during manual processing.

Number of views: 19 (view statistics)
Количество скачиваний: 36
Full text (PDF)To articles list
  • User comments
  • Reference lists

While nobody left any comments to this publication.
You can be first.

Leave comment: