The Scholar Relation Analysis Map Using Linked Open Data on the Research Platform of the Research Information Sharing Service (RISS)
Abstract
The outbreak of the COVID-19 pandemic facilitated opportunities for contact-free (or “contactless”) academic activities. In particular, academic activity has become active on online platforms, and researchers have demonstrated a growing tendency to search information more accurately and conveniently on the platforms that carry numerous academic information. In 1998, the Korea Education and Research Information Service (KERIS), a research institute funded by the Ministry of Education, developed a platform that shares and collects academic research information, namely, the Research Information Sharing Service (RISS). The RISS platform encompasses all domestically-produced research information and the overseas research information. The RISS consists of union catalogues service, inter-library loan service, dCollection system, research information analytics service, and overseas theses information service. The RISS has accumulated not only the domestic theses and dissertations but also various academic data over the past two decades. There has also been an increasing demand for easily obtaining the platform data added in real time. As such, using the liked open data (LOD) through the RISS, the KERIS has linked data with others and formed a useful environment for using data to help the researchers with conducting their academic activities. By analyzing and processing the data produced on the RISS platform and reflecting the users’ patterns of uses, the KERIS has implemented a new scholar relation analysis map (“SAM”) service. In this context, this article introduces thesis-researcher relation analysis service, thesis usage analysis service, thesis-researcher impact analysis service, and latest research trend analysis service.
Keywords:
RISS, Big Data, Research Platform, Research Information Analytics1. Introduction
The development of information and communications technology (ICT) in the second half of the 20th century led to the innovations which demolished temporal and spatial barriers and raised efficiency using the ICT across all areas, including the economic, social and the educational sectors. It also initiated a great change in research as well. With digitalization, the paper-printed theses have been collected as a database, and various search technologies have helped researchers save their time and cost. Moreover, a rapid development of artificial intelligence, Cloud computing, Big Data, and mobile-related technologies have recently accelerated data collection to the extent that the data generated for a year surpass those produced for the past century. Furthermore, the accurate and speedy information search has emerged as a very important issue for the researchers.
Big Data carries characteristics which are different from the existing data. The data created in real time is characterized by its volume and velocity. For example, Facebook (currently, “Meta”) has produced more than 30 petabyte (PB) data, adding over 10 terabyte (TB) to its data every day. Furthermore, data has been accumulated across various forms of information, including the multimedia and social network services. When it comes to research information, it also became important to store various types of the information needed for the research process and to demonstrate the relations rather than simply managing the research outcomes of the researchers. As such, the KERIS has built the Research Information Sharing Service (RISS), a platform which collects the research data related to the domestic studies, and provided correlations of the collected data. To enhance the efficiency of search, the KERIS has also provided services for the users to obtain the information related to the applicable key words by using the linked open data (LOD) rather than taking only the results related to specific key words. The research environment has changed a lot following the development of new technologies. The development of Korean private research support services and large portal services including Naver are also providing research information systematically. Moreover, the overseas information search services such as Google Scholar provide a variety of information. However, while the RISS service includes linguistic features, it is still the service which the domestic researchers seek first. Therefore, it is necessary to develop the research information services that meet the needs of the domestic users and secure global competitiveness by utilizing new technologies such as Artificial Intelligence.
2. Research Information Sharing and Collection Platform
2.1 Research Information Sharing Service (RISS)
Launched in May 1998, the Research Information Sharing Service (RISS) is an open service for the people, including undergraduate and graduate students, professors, researchers, and the general public. Its purpose is to rapidly and accurately provide domestic and overseas theses and books as well as the higher education related research information, including the colleges’ open lectures, to create Korea’s national research competitive edge.
The RISS is developed and operated by the Korea Education and Research Information Service (KERIS), a research institute funded by the Ministry of Education. The KERIS is also a quasi-government organization that develops and implements policies and services by applying the information and communications technologies (ICT) across all levels of education, spanning from preschool to life-long education. The RISS consists of union catalogue service, inter-library loan service, dCollection system, research information analytics service, and information service for the overseas academic papers. First of all, the union catalogue service is a cooperative system that jointly prepares and shares the catalogues and collection information of resources and materials held by the university libraries nationwide. With all four-year universities across the nation participating in the system, its member universities provide the catalogues and collection information of the domestic and overseas resources they retain. Second, supported by the RISS, the inter-library loan service lends resources and books out to the participating universities. This service aims to improve the efficiency of loan service by reducing the relevant cost for establishing the catalogues of overlapping data held by each university library. The RISS provides integrated search for established data, thereby helping the researchers save their time and cost for collecting the relevant information. Third, the dCollection system was built to secure the latest original documents produced by individual universities and research institutes. The dCollection system automatically collects in real time diverse academic resources, including academic papers, treatises for academic journals, research reports, and digital original documents produced by universities and research institutes. The original documents this system collected are provided for various users, free of charge, through the RISS. Where the digital original documents are not established, the users can copy the documents or take advantage of the loan service by using the collection information of the member universities that are established through the union catalogues. Lastly, given the recently growing need for the overseas academic resources, the RISS has enabled the users to search and use such overseas data. The RISS has formed partnerships with the relevant overseas organizations, and provided the integrated search and original document services for them by linking the open access (OA) data or purchasing distinguished theses from abroad. Because the overseas electronic data account for a large portion of the university library budget, the support for the budget has been provided at the national level to reduce the cost for subscribing to the university library resources and enhance the efficiency of the service(Korea Education and Research Information Service, 2021).
As of January 2022, the RISS users can use the bibliographic information on 12.41 million cases of domestic and overseas resources. They can also check which university or organization holds the resources they want by searching collection information on 63.62 million cases. If the individual universities do not provide the digital original documents among the resources they hold, the users can use the documents through the collection information by applying for the document copy and loan services. Furthermore, to support acquisition for libraries, the RISS provides 2.77 million cases of domestic and overseas reference lists. When it comes to theses and dissertations, the users can use 1.64 million domestic theses and dissertations and 240,000 overseas theses and dissertations, free of charge, with the consent of their authors. They can also use the digital original document service for 5.77 million treatises of domestic journals, including treatises established through agreements with research institutes and treatises established through domestic research institutes in the private sector (Table 1).
The overseas electronic data are obtained and built at the national level by means of the foreign research information centers (FRICs) and the university license depending on each project. The FRIC is a project that selects 10 universities from across the nation and partially supports their expenses for purchasing academic journals by theme. By purchasing journals in the designated area of themes, then by sharing the bibliographic information, these universities provide the information for other university users through application for the original document copy service. As of January 2022, the users can use the free document copy service for approximately 37,800 types of foreign journals and 23 million cases of theses and dissertations held by FRICs through the RISS (Table 2).
The university license project is a joint use system for the overseas electronic data by sharing the subscription fees of overseas electronic journals and academic database by and between the government and universities. By supporting the fees in part or in whole and by providing database through the RISS, the government has narrowed the information gaps between universities and expanded the scope of available resources. The RISS members can use the overseas electronic data, free of charge, around the clock from the journals their institutions subscribe to. Even if their institutions do not subscribe to the journals, the members can freely use expensive overseas electronic data through the RISS from 4 p.m. to 9 a. m. the following day.
Since its launch in 1998, the RISS has continuously improved its service and operated in a stable manner, thereby increasing its use every year (Table 3). To enhance the service accessibility and convenience, the RISS has enabled the users to freely use the data search and original document viewing services even without logging in. In 2018, the RISS was linked for the free inter-library loan and delivery services at the National Library for the Disabled in order to meet the need of the time for strengthening the public nature of education and to achieve open innovation of public data. Fiven the participation of 836 public libraries and 42 university libraries in the linked RISS service, university books for the researchers with disabilities were delivered to their home, free of charge. Furthermore, the ‘RISS voice service for theses and dissertations’ was provided for the visually-impaired people through the selection of the papers that reflected their needs.
Furthermore, the RISS was restructured in 2019 to upgrade its domestic and overseas data search functions and to reinforce its customized information service. This led to unifying the domestic and overseas data searches, toward improving readability through the user interface redesign, and providing the customized contents for the users, including the recommendation service for latest popular papers by topic.
Furthermore, a mobile inter-library loan function was developed to enhance the mobile accessibility, and the newly-arrived domestic academic papers were posted in the RISS’ main screen through the collection of the users’ opinions. In this way, the RISS has continually upgraded its system to provide customized services in response to the changing trends and user needs. In addition, the KERIS has operated a customer center to answer inquiries within three hours to help strengthen the customer response activity. Through the center, the KERIS has at all times listened to the users’ requests for and opinions about the services they render, and has actively reflected them on their services. Furthermore, the KERIS has conducted public relations, including various social network service activities and events.
2.2 Union catalogue and inter-library loan services
The union catalogue service helps the domestic university libraries to increase the efficiency of preparing catalogues by jointly preparing and using catalogues of the resources they retain. Through the integrated search of the RISS, the union catalogue service supports the users to reduce their time and cost for collecting academic resources by informing them of the university libraries that hold the information they need. The number of union catalogue service members rose from 148 libraries in 1998 to 797 libraries in June 2021. The catalogue service has grown to include a majority of four-year university libraries and major special libraries across the nation as participants and users.
To improve the quality of union catalogue service, the KERIS has conducted regular training on the joint catalogue system and the integrated KORMARC bibliographic entry guidelines every year. By operating a standardization sub-committee and distributing the KERIS standard entry guidelines, the Korea Education and Research Information Service has provided practical assistance for the member libraries in preparing their catalogues.
Since it began offering the inter-library loan service based on the agreements on the joint use of library resources in 1999, the KERIS has operated the service centered on the inter-library loan member channel ‘Web Inter Library Loan (WILL)’ and the RISS as well. The number of inter-library loan service participants increased from 29 in 1999 to 617 in June 2021. This testifies to the active sharing of academic information between universities largely centered on the inter-library loan service of the RISS. Where resources are unavailable from domestic university libraries, the KERIS has continually expanded the scope of its assistance for researchers to ensure that they can be provided with the resources from the overseas institutions.
2.3 Operation of the digital research information collection system ‘dCollection’
The digital research information collection system ‘dCollection’ refers to a series of processes for collecting and distributing the research information produced by individual institutions, mostly by universities, and for providing integrated services. In some cases, individual universities establish their own systems to collect and distribute the information. In other cases, all universities build a standardized system for their common use. Since 2003, the Korea Education and Research Information Service has established and distributed the digital research information collection system for universities to use across the nation. As of 2022, 246 universities use the digital research information collection system, which collects the online theses and dissertations produced by universities nationwide. Based on open access (OA), the system collects digital research products from academic societies and related institutions, then distributes them to users for free use.
The dCollection system was first developed by the Korea Education and Research Information Service in 2003 as part of the ‘National Research Database-Building Project’ supported by the ICT Promotion Fund of then-Ministry of Information and Communications. The system then expanded its operation to an increased number of universities and improved its functions in 2004 and 2005, respectively. Beginning with four pilot universities in 2002, the system was distributed to 16 universities in 2004, then 20 universities in 2005, and 22 universities in 2006. Then, the dCollection hosting system was developed in 2007 by establishing a central server at the Korea Education and Research Information Service. Universities can use the hosting system by sharing the KERIS resources. The development of the hosting system enabled even more universities to participate in the distribution of research information resources produced by universities based on the dCollection system. The 246 universities currently using the dCollection system represent almost all universities that grant master’s or higher degrees to their graduates in Korea.
The dCollection system was developed in two ways, direct distribution and hosting, depending on where the applicable software was installed and operated. Until 2006, the dCollection system had been implemented in the way of directly installing and distributing the dCollection software at university libraries. In this case, each university prepared its own server and directly managed the system.
However, the small- and medium-sized university libraries faced difficulties managing the system directly due to the costs for labor and server installation. For this reason, the dCollection system was developed in the way of hosting to enable the university libraries to participate in the system. After the introduction of the hosting method, most four-year universities were able to use the dCollection system regardless of their conditions. Still another method of establishing the dCollection system is to build a conversion system, which is to collect the meta data through the connection and conversion with the existing systems held by the applicable institutions. This method of conversion system is used to collect data largely from the agreed-upon institutions or academic papers from the private sector (Fig. 1).
Because numerous universities use the dCollection system, the Korea Education and Research Information Service has established and operated an integrated operation center so that research outcomes could be collected and managed in a smooth manner at university libraries. Through the center, the KERIS supports a stable operation of the system at universities. By reference to cases of universities where the dCollection system was distributed in the initial phase, the KERIS prepared standard operation guidelines and has distributed the guidelines to all participating universities. Furthermore, the KERIS conducts manager education six times every year for the staff in charge of dCollection. The existing staff members receive support largely for upgraded and changed functions, while the new staff members are prepared to promptly undertake their job duties.
As of January 2022, the dCollection system collected and established approximately 1.66 million cases of theses and dissertations plus 5.74 million cases of academic papers and provided original document services through the RISS (Table 4). Through dCollection, more than 60,000 theses and papers are collected every year, and the collected theses and papers are linked to the RISS and freely available for everyone.
According to the current status of establishing a database of theses and dissertations by university, Seoul National University had the largest number of 135,040 cases, followed by Korea University (112,049 cases) and Yonsei University (83,367 cases). Regarding academic papers, Ewha Woman’s University had the largest number of 62,693 cases, followed by Korea University (40,171 cases) and Chungnam National University (26,487 cases).
The KERIS has made effort to improve the dCollection service every year. In 2020, it conducted diverse and significant analyses and provided services with a unified classification system by restructuring the classification system for theses and academic papers based on the standard classification system that integrated different classification systems. The KERIS also expanded freely available academic resources in connection with periodicals and research results of related research information and distribution institutions in Korea. In this way, the KERIS has made constant effort to secure content from various aspects in order to increase convenience for researchers and strengthen research competitiveness.
In 2021, the KERIS introduced the digital object identifier (DOI) to promote the international distribution of university-produced theses and dissertations and to increase the use and citation of academic information by enhancing access convenience. The KERIS also introduced and applied research trend analysis service (U-REKA) on a pilot basis to support research and learning activity of universities and to efficiently manage and use information on research achievements. By doing so, the KERIS prepared a basis for service support to help enable the university libraries to manage their education and research related achievements in a systematic manner, which is their major task.
3. Scholar Relation Analysis Map Using the Linked Open Data
3.1 Linked Open Data (LOD)
Linked data refers to a structured technical method that grants the Internet identifiers to data included in the Web documents and provides associated information links related to the data. As a method to publish structured data, this aims to build a useful environment of using the data by linking the published data to each other. Open data refers to the data which anyone can use freely, and redesign and reproduce. As a concept that encompasses key concepts of the linked data and open data alike, the linked open data (hereinafter called LOD) refers to releasing data to the public according to the principles for publishing linked data. Based on the Web that is already equipped with a huge information eco-system as its platform, the LOD aims to establish open data that complies with the mode of common data understanding and exchange. The LOD has the data format that can be automatically processed by machine(National Information Society Agency, 2014).
The Korea Education and Research Information Service began to provide research information LOD service on a pilot basis in 2013. The goal of this service is to promote the information use of academic resources held by the KERIS and to prepare a framework for providing convergence services in connection with related information. The KERIS has since released the data held by the Scholar Relation Analysis Map (“SAM”) to the public through the Web, and additionally established and published data on theses and academic papers every year.
Currently, the number of established data exceeds 3.80 million cases, including 43,796 cases of theses and dissertations, 368,790 cases of domestic academic papers, and approximately 3.42 million cases of references. With bibliographic ontology (BIBO) as its prototype, modelling of the SAM’s LOD included the vocabulary of simple knowledge organization system (SKOS) to express theme names, and the vocabulary of friend of a friend, machine-readable ontology (FOAF) to describe author names and collection information as well as the schema vocabulary and KERIS vocabulary.
3.2 Scholar Relation Analysis Map (SAM)
As the research information increased following the development of information and communications technology, a few issues requiring resolution arose. First, due to an increase in the service volume resulting from the quantitative growth of research information, it became difficult for the users to search the information they need. Second, due to an increase in the research information service of similar kinds on the Web and the expanded areas of specialized information on the portal, the use of existing services decreased. Third, there emerged limitations of fragmentary research impact analysis by means of the existing Citation Index.
To resolve these issues, it is necessary to 1) endeavor to reduce the gap between accuracy and the information service the users need through the data analysis and the users’ patterns of use analysis, 2) provide a new concept for the research information service (research trend analysis) which is different from the existing research information yet in line with the character of institutions, and 3) add and introduce new assessment elements on top of the Citation Index in order to analyze complex research impacts.
To meet these needs, the KERIS began to provide the scholar relation analysis map (“SAM”) service, and implemented a new scholar relation analysis service called the “SAM” by analyzing, processing and reflecting the produced data and the users’ patterns of use (Fig. 2). Specifically, the SAM is a service that analyzes the RISS-held theses and domestic academic papers and provides the usage and impacts of theses and researchers, and analyzes and provides the research trends of the relevant year. The SAM is largely consisted of the four services of 1) the thesis-researcher relation analysis service (Fig. 3; Fig. 2) thesis usage analysis service; 3) thesis-researcher impact analysis service; and 4) the latest research trend analysis service. First, the thesis-researcher relation analysis service provides the thesis relation diagram by theme and the researcher relation diagram by theme. The thesis relation diagram by theme provides thesis network by theme by analyzing the thesis citing/cited relations. This service also provides the citing/cited relation diagram of individual theses and the researcher relation diagram. The researcher relation diagram by theme provides the co-researcher/quasi-researcher network by theme by analyzing the research results of the researchers. Second, the thesis usage analysis service assesses thesis usage by calculating data statistics of the RISS users’ patterns of use, including the original document downloading, storing in the user’s library, sending, and social network service (SNS) sharing. Third, the thesis-researcher impact analysis service assesses the impacts of theses by theme by calculating the cited number of the theses. This service also assesses the researchers’ sharing of theses compared to the total theses based on the number of their theses by theme. Lastly, the latest research trend analysis service provides a total of four reports (Table 5).
The scholar relation analysis map (“SAM”) has the following scope of data for its analysis. To analyze the research trends, the SAM compares and analyzes domestic dissertations published in the latest year and the previous year and treatises of domestic journals posted in the Korea Citation Index (KCI), then identifies the trend analytical results of the latest year regarding statistics on the RISS users’ data use, the Korean Decimal Classification (KDC)/Dewey Universal Decimal Classification (DDC), and the author key words. To analyze the scholar relation diagram and impacts, the SAM builds domestic dissertations published since 2015 as the scope of its service. As for treatises of domestic journals, the scope is limited to papers posted in the KCI among the domestic journal treatises published since 2014.
General researchers and universities/research institutes can make use of the SAM data in selecting optimal papers needed for their research activities or in selecting research areas based on the evaluation data of researchers and latest research trends. Developers or other related business providers can implement new services using the data. The numeric value of usage currently used as the standard for analysis in the SAM is a relative value that adds up statistics on the RISS users’ data use, including original document download cases and copy/loan application cases. The value is calculated in the following method: first of all, add up RISS statistics on using theses. Then, to calculate the relative value of the statistics, calculate the standard deviation of the total statistics of using the RISS data employing the median value. Lastly, calculate the normalized value of the applicable data, and the value of usage on the distribution chart.
According to the results of the SAM analysis, the key words much studied in 2021 included COVID-19, coronavirus 19, depression, artificial intelligence, and depressed in the descending order (Fig. 3). Furthermore, much-used key words included COVID-19, coronavirus 19, depressed, self-efficacy, and social support in that order (Fig. 4).
Such key words as COVID-19, stress, resilience, and meta-analysis newly emerged as research key words that attracted significant interest in 2021. Mental health, self-efficacy, stress, text mining, and case study newly surfaced as key words that were highly used. Meanwhile, such key words as North Korea, performance, stroke, turnover intention, and the Internet of Things (IoT) were deleted from among the Top 50 Research Chart because studies on these key words decreased compared to the previous year. Furthermore, communication, emotional intelligence, empathy, trust, and child care teachers were deleted from among the Top 50 Usage Chart because the data use of these key words dropped compared to the previous year. While the data on nurses, nursing students, organizational commitment, job stress, and stress were frequently used, the relevant studies were relatively insufficient. Therefore, it was analyzed that further studies need to be conducted even more actively.
References
- Korea Education and Research Information Service. (2021). 2021 White paper on ICT in education (pp. 287-295). Korea Education and Research Information Service.
- Mavaluru, D., Shriram, R., & Sugumaran, V. (2014). Big data analytics in information retrieval: promise and potential. In Proceedings of 08th IRF International Conference. Bengaluru, India. https://www.digitalxplore.org/up_proc/pdf/87-140479834241-46.pdf
- National Information Society Agency. (2014). 2014 Linked Open Data-Building Casebook in Korea (pp. 6-9). National Information Society Agency.
- Riss. (2022). Riss Hompage. Retrieved from http://www.riss.kr
- Riss. (2022). Riss SAM Hompage. Retrieved from http://sam.riss.kr
- Wenige, L., & Ruhland, J. (2018). Retrieval by recommendation: using LOD technologies to improve digital library search. International Journal on Digital Libraries, 19(2), 253-269. https://link.springer.com/article/10.1007/s00799-017-0224-8 [https://doi.org/10.1007/s00799-017-0224-8]
Sanghyun Jang is a director of KERIS, he is charging for department of research and higher education using ICT. His major is computer engineering. He is currently working as a adjunct professor of Kyungpook National University. His current research interests focus on adapting AI and Bigdata technologies for research and education.