CORE, the world’s largest collection of open access research papers, turns ten
It all started in 2010 when the then PhD student at the Knowledge Media Institute at the Open University, Dr. Petr Knoth wanted to collect a large corpus of academic papers to explore related research content. It was a frustrating job as he realised that there not only wasn’t a readily available corpus of all research papers, but that collecting this information for machine processing was particularly difficult. While reading about Open Access, he came up with the idea to create a tool that harvests both metadata and full text from all research repositories on a global scale enabling unrestricted access to all content.
He called the project COnnecting REpositories (CORE), signifying the aim of connecting the content that was semantically related across the distributed network of repositories.
It was the first collection to bring together both metadata records and full text content of research papers from repositories across the world, making it much easier to text and data-mine large quantities of research documents.
Understanding that no-one can solve every problem in the area of mining scientific literature, Dr. Knoth concentrated on making it easier for others to kick-start their projects. While the machine access to CORE data is critically important to those developing new tools for research, especially those utilising Natural Language Processing and Artificial Intelligence, the benefits CORE delivers now go much much further than that. CORE serves researchers, the general public, funders, HEIs, repositories and the open access movement.
Building CORE involved its fair share of out of hours work, before any grant funding was secured. When Dr. Knoth was asked to identify the most challenging moments of CORE, he said:
“The most important lesson is to never give up. We have been through difficult times and had to risk a lot. Endurance and resilience are most important.”
A pivotal moment in CORE’s history was winning the first CORE grant from Jisc in 2011. Following that, in 2015 CORE won a tender for the UK National Aggregator, and shortly after CORE transitioned from a research project and was launched as a scholarly infrastructure service.
Over the last ten years, CORE has become an intrinsic part of the open research infrastructure providing the following services:
- CORE API
- CORE Dataset
- CORE FastSync
- CORE Recommender
- CORE Discovery
- CORE Repository Dashboard
- CORE Repository Edition
These services address the needs of a wide range of stakeholders: researchers, developers, text and data miners, enterprises, librarians, research support administrators, higher education institutions, life long learners, and funders.
There have been plenty of achievements so far; in February 2020 CORE reached 10 million monthly active users, and just four months later in June, that number hit 20 million. Incredibly, in its 10th year CORE is proud to celebrate unprecedented usage levels, with 50 million monthly active users in April 2021. CORE is now in the top 1,068 most used websites globally by Alexa Global Rank and ranks 7th globally in the category Science and Education by SimilarWeb. (See more on CORE’s growth on the CORE blog.)
Asked about CORE’s 10 years anniversary, Dr. Petr Knoth says:
“We have been on an incredible journey which has felt like a sprint to me, so it is hard to believe that it has been 10 years already. We have a very clear vision about where we want to take CORE and I feel immensely honoured to have been given the opportunity to work with all the super talented people around me on delivering it.”
Liz Ball, Director of Open Access Services at Jisc, says:
“For 10 years, CORE has played an important role in the open research infrastructure landscape. We are delighted that CORE has reached this milestone and look forward to continued collaboration with the CORE team in realising the full benefits of CORE as a global aggregation service.”
The Jisc and Open University partnership has resulted in delivery of the largest global aggregator of full text and metadata for OA research papers. The services CORE offers on top of this data now enable and power a rich set of use cases as diverse as fact checking, detection of misinformation, metadata enhancement, analysing research trends, plagiarism detection, compliance monitoring and research impact evaluation. Most recently CORE was able to provide an aggregated view of REF compliance at national level to support REF 2021 compliance monitoring for Research England.
Balviar Notay, Product Manager at Jisc, states:
“As the world’s largest aggregator of open access content, we should further explore more opportunities to exploit how this aggregation can provide economies of scale and reduce administrative burden and support workflows.”
CORE is a fast evolving service, and in some ways we are unrecognisable from our beginnings. For starters, CORE: COnnecting REpositories has been shortened to just CORE, to reflect our wider remit, the one-man-band has now turned into a highly skilled multinational team, our algorithms have evolved rapidly, our infrastructure is more solid, and this is all reflected in the vast growth in user numbers.
At its heart CORE remains focussed on our mission to aggregate all open access research worldwide and deliver unrestricted access for all. In doing so, we:
- enrich scholarly data using state-of-the-art text and data mining technologies to aid discoverability,
- enable others to develop new tools and use cases on top of the CORE platform,
- support the network of open access repositories and journals with innovative technical, solutions and,
- facilitate a scalable, cost-effective route for the delivery of open scholarship.
We are proud of all our accomplishments so far and we are working hard to achieve many more. We are a mission driven organisation and we look forward to delivering on our goals in the years ahead!
The CORE team