Open research Research information and analytics

Collecting data on open access publications

Author: Amy Devenney (Research and Business Intelligence Strategic Lead, Jisc)

In 2019, an article by the Knowledge Exchange (KE) demonstrated the difficulty of efficiently collating consistent article-level metadata to enable the monitoring and evaluation on Transitional Agreements  (TAs) and the burden this placed on our members. Therefore, over the last eighteen months, we have been working with publishers to collect the metadata elements recommended by KE and the Efficiency and Standards for Article Charges (ESAC).  We are now working with 31 publishers who provide us with regular metadata reports on outputs under Transitional and Native open access agreements. 

How do we get the data? 

Using a template created by the KE Monitoring OA Group (that repurposes the article-level metadata checklist from the article mentioned above), we have worked with these 31 publishers to implement the template as a standard reporting tool. This template identifies the key metadata that is required by institutions and consortia to enable the monitoring of the volume and compliance of output, as well as undertaking more detailed evaluation such as the value and equity of use across the consortium and mission groups of the agreements. As the license terms for an agreement are being arranged, we work with the publisher to discuss the metadata that they can provide and where this comes from, so that we can understand the publisher’s internal metadata workflow. We also agree with the publisher a schedule of reporting that fits with both Jisc’s reporting needs and the publisher’s ability to provide data.  

What do we do with the data? 

Once we have received the data we clean, verify, enhance and standardise it to ensure the dataset is complete and will allow holistic and comparable analysis across all publishers. The following work is carried out: 

  • DOIs: these are cleaned and de-duplicated to avoid the double counting of articles and to enable the use of external DOIs to enhance the data, 
  • Institution name: these are standardised to the legal name and the PID (Ringgold and Jisc ID) is added to verify that the institutions is subscribed to the agreement and to facilitate detailed analysis, 
  • Currency: where the APC list price is included this is standardised to GBP to enable comparable analysis in one standard currency, 
  • Article type: these are mapped to the COAR 3.0 standard following discussion with the publisher to ensure we have a coherent article type across all publishers, 
  • License type: these are mapped to the CC BY standard to enable us to monitor compliance across publishers. 

We also verify that all the institutions listed in the publisher’s report are subscribed to the agreement so that we can exclude any records that have been sent in error. We then enhance the data by using Crossref to add the funders of an article and Unpaywall to show the open access (OA) status of the article. This provides us with a cleaned, verified and standardized dataset of articles accepted and published under a Jisc agreement. In 2021 we received data from 97% of publishers with a TA and 100% of publishers with a Native OA agreement. 

How the data supports the transition to open access 

Jisc is supporting higher education with the transition to OA through the negotiation of a range of agreements that open up publishing opportunities – wherever the author is based, whatever their funding situation and whichever venue they choose to publish in.

The collection, verification and analysis of the article level metadata alongside publisher and sector data enables us to monitor the effectiveness and administration implications of these OA agreements, and we use this evidence to inform our negotiation objectives.

Next steps 

We are exploring how we can work with the cleaned, verified and standardized dataset of articles accepted and published under a Jisc agreement to benefit the community, and currently have several strands of activity: 

  • Analysing the data internally to look at year on year patterns and trends within the sector, band and mission group level, 
  • Comparing the data with pre-TA data to understand and evaluate the impact, value and costs of these agreements, 
  • Working with the Transitional Agreements Oversight Group (TAOG) to develop a series of dashboards that will enable them to collectively evaluate the impact of TAs and inform prospective post TA business models, 
  • Working with a number of institutions on a data verification project to identify any inaccuracies and missing entries, and to evaluate the metadata fields available. 

Finally, we are also exploring ways in which this dataset could be made available to you, our members, to help your internal processes and procedures. So please get in touch if you have any thoughts or ideas. 

You can also follow us on Twitter to keep up to date with Jisc open research.

Share and Enjoy !


Leave a Reply

Your email address will not be published. Required fields are marked *