SURF Data Repository
Makes large datasets findable and accessible
Advantages
Make your data FAIR
Easy to use
Easy to combine with our computing services
My dataset of over 50 TB must remain publicly accessible for 10 years
How does it work?
Using metadata and possible structuring of files and relationships, each individual dataset gets its own so-called landing page containing all data known about the dataset. Each dataset and its associated files are given permanent, unique numbers (persistent identifiers) in the form of Digital Object Identifiers (DOI) and EPIC PIDs (for files). These can be used in new publications.
You simply transfer your data to the Data Repository by logging into the web interface and uploading your files. Is your dataset too large to get to us via a browser? Then you can move your data efficiently and quickly using the REST API, or indirectly via SURF's Data Archive using various supported protocols. If necessary, after uploading and structuring the data and supplying metadata, SURF's advisers will help you get the data into the repository.
I want to be able to use a persistent identifier to refer to my large dataset in an article
Is my dataset eligible for this service?
SURF Data Repository is designed for publishing large datasets: research data totalling several TBs to PBs in size. For smaller datasets, please refer to DANS and/or 4TU. In a European context, you can also use the data services of EUDAT or CERN.
If your group or institute would like to publish several datasets of different sizes, please inquire about the possibilities. If you want to be sure whether your dataset is eligible for this service, please contact our advisors via the servicedesk.
SURF Data Repository is a collaboration between DANS, 4TU and SURF.
Rates
View the tariffs of this service.
SURF Services and rates 2025
View the tariffs of this service.
Security and data protection
- This service is ISO 27001 certified
- HTTPS web interface and access tokens for REST API use
- Defined access roles
- We store two copies of the data for redundancy.