[toc]

## Introduction

Machine learning (ML) models produced by researchers are considered to be **research output** just like more traditiobal research outputs such as journal articles, conference papers, book chapters, etc. This means that when creating and sharing machine learning models researchers need to fulfil funder and institutional requirements.

For example, researchers need to ensure that all their work falls under the ethical approval in projects where such approval is applicable (see guidelines from e.g., [Swedish Research Council](https://www.vr.se/english/applying-for-funding/requirements-terms-and-conditions/conducting-ethical-research.html), [ERC](https://erc.europa.eu/manage-your-project/ethics-guidance)). In addition, their research output needs to meet open access publication requirements (e.g., [Swedish Research Council](https://www.vr.se/english/applying-for-funding/requirements-terms-and-conditions/publishing-open-access.html), [ERC](https://erc.europa.eu/manage-your-project/open-science )), open data requirements (e.g., [Swedish Research Council](https://www.vr.se/english/mandates/open-science/open-access-to-research-data/vision-and-guiding-principles.html ), [ERC](https://erc.europa.eu/manage-your-project/open-science )), open analysis workflows and code requirements (e.g., [National guidelines for promoting open science in Sweden](https://www.kb.se/samverkan-och-utveckling/nytt-fran-kb/nyheter-samverkan-och-utveckling/2024-01-15-national-guidelines-for-promoting-open-science-in-sweden.html )). 

To meet these requirements, European and Swedish funders and universities currently recommend adhering to FAIR principles (e.g., [Swedish Research Council: Making research data accessible and FAIR criteria](https://www.vr.se/english/mandates/open-science/open-access-to-research-data/support-and-tools-/making-research-data-accessible-and-fair.html)).

## FAIR ML models

FAIR (Findable, Accessible, Interoperable, Reusable) is a set of principles originally written for research data (see [Wilkinson et al 2016](https://doi.org/10.1038/sdata.2016.18)) but since expanded to other research output (see [Baker et al 2022](https://doi.org/10.1038/s41597-022-01710-x), [Patel et al 2023](https://doi.org/10.1038/s41597-023-02463-x)). There is no one specific way to 'make something FAIR'; instead, research output can adhere to FAIR principles to different extent and in different ways.

The long-term goal of SciLifeLab Serve is allow Swedish researchers to meet funder requirements in terms of FAIR principles when sharing models without any extra work; in other words, everything should be done for you automatically when you share your model on SciLifeLab Serve. In the meantime, there are some things that researchers can do themselves. On this page, we give recommendations on basic steps researchers can take to adhere to FAIR principles to a reasonable extent when sharing machine learning models.

### Meeting FAIR requirements in applications with ML models

Currently, researchers can share their machine learning models through SciLifeLab Serve by turning them into independent applications. We have [guidelines for how to do it here](https://serve.scilifelab.se/docs/model-serving/options/). We also have a separate page describing how applications (including machine learning applications) [can meet FAIR requirements](https://serve.scilifelab.se/docs/application-hosting/fair/). All models shared on SciLifeLab Serve should aim to fulfil the requirements described there as a starting point. Below, we provide additional recommendations that are specific to ML models.

### Additional suggestions specific to ML models

When it comes to specifically machine learning models, researchers should in addition put extra effort into the descriptions of their models so that they contain all relevant and necessary information. Good descriptions (metadata) are one of the pillars of FAIR.

====== MAYBE THAT IT IS GOOD TO DESCRIBE THINGS WELL ======

## Open ML models

As mentioned in the Introduction, funders and institutions require open sharing of research output. Research projects using machine learning models make use of and create many artifacts, and all of these components need to be taken into account when considering the funder and institutional requirements. We at SciLifeLab Serve endorse the so-called *Model Openness Framework* (MOF, [White et al 2024](    
https://doi.org/10.48550/arXiv.2403.13784)) developed by researchers at [Linux Foundation](https://www.linuxfoundation.org/) and elsewhere.

The Model Openness Framework identifies 17 components that can be shared by researchers developing machine learning models as well as appropriate licenses that these need to be shared with. Those models that share all components with expected accompanying licenses meet the criteria to be classified as *Open Science* models in MOF. There is also [Model Openness Tool](https://isitopen.ai/) where researchers can add their own models or get an overview of how other models are classified in MDF.

While the Model Openness Framework is designed for deep learning artifacts and does not transfer directly to every form of learning in AI, we think this is a great starting point for any ML researchers wishing to share their models. We recommend the researchers strive to share as many components as possible from the list. The long-term goal of SciLifeLab Serve is to make sharing all these components as easy as possible.

=== THINK ABOUT: MOF IS PRIMARILY TARGETING THOSE THAT SHARE THE MODEL AS THEIR PRIMARY GOAL. WHAT ABOUT WHEN A MODEL IS USED AS AN ANALYSIS TOOL?

### Model Openness Framework Components and License

Source: *Table 2* of [White et al 2024](    
https://doi.org/10.48550/arXiv.2403.13784))

| Component | Domain | Content Type | Accepted Open License |
| --- | --- | --- | --- |
| Datasets | Data | Data | Preferred: [CDLA-Permissive-2.0](https://cdla.dev/permissive-2-0/), [CC-BY-4.0](https://creativecommons.org/licenses/by/4.0/deed.en). Acceptable: Any including unlicensed |
| Data Preprocessing Code | Data | Code | Acceptable: [OSI-approved](https://opensource.org/licenses), e.g., [The MIT License](https://opensource.org/licenseslicense/mit) |
| Model Architecture | Model | Code | Acceptable: [OSI-approved](https://opensource.org/licenses), e.g., [The MIT License](https://opensource.org/licenseslicense/mit) |
| Model Parameters | Model | Data | Preferred: [CDLA-Permissive-2.0](https://cdla.dev/permissive-2-0/). Acceptable: [OSI-approved](https://opensource.org/licenses), e.g., [The MIT License](https://opensource.org/licenseslicense/mit), Permissive Open Data Licenses |
| Model Metadata | Model | Data | Preferred: [CDLA-Permissive-2.0](https://cdla.dev/permissive-2-0/). Acceptable: [CC-BY-4.0](https://creativecommons.org/licenses/by/4.0/deed.en), Permissive Open Data Licenses |
| Training Code | Model | Code | Acceptable: [OSI-approved](https://opensource.org/licenses), e.g., [The MIT License](https://opensource.org/licenseslicense/mit) |
| Inference Code | Model | Code | Acceptable: [OSI-approved](https://opensource.org/licenses), e.g., [The MIT License](https://opensource.org/licenseslicense/mit) |
| Evaluation Code | Model | Code | Acceptable: [OSI-approved](https://opensource.org/licenses), e.g., [The MIT License](https://opensource.org/licenseslicense/mit) |
| Evaluation Data | Model | Data | Preferred: [CDLA-Permissive-2.0](https://cdla.dev/permissive-2-0/). Acceptable: [CC-BY-4.0](https://creativecommons.org/licenses/by/4.0/deed.en), Permissive Open Data Licenses |
| Evaluation Results | Model | Documentation | Preferred: [CC-BY-4.0](https://creativecommons.org/licenses/by/4.0/deed.en). Acceptable: Permissive Open Content Licenses |
| Supporting Libraries & Tools | Model | Code | Acceptable: [OSI-approved](https://opensource.org/licenses), e.g., [The MIT License](https://opensource.org/licenseslicense/mit) |
| Model Card | Model | Documentation | Preferred: [CC-BY-4.0](https://creativecommons.org/licenses/by/4.0/deed.en). Acceptable: Permissive Open Content Licenses |
| Data Card | Data | Documentation | Preferred: [CC-BY-4.0](https://creativecommons.org/licenses/by/4.0/deed.en). Acceptable: Permissive Open Content Licenses |
| Technical Report | Model & Data | Documentation | Preferred: [CC-BY-4.0](https://creativecommons.org/licenses/by/4.0/deed.en). Acceptable: Permissive Open Content Licenses |
| Research Paper | Model & Data | Documentation | Preferred: [CC-BY-4.0](https://creativecommons.org/licenses/by/4.0/deed.en). Acceptable: Permissive Open Content Licenses |
| Sample Model Outputs | Model | Data or Code | Unlicensed |

### DOME recommendations for reporting ML-based analyses

*DOME (Data, Optimization, Model and Evaluation)* is a set of community-wide recommendations for reporting supervised machine learning–based analyses applied to biological studies ([Walsh et al 2021](https://doi.org/10.1038/s41592-021-01205-4)). The goal behind DOME recommendations is to improve machine learning assessment and reproducibility. These recommendations were developed primarily for the case of supervised learning in biological applications in the absence of direct experimental validation, as this is the most common type of ML approach used in biology. Since their publication, DOME recommendations have been increasingly adopted by the community, including some journals requiring descriptions according to DOME. There is also a [DOME registry](https://registry.dome-ml.org/intro) website where researchers can add their models.

We recommend researchers to get familar with [DOME recommendations](https://dome-ml.org/) and follow them in the way they describe their models in their papers as well as on SciLifeLab Serve.

=== INSERT A PICTURE WITH DOME GUIDELINES ====




https://docs.google.com/presentation/d/1s9KqkdRGWOLiTkpy4pAcoXAiBYj1VDH5/edit?pli=1#slide=id.p6
- FAIR4ML https://www.rd-alliance.org/groups/fair-machine-learning-fair4ml-ig/members/all-members/
- FARR https://www.farr-rcn.org/
- https://zenodo.org/records/12943228
- Google model cards?
- HuggingFace model cards?
- https://figshare.scilifelab.se/articles/presentation/FAIR_principles_in_life_science_research_practice/25091471
- Open source AI definition
- Model Openness Framework


=====

## Other sources of information

Here we wrote guidelines from the perspective of typical use cases of SciLifeLab Serve but there are many other good sources of information about FAIR that you can use if you are interested to dive into this. Here are some recommendations:

https://zenodo.org/records/13835105

https://zenodo.org/records/12943228 ten simple rules

https://rda-fair4ml.github.io/FAIR4ML-schema/release/0.1.0/index.html and https://github.com/RDA-FAIR4ML/FAIR4ML-schema

The SciLifeLab Serve user guide is powered by django-wiki, an open source application under the GPLv3 license. Let knowledge be the cure.