Is Your Pathology R&D Data Management Strategy FAIR?

By Proscia | February 22, 2022

Life science organizations are under constant pressure to push the boundaries of science to develop lifesaving drugs and treatments that positively impact life at scale. To bring these ideas from discovery to market, organizations often manage and connect complex pathology data with multiple stakeholders. The digitization of pathology workflows in R&D has paved the way for accelerated global development by eliminating geographic barriers, reducing subjectivity, and ultimately transforming the practice of pathology.

Image analysis, computational pathology applications, and other analytic tools have been key to accelerating both the adoption of digital pathology and the pace with which life science organizations are able to bring groundbreaking discoveries to market. In the wake of this digitization wave in pathology, many life science organizations are struggling to improve data management infrastructure, as incomplete, unusable, and disorganized data is preventing organizations from extracting the maximum benefit from their investments.

What is FAIR and how is that relevant for digital pathology?

The FAIR (Findable, Accessible, Interoperable, Reusable) guiding principles established in 2016 articulate specific attributes that data require in order to be optimally reusable by humans and machines. Effective data management and stewardship are the key to accelerating knowledge discovery and innovation in the present and future. Existing legacy digital ecosystems have crippled R&D workflows and have acted as a barrier to extracting the maximum benefit from research investments and data gathered. The FAIR guiding principles provide four foundational characteristics for data management and stewardship for the life sciences community to maximize discovery.

A glance at these principles published by Wilkinson et al is the following (Table 1)³:

Findable	F1. (Meta)data are assigned a globally unique and persistent identifier F2. Data are described with rich metadata (defined by R1 below) F3. Metadata clearly and explicitly include the identifier of the data they describe F4. (Meta)data are registered or indexed in a searchable resource
Accessible	A1. (Meta)data are retrievable by their identifier using a standardized communications protocol A1.1 The protocol is open, free, and universally implementable A1.2 The protocol allows for an authentication and authorisation procedure, where necessary A2. Metadata are accessible, even when the data are no longer available
Interoperable	I1. (Meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation. I2. (Meta)data use vocabularies that follow FAIR principles I3. (Meta)data include qualified references to other (meta)data
Reusable	R1. (Meta)data are richly described with a plurality of accurate and relevant attributes R1.1. (Meta)data are released with a clear and accessible data usage license R1.2. (Meta)data are associated with detailed provenance R1.3. (Meta)data meet domain-relevant community standards

Table 1: The FAIR principles at a glance

A study by the European Commission found that lack of FAIR research data costs the European economy at least 10.2 billion Euros every year along with consequences that cannot be estimated including an impact on research quality, economic turnover, or machine readability of research data¹. Clearly, there is tremendous value in adopting FAIR principles to optimize research data: reduction of R&D costs, increased productivity, improved collaboration, enhanced data integrity, and greater scientific and operational insights. Furthermore, adoption of FAIR principles will enable powerful analytical technologies, such as image analysis and artificial intelligence, improved access to data at scale in order to solve more complex problems.

Using FAIR principles to drive effective pathology data management with Concentriq® for Research

Proscia’s Concentriq digital pathology platform leverages the core principles of FAIR data to enable life science companies to realize greater value from their R&D data over legacy solutions. In this section, we highlight how Proscia supports organizations in their adoption of FAIR data principles, and demonstrates the value of achieving this transformation in terms of improving productivity, collaboration, data integrity, and decision-making.

Findable: The first aspect of FAIR is the ease of data and metadata to be finable for both humans and computers. The integration of digital pathology imagery with physical samples and their linked metadata is essential to discovery for life science organizations. Concentriq for Research supports the ability to import whole slide images (WSI) and their associated metadata en masse or manually, and also allows users to organize their collection of virtual slides, folders, cases, annotations, and other metadata into a repository. A repository can further be organized into groups allowing researchers from multiple teams to collaborate on studies. Researchers can use the global search feature for quick browsing or sharing of the data nested within repositories, folders, cases, and images. Additionally, configurable metadata within repositories helps organizations standardize their research operations. The metadata is visible alongside the image in Concentriq for Research, providing the pathologist with all the information needed – on a single page – to conduct their assessment.

Accessible: Once a user finds the required data, they need to know how it can be accessed securely. Concentriq for Research allows teams to retrieve the data for rapid digital review. The data can also be viewed in table view with configurable metadata.

Viewing whole slides images (WSI): Setting and enforcing guidelines around image access is a fundamental piece of accessibility and one that many labs struggle to address without a proper image management system in place. This is an area where Concentriq for Research excels. Access to WSI, whether uploaded manually or automatically, can be secured with configurable roles and permissions, along with specific permission levels. Internal or external stakeholders (such as pathologist consultations) who view WSI files in Concentriq can be restricted from modifying them and therefore cannot invalidate or alter these files. WSI files can be deleted, but that action would be captured in the audit trail, which is human-readable and exportable from the GUI.
Viewing WSI metadata: Presenting metadata alongside WSI files is another important consideration for accessibility, and one that becomes more difficult as lab environments introduce technologies that may not connect seamlessly with existing systems. The ability to modify uploaded WSI metadata can be controlled through role permissions and repository share permission levels. Additionally, WSI metadata can be viewed alongside the images.

Interoperable: The third key principle of FAIR is the integration of data with applications or workflows for analysis, storage and processing. Interoperability with your existing digital technology ecosystem is at the core of the Concentriq platform (Figure 1). Concentriq for Research supports images from all leading scanners. Furthermore, our mature and field-tested open API provides the ability to connect third-party software like lab information management systems (LIMS), image analysis (IA), electronic lab notebook, and homegrown AI applications. This enables automation of manual steps and ultimately enhances workflow efficiency helping organizations to reach their full potential.

Concentriq’s rich interoperability also allows users to access both the features and data within these systems through the Concentriq interface. For instance, Concentriq for Research offers bidirectional image analysis integration with two of the leading image analysis vendors. With this integration, analysis results created by these solutions are accessible within Concentriq. And because Concentriq houses these images centrally, image analysis applications can read the data directly from Concentriq, eliminating data silos and disorganized data in general.

Concentriq The center of the digital pathology ecosystem

Figure 1: Concentriq connects your entire pathology ecosystem through a future-ready platform that integrates with all your technology, giving you the flexibility and investment protection every lab needs.

Reusable: The last and key principle of FAIR is the ability to optimize the reusability of data. At the core of the Concentriq digital pathology platform is a powerful image management capability that ensures data can be organized within digital image libraries. This is vital to ensure the data is available and easily accessible for future research efforts – and able to be easily leveraged as new innovations become available. Additionally, as new predictive algorithms are available, there is potential to harness additional insights from data that may not have been previously possible. Proscia recognizes the value of archival and image life cycle management of its customers’ data and works with them to design a solution based on the organization’s use cases and needs. Furthermore, Concentriq for Research can be used with storage platforms that provide storage tiering, allowing organizations to optimize their data storage for accessibility and cost.

FAIR in practice: Developing your own AI algorithms

In evaluating trends across the life sciences market, we noticed a shift for many biopharma organizations towards building their own data science teams to impact the entire R&D value chain. Adopting a FAIR data management strategy is a critical first step for your organization to maintain your market position and leverage technologically innovative computational solutions for pathology that amplify the possibilities of data-driven discovery to drive insights faster.

Here at Proscia, we encourage our community to leverage their data in new ways and develop next-generation tools for pathology. Concentriq makes available the images, annotations, and metadata it manages to not only facilitate internal app development, but also enable the integration of homegrown analytic tools into Concentriq, further enriching the image data managed within Concentriq. These capabilities are powered by a comprehensive API and development kit for data scientists and AI researchers to develop and deploy image analysis and AI applications across discovery, pre-clinical, and clinical research.

For data scientists and developers, these tools help our customers and partners generate insights from their data and develop next-generation computational pathology applications. The Concentriq developer platform allows technical users to read and write directly to the pathology system of record, enabling new insights and applications to be deployed both in research ecosystems and diagnostic workflows. And our expertise working with dozens of companies to integrate scanners, information systems, computational applications, and devices of all types directly with the Concentriq platform help ensure our customer’s data is standardized and centralized to drive pathology workflow automation.

Conclusion

FAIR data is critical to solve the inefficiencies that have arisen from legacy data management practices, such as data silos, lack of metadata standardization, and barriers to effective internal and external collaboration. Proscia’s Concentriq for Research facilitates the four principles of FAIR data management by providing:

The ability to organize virtual slide libraries along with their associated metadata, annotations into repositories (‘Findable’’)
Configurable roles and permissions to enable your organization to simplify user access management based on each user’s work function (‘Accessible’)
Leading interoperability with your existing digital ecosystem (‘Interoperable’), and
Data archival and image lifecycle management, along with ability for your research team to develop and deploy image analysis and AI applications with Concentriq’s comprehensive API and development kit (‘Reusable’)

This is why 14 of the top 20 pharma companies trust Proscia’s Concentriq for Research platform to help accelerate their research efforts, improve internal and external communications, and make better use of their valuable pathology data.

References:

For more information on the FAIR data principles, the following reading is recommended.

European Commission, Directorate-General for Research and Innovation, (2019). Cost-benefit analysis for FAIR research data : cost of not having FAIR research data, Publications Office. https://data.europa.eu/doi/10.2777/02999
Fair principles. GO FAIR. (2022, January 21). Retrieved February 1, 2022, from https://www.go-fair.org/fair-principles/
Wilkinson, M. D., Dumontier, M., Aalbersberg, I. J. J., Appleton, G., Axton, M., Baak, A., Blomberg, N., Boiten, J.-W., da Silva Santos, L. B., Bourne, P. E., Bouwman, J., Brookes, A. J., Clark, T., Crosas, M., Dillo, I., Dumon, O., Edmunds, S., Evelo, C. T., Finkers, R., … Mons, B. (2016). The Fair Guiding Principles for Scientific Data Management and Stewardship. Scientific Data, 3(1). https://doi.org/10.1038/sdata.2016.18

Is Your Pathology R&D Data Management Strategy FAIR?

What is FAIR and how is that relevant for digital pathology?

Using FAIR principles to drive effective pathology data management with Concentriq® for Research

Latest Articles

Unlocking the Promise of Foundation Models in Pathology for AI-Driven Drug Discovery & Development

Advancing Precision Medicine with Digital Pathology: Five Questions to Answer about Real-World Data

The Hidden Costs of AI Development in Pathology and How Concentriq Embeddings Helps Life Sciences Organizations Mitigate Them