You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Model Card for SlideCheck

Self-supervised learning (SSL) has shown strong transferability for pathology foundation models, yet most pipelines still sample patches from whole-slide images (WSIs) uniformly at random despite severe redundancy and imbalanced tissue distributions. We propose SlideCheck as a prior, using supervised distribution priors to guide SSL patch selection. We unify multiple large-scale public ROI datasets and map heterogeneous labels into two binary factors: normal vs. abnormal and cancer vs. non-cancer. With ~1M labeled patches, we train and open-source SlideCheck, a lightweight patch classifier that outputs prior scores for candidate patches. These scores can be used to filter and prioritize diagnostically relevant patches before or during SSL pretraining, reducing uninformative tissue redundancy and improving data efficiency without changing the SSL objective. We hope SlideCheck can serve as a practical, reusable tool to facilitate dataset curation and patch sampling for future pathology SSL research.

  • normal vs. abnormal (logit_abn)
  • noncancer vs. cancer (logit_can)

Arxiv preprint paper: https://arxiv.org/abs/2505.21928

Github codebase: https://github.com/lingxitong/SlideCheck


What can SlideCheck be used for?

Most pathology SSL pipelines sample WSI patches uniformly at random, which often includes large amounts of redundant and uninformative tissue.
SlideCheck provides prior scores that can be used to: - filter out low-value patches - prioritize diagnostically relevant patches - build higher-quality pretraining sets - guide patch selection without changing the SSL objective

Inputs / Outputs

Input

SlideCheck takes patch embeddings as input (not raw images).
Typical embeddings come from pathology foundation models (e.g., UNI / GigaPath / Virchow2), and are saved as an .h5 file with:

  • dataset key: features
  • shape: [N, D]

Output

SlideCheck outputs two logits (and sigmoid probabilities):

logit_abn → abnormal probability

logit_can → cancer probability

The inference script can also export binary predictions with a threshold.

How to Run Inference (Recommended)

The official inference script is in the GitHub repo: SlideCheck/Infer_SlideCheck/SlideCheck_Infer.py

Citation

If SlideCheck is helpful to you, please cite our work.

@article{zhu2025subspecialty,
  title={Subspecialty-specific foundation model for intelligent gastrointestinal pathology},
  author={Zhu, Lianghui and Ling, Xitong and Ouyang, Minxi and Liu, Xiaoping and Guan, Tian and Fu, Mingxi and Cheng, Zhiqiang and Fu, Fanglei and Zeng, Maomao and Liu, Liming and others},
  journal={arXiv preprint arXiv:2505.21928},
  year={2025}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for xtxx/SlideCheck