---
language:
- en
license: apache-2.0
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:40482
- loss:MatryoshkaLoss
- loss:MultipleNegativesRankingLoss
widget:
- source_sentence: List the deadliest viruses in the world.
sentences:
- "Mediator is a large multiprotein complex conserved in all eukaryotes, which has\
\ \na crucial coregulator function in transcription by RNA polymerase II (Pol\
\ II). \nHowever, the molecular mechanisms of its action in vivo remain to be\
\ understood. \nMed17 is an essential and central component of the Mediator head\
\ module. In this \nwork, we utilised our large collection of conditional temperature-sensitive\
\ \nmed17 mutants to investigate Mediator's role in coordinating preinitiation\
\ \ncomplex (PIC) formation in vivo at the genome level after a transfer to a\
\ \nnon-permissive temperature for 45 minutes. The effect of a yeast mutation\
\ \nproposed to be equivalent to the human Med17-L371P responsible for infantile\
\ \ncerebral atrophy was also analyzed. The ChIP-seq results demonstrate that\
\ med17 \nmutations differentially affected the global presence of several PIC\
\ components \nincluding Mediator, TBP, TFIIH modules and Pol II. Our data show\
\ that Mediator \nstabilizes TFIIK kinase and TFIIH core modules independently,\
\ suggesting that \nthe recruitment or the stability of TFIIH modules is regulated\
\ independently on \nyeast genome. We demonstrate that Mediator selectively contributes\
\ to TBP \nrecruitment or stabilization to chromatin. This study provides an extensive\
\ \ngenome-wide view of Mediator's role in PIC formation, suggesting that Mediator\
\ \ncoordinates multiple steps of a PIC assembly pathway."
- "mTOR complex 2 (mTORC2) signaling is upregulated in multiple types of human \n\
cancer, but the molecular mechanisms underlying its activation and regulation\
\ \nremain elusive. Here, we show that microRNA-mediated upregulation of Rictor,\
\ an \nmTORC2-specific component, contributes to tumor progression. Rictor is\
\ \nupregulated via the repression of the miR-424/503 cluster in human prostate\
\ and \ncolon cancer cell lines that harbor c-Src upregulation and in Src-transformed\
\ \ncells. The tumorigenicity and invasive activity of these cells were suppressed\
\ \nby re-expression of miR-424/503. Rictor upregulation promotes formation of\
\ \nmTORC2 and induces activation of mTORC2, resulting in promotion of tumor growth\
\ \nand invasion. Furthermore, downregulation of miR-424/503 is associated with\
\ \nRictor upregulation in colon cancer tissues. These findings suggest that the\
\ \nmiR-424/503-Rictor pathway plays a crucial role in tumor progression."
- "This year marks the 100th anniversary of the deadliest event in human history.\
\ \nIn 1918-1919, pandemic influenza appeared nearly simultaneously around the\
\ globe \nand caused extraordinary mortality (an estimated 50-100 million deaths)\
\ \nassociated with unexpected clinical and epidemiological features. The \ndescendants\
\ of the 1918 virus remain today; as endemic influenza viruses, they \ncause significant\
\ mortality each year. Although the ability to predict influenza \npandemics remains\
\ no better than it was a century ago, numerous scientific \nadvances provide\
\ an important head start in limiting severe disease and death \nfrom both current\
\ and future influenza viruses: identification and substantial \ncharacterization\
\ of the natural history and pathogenesis of the 1918 causative \nvirus itself,\
\ as well as hundreds of its viral descendants; development of \nmoderately effective\
\ vaccines; improved diagnosis and treatment of \ninfluenza-associated pneumonia;\
\ and effective prevention and control measures. \nRemaining challenges include\
\ development of vaccines eliciting significantly \nbroader protection (against\
\ antigenically different influenza viruses) that can \nprevent or significantly\
\ downregulate viral replication; more complete \ncharacterization of natural\
\ history and pathogenesis emphasizing the protective \nrole of mucosal immunity;\
\ and biomarkers of impending influenza-associated \npneumonia."
- source_sentence: Where is X-ray free electron laser used?
sentences:
- "BACKGROUND: After tooth loss, the posterior maxilla is usually characterized\
\ by \nlimited bone height secondary to pneumatization of the maxillary sinus\
\ and/or \ncollapse of the alveolar ridge that preclude in many instances the\
\ installation \nof dental implants. In order to compensate for the lack of bone\
\ height, several \ntreatment options have been proposed. These treatment alternatives\
\ aimed at the \ninstallation of dental implants with or without the utilization\
\ of bone grafting \nmaterials avoiding the perforation of the Schneiderian membrane.\
\ Nevertheless, \nmembrane perforations represent the most common complication\
\ among these \nprocedures. Consequently, the present review aimed at the elucidation\
\ of the \nrelevance of this phenomenon on implant survival and complications.\n\
MATERIAL AND METHODS: Electronic and manual literature searches were performed\
\ \nby two independent reviewers in several databases, including MEDLINE, EMBASE,\
\ \nand Cochrane Oral Health Group Trials Register, for articles up to January\
\ 2018 \nreporting outcome of implant placement perforating the sinus floor without\
\ \nregenerative procedure (lateral sinus lift or transalveolar technique) and\
\ graft \nmaterial. The intrusion of the implants can occur during drilling or\
\ implant \nplacement, with and without punch out Schneiderian. Only studies with\
\ at least \n6 months of follow-up were included in the qualitative assessment.\n\
RESULTS: Eight studies provided information on the survival rate, with a global\
\ \nsample of 493 implants, being the weighted mean survival rate 95.6% (IC 95%),\
\ \nafter 52.7 months of follow-up. The level of implant penetration (≤ 4 mm or\
\ \n> 4 mm) did not report statistically significant differences in survival rate\
\ \n(p = 0.403). Seven studies provided information on the rate of clinical \n\
complications, being the mean complication rate 3.4% (IC 95%). The most frequent\
\ \nclinical complication was epistaxis, without finding significant differences\
\ \naccording to the level of penetration. Five studies provide information on\
\ the \nradiographic complication; the most common complication was thickening\
\ of the \nSchneiderian membrane. The weighted complication rate was 14.8% (IC\
\ 95%), and \npenetration level affects the rate of radiological complications,\
\ being these of \n5.29% in implant penetrating ≤4 mm and 29.3% in implant penetrating\
\ > 4 mm, \nwithout reaching statistical significant difference (p = 0.301).\n\
CONCLUSION: The overall survival rate of the implants into the sinus cavity was\
\ \n95.6%, without statistical differences according to the level of penetration.\
\ \nThe clinical and radiological complications were 3.4% and 14.8% respectively.\
\ \nThe most frequent clinical complication was the epistaxis, and the radiological\
\ \ncomplication was thickening of the Schneiderian membrane, without reaching\
\ \nstatistical significant difference according to the level of implant penetration\
\ \ninside the sinus."
- "Ultrashort X-ray pulses from free-electron laser X-ray sources make it feasible\
\ \nto conduct small- and wide-angle scattering experiments on biomolecular samples\
\ \nin solution at sub-picosecond timescales. During these so-called fluctuation\
\ \nscattering experiments, the absence of rotational averaging, typically induced\
\ \nby Brownian motion in classic solution-scattering experiments, increases the\
\ \ninformation content of the data. In order to perform shape reconstruction\
\ or \nstructure refinement from such data, it is essential to compute the theoretical\
\ \nprofiles from three-dimensional models. Based on the three-dimensional Zernike\
\ \npolynomial expansion models, a fast method to compute the theoretical \nfluctuation\
\ scattering profiles has been derived. The theoretical profiles have \nbeen validated\
\ against simulated results obtained from 300 000 scattering \npatterns for several\
\ representative biomolecular species."
- Hemophilic Pseudotumor is a rare complication of hemophilia. It is an encapsulated
haematoma in patients with haemophilia which has a tendency to progress and produce
clinical symptoms related to its anatomical location. The lesion most frequently
occurs in the long bones, pelvis, small bones of the hands and feet, or rarely
in the maxillofacial region.
- source_sentence: For the constructions of which organs has 3D printing been tested?
sentences:
- "The ability to three-dimensionally interweave biological tissue with functional\
\ \nelectronics could enable the creation of bionic organs possessing enhanced\
\ \nfunctionalities over their human counterparts. Conventional electronic devices\
\ \nare inherently two-dimensional, preventing seamless multidimensional integration\
\ \nwith synthetic biology, as the processes and materials are very different.\
\ Here, \nwe present a novel strategy for overcoming these difficulties via additive\
\ \nmanufacturing of biological cells with structural and nanoparticle derived\
\ \nelectronic elements. As a proof of concept, we generated a bionic ear via\
\ 3D \nprinting of a cell-seeded hydrogel matrix in the anatomic geometry of a\
\ human \near, along with an intertwined conducting polymer consisting of infused\
\ silver \nnanoparticles. This allowed for in vitro culturing of cartilage tissue\
\ around an \ninductive coil antenna in the ear, which subsequently enables readout\
\ of \ninductively-coupled signals from cochlea-shaped electrodes. The printed\
\ ear \nexhibits enhanced auditory sensing for radio frequency reception, and\
\ \ncomplementary left and right ears can listen to stereo audio music. Overall,\
\ our \napproach suggests a means to intricately merge biologic and nanoelectronic\
\ \nfunctionalities via 3D printing."
- "A case of heterochromia iridis and Horner's syndrome is reported in a 7-year\
\ old \ngirl with paravertebral neurilemmoma. These clinical findings can be useful\
\ in \nthe early diagnosis of mediastinal tumors in the paravertebral axis. While\
\ \ntypically associated with neuroblastoma, these findings can be due to tumors\
\ \nwhich are inately benign--in this case neurilemmoma. The mechanism for \n\
heterochromia is briefly discussed."
- "The creation of complex neuronal networks relies on ligand-receptor interactions\
\ \nthat mediate attraction or repulsion towards specific targets. Roundabouts\
\ \ncomprise a family of single-pass transmembrane receptors facilitating this\
\ \nprocess upon interaction with the soluble extracellular ligand Slit protein\
\ \nfamily emanating from the midline. Due to the complexity and flexible nature\
\ of \nRobo receptors , their overall structure has remained elusive until now.\
\ Recent \nstructural studies of the Robo 1 and Robo 2 ectodomains have provided\
\ the basis \nfor a better understanding of their signalling mechanism. These\
\ structures \nreveal how Robo receptors adopt an auto-inhibited conformation\
\ on the cell \nsurface that can be further stabilised by cis and/or trans oligmerisation\
\ \narrays. Upon Slit -N binding Robo receptors must undergo a conformational\
\ change \nfor Ig4 mediated dimerisation and signaling, probably via endocytosis.\
\ \nFurthermore, it's become clear that Robo receptors do not only act alone,\
\ but as \nlarge and more complex cell surface receptor assemblies to manifest\
\ directional \nand growth effects in a concerted fashion. These context dependent\
\ assemblies \nprovide a mechanism to fine tune attractive and repulsive signals\
\ in a \ncombinatorial manner required during neuronal development. While a mechanistic\
\ \nunderstanding of Slit mediated Robo signaling has advanced significantly further\
\ \nstructural studies on larger assemblies are required for the design of new\
\ \nexperiments to elucidate their role in cell surface receptor complexes. These\
\ \nwill be necessary to understand the role of Slit -Robo signaling in \nneurogenesis,\
\ angiogenesis, organ development and cancer progression. In this \nchapter, we\
\ provide a review of the current knowledge in the field with a \nparticular focus\
\ on the Roundabout receptor family."
- source_sentence: For the constructions of which organs has 3D printing been tested?
sentences:
- "Objective:To evaluate the value of improved Mallampati grading combined with\
\ \nNoSAS questionnaire in screening for obstructive sleep apnea (OSA). Method:A\
\ \ntotal of 344 patients admitted to our hospital for sleep disorders were studied.\
\ \nAll patients were measured for their height, weight, neck circumference and\
\ \nother parameters. NoSAS scores, improved Mallampati grading and polysomnography\
\ \n(PSG) were performed in these patients. According to AHI in PSG monitoring\
\ \nresults, patients were divided into non-osa group (AHI<5) 93 cases and OSA\
\ group \n251 cases. The OSA group were divided into mild (AHI 5-15), moderate(AHI\
\ 16-30) \nand severe OSA group(AHI>30) according to the PSG result. The ROC curve\
\ was \nplotted to evaluate the screening value of NoSAS and improved Mallampati\
\ grading \ncombined with NoSAS for OSA. Result:With the NoSAS score of 8 or 9\
\ as cutoffs \nfor analysis, the sensitivity for OSA was 0.733 and 0.701; the\
\ specificity for \nOSA was 0.538 and 0.624, respectively. The sensitivity and\
\ specificity of NoSAS \ncombined with improved Mallampati grading for screening\
\ OSA were 0.813 and \n0.710, respectively. Conclusion:As a new screening tool,\
\ NoSAS questionnaire is \nsimple and convenient, and has certain screening value\
\ to OSA. The improved \nMallampati grading combined with NoSAS questionnaire\
\ can obviously improve the \nscreening sensitivity and specificity of Osa, and\
\ has higher application value."
- "The morphology and the functionality of the murid glandular complex, composed\
\ of \nthe submandibular and sublingual salivary glands (SSC), were the object\
\ of \nseveral studies conducted mainly using magnetic resonance imaging (MRI).\
\ Using a \n4.7 T scanner and a manganese-based contrast agent, we improved the\
\ \nsignal-to-noise ratio of the SSC relating to the surrounding anatomical \n\
structures allowing to obtain high-contrast 3D images of the SSC. In the last\
\ \nfew years, the large development in resin melting techniques opened the way\
\ for \nprinting 3D objects starting from a 3D stack of images. Here, we demonstrate\
\ the \nfeasibility of the 3D printing technique of soft tissues such as the SSC\
\ in the \nrat with the aim to improve the visualization of the organs. This approach\
\ is \nuseful to preserve the real in vivo morphology of the SCC in living animals\
\ \navoiding the anatomical shape changes due to the lack of relationships with\
\ the \nsurrounding organs in case of extraction. It is also harmless, repeatable\
\ and \ncan be applied to explore volumetric changes occurring during body growth,\
\ \nexcretory duct obstruction, tumorigenesis and regeneration processes. 3D \n\
printing allows to obtain a solid object with the same shape of the organ of \n\
interest, which can be observed, freely rotated and manipulated. To increase the\
\ \nvisibility of the details, it is possible to print the organs with a selected\
\ \nzoom factor, useful as in case of tiny organs in small mammalia. An immediate\
\ \napplication of this technique is represented by educational classes."
- "Mobile phone use and risk of acoustic neuroma: results of the interphone \ncase-control\
\ study in five north European countries [corrected]."
- source_sentence: What is known about the Digit Ratio (2D:4D) cancer?
sentences:
- "Proteins undergo conformational changes during their biological function. As\
\ \nsuch, a high-resolution structure of a protein's resting conformation provides\
\ a \nstarting point for elucidating its reaction mechanism, but provides no direct\
\ \ninformation concerning the protein's conformational dynamics. Several X-ray\
\ \nmethods have been developed to elucidate those conformational changes that\
\ occur \nduring a protein's reaction, including time-resolved Laue diffraction\
\ and \nintermediate trapping studies on three-dimensional protein crystals, and\
\ \ntime-resolved wide-angle X-ray scattering and X-ray absorption studies on\
\ \nproteins in the solution phase. This review emphasizes the scope and limitations\
\ \nof these complementary experimental approaches when seeking to understand\
\ \nprotein conformational dynamics. These methods are illustrated using a limited\
\ \nset of examples including myoglobin and haemoglobin in complex with carbon\
\ \nmonoxide, the simple light-driven proton pump bacteriorhodopsin, and the \n\
superoxide scavenger superoxide reductase. In conclusion, likely future \ndevelopments\
\ of these methods at synchrotron X-ray sources and the potential \nimpact of\
\ emerging X-ray free-electron laser facilities are speculated upon."
- 'Extensive messenger RNA editing generates transcript and protein diversity in
genes involved in neural excitability, as previously described, as well as in
genes participating in a broad range of other cellular functions. '
- "BACKGROUND: The ratio of the lengths of index and ring fingers (2D:4D) is a \n\
marker of prenatal exposure to sex hormones, with low 2D:4D being indicative of\
\ \nhigh prenatal androgen action. Recent studies have reported a strong association\
\ \nbetween 2D:4D and risk of prostate cancer.\nMETHODS: A total of 6258 men participating\
\ in the Melbourne Collaborative Cohort \nStudy had 2D:4D assessed. Of these men,\
\ we identified 686 incident prostate \ncancer cases. Hazard ratios (HRs) and\
\ confidence intervals (CIs) were estimated \nfor a standard deviation increase\
\ in 2D:4D.\nRESULTS: No association was observed between 2D:4D and prostate cancer\
\ risk \noverall (HRs 1.00; 95% CIs, 0.92-1.08 for right, 0.93-1.08 for left).\
\ We \nobserved a weak inverse association between 2D:4D and risk of prostate\
\ cancer \nfor age <60, however 95% CIs included unity for all observed ages.\n\
CONCLUSION: Our results are not consistent with an association between 2D:4D and\
\ \noverall prostate cancer risk, but we cannot exclude a weak inverse association\
\ \nbetween 2D:4D and early onset prostate cancer risk."
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
- cosine_accuracy@1
- cosine_accuracy@3
- cosine_accuracy@5
- cosine_accuracy@10
- cosine_precision@1
- cosine_precision@3
- cosine_precision@5
- cosine_precision@10
- cosine_recall@1
- cosine_recall@3
- cosine_recall@5
- cosine_recall@10
- cosine_ndcg@10
- cosine_mrr@10
- cosine_map@100
model-index:
- name: Biomedical MRL
results:
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: dim 768
type: dim_768
metrics:
- type: cosine_accuracy@1
value: 0.7397454031117398
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.8472418670438473
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.8925035360678925
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.9292786421499293
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.7397454031117398
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.6058462989156059
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.5295615275813296
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.41103253182461097
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.22757153438103173
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.39389351666156774
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.4953500769443452
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.626185395476178
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.7036538830306982
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.8041815406030398
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.6499688056459438
name: Cosine Map@100
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: dim 512
type: dim_512
metrics:
- type: cosine_accuracy@1
value: 0.7326732673267327
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.842998585572843
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.8882602545968883
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.9151343705799151
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.7326732673267327
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.5964167845355963
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.5278642149929279
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.40990099009900993
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.21918993091456265
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.38673218299790596
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.4915208575777972
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.6229670136489501
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.6971415938662006
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.7968989245863362
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.6403253251933015
name: Cosine Map@100
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: dim 256
type: dim_256
metrics:
- type: cosine_accuracy@1
value: 0.7227722772277227
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.8373408769448374
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.8769448373408769
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.9108910891089109
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.7227722772277227
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.5893446487505893
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.5131541725601132
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.4048090523338048
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.2165092120706659
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.3843563311047163
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.4706508437641641
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.6082103871285517
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.6857315358161504
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.7889281785321389
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.6255397978739031
name: Cosine Map@100
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: dim 128
type: dim_128
metrics:
- type: cosine_accuracy@1
value: 0.7072135785007072
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.8076379066478077
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.8458274398868458
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.8967468175388967
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.7072135785007072
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.5605846298915607
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.4876944837340877
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.38189533239038187
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.2131717638221153
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.3571863197583239
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.44275724893253604
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.5763830904405497
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.651957768079385
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.7681035450483825
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.5861399094808066
name: Cosine Map@100
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: dim 64
type: dim_64
metrics:
- type: cosine_accuracy@1
value: 0.6435643564356436
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.7666195190947667
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.8048090523338048
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.8415841584158416
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.6435643564356436
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.5115511551155115
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.45007072135785003
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.3510608203677511
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.18506567524592368
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.3180821001225782
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.3926270123067019
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.5118404409971898
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.5894018468562044
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.7115219685233828
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.5197323616049745
name: Cosine Map@100
---
# Biomedical MRL
This is a [sentence-transformers](https://www.SBERT.net) model trained. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
## Model Details
### Model Description
- **Model Type:** Sentence Transformer
- **Maximum Sequence Length:** 512 tokens
- **Output Dimensionality:** 768 dimensions
- **Similarity Function:** Cosine Similarity
- **Language:** en
- **License:** apache-2.0
### Model Sources
- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
### Full Model Architecture
```
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
```
## Usage
### Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
```bash
pip install -U sentence-transformers
```
Then you can load this model and run inference.
```python
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("potsu-potsu/bge-base-mrl-train40k")
# Run inference
sentences = [
'What is known about the Digit Ratio (2D:4D) cancer?',
'BACKGROUND: The ratio of the lengths of index and ring fingers (2D:4D) is a \nmarker of prenatal exposure to sex hormones, with low 2D:4D being indicative of \nhigh prenatal androgen action. Recent studies have reported a strong association \nbetween 2D:4D and risk of prostate cancer.\nMETHODS: A total of 6258 men participating in the Melbourne Collaborative Cohort \nStudy had 2D:4D assessed. Of these men, we identified 686 incident prostate \ncancer cases. Hazard ratios (HRs) and confidence intervals (CIs) were estimated \nfor a standard deviation increase in 2D:4D.\nRESULTS: No association was observed between 2D:4D and prostate cancer risk \noverall (HRs 1.00; 95% CIs, 0.92-1.08 for right, 0.93-1.08 for left). We \nobserved a weak inverse association between 2D:4D and risk of prostate cancer \nfor age <60, however 95% CIs included unity for all observed ages.\nCONCLUSION: Our results are not consistent with an association between 2D:4D and \noverall prostate cancer risk, but we cannot exclude a weak inverse association \nbetween 2D:4D and early onset prostate cancer risk.',
"Proteins undergo conformational changes during their biological function. As \nsuch, a high-resolution structure of a protein's resting conformation provides a \nstarting point for elucidating its reaction mechanism, but provides no direct \ninformation concerning the protein's conformational dynamics. Several X-ray \nmethods have been developed to elucidate those conformational changes that occur \nduring a protein's reaction, including time-resolved Laue diffraction and \nintermediate trapping studies on three-dimensional protein crystals, and \ntime-resolved wide-angle X-ray scattering and X-ray absorption studies on \nproteins in the solution phase. This review emphasizes the scope and limitations \nof these complementary experimental approaches when seeking to understand \nprotein conformational dynamics. These methods are illustrated using a limited \nset of examples including myoglobin and haemoglobin in complex with carbon \nmonoxide, the simple light-driven proton pump bacteriorhodopsin, and the \nsuperoxide scavenger superoxide reductase. In conclusion, likely future \ndevelopments of these methods at synchrotron X-ray sources and the potential \nimpact of emerging X-ray free-electron laser facilities are speculated upon.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
```
## Evaluation
### Metrics
#### Information Retrieval
* Dataset: `dim_768`
* Evaluated with [InformationRetrievalEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) with these parameters:
```json
{
"truncate_dim": 768
}
```
| Metric | Value |
|:--------------------|:-----------|
| cosine_accuracy@1 | 0.7397 |
| cosine_accuracy@3 | 0.8472 |
| cosine_accuracy@5 | 0.8925 |
| cosine_accuracy@10 | 0.9293 |
| cosine_precision@1 | 0.7397 |
| cosine_precision@3 | 0.6058 |
| cosine_precision@5 | 0.5296 |
| cosine_precision@10 | 0.411 |
| cosine_recall@1 | 0.2276 |
| cosine_recall@3 | 0.3939 |
| cosine_recall@5 | 0.4954 |
| cosine_recall@10 | 0.6262 |
| **cosine_ndcg@10** | **0.7037** |
| cosine_mrr@10 | 0.8042 |
| cosine_map@100 | 0.65 |
#### Information Retrieval
* Dataset: `dim_512`
* Evaluated with [InformationRetrievalEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) with these parameters:
```json
{
"truncate_dim": 512
}
```
| Metric | Value |
|:--------------------|:-----------|
| cosine_accuracy@1 | 0.7327 |
| cosine_accuracy@3 | 0.843 |
| cosine_accuracy@5 | 0.8883 |
| cosine_accuracy@10 | 0.9151 |
| cosine_precision@1 | 0.7327 |
| cosine_precision@3 | 0.5964 |
| cosine_precision@5 | 0.5279 |
| cosine_precision@10 | 0.4099 |
| cosine_recall@1 | 0.2192 |
| cosine_recall@3 | 0.3867 |
| cosine_recall@5 | 0.4915 |
| cosine_recall@10 | 0.623 |
| **cosine_ndcg@10** | **0.6971** |
| cosine_mrr@10 | 0.7969 |
| cosine_map@100 | 0.6403 |
#### Information Retrieval
* Dataset: `dim_256`
* Evaluated with [InformationRetrievalEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) with these parameters:
```json
{
"truncate_dim": 256
}
```
| Metric | Value |
|:--------------------|:-----------|
| cosine_accuracy@1 | 0.7228 |
| cosine_accuracy@3 | 0.8373 |
| cosine_accuracy@5 | 0.8769 |
| cosine_accuracy@10 | 0.9109 |
| cosine_precision@1 | 0.7228 |
| cosine_precision@3 | 0.5893 |
| cosine_precision@5 | 0.5132 |
| cosine_precision@10 | 0.4048 |
| cosine_recall@1 | 0.2165 |
| cosine_recall@3 | 0.3844 |
| cosine_recall@5 | 0.4707 |
| cosine_recall@10 | 0.6082 |
| **cosine_ndcg@10** | **0.6857** |
| cosine_mrr@10 | 0.7889 |
| cosine_map@100 | 0.6255 |
#### Information Retrieval
* Dataset: `dim_128`
* Evaluated with [InformationRetrievalEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) with these parameters:
```json
{
"truncate_dim": 128
}
```
| Metric | Value |
|:--------------------|:----------|
| cosine_accuracy@1 | 0.7072 |
| cosine_accuracy@3 | 0.8076 |
| cosine_accuracy@5 | 0.8458 |
| cosine_accuracy@10 | 0.8967 |
| cosine_precision@1 | 0.7072 |
| cosine_precision@3 | 0.5606 |
| cosine_precision@5 | 0.4877 |
| cosine_precision@10 | 0.3819 |
| cosine_recall@1 | 0.2132 |
| cosine_recall@3 | 0.3572 |
| cosine_recall@5 | 0.4428 |
| cosine_recall@10 | 0.5764 |
| **cosine_ndcg@10** | **0.652** |
| cosine_mrr@10 | 0.7681 |
| cosine_map@100 | 0.5861 |
#### Information Retrieval
* Dataset: `dim_64`
* Evaluated with [InformationRetrievalEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) with these parameters:
```json
{
"truncate_dim": 64
}
```
| Metric | Value |
|:--------------------|:-----------|
| cosine_accuracy@1 | 0.6436 |
| cosine_accuracy@3 | 0.7666 |
| cosine_accuracy@5 | 0.8048 |
| cosine_accuracy@10 | 0.8416 |
| cosine_precision@1 | 0.6436 |
| cosine_precision@3 | 0.5116 |
| cosine_precision@5 | 0.4501 |
| cosine_precision@10 | 0.3511 |
| cosine_recall@1 | 0.1851 |
| cosine_recall@3 | 0.3181 |
| cosine_recall@5 | 0.3926 |
| cosine_recall@10 | 0.5118 |
| **cosine_ndcg@10** | **0.5894** |
| cosine_mrr@10 | 0.7115 |
| cosine_map@100 | 0.5197 |
## Training Details
### Training Dataset
#### Unnamed Dataset
* Size: 40,482 training samples
* Columns: anchor and positive
* Approximate statistics based on the first 1000 samples:
| | anchor | positive |
|:--------|:---------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
| type | string | string |
| details |
What is the implication of histone lysine methylation in medulloblastoma? | Aberrant patterns of H3K4, H3K9, and H3K27 histone lysine methylation were shown to result in histone code alterations, which induce changes in gene expression, and affect the proliferation rate of cells in medulloblastoma. |
| What is the implication of histone lysine methylation in medulloblastoma? | Recent studies showed frequent mutations in histone H3 lysine 27 (H3K27)
demethylases in medulloblastomas of Group 3 and Group 4, suggesting a role for
H3K27 methylation in these tumors. Indeed, trimethylated H3K27 (H3K27me3) levels
were shown to be higher in Group 3 and 4 tumors compared to WNT and SHH
medulloblastomas, also in tumors without detectable mutations in demethylases.
Here, we report that polycomb genes, required for H3K27 methylation, are
consistently upregulated in Group 3 and 4 tumors. These tumors show high
expression of the homeobox transcription factor OTX2. Silencing of OTX2 in D425
medulloblastoma cells resulted in downregulation of polycomb genes such as EZH2,
EED, SUZ12 and RBBP4 and upregulation of H3K27 demethylases KDM6A, KDM6B, JARID2
and KDM7A. This was accompanied by decreased H3K27me3 and increased H3K27me1
levels in promoter regions. Strikingly, the decrease of H3K27me3 was most
prominent in promoters that bind OTX2. OTX2-bound promoters showe... |
| What is the implication of histone lysine methylation in medulloblastoma? | We used high-resolution SNP genotyping to identify regions of genomic gain and
loss in the genomes of 212 medulloblastomas, malignant pediatric brain tumors.
We found focal amplifications of 15 known oncogenes and focal deletions of 20
known tumor suppressor genes (TSG), most not previously implicated in
medulloblastoma. Notably, we identified previously unknown amplifications and
homozygous deletions, including recurrent, mutually exclusive, highly focal
genetic events in genes targeting histone lysine methylation, particularly that
of histone 3, lysine 9 (H3K9). Post-translational modification of histone
proteins is critical for regulation of gene expression, can participate in
determination of stem cell fates and has been implicated in carcinogenesis.
Consistent with our genetic data, restoration of expression of genes controlling
H3K9 methylation greatly diminishes proliferation of medulloblastoma in vitro.
Copy number aberrations of genes with critical roles in writing... |
* Loss: [MatryoshkaLoss](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
```json
{
"loss": "MultipleNegativesRankingLoss",
"matryoshka_dims": [
768,
512,
256,
128,
64
],
"matryoshka_weights": [
1,
1,
1,
1,
1
],
"n_dims_per_step": -1
}
```
### Training Hyperparameters
#### Non-Default Hyperparameters
- `eval_strategy`: epoch
- `per_device_train_batch_size`: 32
- `per_device_eval_batch_size`: 16
- `gradient_accumulation_steps`: 16
- `learning_rate`: 2e-05
- `num_train_epochs`: 4
- `lr_scheduler_type`: cosine
- `warmup_ratio`: 0.1
- `bf16`: True
- `tf32`: True
- `load_best_model_at_end`: True
- `optim`: adamw_torch_fused
- `batch_sampler`: no_duplicates
#### All Hyperparameters