Transmission electron microscopy (TEM) image datasets of peptide / protein nanowire morphologies

TEM image dataset containing four nanowire morphologies of bio-derived protein nanowires and synthetic peptide nanowires. The peptide / protein nanowires used in this study were synthesized and imaged by Brian Montz in Prof. Todd Emrick's research group at the Department of Polymer Science and...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Shizhao Lu, Brian Montz, Todd Emrick, Arthi Jayaraman
Format: Dataset
Sprache:eng
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:TEM image dataset containing four nanowire morphologies of bio-derived protein nanowires and synthetic peptide nanowires. The peptide / protein nanowires used in this study were synthesized and imaged by Brian Montz in Prof. Todd Emrick's research group at the Department of Polymer Science and Engineering Department, University of Massachusetts Amherst.  We acknowledge financial support from the U.S. National Science Foundation, Grant NSF DMREF #1921839 and DMREF #1921871. Nanowires were classified into either of the four morphologies: bundle, singular, dispersed or network. Each morphology contains 100 images (jpg files). For the dispersed and network morphologies, because these two morphologies are harder to visually distinguish, we have created manual segmentation labels of the nanowires (included in these two morphology folders as png files). Percolation analysis was done on these manually segmented nanowires to provide quantitative metric on whether the nanowires form a network in the image.  seg_mask_5_resolutions.zip contains ground truth 2D binary encoding of segmented nanowires at 5 resolutions. encoders_trained_with_optimized_hyperparameter.zip contains 4 sets of encoders trained with either SimCLR or Barlow-Twins self-supervised methods on either generic TEM images, or generic everyday photographic images (each with 5 replicates with different random seed) with optimized hyperparameters. Open-access datasets that have been used during self-supervised training. 2021-CEM500K.zip contains 10,000 images that was used as "generic TEM images" to train the encoders with self-supervised methods, these are a random selection from the CEM500k open-access dataset. DOI: 10.7554/eLife.65894 2022-1000-ImageNet.zip contains 1,000 images from the ImageNet1k dataset, each come from a different category. DOI: 10.1007/s11263-015-0816-y Open-access datasets that our machine learning workflow have been applied to: 2022-AutoDetect-mNP-morphology.zip contains a selected TEM images of nanoparticles categorized in 3 morphologies from the AutoDetect-mNP datasets: DOI: 10.6078/D1WT44 and DOI: 10.6078/D1S12H 2021-TEM virus.zip contains TEM images of 9 types of viruses from the TEM virus dataset. Matuszewski, Damian; Sintorn, Ida-Maria (2021), “TEM virus dataset”, Mendeley Data, V3, DOI: 10.17632/x4dwwfwtw3.3 The official github page of the implementation of the machine learning models is semi-supervised_learning_microscopy_images. If you use the dataset or the codes in the
DOI:10.5281/zenodo.6377140