SSLectures: Abstractive Summaries and Topic Segments of Lecture Videos

SSLectures: Abstractive Summaries and Topics Segments of Lecture Videos   SSLectures is a dataset containing abstractive summaries of lecture videos from AK Lectures website and MIT OCW repository. It also contains topic segments (chapters) for the MIT lectures. The dataset was scraped from free pub...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Alesh, Yaser Haitham, Abdulghani, Osama, Al Ali, Omar Ibrahim, Aoudia, Meriem, Abu Talib, Dr. Manar
Format: Dataset
Sprache:eng
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:SSLectures: Abstractive Summaries and Topics Segments of Lecture Videos   SSLectures is a dataset containing abstractive summaries of lecture videos from AK Lectures website and MIT OCW repository. It also contains topic segments (chapters) for the MIT lectures. The dataset was scraped from free publicly available material and is published under a Creative Commons License that allows re-distribution and re-use.   The dataset is split into 3 files explained below: mit_chapters_summarized.csv: Contains the transcript and other details of 14.8K chapters (segments) from the MIT lectures along with abstractive summaries generated with GPT-3.5. Each row is one chapter from one lecture video.  Suitable to train summarization to summarize parts of lecture videos. (Not full lectures). ak_lectures_summarized.csv: Contains the transcript and other details of 1.8k lecture videos from aklectures.com. Each lecture video comes with the abstractive summary that was published on the website. Most videos of this dataset are short, between 5-15 minutes on average. Suitable to train summarization models to summarize full short lecture videos. (~ 15 min. in length for most) mit_videos_all_courses_segmentations.csv: Contains details of the chaptering (segmentation) of each lecture video from MIT. Each row is for one lecture video, and comes with the timing (end times) and titles of each chapter in the video.  Suitable to train and/or evaluate segmentation algorithms and models for both short and long lecture videos. Please cite this page if you use this dataset in your research or in other projects.  Copyright Notice: All rights of the lecture videos, the transcripts the have been scraped, the chapters and titles, the human-written summaries and all other related details belong to the respective owners of the MIT OCW or the AK Lectures websites. Our work here is for research and educational purposes. 
DOI:10.5281/zenodo.10498679