SSLectures: Abstractive Summaries and Topic Segments of Lecture Videos
SSLectures: Abstractive Summaries and Topics Segments of Lecture Videos SSLectures is a dataset containing abstractive summaries of lecture videos from AK Lectures website and MIT OCW repository. It also contains topic segments (chapters) for the MIT lectures. The dataset was scraped from free pub...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Dataset |
Sprache: | eng |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | SSLectures: Abstractive Summaries and Topics Segments of Lecture Videos
SSLectures is a dataset containing abstractive summaries of lecture videos from AK Lectures website and MIT OCW repository. It also contains topic segments (chapters) for the MIT lectures. The dataset was scraped from free publicly available material and is published under a Creative Commons License that allows re-distribution and re-use.
The dataset is split into 3 files explained below:
mit_chapters_summarized.csv: Contains the transcript and other details of 14.8K chapters (segments) from the MIT lectures along with abstractive summaries generated with GPT-3.5. Each row is one chapter from one lecture video. Suitable to train summarization to summarize parts of lecture videos. (Not full lectures).
ak_lectures_summarized.csv: Contains the transcript and other details of 1.8k lecture videos from aklectures.com. Each lecture video comes with the abstractive summary that was published on the website. Most videos of this dataset are short, between 5-15 minutes on average. Suitable to train summarization models to summarize full short lecture videos. (~ 15 min. in length for most)
mit_videos_all_courses_segmentations.csv: Contains details of the chaptering (segmentation) of each lecture video from MIT. Each row is for one lecture video, and comes with the timing (end times) and titles of each chapter in the video. Suitable to train and/or evaluate segmentation algorithms and models for both short and long lecture videos.
Please cite this page if you use this dataset in your research or in other projects.
Copyright Notice: All rights of the lecture videos, the transcripts the have been scraped, the chapters and titles, the human-written summaries and all other related details belong to the respective owners of the MIT OCW or the AK Lectures websites. Our work here is for research and educational purposes. |
---|---|
DOI: | 10.5281/zenodo.10498679 |