Public Dataset Archive 17
Natural LanguageA Free Public Dataset
High quality verifiable public dataset #17 tailored for machine learning pre-training loops.
LicenseCC-BY-4.0
FormatParquet
Dataset Size13 GB
📢 Ad Space — Configure AdSense to display ads here