• Home
  • Categories
  • Pricing
  • Submit
    Built with
    Ever Works
    Ever Works

    Connect with us

    Stay Updated

    Get the latest updates and exclusive content delivered to your inbox.

    Product

    • Categories
    • Pricing
    • Help

    Clients

    • Sign In
    • Register
    • Forgot password?

    Company

    • About Us
    • Admin
    • Sitemap

    Resources

    • Blog
    • Submit
    • API Documentation
    All product names, logos, and brands are the property of their respective owners. All company, product, and service names used in this repository, related repositories, and associated websites are for identification purposes only. The use of these names, logos, and brands does not imply endorsement, affiliation, or sponsorship. This directory may include content generated by artificial intelligence.
    Copyright © 2025 Ever. All rights reserved.·Terms of Service·Privacy Policy·Cookies
    Decorative pattern
    Decorative pattern
    1. Home
    2. Meta Directories
    3. apd-core - NaturalLanguage section

    apd-core - NaturalLanguage section

    A curated Awesome-style sub-collection within the APD (Awesome Public Datasets) core repository that indexes multiple high‑quality natural language datasets and lexical resources via individual YAML meta files (e.g., SQuAD, Universal Dependencies, WordNet). It serves as a meta directory of links to external NLP datasets, aligning with the broader Awesome ecosystem as a directory-of-resources pattern.

    Surveys

    Loading more......

    Information

    Websitegithub.com
    PublishedDec 30, 2025

    Categories

    1 Item
    Meta Directories

    Tags

    3 Items
    #datasets#nlp#directory-of-directories

    Similar Products

    6 result(s)

    All Awesome Lists

    GitHub’s topic index page listing all public repositories tagged with the `awesome` topic, effectively serving as a central directory of Awesome lists across domains.

    Featured

    awesome-awesome-awesome-awesome

    GitHub repository that serves as a curated meta-list collecting multiple "awesome" lists of other awesome lists, effectively a directory of meta awesome directories.

    awesome-cn

    "awesome-cn" is a Chinese-language meta collection of curated "awesome" lists spanning programming languages, frameworks, and learning resources. Maintained in Python, it aggregates links to numerous topic-specific awesome lists (e.g., awesome-go, awesome-python, awesome-vue, awesome-javascript), providing a centralized entry point for Chinese developers looking for high-quality curated resources.

    AI Directories

    AI Directories compiles all AI-related directories in one place, serving as a meta-directory specifically for the AI sector.

    Awesome 3D Semantic City Models

    A curated awesome-style list of open 3D semantic city and region models (e.g., CityGML datasets), providing a centralized directory of high-quality 3D urban data sources.

    Featured

    Awesome Data - Biology Datasets (Meta)

    A curated Awesome-style collection of biological and genomics datasets, including ENCODE, EMPIAR, Ensembl Genomes, GEO, Gene Ontology, GloBI, LINCS, HGDP, HMP, ICOS PSP Benchmark, HapMap, JCB DataViewer (via BioStudies), and KEGG. Each entry links out to the primary dataset resource along with a corresponding YAML metadata file in the awesomedata/apd-core GitHub repository, making this part of a larger meta collection of Awesome data directories.

    Featured

    apd-core – NaturalLanguage Section

    Category: Meta-directories
    Tags: datasets, nlp, directory-of-directories
    Source: GitHub – awesomedata/apd-core (NaturalLanguage)

    awesomedata


    Overview

    The NaturalLanguage section of the APD (Awesome Public Datasets) core repository is a curated, Awesome-style sub-collection focused on natural language processing (NLP) datasets and lexical resources. Instead of hosting datasets directly, it acts as a meta directory that indexes multiple high‑quality external NLP resources through individual YAML metadata files.

    Examples of referenced resources include:

    • Question answering datasets (e.g., SQuAD)
    • Syntactic and morphosyntactic corpora (e.g., Universal Dependencies)
    • Lexical databases (e.g., WordNet)

    This section follows the broader Awesome ecosystem pattern of providing a structured directory-of-resources to help users discover and navigate NLP datasets.


    Features

    • Curated NLP Dataset Index
      Focused list of public natural language datasets and lexical resources, filtered to highlight commonly used, higher-quality sources.

    • YAML-based Metadata Files
      Each dataset/resource is represented by an individual YAML meta file containing structured information (e.g., name, description, links, possibly licenses and modalities), enabling machine-readable indexing and easier tooling integration.

    • Meta Directory (Directory-of-Directories Pattern)
      Functions as a directory of external resources, not as a data host:

      • Links out to canonical dataset homepages or repositories.
      • Aligns with Awesome-style lists and the broader Awesome Public Datasets (APD) ecosystem.
    • Coverage of Multiple NLP Resource Types
      Includes various categories such as:

      • Question answering datasets (e.g., SQuAD)
      • Parsed corpora / treebanks (e.g., Universal Dependencies)
      • Lexical/semantic resources (e.g., WordNet)
      • Other written/spoken language datasets relevant to NLP research and applications.
    • Integration with APD Core Structure
      Lives under core/NaturalLanguage in the apd-core repo, benefiting from:

      • Shared conventions with other APD sub-collections.
      • Consistent metadata format across domains.
    • Open, Git-based Contribution Model
      As a GitHub-hosted collection, it can be extended via pull requests:

      • New YAML entries can be added for additional datasets.
      • Existing metadata can be updated or corrected collaboratively.

    Pricing

    • Free
      • Public GitHub repository.
      • Free to browse, clone, and use the metadata and links to external datasets (subject to each dataset’s own license and access terms).