Skip to content
@bigscience-workshop

BigScience Workshop

Research workshop on large language models - The Summer of Language Models 21

Popular repositories Loading

  1. petals petals Public

    🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading

    Python 9.7k 566

  2. promptsource promptsource Public

    Toolkit for creating, sharing and using natural language prompts.

    Python 2.9k 367

  3. Megatron-DeepSpeed Megatron-DeepSpeed Public

    Ongoing research training transformer language models at scale, including: BERT & GPT-2

    Python 1.4k 226

  4. bigscience bigscience Public

    Central place for the engineering/scaling WG: documentation, SLURM scripts and logs, compute environment and data.

    Shell 1k 100

  5. xmtf xmtf Public

    Crosslingual Generalization through Multitask Finetuning

    Jupyter Notebook 537 41

  6. biomedical biomedical Public

    Tools for curating biomedical training data for large-scale language modeling

    Python 481 117

Repositories

Showing 10 of 36 repositories
  • data_tooling Public

    Tools for managing datasets for governance and training.

    bigscience-workshop/data_tooling’s past year of commit activity
    HTML 85 Apache-2.0 46 138 (2 issues need help) 3 Updated May 27, 2025
  • ShadesofBias Public

    Evaluation for Shades of Bias in Text

    bigscience-workshop/ShadesofBias’s past year of commit activity
    HTML 6 0 1 0 Updated Apr 24, 2025
  • biomedical Public

    Tools for curating biomedical training data for large-scale language modeling

    bigscience-workshop/biomedical’s past year of commit activity
    Python 481 117 163 (6 issues need help) 16 Updated Dec 10, 2024
  • xmtf Public

    Crosslingual Generalization through Multitask Finetuning

    bigscience-workshop/xmtf’s past year of commit activity
    Jupyter Notebook 537 Apache-2.0 41 11 0 Updated Sep 22, 2024
  • petals Public

    🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading

    bigscience-workshop/petals’s past year of commit activity
    Python 9,728 MIT 566 92 (9 issues need help) 20 Updated Sep 7, 2024
  • bigscience Public

    Central place for the engineering/scaling WG: documentation, SLURM scripts and logs, compute environment and data.

    bigscience-workshop/bigscience’s past year of commit activity
    Shell 1,005 100 13 7 Updated Jul 29, 2024
  • Megatron-DeepSpeed Public

    Ongoing research training transformer language models at scale, including: BERT & GPT-2

    bigscience-workshop/Megatron-DeepSpeed’s past year of commit activity
    Python 1,403 226 76 (10 issues need help) 47 Updated Mar 20, 2024
  • multilingual-modeling Public

    BLOOM+1: Adapting BLOOM model to support a new unseen language

    bigscience-workshop/multilingual-modeling’s past year of commit activity
    Python 73 Apache-2.0 15 13 6 Updated Mar 2, 2024
  • promptsource Public

    Toolkit for creating, sharing and using natural language prompts.

    bigscience-workshop/promptsource’s past year of commit activity
    Python 2,906 Apache-2.0 367 11 32 Updated Oct 24, 2023
  • massive-probing-framework Public Forked from AIRI-Institute/Probing_framework

    Framework for BLOOM probing

    bigscience-workshop/massive-probing-framework’s past year of commit activity
    Python 8 8 0 0 Updated Oct 18, 2023