distilbert-base-uncased
by distilbert
--- language: en tags: - exbert license: apache-2.0 datasets: - bookcorpus - wikipedia --- This model is a distilled version of the BERT base model. It was introduced in this paper. The code for the distillation process can be found here. This model is uncased: it does not make a difference between ...
π§ Architecture Explorer
Neural network architecture
About
This model is a distilled version of the BERT base model. It was introduced in this paper. The code for the distillation process can be found here. This model is uncased: it does not make a difference between english and English. DistilBERT is a transformers model, smaller and faster than BERT, which was pretrained on the same corpus in a self-supervised fashion, using the BERT base model as a teacher. ...
π Limitations & Considerations
- β’ Benchmark scores may vary based on evaluation methodology and hardware configuration.
- β’ VRAM requirements are estimates; actual usage depends on quantization and batch size.
- β’ FNI scores are relative rankings and may change as new models are added.
- β’ Data source: [{"source_platform":"huggingface","source_url":"https://huggingface.co/distilbert/distilbert-base-uncased","fetched_at":"2025-12-18T04:21:59.006Z","adapter_version":"3.2.0"}]
π Related Resources
π Related Papers
No related papers linked yet. Check the model's official documentation for research papers.
π Training Datasets
Training data information not available. Refer to the original model card for details.
π Related Models
Data unavailable