I was thrilled to stumble upon Gemma today - Google’s fresh release in the world of AI! Gemma brings two variants - Gemma 2B and Gemma 7B, each loaded with pre-trained and instruction-tuned models currently available on Kaggle and Hugging Face. It’s all set for you to explore, tailored for different frameworks and hardware platforms. Let’s dive in together!
Here is a quick overview of Gemma specs:
- Gemma: Open language models by Google DeepMind.
- Includes 2B (trained on 2T tokens) and 7B (trained on 6T tokens) models.
- Outperform Llama 2 7B and Mistral 7B on benchmarks.
- Architecture: Transformer decoder with enhancements.
- The models are trained on a context length of 8192 tokens.
- Trained on web docs, math, code; not explicitly multilingual or multimodal.
- Vocabulary: 256K tokens, uses subset of SentencePiece tokenize, byte-level encodings for unknown tokens.