Gemma - A family of lightweight, state-of-the art open models!

proyas.bose · February 22, 2024, 5:41pm

I was thrilled to stumble upon Gemma today - Google’s fresh release in the world of AI! Gemma brings two variants - Gemma 2B and Gemma 7B, each loaded with pre-trained and instruction-tuned models currently available on Kaggle and Hugging Face. It’s all set for you to explore, tailored for different frameworks and hardware platforms. Let’s dive in together!

Here is a quick overview of Gemma specs:

Gemma: Open language models by Google DeepMind.
Includes 2B (trained on 2T tokens) and 7B (trained on 6T tokens) models.
Outperform Llama 2 7B and Mistral 7B on benchmarks.
Architecture: Transformer decoder with enhancements.
The models are trained on a context length of 8192 tokens.
Trained on web docs, math, code; not explicitly multilingual or multimodal.
Vocabulary: 256K tokens, uses subset of SentencePiece tokenize, byte-level encodings for unknown tokens.