ABSILVER's World

January 08, 2025

Company	Model/Technology	Advancements/Improvements

Google	Gemini Ultra	Multimodal integration across Google products. Enhanced user experience with more advanced AI capabilities.
	Gemma	Smaller, more efficient model with 7 billion parameters trained on 6 trillion tokens. Improved efficiency for various tasks.
IBM	Granite Code Models	Introduced code generation and debugging capabilities. Open-source, Apache 2.0 license for greater accessibility.
Meta	Llama 3.1	Released the largest-ever open-source AI model with 405 billion parameters. Claimed to outperform GPT-4o and Anthropic’s Claude 3.5 Sonnet on several benchmarks.
	Llama 3.2	Introduced multimodal capabilities, allowing processing of text, images, and audio. Suitable for applications in robotics, virtual reality, and AI agents.
	Llama 3.3	Enhanced efficiency, delivering performance comparable to Llama 3.1 405B at a lower cost. Text-only model with 70 billion parameters.
	Movie Gen	AI tool can generate videos up to 16 seconds based on text prompts. Outperforms similar models from competitors in human evaluations.
	Motivo AI Model	Designed to control the movement of human-like digital agents in the Metaverse. Enhances immersive experiences by improving interaction within virtual environments.
NVIDIA	NVLM 1.0	Released a family of open-source multimodal large language models, with a flagship version featuring 72 billion parameters. Designed to improve text-only performance after multimodal training.
	Nemotron-4 340B	Introduced the Nemotron-4 340B model family, including Base, Instruct, and Reward models. Open access under the NVIDIA Open Model License Agreement, allowing distribution, modification, and use of the models and their outputs.
	Project GR00T	Unveiled Project GR00T, a general-purpose multimodal generative AI model designed specifically for training humanoid robots.
	Blackwell Architecture	Announced the Blackwell architecture with B100 and B200 data center accelerators. Emphasized as a processor for the generative AI era, combining accelerators with NVIDIA's ARM-based Grace CPU.
	Cosmos AI Model	Launched Cosmos, a family of AI models that can generate images and 3D models to train humanoid robots, industrial robots, and self-driving cars. Trained on extensive footage of human activities to help robots understand the physical world better.
	Fugatto AI Audio Generator	Introduced Fugatto, an AI audio generator capable of creating unprecedented sounds, such as a trumpet that meows. It can generate music, unique sound effects, and speech alterations based on text and audio inputs it wasn't specifically trained on.
OpenAI	GPT-4o	Multimodal capabilities (text, image, audio). Faster performance with reduced costs compared to GPT-4 Turbo.
	GPT-4o Mini	Compact version of GPT-4o. Cost-effective, aimed at startups and businesses needing efficient integration.
	o1 Model	Reasoning capabilities for more human-like thinking. Focused on moving beyond traditional prediction-based AI models.
	o3 Model	Successor to the o1 model, with enhanced reasoning capabilities. Designed to deliberate over questions, improving performance on complex tasks such as coding, mathematics, and science. Introduces a "private chain of thought" for step-by-step logical reasoning. Available in two versions: o3 and o3-mini, with o3-mini offering adjustable reasoning time settings.

Comments