Company Model/Technology Advancements/Improvements
Google Gemini Ultra Multimodal integration across Google products. Enhanced user experience with more advanced AI capabilities.
Gemma Smaller, more efficient model with 7 billion parameters trained on 6 trillion tokens. Improved efficiency for various tasks.
IBM Granite Code Models Introduced code generation and debugging capabilities. Open-source, Apache 2.0 license for greater accessibility.
Meta Llama 3.1 Released the largest-ever open-source AI model with 405 billion parameters. Claimed to outperform GPT-4o and Anthropic’s Claude 3.5 Sonnet on several benchmarks.
Llama 3.2 Introduced multimodal capabilities, allowing processing of text, images, and audio. Suitable for applications in robotics, virtual reality, and AI agents.
Llama 3.3 Enhanced efficiency, delivering performance comparable to Llama 3.1 405B at a lower cost. Text-only model with 70 billion parameters.
Movie Gen AI tool can generate videos up to 16 seconds based on text prompts. Outperforms similar models from competitors in human evaluations.
Motivo AI Model Designed to control the movement of human-like digital agents in the Metaverse. Enhances immersive experiences by improving interaction within virtual environments.
NVIDIA NVLM 1.0 Released a family of open-source multimodal large language models, with a flagship version featuring 72 billion parameters. Designed to improve text-only performance after multimodal training.
Nemotron-4 340B Introduced the Nemotron-4 340B model family, including Base, Instruct, and Reward models. Open access under the NVIDIA Open Model License Agreement, allowing distribution, modification, and use of the models and their outputs.
Project GR00T Unveiled Project GR00T, a general-purpose multimodal generative AI model designed specifically for training humanoid robots.
Blackwell Architecture Announced the Blackwell architecture with B100 and B200 data center accelerators. Emphasized as a processor for the generative AI era, combining accelerators with NVIDIA's ARM-based Grace CPU.
Cosmos AI Model Launched Cosmos, a family of AI models that can generate images and 3D models to train humanoid robots, industrial robots, and self-driving cars. Trained on extensive footage of human activities to help robots understand the physical world better.
Fugatto AI Audio Generator Introduced Fugatto, an AI audio generator capable of creating unprecedented sounds, such as a trumpet that meows. It can generate music, unique sound effects, and speech alterations based on text and audio inputs it wasn't specifically trained on.
OpenAI GPT-4o Multimodal capabilities (text, image, audio). Faster performance with reduced costs compared to GPT-4 Turbo.
GPT-4o Mini Compact version of GPT-4o. Cost-effective, aimed at startups and businesses needing efficient integration.
o1 Model Reasoning capabilities for more human-like thinking. Focused on moving beyond traditional prediction-based AI models.
o3 Model Successor to the o1 model, with enhanced reasoning capabilities. Designed to deliberate over questions, improving performance on complex tasks such as coding, mathematics, and science. Introduces a "private chain of thought" for step-by-step logical reasoning. Available in two versions: o3 and o3-mini, with o3-mini offering adjustable reasoning time settings.

Comments

Popular posts from this blog

Apparently I am my car...

Saturday Reflections: Musings on Life, Television, and Elections

I can’t sit here this morning, quietly, after all!