The rumor mill is busy with rumors of the GPT-5 launch, speculated to be in the beginning of August, so let’s see what the next week or two will bring. In addition the money train in AI is not stopping both for the war of talent and with the new OpenAI round valuing it at $300 billion.
News
In addition we also saw the introduction of new features:
Google released Gemini 2.5 Deep Think, currently only for Google AI Ultra subscribers and they also released Gemini Embeeding.
OpenAI added Study Mode to ChatGPT, a mode that is focused on Socratic learning, trying to guide you along your learning journey.
And we again have a giant model release out of China, this time it is GLM-4.5 by z.AI. Through RL based post training the gap with leading models has been closed even further.
Articles
Context lengths have been improving for LLMs with the Llama 4 mode family boasting 10 million tokens. But how good are LLMs with making use of these giant contexts? Well this study by Chroma showcases, it isn’t that great - while Chroma might be biased, as they are in the RAG space - it confirms what a lot of people have suspected. They termed this context rot.
This OWASP guide provides a detailed technical framework for securing AI agents. It breaks down the security risks associated with core agent components like LLMs, memory, and tool use, and offers practical, actionable advice for developers and security professionals to build resilient applications.
Magentic-UI introduces an open-source web interface that advocates for a human-in-the-loop approach to AI agents. It details six key features—co-planning, co-tasking, multi-tasking, action approval, answer verification, and memory—that empower users with oversight and control, allowing for safe and efficient collaboration with AI. The core takeaway is that a hybrid human-AI system is a promising path forward for unlocking the productivity of AI while mitigating its risks.
Deep Dive
Last week we prepared ourselves to take full advantage of a new custom function to start building with. But I believe the embedding model release by Google warrants that we take a look at that.
While you won’t find the embedding model in the AI Studio it also has a free tier available that we can make use of, so let us take a look at that.
We luckily can reuse most of our code that we have previously written as we only need to switch out the model and slightly modify the code to adjust the request body as things are slightly differently structured.
One thing for a potential future exploration when we dive deeper into embeddings - where we than also take a look at what you can do with them - is that there is an additional configuration option: embedding_config
this option enables us to reduce the dimensionality of the returned embedding from 3,072 vectors down to 1,536 or even 768 - these are the recommened values, but you can specify others as well.
Take a look at the Python and the SAS code to see how easy it is to add these to our setup.