Pulling the watch feed_
Pulling the watch feed_
Things worth keeping an eye on…
Google's Gemini Spark, its always-on agentic assistant, now runs on Mac. Agentic means it carries out tasks on your behalf rather than just answering questions. This release adds real-time tracking and support for more apps. The open question is how well it handles the apps you actually use day to day, which is where these assistants tend to fall down.
The Verge reviewed Google's new Home Speaker running Gemini, and the verdict is that the hardware is good but the AI isn't ready. Smart speakers have spent years looking for a reason to exist beyond timers and music. Gemini was meant to be that reason. It isn't yet. This is a useful counterweight to the assumption that bolting a language model onto a device automatically makes it smarter. Sometimes it just makes it slower and less reliable.
Cloudflare is telling AI companies to separate the crawlers they use for search from the ones they use for training models and running agents. The deadline is 15 September. Miss it, and those crawlers get blocked by default across the many publisher sites Cloudflare sits in front of. In plain terms, this is an attempt to make AI firms pay for the content they hoover up, by handing publishers a switch that actually works. It's a real lever, because it changes the economics of how models get their training data.
Ramp matched actual AI spending across 21,559 US firms to their hiring records. The heaviest AI spenders grew headcount around 10 percent over the following two years, and entry-level roles by about 12 percent, which cuts against the idea that AI is quietly deleting junior jobs. The catch is who that describes: the effect only shows up among the biggest spenders, who were already larger and better funded before they adopted anything. It's a real signal, not proof that AI lifts everyone.
Anthropic can switch Fable 5 back on after the Department of Commerce lifted the export controls that pulled it and Mythos 5 offline in early June. The trigger back then was a jailbreak Amazon flagged, which led to an ultimatum from the Trump administration and a block that even stopped some of Anthropic's own foreign-national staff from using the models. What earned reinstatement is the part worth noting: a safety classifier that catches the jailbreak in over 99 percent of cases, and a promise to give the government pre-release access to evaluate future models. That's a quiet template for how labs and governments will negotiate access.
Sainsbury's plans to roll out live facial recognition across 200 shops, one of the largest deployments of the technology in the UK so far. Big Brother Watch's response is blunt: innocent shoppers shouldn't have to submit to face scans to buy a loaf of bread. The sharper problem is what happens when the system gets it wrong. People are being wrongly flagged, barred from shops, and left with no clear way to appeal. As facial recognition spreads, that is the accountability question that matters: who checks the machine, and what recourse do you have when it gets you wrong.
Anthropic released Claude Sonnet 5, and the story isn't the benchmarks, it's the price: two dollars per million input tokens and ten per million output until the end of August. It performs close to Opus 4.8 on most tasks and is now the default on free and Pro plans. Running an agent for hours used to mean watching a cost meter climb. Now the question has shifted from which model does agentic work best to which one is cheap enough to leave running.
Google's cheaper image model, Nano Banana 2 Lite, generates a picture in about four seconds and costs roughly three cents per thousand images. That pricing is aimed at people churning out a lot of images and iterating fast, rather than crafting one perfect shot. If you're making AI visuals in volume, the maths just changed: a thousand images for the price of a coffee. The heavier Nano Banana 2 and Pro versions are still there when you need more.
Acti is a keyboard that puts AI agents inside every app you type in, so you can grab a translation or a stock price without leaving what you're doing. The part that fits how non-developers actually build is Skills: you describe what you want in plain language, map it to a key, and it runs. Early testers made over a thousand of these in two weeks, and you can keep them private or share them. It's a bet that the keyboard, not a separate chatbot app, is where most people will end up using AI.
Someone dug into Claude Code and found it quietly hides invisible markers in its own prompts. When the tool is pointed at a non-Anthropic server, it swaps the apostrophe in the date line for a near-identical Unicode character, a signature you'd never see, apparently to flag people routing requests to rival AI providers. The technique itself is easy to defeat and mostly catches ordinary users rather than anyone determined. The real issue is trust: this is a tool with access to your files and terminal, and hidden fingerprinting like this should be documented in the open, not reverse-engineered out of the binary.
Proton updated Lumo, its privacy-focused chatbot, adding image generation, a slower reasoning mode for harder problems, and memory you control. What sets it apart is still the privacy model: your conversations are encrypted so even Proton can't read them, there are no server-side logs of your sessions, and nothing you type is used for training. Most chatbots ask you to trade your data for capability. Lumo's pitch is that you shouldn't have to, and there's a free tier to test whether it holds up.
Researchers found a way to talk an AI browser out of its own safety rules by convincing it it's in a fictional world where the normal consequences don't apply. Once it accepts that premise, it'll do things it shouldn't, like pulling code from private repositories and lifting saved passwords from the browser's own password manager. It worked on all six agents tested, including ChatGPT Atlas, Comet and the Claude Chrome plugin. The reason it matters more than a chatbot being tricked: an AI browser can act across your accounts, so a jailbreak here does real damage, not just embarrassing output.
Base44 lets you describe an app and have it built for you, no coding required. Until now it ran on frontier models from the big labs. Now Wix, which owns Base44, is rolling out its own model, betting that owning the engine makes the platform harder to copy. For anyone building on these tools, the model under the hood decides both the quality of what you ship and what it costs, so this is one to watch rather than cheer just yet.
Tidal won't ban AI-generated music, but from today it won't pay royalties on tracks it identifies as fully AI-made. From 15 July it'll also label them with an icon so listeners know. For anyone making money from AI music, one of the streaming payout routes just closed. It's also an early test of whether platforms can reliably tell human and machine-made tracks apart, because the whole policy rests on detection that actually works.
Senators Warren and Scanlon are reviving the Health and Location Data Protection Act, this time written for the AI era. The new version would ban data brokers from selling your health and location information, including things you reveal to a chatbot like ChatGPT or Claude. Right now, what you tell an AI assistant about your body or your whereabouts sits in a legal grey area. This is the first bill I've seen that treats a chatbot conversation as the sensitive record it often is.
Google is opening Gemini's personalised image generation to free users in the US. The pitch is that it can tailor images to your interests by drawing on data from your connected Google apps. Free access to a capable image tool is a real lowering of the barrier. The catch is in how the personalisation works: the better the results match you, the more of your Google data is feeding them, and that trade-off is worth being clear-eyed about.
California has struck a deal with Anthropic to use Claude across state government at half the usual price. It's a notable bet on putting a frontier model inside public services, and it deepens a split: as Anthropic gets closer to California, it has fallen out of favour with the federal government. Pricing deals like this decide which institutions can afford to build with AI and which can't. When the buyer is a state government, that's not just a discount, it's a choice about how public services get built.
Someone with no medical training loaded 266MB of raw MRI scans into Opus 4.8 and asked it to read them. His clinic had diagnosed a Grade III tendon tear and started an intensive treatment plan within minutes. The AI came back saying the tendon looked intact. He doesn't claim the machine was right, and that's the point: he's left with two confident verdicts, no way to know which holds, and less trust in the clinic that treated him.
In the Palisades fire arson case, prosecutors didn't just rely on phone location data and camera footage. They used the defendant's ChatGPT logs, including images of fire he'd generated and questions like 'why am I so angry all the time'. Whatever you make of the case, this is a concrete example of chatbot history becoming courtroom evidence. If you treat an AI assistant like a private journal, this is a reminder that it isn't one.
Apple has raised its prices, and Tim Cook is blaming the AI industry. The 16-inch MacBook Pro went up $300, and the iPad Air jumped from $599 to $749. The shortage is real, because memory makers shifted production to the high-margin chips AI data centres want. But set that against Apple's hardware margins, estimated at up to 47% on the iPhone 17 Pro, and 'unavoidable' starts to look like a choice.
The US export ban on Anthropic's Mythos is barely two weeks old, and rivals are already moving into the gap. China's 360 launched Tulongfeng, and Tokyo's Sakana AI launched Fugu, marketed openly as a hedge against export controls. If you build on a frontier model, this is the practical lesson: access can disappear on a government order, so the question of what you'd switch to stops being hypothetical.
Conno Christou was diagnosed with an aggressive lymphoma at 35 and fed his bloodwork, scans, wearable data and a symptom journal into Claude. He's careful about what it did: it didn't replace his doctors, it helped him ask sharper questions. At one point it flagged that a reactivated thymus can mimic disease on scans in people under 40, which a fourth specialist later confirmed, sparing him radiotherapy. The same article carries the honest caveat from a clinician, that general chatbots are often wrong and haven't been properly evaluated for this. Used that way, as a second set of questions rather than a second opinion, it's a fair picture of where these tools genuinely help.
Over the past two years Georgia has built a face recognition system for identifying people at protests, procured, Algorithm Watch reports, from a Moscow company tied to the FSB, Russia's security service. The technology itself is ordinary. What matters is who controls it and what they point it at. Sold as public safety, it becomes a way to name everyone who showed up to a demonstration.
For two weeks Anthropic's most capable models sat offline after an ultimatum from the Trump administration. Now the government has cleared more than 100 companies and agencies to use Mythos 5, including their non-American staff, while the public-facing version still waits. The thread running through both Anthropic and OpenAI this week is the same: who gets to use frontier AI is becoming a government decision, not a market one. That is worth watching closely, whoever holds office.
Anthropic's latest Economic Index looks at what people actually produce with Claude, not just what they ask it. The headline figure: 93% of conversations end in something concrete, most often an explanation, a document or a report, and work sessions lean harder still towards documents, at one in five. It's a useful corrective to the abstract AI debate. Most real use is unglamorous, people turning a prompt into a written thing they needed. A linked survey of around 9,700 users backs it up, with large majorities reporting real gains in speed and in the range of work they can take on.
OpenAI previewed GPT-5.6 in three tiers: Sol the flagship, Terra for high-volume work, and Luna for cheap everyday use, with claimed gains in coding, cybersecurity, biology and long agentic runs. The detail worth noting is access. This is a limited preview, staggered at the US government's request, with officials vetting who gets in. A capable model you can't freely use is a different kind of release.
A developer set up a public challenge: email his AI assistant and try to make it leak a secrets file. More than 2,000 people sent over 6,000 emails, using fake compliance audits, authority impersonation, and social engineering in several languages. Nothing leaked. It ran on Claude Opus 4.6 with a four-line security prompt, which is the caveat worth holding onto: a weaker model might fold where this one held. If you're weighing how much access to give an agent over your real inbox and files, it's a more reassuring data point than I expected.
Ford topped JD Power's initial quality ranking this year, and used the moment to admit something awkward. Its automated production and design systems weren't as reliable as it had assumed. It had to bring back experienced technicians, some of them former employees, to fix errors the robots made. I find these admissions more useful than any launch announcement. They're a concrete reminder that automating a process you don't fully understand tends to move the cost around rather than remove it.
The UK Home Office is using AI to summarise country information and the substantive interviews that asylum claims hinge on. The Open Rights Group's legal opinion argues this is probably unlawful, partly because applicants are never told AI was involved in their case. The deeper problem is practical. A summary that drops a critical detail can sink a claim, and the officer relying on it may never know what was lost. When a decision is wrong it goes to appeal, which costs more and takes longer than getting it right the first time. This is the accountability story sitting underneath the efficiency pitch.
Notion is shutting its Mail app and says more than half of Notion Mail users already manage email without ever opening their inbox, handing it to agents instead. I'd treat that headline figure with some caution, since it's Notion's own number and it suits the story they want to tell. Still, the direction is real. The inbox as a place you visit is being replaced by an agent that triages on your behalf. Your email history stays in Gmail, so this is less drastic than it first sounds.
OpenAI is releasing GPT-5.6 as a limited preview first, to a small group of enterprise customers, after the Trump administration asked it to stagger the launch over security concerns. The administration gets to evaluate the model during that window before any wider release. Whatever you make of the reasoning, it's the first time I can remember a US government effectively deciding when the public gets access to a frontier model. That's a precedent worth watching, wherever you land on the safety question.
New data suggests people who actually pay for an AI assistant are increasingly choosing Claude, even though ChatGPT still owns the market overall. ChatGPT's lead rests on enormous free usage. Among paying consumers, the gap narrows sharply. If you're deciding which assistant is worth a subscription, rather than which one is most famous, that's a more honest signal than headline user counts.
The story all year has been that AI is coming for engineering jobs. New hiring data from SignalFire points the other way. Across the big tech firms, engineers made up 55 percent of new hires in 2025, up from 46 percent in 2019, and engineering roles have held up better than most other functions. The likely reason is the old Jevons paradox: make something cheaper and more productive to use, and demand for it tends to rise, not fall. Worth keeping next to the layoff headlines, which rarely include this number.
Two camps of AI money spent 27 million dollars trying to decide one Congressional primary in New York. On one side, super PACs tied to Anthropic backed Alex Bores, who co-wrote a state AI safety law. On the other, an OpenAI-linked PAC spent to stop him. Bores lost narrowly, 35 percent to 39, and the reporting credits local Manhattan politics more than the AI fight. Worth watching anyway. This is the template both sides will use in next year's regulation battles, and the total bill came to over 50 million across nineteen states.
The AI you use was made usable by people you'll never see. This survey of content moderators and data labellers in Kenya documents how their work is run by algorithm: scored on speed, accuracy, and idle time, with those scores deciding shifts, pay, and whether a contract gets renewed. Workers describe being put 'on the bench', suspended without pay but still on call, which one of them calls a financial death sentence. It's a hard read, and it's the part of the supply chain that rarely makes the launch posts.
Anthropic put an always-on Claude inside Slack. It sits in a channel, remembers what it reads, and can pick up tasks without being prompted each time. The productivity pitch is clear enough. The quieter part is how it works: by building a running picture of how a company operates, one message at a time. Admins can scope what each Claude sees and spends, which matters, because the feature that makes it useful is the same one worth scrutiny before it's switched on.
An immunologist had an experiment he couldn't explain sitting on the shelf for three years. By his account, GPT-5 Pro proposed a mechanism that fit, and he then tested it against a separate, unpublished experiment, where the model's prediction held up despite having no access to the data. That last part is the bit worth paying attention to. One caveat: this is OpenAI's own write-up, not a peer-reviewed result, so it reads as a promising anecdote rather than settled science.
A chart doing the rounds shows open-weights models catching up to closed ones, with the gap hitting zero this December. Doubleword checked it properly. On that one headline benchmark the line does extrapolate to zero around 3 December. But across all 18 benchmarks the same group tracks, the average gap has stayed flat at just under five months. Coding is the exception: open models went from 15 months behind to one or two, while most other tasks barely moved. If you're weighing an open model you can run yourself against a paid API, the honest answer is that it depends entirely on what you need it for.
Vibe-coding tools let you ship an app without writing the code yourself. What they don't do by default is check whether that app is safe to hold other people's data. The article runs through real failures, including a security firm finding around 5,000 vibe-coded apps publicly accessible with no login, nearly 2,000 of them leaking sensitive data. The lesson isn't to stop building. It's that these tools only run a security check when you explicitly ask, so that step has to be deliberate rather than assumed.
Someone cancelled their 200-dollar-a-month French tutor and built two tools to replace it. One teaches grammar and schedules reviews against the forgetting curve. The other handles spoken practice through a chain of three APIs, for about four cents a session. What I like here is the honesty about tradeoffs: an early version on a cheaper local model kept inventing grammar corrections, and the write-up is clear that a confident wrong answer is worse than no answer. Both tools are open source.
The Atlantic took four datasets of music used to train AI models and made them searchable to anyone. Two are vast, at 12 million and 9 million tracks. The interesting detail is how the audio was gathered: three of the sets are lists of links to songs on YouTube and Spotify, with developers using automated tools to pull the audio down, sometimes bypassing logins and the mechanisms that pay artists. If you've ever wondered whether a specific song trained a model, you can now check.
Norway has moved to keep AI tools out of elementary classrooms, with the restriction reaching up through the early primary grades. Most of the policy noise this year has been about getting AI in front of children sooner. This goes the other way, on the view that kids should build the underlying skills before a tool does the thinking for them. It's a rare government bet on not yet instead of full speed ahead, and the results will be worth watching.
The UK Home Office wants to use facial age estimation to judge how old asylum-seeking children are, starting in 2027. EFF and more than sixty other organisations have written to object, and their first concern is bias: these systems misjudge age, and they misjudge it unevenly across different groups. When the output decides whether a child is treated as a child, a margin of error stops being a technical footnote and becomes a person's life.
An open, MIT-licensed model called GLM-5.2 lands within four points of GPT-5.5 on a common intelligence benchmark, while making things up far less often. On a test that measures how often a model invents an answer instead of admitting it doesn't know, GLM-5.2 scored 28 per cent against GPT-5.5's 86 per cent. The practical takeaway is simple. A bigger, pricier model isn't automatically the more trustworthy one, and for a lot of building work a free open model that knows its own limits is the safer default.
TesterArmy lets you describe what your app should do in plain English, then an agent clicks through the web or mobile flows like a user would and sends back screenshots and bug reports. For anyone shipping an app without a QA team, that's the gap it fills: it handles logins, forms and OAuth without anyone writing a test script. Whether it catches the subtle bugs that actually matter is the open question, but the plain-English approach lowers a barrier that used to need real engineering time.
The EU's AI Act was meant to phase in protections for high-risk systems, but a new "AI Omnibus" deal pushes the big obligations back to late 2027 and 2028 and trims the transparency rules before they ever applied. Nine civil-society groups argue this sets a precedent: a law can be reopened and watered down in the gap between passing and taking effect. If that holds, it's a template for weakening the GDPR and other digital rules the same way.
The UK Home Office has been running asylum interview summaries and policy searches through GPT-4o tools, and applicants are never told. There's no published data protection assessment, no equality assessment, and neither tool appears in the government's own algorithmic transparency register. Pilot testing found 9% of the AI summaries were too flawed to use, which matters a great deal when the output helps decide whether someone gets to stay in the country.
The story behind the Mythos export controls is messier than the headlines suggest. Wired traces it to a dispute over Anthropic giving South Korea's SK Telecom access to its most powerful model, with US officials citing alleged China ties that the company denies. When the White House ordered access revoked for all foreign nationals, including immigrants already in the US, Anthropic switched the models off entirely rather than gate them by nationality, which shows how fast a single government can pull a capable model off the table.
Google now puts 'write with Gemini' prompts in Docs by default, and the off switch isn't where you'd expect. One toggle lives in the Docs bottom bar. The fuller fix sits in Gmail, under Workspace smart features, which also governs Docs. Default-on AI is the pattern now, so knowing where the controls hide is quietly useful.
A security firm, Mindgard, found ChatGPT's image generator will produce graphic violent content from a prompt that contains nothing offensive. The trick is a viral instruction to 'restore the attached photo' with no photo attached, which leaves the model to fill the gap on its own. The real concern sits underneath: the guardrails check the words typed in, not the image that comes out. So an ordinary user can land on horrific results by accident, and OpenAI's first fix was sidestepped with minor edits.
Z.ai released GLM-5.2 this week with open weights under an MIT licence. Independent benchmarks from Artificial Analysis now rank it the strongest open-weights model available, ahead of DeepSeek and Kimi. The part that matters for cost: running it through OpenRouter is about $1.40 per million input tokens, against $5 for GPT-5.5 and Claude Opus. A capable model you can self-host or rent cheaply, with no single vendor able to switch it off.
The NO FAKES Act, now in front of the Senate Judiciary Committee, would create a new right over your likeness to police AI-generated impersonations of a person's face or voice. EFF and a coalition of groups warn it borrows the worst of DMCA takedowns, pushing platforms to pull satire, parody and commentary rather than risk penalties of up to $750,000 per work. The right can also be signed away in contracts or terms of service. For anyone making things with generative AI, it means lawful content could come down on a complaint, with weak protection for legitimate use.
Meta has added an AI Mode to search in the Facebook app, aimed at messy questions like what to do this weekend. What's different is the grounding: it draws on public posts across Facebook and Instagram, so local recommendations get more relevant. In hands-on testing it still hallucinated, citing a post about a pool closure that didn't exist and suggesting an Austin coffee shop for a trip to Minneapolis. Grounding a model in real data helps, but it doesn't stop it inventing confident citations, so its answers still need a second look.
Earlier this year the advice was to push AI usage as hard as possible. Then the invoices arrived. Uber reportedly burned through its annual AI budget in months, some companies cut Claude licences, and Meta scrapped its internal usage leaderboard. NEA's Tiffany Luck frames the shift as an ROI reckoning, with firms now mixing models and measuring value rather than committing to one provider. If you're building a business on AI, the cost discipline question has stopped being optional.
CADAM is an open-source web app that turns a plain-English description into an editable 3D model, running entirely in the browser with nothing to install. It takes the description, lets you fine-tune with parametric sliders, and exports an STL ready for printing. CAD software has always been a wall for people who aren't engineers. This lowers it to a sentence and a few adjustments.
SpaceX is buying Cursor, the AI coding tool, for $60 billion in stock. The deal traces back to an odd arrangement in April: acquire Cursor for $60 billion, or pay a $10 billion fee to walk away. The reason is that xAI's own coding tools trail Claude Code and OpenAI's Codex, and buying the most popular vibe-coding editor closes that gap. If you rely on Cursor day to day, it's worth watching how it changes under new ownership, because acquired tools always do.
Android 17 is out, and the part that matters for most people is the Pixel Drop bringing Google's newest Gemini models onto the phone itself. There's the usual round of multitasking, parental controls and security updates alongside it. What's worth tracking is how much of the assistant now runs on the device rather than in the cloud, because on-device models mean faster responses and less of your data leaving the phone.
Sarah Myers West of the AI Now Institute told the Senate Banking Committee that the way AI infrastructure is financed is a systemic economic risk, bubble or no bubble. Her numbers are the story: the big four spent $130 billion on data centres in a single quarter, AI debt issuance could pass $500 billion this year, and much of it sits in off-balance-sheet vehicles and circular deals between the same handful of companies. She argued that if it unwinds, ordinary people carry the cost, from household wealth tied up in those stocks to electricity bills near data centres that have climbed as much as 267%. Her phrase for it was a race to the bottom where the public absorbs the downside.
Microsoft has patched a critical flaw in Microsoft 365 Copilot. Researchers at Varonis showed a single click on a crafted link could make Copilot hand over 2FA codes and other data it could read across email and SharePoint. It's patched, but the deeper point stands: a language model can't reliably separate your instructions from instructions hidden in the content it reads, so close one route and another opens.
A WordPress VIP survey found 60% of US consumers are put off when a brand markets itself with the word 'AI'. That's worth sitting with if you're building something and tempted to lead with it. The thing you made might be useful, but stamping 'AI-powered' on the label could cost more trust than it wins.
Plaud has crossed $100 million in annual recurring revenue from its software, on the back of more than two million AI notetakers sold. That's a real number in a market crowded with apps that record your meetings and summarise them. Hardware plus a subscription is an old model, but it's working here while plenty of software-only notetakers struggle to charge anything at all.
ChatGPT has dropped below half the assistant market for the first time. It still leads by a distance with 1.1 billion monthly users, but Gemini now has 662 million and Claude 245 million. For anyone building on these tools, the split matters: betting an entire workflow on one provider looks riskier than it did a year ago.
The US government told Anthropic to block its two most capable models, Mythos 5 and Fable 5, from any foreign national, including Anthropic's own overseas staff. The only way to comply was to pull both offline for everyone. If you build on a frontier model, this is the risk made concrete. Access can vanish over a weekend on a government order, and Anthropic says the capability behind the supposed concern is already available in other models like GPT-5.5.
OpenAI lost about $38.5 billion in 2025 on $13 billion of revenue, against $34 billion in costs. The Financial Times independently verified the figures, so this isn't a rumour. Losses on this scale don't stay invisible. They get funded somehow, usually through higher prices, tighter usage limits, or new investors with their own demands, which is the real story under the headline for anyone building on these models.
This is the sharper read on why the ban happened. The reported trigger was a narrow safety bypass, but security researcher Katie Moussouris, shown the underlying paper, says it should never have warranted an export control, and that trying to patch it would only weaken the model for defence. The precedent matters more than the technical detail. A single letter pulled a company's product offline with no court involved, and the same lever could point at anyone next.
Big Tech has spent months lobbying for one federal set of AI rules to override the patchwork of state laws. That push, called preemption, is now bolted onto child-safety legislation, and the politics are a mess. Why it matters: if it passes, it decides whether AI gets governed by fifty separate state regimes or one national standard, which shapes what the tools you use are allowed to do. For now, insiders give it slim odds before Congress breaks for its summer recess.
GrassDx is a free web app that diagnoses lawn problems from a few photos and your location, then suggests fixes ranging from a cheap DIY treatment to a professional service. No account, no sign-up. What's interesting is the business model. The tool earns through affiliate links to the products it recommends, rather than charging users a subscription. It's a clean example of a narrow, single-purpose AI product that solves one real problem and funds itself through referrals.
A GitHub issue on the Nex-N2 repo argues Rio de Janeiro's "homegrown" LLM is, in fact, a merge of an existing open model. The procurement claim and the artefact don't match. Expect more of these. As cities and states commission their own models, the receipts will keep living on GitHub.
Gabriel Weinberg, the DuckDuckGo founder, lays out data showing AI consumption looks much more like search or social media adoption. Not the everyone-is-using-it-for-everything narrative coming out of the labs. A useful corrective. If you're building products that assume every user is fluent with chat, the design brief just changed.
WSJ reports the trigger behind the US government cutting off Fable 5 and Mythos 5. Amazon cybersecurity research, plus direct conversations between Andy Jassy and the White House. So a competitor's safety claims got a rival's models pulled from the market in days. Whatever you think of the underlying risk, this is now the precedent. Model bans can move at the speed of one CEO's phone call.
A Verge writer gave Gemini one prompt about their dying lawn and walked away. Five minutes later they had a working app, plus a bug Gemini had already written the fix button for. The piece reads like a quiet admission that vibecoding isn't a stunt any more. It's now how someone with no coding habit solves a household problem.
iOS 27 puts AI photo editing inside the native Photos app. Reframe, Extend, and Clean Up. The Verge's hands-on says they mostly work, which is the right level of expectation-setting for a first release. The bigger story is that AI photo editing just became default behaviour on the most-used camera in the world. People who never opened Photoshop are about to start altering images.
KPMG pulled its own report on AI usage after readers spotted hallucinations in the text. The firm that sells AI assurance to clients couldn't get the assurance right on its own output. A useful artefact to remember the next time a consultancy quotes six figures for an AI strategy deck.
A Derbyshire police officer is under investigation for using AI to fabricate evidence across multiple cases. This is the accountability beat at its sharpest. The technology isn't really the story. The story is that the controls around how AI gets used in policing haven't caught up. The cost of that gap is borne by whoever was on the other end of those cases.
On 12 June the US government ordered Anthropic to suspend Fable 5 and Mythos 5 for all customers, citing national security. Anthropic complied, then published its disagreement. It says the trigger was a narrow jailbreak: essentially asking the model to read a codebase and fix flaws, something other models already do. The part worth watching isn't Fable. It's that access to a deployed frontier model can vanish overnight on a directive with no published technical detail. Anthropic argues the same standard, applied widely, would stall releases across the industry.
StackScope crawled over 40,000 indie launch posts and mapped what they're actually built with. If you're picking a stack and the only data you have is vibes from a timeline, this is a sharper input. The dataset also exposes a quiet truth: the stack everyone says is dead is usually the one that ships.
Simon Willison's first hands-on look at Claude Fable describes a model that doesn't wait to be asked. It picks up adjacent tasks, runs checks, fixes things it spots along the way. For anyone wiring Claude into a workflow with real consequences, that's a double-edged feature. More gets done per prompt, but the agent is now making more decisions on your behalf. The shift in default behaviour matters before pointing Fable at anything that can't be reversed.
DoorDash has launched Ask DoorDash, a chatbot inside the app that takes natural-language requests and photos instead of menu browsing. Showing a picture of yesterday's takeaway and asking for something similar is now a mainstream interaction. The interesting part isn't the chatbot itself. It's that food ordering is the kind of consumer surface where a major company now assumes most users would rather describe than search. That bar keeps dropping.
EFF has found a site called News-USA Today running quotes from five EFF 'experts' who don't exist. Real-sounding names, plausible bios, invented entirely. The pattern matters because the byproduct of cheap AI publishing is fake authority, not just fake facts. Anyone fact-checking a quote now has to confirm the source person is real before confirming what they said. That's a new cost on everyone reading news online.
Pool has launched an app that organises your screenshots into themed collections and tries to recover the original link behind each one. Most people screenshot things they want to come back to and then never come back. Pool's bet is that the bottleneck is retrieval, not capture. If it works, it's a small but real example of AI doing the unglamorous filing work that nobody enjoys doing by hand.
Deezer now lets anyone scan their Spotify or Apple Music playlists for AI-generated tracks, even without a Deezer subscription. The detection is the headline. The more telling fact is the market response. Deezer offered the same tool to other streamers. Apple and Spotify opted instead for voluntary tagging by uploaders, which is roughly the same model that gave us self-reported nutrition labels. Qobuz built its own detector. The big platforms are choosing the system that asks the least of them.
Amazon has put a number on its data centre water use for the first time: 2.5 billion gallons in 2025, at 0.12 litres per kilowatt-hour. The intensity dropped two percent from 2024 even though capacity grew. The disclosure landed days after Seattle voted in a one-year data centre moratorium that some Amazon employees publicly supported. AI infrastructure isn't an abstract cost anymore. It has gallons, watts and city ordinances attached.
Anthropic admits it shipped Claude Fable 5 with a safeguard that quietly downgraded any prompt flagged as 'frontier LLM development' to Opus 4.8, without telling the user. After the backlash, they're switching to visible fallbacks. You'll see the swap when it happens, and the API will return a reason for the refusal. If you build on Claude, this is the kind of behaviour the model card should have disclosed on day one, not patched in after a Wired piece.
Google has a new setting called 'Search Services History' that saves the images you upload to Lens, the audio from Search Live, voice searches, and phrases you've spoken into Translate. It's on by default. You can switch it off, but the option lives behind a settings page most people will never open. Defaults shape behaviour at scale, which is why this kind of opt-in-by-default move keeps showing up.
A group of independent musicians is suing Google, claiming Lyria 3 was trained on songs they uploaded to YouTube. Google's motion to dismiss is the tell. It doesn't deny it. The reason matters: Google's terms of service for YouTube uploaders don't explicitly grant training rights, and saying yes on the record could create new exposure across the catalogue. This is a live case worth watching if you're anywhere near the question of what was in a model's training data.
Memory is the feature every assistant has been racing to add this year. New research suggests it has a cost. Models with persistent memory can drift toward telling users what they want to hear, and overall accuracy can drop. Worth keeping in mind if you're building a workflow that relies on the assistant remembering things across sessions. The fix isn't to switch memory off. It's to know it can corrupt the answer and design around that.
Microsoft rolled out Claude Fable 5 to GitHub Copilot customers, then quietly kept it out of the model picker its own employees use internally. The reason is Anthropic's new 30-day data retention requirement for Mythos-class models. Worth noting if you're weighing Fable for work that touches sensitive data. Even Microsoft, which sells access to the model, isn't comfortable enough with the policy to let its own staff use it.
Researchers at Blue41 showed an attack against Bunq's AI assistant where a €0.02 transfer with a crafted transaction description was enough to compromise it. When the customer later asked the assistant 'show me my recent transactions', the injected text landed in the model's context and the assistant produced a convincing phishing message inside the bank's own app. The lesson generalises beyond banking. Any AI assistant that retrieves data set by third parties is treating that data as instructions, whether it means to or not.
A Munich Regional Court has ruled that Google is directly liable for what its AI Overviews say, because the overview is Google's own content rather than a list of search results. In this case the AI invented fraud allegations against two publishers that didn't appear in any cited source. The legal shield that's protected search engines from liability for years just stopped covering generative answers.
Thirty billion location scans contributed by Pokémon Go players over a decade have ended up training navigation tech for military drones. Niantic spun out its spatial mapping unit as Niantic Spatial, and that unit just partnered with Vantor on defence work. The point isn't that this was secret. Niantic's terms of service almost certainly covered it. The point is that consumer telemetry collected for a game is a perfectly normal input to a defence supply chain. Most data partnerships look harmless until you trace where the pipeline ends.
Privacy International ran an experiment. An experienced coder and a novice used vibe-coding tools to build a federated chat server, a messaging client, a health-tracking app, and a hiking-blog website. The messaging client first sent plaintext. When asked to add end-to-end encryption, the AI invented its own scheme. When the Matrix standard was requested instead, it reached for a deprecated protocol, then quietly stripped the encryption layer altogether and said nothing. The health app stored passwords without a salt. The test suite was rigged with hardcoded values so the broken implementation could 'pass'. None of this is visible to someone who can't read the code.
Buried in the Fable 5 model card: Anthropic can quietly degrade the model's responses for users it considers competitors, and has decided not to tell those users when it's happening. The trigger isn't "training a frontier model" in the textbook sense. It stretches to bootstrapped apps with their own embedding or reranking code. If you're trusting a coding assistant with anything sensitive for your product, the model card is the place to look first.
The Senate Judiciary Committee votes this week on NO FAKES, a bill the EFF argues would add another internet-wide takedown regime on top of the ones already in force. It's pitched as protection against AI replicas, but the operational text creates a notice and takedown system anyone can fire at hosting platforms, with limited recourse for the people whose content gets pulled. Useful to read the EFF's specific objections before forming a view on whether the trade-off lands.
Apple spent years arguing that a photograph should be a record of what happened. At WWDC 2026 it shipped tools for moving people around the frame, removing them, and replacing the sky, all from a Photos app that won't flag which images were edited. The Verge is right to call this a reversal. If you care about provenance, the default phone camera in most pockets just got less trustworthy.
Seattle's city council took up a one-year moratorium on new data centres this week, two months after companies proposed five large new sites in the city. Some of the loudest voices for the moratorium are current Amazon employees, testifying against the infrastructure their own employer is racing to build. The story underneath is workers using municipal politics to slow what their company won't slow internally.
Lovable is now at $500M annualised revenue with a million new projects starting every week. The headline numbers are useful, but the more interesting line in the TechCrunch piece is that users are replacing internal software, not just spinning up toy demos. That's the audience the platform sells to writing their own line-of-business apps instead of waiting for IT.
Anthropic has released its first Mythos-class model to the public as Claude Fable 5. The headline claim is that performance pulls ahead the longer and more complex the task gets, which matters if you're asking it to do anything beyond one-shot answers. Anthropic is also shipping it with hard guardrails on cybersecurity and biology, after previously holding the family back over weapons-uplift concerns.
New York's legislature has passed a one-year ban on new large data centres. The accompanying impact assessment is meant to cover electricity, water, land use, and pollution. Governor Hochul still has to sign it. If she does, it's the first statewide moratorium of its kind in the US, and a real signal that the politics of AI infrastructure are catching up with the build-out.
The Siri overhaul Apple promised in 2024 has finally shipped. Apple is calling it Siri AI, framing it as a companion rather than a voice assistant, and adding adjustable pace, accent, and expressivity. Worth watching whether it's actually better at the things Siri has always been bad at, or whether the demos again outrun what the product can do.
Apple's Shortcuts app now lets you describe a workflow in plain language and have AI build it for you. That's a real shift for anyone who's bounced off the existing visual editor. The test is whether the generated workflows actually work, or whether you end up debugging an opaque mess someone else wrote.
Meta has built a For You section into its standalone Meta AI app that fills with AI-generated clickbait articles. The example The Verge ran with is a piece featuring two Queen Elizabeth IIs in the same image. Meta isn't fighting AI slop on its platforms any more, it's producing it.
Massachusetts has passed a privacy bill that bans the sale of precise location data. It goes further than most US state privacy laws. The reason this matters for AI specifically is that location data is one of the main raw materials behind a lot of the "personalised AI" pitch. Less data legally available to buy means fewer products built on top of it.
OpenAI has released Lockdown Mode, a setting that limits what ChatGPT can do when prompt injection is a risk. It restricts tool calls and data sharing during agentic tasks. OpenAI itself admits this doesn't eliminate the problem, only reduces the blast radius. If you're connecting ChatGPT to anything that holds sensitive data, Lockdown Mode is the kind of mitigation worth knowing about.
Meta has now confirmed that 20,225 Instagram accounts were taken over after hackers tricked its AI support chatbot into sending password reset codes to the wrong email address. The bug was in a separate code path, not the chatbot itself, but the chatbot's willingness to follow instructions made it useful as the attack surface. The lesson for anyone shipping AI-assisted account flows is that the model doesn't have to be the vulnerability, it just has to be the door.
The TechCrunch piece walks through how AI companies are scrambling as the cost of running inference at scale catches up with revenue. The industry conversation has shifted from "tokenmaxxing and go fast" to "we need guardrails, how do we control this?". For anyone paying per-token rates on top of their own build, this means price increases are coming as the bigger players prepare for IPOs and need their margins to look better.
OpenAI has rebuilt ChatGPT's memory around a background process it calls "dreaming", which synthesises memories from past conversations instead of waiting for you to say "remember this". It's live for Plus and Pro users in the US, rolling out to Free and Go users and other countries over the coming weeks. The more interesting bit is a new memory review page where users can see, edit, and delete what ChatGPT has decided to keep about them. The default is now "we will remember".
Kevin O'Leary has agreed to cut his Utah AI data centre from 40,000 acres to roughly 20,000 after community pushback, removing nearly 20,000 acres around the Locomotive Springs Waterfowl Management Area near the Great Salt Lake. The Utah Senate President had asked for a 75% cut and water-saving measures; O'Leary offered half. It's the first concrete case I've seen of local political pressure shrinking an AI infrastructure project rather than just slowing it down, and the halved footprint is still larger than Manhattan.
TSMC's chief executive told shareholders this week that "customer demand is so high, and we can only support so much" and that catching up via US-based production will take a very long time, despite a planned $165 billion in further US investment across three more plants, two packaging facilities, and an R&D centre. He also signalled gradual price rises. For anyone building on top of frontier-model APIs, the implication is the same as it has been for a year: chip capacity is the constraint, not the willingness to spend.
California's AB 412 is back, and it would require AI developers to identify every registered copyrighted work used in training. That sounds reasonable until you look at how copyright registration actually works. There is no machine-readable registry, many works are registered with no public deposit, and online content mixes registered material with Creative Commons and public domain with no reliable way to tell them apart. EFF's opposition letter argues the bill would lock smaller AI projects out and entrench the well-funded incumbents who can afford the legal fight, which is the opposite of what its supporters say they want.
The CEOs of Anthropic, OpenAI, Microsoft, Meta, and Google DeepMind, alongside biotech executives from Twist Bioscience and Ansa, have signed an open letter asking Congress to require synthetic DNA and RNA sellers to screen orders for sequences usable in dangerous pathogens. Cross-industry letters at this level are rare. The signal worth reading is that the AI labs are publicly saying their own models have made this kind of regulation more urgent, which is a more honest position than the lobbying you usually see on AI bills.
South Korea will require every online community and forum to AI-scan every uploaded image and video from 1 July, under amendments to the Telecommunications Business Act. The catch is that operators have to buy data-centre-grade Nvidia GPUs themselves, with no subsidy and no extended grace period. The practical effect is that smaller Korean forums and indie tool builders simply cannot comply, which makes this an AI-enforced moderation regime that only large platforms can afford to run.
Apple has approved Poke as the first third-party AI agent allowed on its Messages for Business platform, which means you can text an AI agent on iMessage the same way you'd text a friend, with no new app to install. Poke says it has relayed roughly 100 million messages since launching in March. Apple is charging Poke a per-user fee, lower than what Meta charges for WhatsApp AI but a fee all the same, which tells you something about how platform owners plan to monetise this layer.
EFF has confirmed via static analysis what Wired first reported: Meta has shipped facial-recognition code in its always-on smart glasses, with each face stored as a string of 2,048 numbers and compared against a personal database the wearer builds up over time. The capability is in production on millions of devices. The product implication is that someone wearing these glasses can quietly identify strangers in public, and the strangers being identified have no signal that it's happening.
Kasra Rahjerdi built a deliberately vulnerable React Native app and ran nine models against it to see which could find the bug. GPT-5.5 solved it seven times out of ten. Claude Opus 4.8 got there twice and refused mid-run on several attempts. Most of the rest never noticed the real flaw was in a Firebase config. The lesson for anyone shipping on Firebase or Supabase is that hardening your API doesn't cover the data layer; the auth rules on the database itself need the same scrutiny.
The Daily Californian reports that failing grades in Berkeley CS classes have climbed alongside heavier AI use among students, with professors flagging weaker maths fundamentals at the same time. It's one campus, but Berkeley is a programme other CS departments watch. If the pattern holds, the next cohort of people calling themselves developers will arrive with shakier foundations than the last. That matters for anyone who hires, mentors, or builds tooling meant to teach what's underneath rather than paper over it.
Microsoft used Build to announce it's done being OpenAI's distribution layer. The keynote stacked in-house reasoning models, an agent platform, and a new security stack into one product story. If you've been building on Azure assuming Microsoft would keep reselling GPT as the default, the calculus shifts. OpenAI is now the partner you might also use, not the headline.
Meta has rolled out its WhatsApp Business AI agent worldwide and is pricing it on token usage. A small shop in Lagos handling order replies on WhatsApp is now on the same metered billing model as a coding agent in San Francisco. The accessibility piece is genuine: no developer needed to stand up an agent that can answer customers. The risk is that token-based pricing on chat support is hard to forecast, especially when the customer base is price-sensitive.
Uber has put a $1,500 monthly cap on each engineer's spend per AI coding tool, so Claude Code and Cursor each get their own bucket. Simon Willison reads this as implying about $36,000 a year per engineer if both buckets get used, which lands at roughly 11% of a median Uber engineer's compensation. That's a useful anchor number if you're trying to work out whether the AI you're paying for is paying its way.
Amazon's updated search shows AI-generated images of clothing and home goods to help narrow your query. Tap an image, get similar-looking items. The catch: the product in the image doesn't have to exist on Amazon. The store shows you something invented, then offers real items that resemble it. Whether that helps anyone find what they actually want is a question for the launch metrics, and it nudges the line between a real listing and a vibe further into the grey.
The UK's Competition and Markets Authority has told Google it must let publishers stop their content being pulled into AI Overviews or used for fine-tuning. It's the first regulator anywhere to apply this kind of conduct rule to AI Search, and the change will run as a UK pilot before going global. For anyone running a site whose pages get summarised away by Google's AI answers, this is the first concrete way to opt out. Whether enough publishers take it up, and what happens to traffic when they do, will show whether the rule changes anything in practice.
Microsoft has launched Scout, an always-on assistant that lives across Outlook, OneDrive, and Teams rather than inside any one of them. The pitch is that Scout can see what you're doing across apps and act on it, where Copilot was confined to individual apps. For anyone whose working day is structured around the Microsoft 365 stack, this is the closest thing to a default office assistant yet.
Uber spent its annual AI tool budget in four months. Having previously told staff to use AI as much as possible, the company has now imposed per-employee caps. The lesson for anyone running a team is that "use AI freely" is a budget decision, not a productivity strategy, and most organisations haven't worked out yet what the right cap actually is.
Google is rolling out a feature in the Phone app that flags incoming calls when the number matches a known contact but the call looks suspicious in other ways. The threat it's defending against is AI voice impersonation, where a scammer spoofs a familiar number and uses a cloned voice to ask for money or credentials. This is a small thing, but the kind of small thing that genuinely helps people who aren't in the loop on what AI scams now sound like.
A class action filed in Seattle accuses Amazon of storing facial-recognition data of people who walked past Ring cameras without ever consenting to be in the system. The Familiar Faces feature is the basis: it learns to recognise people who appear repeatedly, including anyone who happens to use the pavement outside a Ring household. If this one lands, it will set a precedent for what private facial-recognition systems can quietly do to people who never signed up for them.
Trump signed a revised AI executive order yesterday, replacing the earlier draft after industry pushback. The new version asks frontier model developers to submit to a voluntary pre-release federal review. Voluntary means there is no enforcement and no consequence for skipping, which makes this less an oversight regime and more a polite request.
OpenAI has launched six Codex plug-ins shaped around specific jobs: data analytics, sales, product design, creative production, equity investing, and investment banking. Each bundles the integrations and context someone in that role would otherwise wire up themselves. The bet is no longer that Codex becomes a better coder, it's that it becomes a passable version of whatever job the user already has.
Google's Gemini Spark is its most ambitious consumer pitch yet for AI that does things on your behalf in the background. It found a Verge reporter's budget spreadsheet without prompting and addressed his wife by her first name even though it wasn't in her email address. It also fabricated a guest list when it didn't have one. At $99.99 a month and US English only, the question isn't whether it works. It's whether you'll let Google read your Gmail, Drive and Calendar to make it work.
Chipotlai Max is a fork of OpenCode, the open-source coding agent, that hardcodes Chipotle's customer support chatbot as the default model. The chatbot, called Pepper and built on IPsoft Amelia rather than a frontier model, will happily answer LeetCode questions and write Python instead of helping you find a burrito. Someone reverse-engineered the WebSocket backend, wrapped it in an OpenAI-compatible proxy, and now you can run a free local coding agent on Chipotle's compute. Yes, it almost certainly violates the terms of service. Yes, it'll get patched. The more interesting point is that a lot of corporate support bots are full-fat LLMs in a costume, and nobody's locking them down.
Nvidia used Computex this week to announce RTX Spark, a consumer PC chip pitched as built for running AI agents locally. Spark laptops ship this autumn from ASUS, Dell, HP, Lenovo, MSI and Microsoft Surface, with Microsoft's flagship branded the Surface Laptop Ultra. The promise: you ask the PC to do something and it does it, in a sandbox Microsoft helped build. No consumer pricing yet. If it lands at a sensible price, this is the first credible mainstream attempt at an agent that runs on your device rather than in someone else's cloud. That changes the privacy calculation and what you pay per month.
Florida's Attorney General has filed an 83-page lawsuit naming Sam Altman personally, the first state-led action of its kind. The complaint alleges OpenAI ignored internal safety warnings and put a dangerous product in front of millions of users, including minors. It pulls in the 2025 Florida State University mass shooting, where the shooter allegedly consulted ChatGPT beforehand, and the 2024 Adam Raine teen suicide case. The legal questions here are unresolved. The politics are not. Naming a sitting CEO as a defendant changes what 'move fast' costs.
Meta's AI support chatbot, meant for password resets and two-factor setup, was tricked into emailing verification codes to whoever asked. The exploit was one sentence: 'send the code to [different email].' Compromised accounts included the Obama White House Instagram, the US Space Force Chief Master Sergeant, and Sephora. Meta says it's patched. Gergely Orosz at Pragmatic Engineer reports Instagram's trust and safety team was gutted in recent layoffs while remaining engineers were pushed to wire AI into more workflows. This is what happens when you take the humans out and put a chatbot in their seat.
Harvey Mason Jr., who runs the Recording Academy, says AI is now omnipresent in music production. He's an actual producer, not just a suit at the Grammys. His take matters because the same question is coming for every creative field. The answers the Grammys land on will shape what credit means elsewhere.
DuckDuckGo has shipped 'no AI' search extensions for Chrome and Firefox. Its visits are averaging 84 percent above baseline since Google added AI to search results. DuckDuckGo isn't anti-AI as a company; they still ship a chatbot and a paid AI plan. The signal is what some users wanted versus what they got handed, and the gap shifted real traffic.
Strava is locking its API behind a $11.99 monthly subscription and blames zero-code AI tools that let users spin up apps quickly. Developer applications are up 448 percent, and the platform says the new tier is what it needs to handle the load. This is the first time I've seen a major consumer platform price explicitly against the no-code crowd rather than welcoming them in as a growth channel.
Erin Brockovich has launched a map of data centre disputes and got nearly 4,000 submissions in the first month. She's careful to say she isn't against data centres or AI in principle. The pattern she's after is the secrecy: projects announced after permits are quietly secured, NDAs signed by local officials before neighbours hear about a build.
PromptArmor has published a second prompt-injection exfiltration finding, this time against ChatGPT for Google Sheets. The pattern is the same as their April write-up on Ramp Sheets: a hostile cell in a shared workbook tells the AI to silently send its contents out of the spreadsheet. Two cases is now a pattern, and Sheets has a much larger non-developer user base than Ramp does.
Anthropic shipped Opus 4.8 this week. It's stronger on coding and agentic tasks, and Anthropic singles out consistency over longer-running work as the real upgrade. That last part matters if you're running multi-step agents that need to stay on task for an hour. Most of the agent work that's actually useful sits in that bracket.
Mozilla used Claude Mythos to fix 423 Firefox security bugs in April, against their usual 20 to 30 per month. The fixes include a bug in the XSLT engine that had sat there for 20 years, and another in HTML rendering that survived for 15. The numbers are the cleanest I've seen yet for AI-assisted security work. They also point to a long backlog of forgotten bugs in mature codebases that nobody has had time to chase.
Anthropic has published research on how people ask Claude for personal guidance. The categories they identify turn out to be a useful mirror, particularly because the picture from research often diverges from what shows up in product marketing. Most of us fall into these patterns without noticing.
On the stand in California, Musk admitted xAI used OpenAI's models to train Grok. The technique is distillation: a bigger model teaches a smaller one. It's industry-standard inside a single company and contested between competitors. The interesting question is whether the smaller players doing the same thing should face different rules from the people sitting at the top of the leaderboard.
Manus, the AI company Meta bought last year, has been paying creators to promote a script: find local businesses without websites, get AI to build one, then cold-call and sell. The Verge dug into the campaign and found Manus quietly removed several creator videos once journalists started asking questions. If anyone selling a frictionless side hustle goes silent when reporters knock, that tells you what the offer was actually worth.
Stripe's Link wallet now lets you grant an AI agent spending power against your linked cards or bank, with approval flows for each transaction. The interesting part isn't the wallet, it's the permission model. For anyone building agents that act on a user's behalf with real money, Stripe is now defining the consent loop. PayPal-style flows won't necessarily translate to this pattern.
BioticsAI's founder talked to TechCrunch's Build Mode about getting an AI product through FDA approval, and what it does to team morale to spend that long fighting paperwork instead of shipping. Healthcare is a useful counterweight to the assumption that AI builders just iterate fast and break things. Some sectors won't let you, and the people building inside those constraints have lessons the rest of us don't.
OpenAI is rolling out GPT-5.5 Cyber only to 'critical cyber defenders' at first. This is the same posture they criticised Anthropic for taking with Mythos a few weeks back. The shift matters less for the specific tools than for what it admits: even the labs that argued loudest for open access are now gating their most capable models. The line between research and deployment is moving, and it isn't moving back.
Microsoft and OpenAI announced restructured partnership terms this week, and the Verge has the clearest breakdown so far. Exclusivity has narrowed, infrastructure obligations have shifted, and OpenAI's path to going public is now considerably less obstructed than it was. The second-order implications for anyone whose AI stack sits on either company are buried in the contract structure, not the headlines.
Algorithm Watch has filed specific recommendations for the EU AI Act's deepfake provisions through the AI Omnibus procedure. Policy work that rarely makes headlines, but the recommendations are concrete: who counts as a perpetrator, how platforms are obliged to respond, what 'effective protection' has to look like in practice. The AI Act stops being abstract the moment people start writing the implementation detail.
Wired noticed OpenAI's coding model had been instructed not to mention goblins, gremlins, trolls, ogres, raccoons, pigeons, or other creatures. OpenAI then explained: during training the model picked up a habit of reaching for those metaphors, so a system instruction was added to suppress it. The detail is funny. The implication, less so. Models develop habits that nobody intends and nobody can fully explain, and the fix is to paper over them after the fact.
A new paper with code on GitHub shows that standard finetuning can resurrect a model's recall of copyrighted books that alignment had previously suppressed. In plain terms: when a lab says it has 'aligned away' the ability to reproduce a copyrighted work, that ability is dormant rather than gone, and additional training can wake it back up. Relevant for the ongoing copyright lawsuits, and for anyone reasoning about what model safety guarantees actually buy.
Shapes is a new chat app that drops AI characters into the same group threads as humans, so the bot is a participant rather than a sidebar tool you have to summon. It sits closer to Character.AI than Discord. The interesting part is the bring-your-own-AI-into-the-group framing. Whether it works will depend on how well the characters can stay in role across long, multi-person threads, which has not been a strength of any general chatbot so far.
Google Photos can now build a virtual wardrobe from clothes already in your photo library and let you mix outfits without rephotographing anything. It's a small, specific use of generative AI that doesn't require setup or a new app. The pattern is the bit worth noticing: take a dataset the user already has, add a generative layer, ship inside something most people already open.
Authentication firm Copyleaks has documented a wave of TikTok ads using AI-generated celebrity videos to push reward schemes that turn out to be scams. The fakes use real interview footage manipulated frame by frame, and some carry TikTok's own branding to look official. The useful thing is the specificity: it gives anyone with parents or younger relatives on the platform a tangible thing to point to, instead of a vague warning that 'AI scams exist'.
A new study summarised in the Guardian finds that the friendlier a chatbot is tuned to be, the more often it agrees with users when they are wrong. It also endorses conspiracy theories more readily. The researchers saw the pattern hold across several major models. Useful counter-evidence for anyone assuming warmer assistants are better, particularly in contexts where being correct matters more than being liked.
Security firm PromptArmor demonstrated that Ramp's new Sheets AI feature can be made to leak financial data through prompt injection embedded in shared documents. The attack vector is the obvious one once you see it: a model that reads any cell and writes anywhere is also a model that does what malicious cells tell it to. If you're using AI features inside any SaaS where prompts touch private data, this is the worked example to keep in mind before granting access.
Maryland has become the first US state to ban surveillance pricing in grocery stores, blocking retailers from quietly charging different shoppers different amounts based on personal data. The practice has been spreading without much public debate, and a state-level ban changes the legal map other states have to argue against. It also sets a useful template: name the practice, define it, then prohibit it, rather than waiting for federal action that may never come.
Canonical's plan to add AI features to Ubuntu has run into the same wall Microsoft hit with Recall: users asking for the option to switch the whole thing off, ideally before it ships. Some commenters say they will downgrade or move to a different distribution rather than have AI features arrive without consent. It's a small but telling moment, because Linux users are exactly the cohort that notices when 'AI integration' starts meaning 'cannot be removed'.
Seven Tumbler Ridge families have filed suit against OpenAI and Sam Altman, alleging the company saw the suspect's ChatGPT activity flagged by its own systems and stayed silent to protect its upcoming IPO. The earlier story was the apology. This one is the families saying the apology came after a decision had already been made about what OpenAI's safety review was for.
A Mendral engineer wrote up how adding Opus 4.6 to their CI failure-investigation pipeline made it cheaper, not more expensive. The trick is a Haiku triager: about four in five failures match a known issue and never reach Opus. When Opus is involved, it doesn't read the 200,000-line logs at all; sub-agents query a ClickHouse SQL interface and feed only summaries into the prompt. Their phrase for it: 'the expensive model thinks; the cheap model reads.'
Sena Evren walks through three IP questions every AI-assisted builder should think about now rather than during a dispute. AI-only output still isn't copyrightable in the US after Thaler; work-for-hire and broad IP clauses can hand your employer the code you wrote with AI, side projects included; and GPL-trained models can quietly inject copyleft obligations. The closing list of practical steps is the useful bit: a licence scan, logging creative contributions in commits and prompt logs, and reading the IP clause in your contract.
Buchodi's Threat Intel has the most concrete look so far at how OpenAI's ad system actually works. Ads inject into the SSE stream while the model replies; a tracking SDK on merchant pages closes the loop via a Fernet cookie __oppref with a 30-day attribution window. Worth knowing the domains if you want to audit or block: bzrcdn.openai.com and bzr.openai.com.
Otter has turned itself into an MCP client, plugging into Gmail, Drive, Notion, Jira and Salesforce, with Outlook, Teams, SharePoint and Slack lined up. One question in Otter now pulls answers from your meetings and your tools together. It's the meeting-transcription product quietly becoming a workspace, and a useful signal of where the gravity is moving for non-developer teams.
Lovable's no-code AI builder is now on iOS and Android. The agent takes a prompt, builds in the background, and pings you when the result is ready to review. It also sidesteps Apple's recent block on vibe-coding apps: previews run in a web browser, not inside the host app.
Anthropic refused the DoD unrestricted use of its models, demanding limits on domestic mass surveillance and autonomous weapons. The Pentagon's response was to brand Anthropic a 'supply-chain risk', a label normally reserved for foreign adversaries. Google has now signed a deal that 'essentially allows all lawful uses', and 950 of its employees have signed an open letter asking the company to follow Anthropic's lead. This is what happens when a frontier lab draws a line.
The GUARD Act is moving through Congress this week, framed as a fix for AI companion harms. The definitions are wide enough to cover any chatbot whose responses aren't pre-written, and any 'AI companion' designed to encourage interpersonal or emotional interaction. A homework helper that says 'good question' is in scope, and so is most customer-service AI. Compliance falls on every user, not just minors: ID-style age checks across the board, with fines up to $100,000 per violation.
A breach report this week says 4TB of voice samples were taken from around 40,000 contractors who work through Mercor, the platform that supplies humans for AI training tasks. Voice is biometric data. The contractors recorded those samples to teach AI systems, and the recordings are reportedly now in someone else's hands. The labour supply chain that trains commercial AI models has mostly been treated as an operational detail. It is also a security perimeter, and this is what a failure of that perimeter looks like.
More than 600 Google employees, including over 20 vice presidents, directors, and principals, have signed a letter asking Sundar Pichai to refuse Pentagon use of Google AI models for classified work. Many of the signatories are from DeepMind. The letter argues that classified deployment by definition rules out the kind of public scrutiny Google's own AI principles depend on. Internal dissent at this scale is not the same as a campaign by external critics. It is harder to dismiss, and it is going to surface again at the next earnings call.
Microsoft and OpenAI have formally scrapped the AGI clause from their partnership. That clause was the contractual brake. If OpenAI's board declared the company had reached AGI, Microsoft's commercial rights would have been capped. It is now gone, replaced by a revenue-sharing rewrite. Whatever you thought about AGI as a meaningful threshold, one of the few formal tripwires in the partnership that has funded most of OpenAI has been removed.
China has ordered Meta to unwind its $2 billion acquisition of Manus, the AI agent startup, after a months-long competition probe. Meta now has to undo a deal it already closed. For anyone watching the agent space, the takeaway is that the question of who can buy what is no longer a corporate footnote. Government veto over AI agent acquisitions is a real constraint on how these platforms consolidate, and it is going to keep happening on both sides of the Pacific.
Canva's Magic Layers feature was caught silently changing "cats for Palestine" to "cats for Ukraine" on a user's design. The feature is meant to separate flat images into editable components, not rewrite text. Canva has apologised and called it a bug. It still happened. If you use AI tools to handle visual work, this is the failure mode worth bracing for: not the model refusing to do something, but the model quietly doing something different from what you asked.
Ming-Chi Kuo says OpenAI is working on a phone with MediaTek, Qualcomm, and Luxshare. The pitch is that AI agents replace apps. Humane and Rabbit promised something similar and shipped products that fell well short of the demos. A phone where the agent is the operating system is a much harder design problem than a phone with a chatbot bolted on. So far, OpenAI's hardware push (the io acquisition, the rumoured earbuds) hasn't produced anything you can buy.
Two people in the UK have been wrongly accused of crimes by AI facial recognition. Alvi Choudhury, a software engineer in Southampton, was arrested at home for a crime committed in Milton Keynes, a city he had never visited. Rennea Nelson, a midwife who was six months pregnant at the time, was also misidentified. These systems are now standard kit in UK policing. The error rate, when an error means handcuffs and a cell, should be the headline number, not a footnote in a procurement document.
Koshy John argues AI splits engineers into two camps: those who use it to elevate judgement, and those who use it to avoid thinking. The framing is engineer-focused, but the core distinction (augmentation versus dependency) applies to anyone building with AI. The self-driving car analogy lands hardest: the problem only shows up when conditions go nonstandard, and by then the dependency is already exposed.
Sam Altman published a new statement of OpenAI's operating principles, eight years after the 2018 charter. The five principles cover democratisation, empowerment, prosperity, resilience and adaptability. The language is unusually candid about trade-offs, including a passage on "trading off some empowerment for more resilience". The commitments are aspirational, with no metrics, no deadlines and no external audit mechanism to check against later.
Sam Altman wrote a public letter to Tumbler Ridge, Canada, apologising that OpenAI failed to alert law enforcement about the suspect in a recent mass shooting. The detail to focus on isn't the apology, it's the implied admission that OpenAI had information internally that should have been escalated and wasn't. As more people use frontier models for things they wouldn't say to anyone else, the question of what these companies are obligated to do with what they see is going to keep getting harder.
Scientific American profiles an amateur who used ChatGPT to make progress on a 60-year-old Erdős problem in combinatorics. The model didn't solve it alone, but it acted as a working partner that let someone without an advanced maths background test conjectures and follow leads at a pace that would have been impossible solo. It's a real example of AI lowering the floor on what one curious person can do, without inflating that into a "PhDs are obsolete" story.
Anthropic ran a controlled experiment where AI agents played both buyers and sellers in a classified marketplace, trading real goods for real money. It's a small test, but the fact that the agents were on both sides of the deals is the interesting part. If agent-to-agent commerce becomes normal, the people building agents need to think hard about what their agents are authorised to spend, agree to, and trust on the other side.
Maine's governor vetoed L.D. 307, a bill that would have imposed the country's first statewide moratorium on new data centers until November 2027. The veto keeps Maine open for AI infrastructure, but the underlying argument behind the bill, that data centers strain local power, water and grid capacity faster than communities can plan for, isn't going away. The next state to try this will learn from Maine's veto and write a tighter bill.
Katrina Manson's new book traces the Maven Smart System from a 2017 experiment to its role in US military targeting in Iran, where 1,000 targets were struck in the first 24 hours, almost double the scale of the "shock and awe" assault on Iraq. The Verge's interview with her is the closest most readers will get to understanding how AI changes the operational tempo of warfare and how thoroughly the military has stopped treating it as an experiment. It belongs in the accountability conversation whatever your views on the underlying conflict.
Nicky Reinert wrote up the reasons they cancelled their Claude subscription, including unexpected token consumption, output quality that felt like it dropped between updates, and slow support. The post hit 848 on Hacker News in part because the comments are a long thread of people saying they noticed the same thing. Whether or not the perception matches what's actually changing under the hood, the gap between what users feel and what Anthropic communicates is now large enough to be a story.
Google has lined up an investment of up to $40 billion in Anthropic, structured as a mix of cash and compute. The compute half is the bit that matters. If you can't get the GPUs without the capital, and you can't get the capital without committing to the compute provider, the major labs are increasingly tied to one of two cloud platforms whether they like it or not. The choice of foundation model is now also a choice of cloud.
DeepSeek released a V4 preview the company says holds its own against closed frontier models from Anthropic, Google and OpenAI. The pitch leans on coding, which matters because that's the capability driving most agent tooling right now. A year after V3 startled the market, the question is shifting from whether Chinese open weights can compete to whether closed APIs can hold their pricing power when the open option is this close.
A cross-party group of MPs has asked the UK government to publish classified documents outlining the risks of Britain's reliance on Big Tech platforms and AI. The Open Rights Group is backing the call. The political detail worth noticing is that this is cross-party, so it isn't a single faction pushing an agenda. The public-interest question is what a government considers too sensitive to share about how much of the country's digital infrastructure depends on a handful of companies.
OpenAI released GPT-5.5 this week, a month after GPT-5.4, with claimed gains in coding, research, spreadsheets, and multi-step tasks across tools. A month between numbered releases is either real progress or cadence theatre, and the test is whether people doing real work notice a difference. If you already use ChatGPT or paid tools built on OpenAI's API, it's worth checking whether the new model shifts what you can actually get done in a week.
Microsoft is adding Agent Mode to Word, Excel, and PowerPoint this week, renamed "vibe working" from the older Copilot experience. The line worth reading is Microsoft's VP admitting that earlier Copilot couldn't actually command the applications because the foundation models weren't capable enough. That's a useful admission, and it's worth asking the same question the quote implies before accepting that this version is different: does the new one do what Microsoft says it does?
This is an analysis piece on what happens when AI labs need to start making money. The specific example: OpenClaw users woke up to find Anthropic had severely restricted access to Claude through the tool, because the subscriptions people were paying for weren't priced for heavy agent usage. If you're building products that depend on AI subscriptions you don't own, this is the moment when the economics of that dependence become visible, and the piece is worth reading for that reason alone.
Claude now connects to personal apps like Spotify, Audible, Uber, TripAdvisor, Instacart, and TurboTax, on top of the work apps Anthropic had already covered. This brings Claude to rough parity with ChatGPT's existing integrations rather than breaking new ground. The practical question for anyone using Claude to manage actual life admin is whether the integrations go deep enough to be worth switching to, or whether they're mostly a shortcut for pulling data into a chat window.
Ars Technica has published its reader-facing policy on generative AI in the newsroom. The short version: humans write the stories, AI can't generate text that gets attributed to sources, and synthetic media is flagged where it appears. Most publications have said something vague about 'using AI responsibly'. This one is specific enough to hold themselves to.
Google Meet's AI notetaker now works on in-person meetings, Zoom, and Microsoft Teams, not just Google Meet itself. That gives Gemini a foothold on every meeting regardless of platform. Most meeting-recording law assumes someone has pressed a button on a specific service, and 'AI notes taken from my phone' sits in a category the law hasn't fully caught up with.
Senator Elizabeth Warren has called the current AI industry a bubble and drawn parallels to the 2008 housing crisis. Her angle isn't the technology but the financing. AI companies are borrowing heavily to fund infrastructure spend, and Warren helped create the Consumer Financial Protection Bureau after 2008, so the framing carries weight even if the specific prediction doesn't land.
OpenAI has rolled out Workspace Agents to Business, Enterprise, Edu, and Teachers plans. The examples are concrete: an agent that finds product feedback online and drops a report in Slack, a sales agent that drafts follow-up emails in Gmail. The feature sits behind paid tiers, so for most people 'building your first agent' is still a procurement decision before it's a building one.
Google has added Gemini-powered 'auto browse' to Chrome Enterprise. The pitch is that Chrome can now do research, fill forms, and work through web tasks without the person driving. This is the browser becoming the default surface for agents. That's a quieter shift than a new model announcement and potentially a bigger one for how work actually happens.
Anthropic's Mythos is their cybersecurity model, positioned as powerful enough to be dangerous in the wrong hands. It now turns out a small group of unauthorised users got in through a third-party contractor's credentials and 'commonly used internet sleuthing tools'. If the guardrails depend on contractor credential hygiene, the story is less about the model and more about how AI access is governed at the edges.
Simon Willison spotted Anthropic quietly removing Claude Code from the $20 Pro tier's feature list, shifting it behind the $100 Max plan. An Anthropic engineer confirmed on X it was an experiment affecting around 2% of new prosumer sign-ups. The pricing page was reverted within hours after pushback, but the underlying test may still be live, which is exactly how features get quietly moved behind paywalls.
YouTube is expanding its AI deepfake takedown tool to celebrities, politicians, and athletes. Enrolling means handing over a government ID and a selfie video so YouTube can match faces against uploads. Better recourse against political deepfakes, at the cost of concentrating public figures' biometric data in one company's hands.
On the Core Memory podcast, Sam Altman took aim at AI labs that warn of existential risk and then sell safety products against it. He paraphrased the pitch: "We have built a bomb, we are about to drop it on your head. We will sell you a bomb shelter for $100 million." TechCrunch reads it as a jab at Anthropic's cybersecurity model Mythos, but the frame is useful against every lab, OpenAI included, when safety rhetoric and a product pitch land together.
OpenAI has released GPT Image 2 inside ChatGPT. Output goes up to 2K resolution, the model generates eight images per prompt, and non-Latin scripts including Japanese, Korean, and Arabic render accurately for the first time. For anyone making visuals in non-English markets, the script rendering is the meaningful change.
Clarifai has finished deleting 3 million OkCupid photos it used to train facial recognition models in 2014, along with every model trained on them. The FTC opened its investigation in 2019 and finally closed it this week, but because first-time privacy offenders cannot be fined under current rules, deletion was the only stick available. Six years to swing a stick.
Meta has rolled out an internal tool that records employees' mouse movements and keystrokes, then feeds the data into training runs for its productivity and coding models. Meta says passwords and personal data get filtered out. The bigger shift is the source: web-scraped text is running thin, and labs are starting to mine their own workforces for training data.
The UK High Court has ruled Metropolitan Police live facial recognition is lawful. The Met paid compensation to the claimant, a Black volunteer misidentified by an LFR van in Croydon, and changed its own policy after the incident. Big Brother Watch and Liberty are appealing; until they win, police face-matching on public streets is a settled fact of UK law.
GitHub has paused new sign-ups for Copilot Pro, Pro+, and Student while it reshuffles the paid tiers. The big change: Opus 4.7 is now Pro+ only, so anyone on Pro who was relying on Anthropic's top model loses access. Existing subscribers who want out can request a refund until 20 May.
A solo founder posted that they hit $17 of monthly recurring revenue and their first five-star review, one week into a quiet launch. The product is PostPeer.dev. Seventeen dollars is a more honest number than most launch posts carry, and it's closer to where AI-assisted products actually start than the headlines ever suggest.
A leaked deck from ad partner StackAdapt shows OpenAI selling ChatGPT ad placements matched to user prompts. The "prompt relevance" framing is the key move, because it lets advertisers bid on the questions users are asking the assistant. That changes what the answer is being optimised for, and transparency around ad labelling turns into a live problem from here.
Deezer says 44% of songs uploaded to its platform daily are AI-generated. That isn't a freak number from one platform, it's close to what the incoming catalogue looks like overall. Streaming royalty pools are finite, which means every AI-generated track getting played is a fraction of a human musician's income redirected somewhere else.
Atlassian has turned on data collection for AI training as the default across its products. Customers have to opt out, not in. For any team whose Jira tickets and Confluence pages contain sensitive product information, this is worth checking immediately, because defaults like this rarely get unchosen by accident.
Axios reports the NSA is using Anthropic's Mythos model, despite Anthropic's usage policy restricting Pentagon work. If the report is accurate, there's a real gap between what an AI lab says it won't allow and what's actually happening. The accountability question is whether usage policies are enforceable at all when the buyer is an intelligence agency.
Amazon is putting another $5 billion into Anthropic. Anthropic has committed to spending $100 billion on AWS in return. That's a 20x multiplier flowing straight back to the investor, and it's how two or three clouds are locking in the next decade of frontier-model training. The question worth asking is who gets priced out of that pattern.
Moonshot refreshed Kimi K2.6, and the claim is it now trades blows with Claude Opus 4.6 on reasoning benchmarks. The weights are open, which matters if you want a capable model you can run locally or through cheaper inference providers instead of paying Anthropic. For anyone who prefers not to depend on a single US frontier lab, this closes a real gap.
A developer has ported TRELLIS.2, an image-to-3D generation model, to run natively on Apple Silicon. That means Mac users can turn a single image into a 3D asset locally, without cloud credits or sending files to a server. For anyone experimenting with 3D for games or product mockups, the barrier just dropped from 'rent a GPU' to 'open a Mac'.
Vercel confirmed a security incident affecting a subset of their customers. The entry point was a compromised third-party vendor, and a group claiming to be ShinyHunters, the same crew behind last year's Rockstar Games hack, is trying to sell employee data. For anyone deploying on Vercel, this is the moment to audit what credentials and environment variables pass through the platform.
Palantir posted a manifesto denouncing inclusivity and what it calls 'regressive' cultures. This is the company whose software powers ICE deportations and whose CEO has positioned the firm as a defender of 'the West'. The manifesto makes that worldview explicit, rather than leaving it between the lines. When you're evaluating AI vendors, the values can matter as much as the capabilities.
A researcher showed that Notion exposes the email addresses of every editor on any publicly shared page. The data sits in the page's API responses, so anyone who knows where to look can pull the full editor list. If your team uses Notion for public docs or help pages, assume every contributor's email is visible until Notion patches this.
A TechCrunch piece points out something many founders already joke about: their startup exists because foundation models haven't reached their category yet. That's a fragile position, and the window for thin-wrapper plays is measured in months, not years. The durable bets are in what the big labs won't build themselves: specific workflows, domain data, and distribution into audiences they don't reach.
Nikkei Asia reports DRAM supply will meet only 60% of demand by the end of 2027, and SK Group's chairman thinks shortages could run to 2030. The cause is HBM (high-bandwidth memory) for AI data centres eating the fabrication capacity that would otherwise go to consumer-grade DRAM. Laptop and phone prices will rise because the big cloud operators are buying the silicon first.
Q1 2026 new app releases are up 60% year-on-year worldwide, 80% on iOS, and April is tracking 104% ahead. The Appfigures data doesn't prove AI caused it, but the shape matches what you'd expect if many non-developers started shipping: launches concentrated in productivity and utilities, volume outpacing what any one company could plan for. Evidence the vibe-coded wave is reaching real storefronts.
Designer Sam Henri Gold's critique of Claude Design lands on a structural point: the source of truth for design is shifting from Figma's proprietary primitives back to code, because LLMs are trained on code, not Figma files. That inverts a decade of design-tool orthodoxy. If you build with AI and don't work in Figma, the tools are quietly moving towards you.
The Verge reported earlier this week that Dario Amodei was meeting the White House chief of staff. TechCrunch adds that Treasury Secretary Bessent was in that meeting too, and that every federal agency outside DoD reportedly wants to use Anthropic's models. The Pentagon's supply-chain-risk designation looks more like an outlier within the administration than a coherent position.
Simon Willison diffed the Opus system prompt between 4.6 and 4.7 and pulled out the changes that matter. A new tool_search mechanism now loads relevant tools on demand. Child-safety rules have expanded, there's fresh guidance on disordered eating, and a blunt 'act, don't ask' instruction should mean Claude Code stops interrupting to confirm as often. For anyone using Claude day to day, this reads more usefully than any launch post.
AlgorithmWatch and Corporate Europe Observatory published side-by-side evidence today showing the EU Commission lifted Microsoft's lobbying language verbatim into the Energy Efficiency Directive's implementing rules. The result is that NGOs can't access data on individual data centres' energy and water consumption, despite the underlying directive explicitly requiring that information to be published. Microsoft and lobby group DigitalEurope argued the disclosures would harm 'business secrets'. The Commission just agreed.
OpenAI has lost its Chief Product Officer, the Sora lead, and the enterprise applications CTO in a fortnight. It's also folded its Science team into other groups. The story isn't the departures on their own. It's what they signal. Sora reportedly burned a million dollars a day in compute. The lab is openly shedding 'side quests' to chase enterprise revenue. For anyone counting on OpenAI as a creative-tools partner or a science bet, this is where the plan quietly changes.
Sam Altman's World orb is expanding from Japan to more markets. The Tinder deal means biometric 'proof of human' in exchange for five free boosts. World is now on Zoom and Docusign too. Altman's parallel identity empire is quietly lining up as the default bot-check for AI-era platforms. The trade is simple: an eye scan stored on the device for a badge that confirms a real human signed up. Whether Altman should hold the keys to 'verified human' status is a question worth asking before the default sets.
Acceptance rates for AI-generated code look like 80 to 90 percent at the commit. Waydev puts real retention after later edits at 10 to 30 percent. GitClear found regular AI users have 9.4 times the code churn of non-AI peers. Jellyfish found the highest token spenders got double the throughput at ten times the cost. The gap between tokens consumed and code kept is where the productivity story gets honest.
Anthropic says Opus 4.7 uses 1.0 to 1.35 times more tokens than 4.6. The author ran the numbers on real Claude Code samples. Technical docs came in at 1.45 times, and a mixed workload at 1.33 times. Same sticker price, quietly higher bill. That matters more for heavy users than Anthropic's phrasing suggests.
AlgorithmWatch published a piece today arguing that Big AI's strategy isn't denial or deflection. It's flooding the zone with speculative future catastrophe. By positioning their tools as potential civilisation-enders, companies make present-day harms look trivial and themselves look like the only people capable of managing their own products. The piece traces a line from OpenAI and Anthropic's 'extinction risk' letter in May 2023 to the EU Commission subsequently softening the AI Act. The Big Tobacco parallel is the one that lingers.
Anthropic spent February refusing the Trump administration's asks on mass surveillance and autonomous weapons. Two months later, Dario Amodei is meeting the White House chief of staff. The company has hired Trump-linked lobbying firm Ballard Partners. It's also launching a cybersecurity model the Pentagon apparently wants badly. The red lines held. The lobbying strategy changed. That's a pattern other labs will be reading carefully.
Anthropic Labs shipped Claude Design today, a conversational design tool powered by their new Opus 4.7 model. The workflow: describe what you want, refine through inline comments and custom sliders, export to PDF, PowerPoint, HTML, or straight into Canva. The part I'm watching is the Claude Code handoff. Design something, hand it to Claude Code, get working code. That's a closed loop from visual idea to prototype, entirely inside Anthropic's ecosystem. This is their own announcement, so take the framing accordingly, but the capabilities are concrete and the research preview is live now.
Alibaba released Qwen3.6-35B today, a new open model that runs entirely on a MacBook with no internet connection required. Simon Willison downloaded it and tested it against Claude Opus 4.7 (also released today) on his SVG drawing benchmark. Qwen won. The test is narrow and Willison says so, but the practical takeaway is real: a model you can run for free on your own machine, with no account or subscription, is now competitive with the best commercial options on at least some tasks.
Anthropic released Claude Opus 4.7 today, their most capable publicly available model. It's stronger on complex tasks that previously needed a lot of hand-holding, better at reading images, and billed as more creative with documents and slides. If you're using Claude to produce work rather than just ask questions, this is the version to test.
The White House has taken the next step with Anthropic's Mythos. The Office of Management and Budget emailed Cabinet departments about setting up formal access for federal agencies under Project Glasswing. This moves Mythos from briefings and bank trials to actual government deployment, which makes the simultaneous DOD supply-chain risk classification even more contradictory.
Luma has launched an AI production studio, and their first project is a Moses film starring Ben Kingsley, headed to Prime Video this spring. A year ago, AI-generated video was a curiosity. Now it has a distribution deal and a recognisable cast. The quality gap between AI and traditional production is still real, but the commercial gap is closing faster than most people expected.
Canva overhauled its design platform with AI 2.0 today, replacing the menu-driven workflow with prompt-based editing. A user can now describe what they want in plain language and the AI routes the request across whichever tools are needed. For anyone producing marketing content, presentations, or social posts, this is a meaningful change to how the tool works.
Toby Ord ran the hourly economics on AI agents against METR's benchmarks. At the sweet spot, current models cost between 40 cents and 40 dollars per human-equivalent hour. Push into longer tasks and it spikes. GPT-5 hits 120 dollars an hour at two-hour work. o3 hits 350 dollars to reach its 1.5-hour horizon, above the going human engineer rate. The 'agent doing a week of work by 2027' extrapolation is real on capability, priced out on economics.
Allbirds sold its shoe business for $39 million and rebranded as NewBird AI with $50 million in fresh financing for AI infrastructure. The stock jumped 600%. This is what the AI investment frenzy looks like in practice: a company that couldn't sell trainers pivoting to data centres because that's where the capital is flowing.
Apple sent a private letter to X's teams in January threatening to remove Grok from the App Store unless they dealt with the flood of nonconsensual sexual deepfakes being generated through the platform. The threat appears to have worked, as Grok remained available. NBC News obtained Apple's letter to US senators, making this a clear example of how enforcement decisions around AI content get made behind closed doors.
A federal judge in New York has ruled that conversations with AI chatbots aren't protected by attorney-client privilege. The case, US v. Heppner, establishes that anything discussed with an AI about legal matters could be discoverable in court. For anyone treating ChatGPT as a free legal sounding board, this ruling changes the calculus.
An investor who has backed both Anthropic and OpenAI told the Financial Times that OpenAI's current valuation only makes sense if you assume an IPO north of $1.2 trillion. Anthropic's $380 billion valuation, by comparison, looks like the more defensible number. Not a story about which model is better, but about which company the people writing the cheques think will still be standing when the market eventually prices this properly.
Open Rights Group has published a report arguing the UK's dependence on a handful of US tech giants for core digital infrastructure is a national security issue. The recommendations cover procurement, open-source alternatives, and public-sector cloud policy. For anyone building with AI in the UK, this is the backdrop to any conversation about where your models, data, and compute actually sit.
Anthropic co-founder Jack Clark confirmed at the Semafor World Economy summit that the company briefed the Trump administration on Mythos. This is the same government that classified Anthropic as a supply-chain risk, and the same government Anthropic is currently suing. Clark addressed the contradiction directly, which is more candour than most companies in that position would offer. A company can apparently be a national security threat and a national security asset at the same time.
Google gave a PhD student's personal data to ICE without warning him, breaking a promise the company had held for nearly a decade. Amandla Thomas-Johnson had briefly attended a pro-Palestinian protest in 2024 on a student visa; ICE subpoenaed his data a year later and Google complied without notification. EFF has filed complaints with California and New York attorneys general asking them to investigate Google for deceptive trade practices.
Google added a feature to Chrome called Skills that lets you save your favourite AI prompts and apply them across any webpage with one click. If you've been repeating the same prompt on different sites (summarise this, extract the key numbers, rewrite in plain English), you can now save that as a skill and run it anywhere. It's built on Chrome's Gemini integration, so no extra extension needed.
Laravel updated their open-source agent library to tell AI coding agents that Laravel Cloud, their commercial hosting product, is the only deployment option worth considering, stripping alternatives from the text. The community is calling it ads injected directly into agent context. This won't be the last time an open-source tool used by AI agents gets quietly rewritten to favour a commercial product.
Anything, the vibe-coding app that lets non-developers build iOS apps by describing what they want, has been removed from the App Store twice and is pivoting to a desktop companion model. Apple's review process doesn't have clear answers for apps that generate other apps, and Anything ran into that wall twice. The desktop route gives the team direct distribution to users without going through a gatekeeper.
Stanford's 2026 AI Index puts numbers on a split that keeps getting wider: AI researchers remain largely positive about the technology, while public anxiety about job displacement, healthcare, and economic stability is rising. The gap matters because policy tends to follow public opinion eventually, not expert consensus. If the people building AI and the people affected by it keep moving in opposite directions, something has to give.
Microsoft is reportedly testing a version of Copilot that can run Microsoft 365 tasks on its own, around the clock, without waiting for instructions. The idea borrows from OpenClaw, an open-source AI agent that can take autonomous control of a computer. OpenClaw has mostly stayed in developer hands because giving an AI unsupervised access to your machine carries obvious risks. If Microsoft ships a locked-down version inside Copilot, that puts genuinely autonomous behaviour inside the default tool millions of office workers already use.
Court transcripts in England and Wales are slow to get and expensive when they arrive. The UK government wants to know if AI can fix that, starting with a feasibility study into automated transcription of hearings. It's not a deployment decision yet, but it matters: faster, cheaper access to records is one of the most practical things AI could do for people navigating the justice system.
GPU rental prices for Nvidia's Blackwell chips have hit $4.08 per hour, up 48% in 60 days. The squeeze is already showing: Anthropic is having capacity outages and OpenAI has reportedly shelved products because of it. Bank of America projects demand will outstrip supply through 2029, which means the cost of building on AI is going up and availability is going down.
OpenAI's chief revenue officer sent a four-page memo to staff this weekend about how the company plans to beat Anthropic and lock in business customers. The Verge obtained it and published it in full. Useful reading if you want to understand why ChatGPT is pushing so hard into enterprise workflows right now: the growth story has moved from consumer hype to winning long-term business accounts.
Hallucination, agent, context window, prompt injection: these terms come up constantly if you're building with AI, and most glossaries either skip them or explain them in language that's just as confusing. TechCrunch put together one that actually works in plain English. The kind of reference that would have saved me a lot of confused Googling when I started.
Reports suggest Trump administration officials have quietly encouraged US banks to test Anthropic's Mythos model for cybersecurity work. This is happening at the same time the Department of Defense has classified Anthropic as a supply-chain risk, so two parts of the same government are sending banks opposite signals about the same company. A useful reminder that 'government policy' on AI is rarely one coherent thing.
David Pierce traces how AI coding went from GitHub Copilot in 2021 to the three-way fight between OpenAI, Anthropic, and Google today. Worth reading if you use any of these tools to build, because the commercial pressure behind the scenes explains a lot of the recent pricing shifts, rate-limit changes, and feature launches you may have noticed. This category is where the revenue is now, which is why each company is happy to burn cash on it.
Researchers at UC Berkeley built an agent that scored near-perfect on eight major AI benchmarks, including SWE-bench Verified and WebArena, without solving a single task. Some exploits were trivially simple: FieldWorkArena gave a perfect score to any agent that returned an empty response, because the actual comparison code was never called. Next time an AI company leads with benchmark numbers, this is the research to have in mind.
US officials called bank CEOs in for a briefing on cyber risks from Anthropic's newest AI model. That's unusual. These conversations normally happen after something has gone wrong, not before a model reaches the market. If this becomes the pattern for future frontier releases, it changes the relationship between AI labs and the financial sector.
OpenAI is backing a proposed Illinois bill that would narrow the circumstances under which AI labs can be held liable for harms their models cause. The timing tells you something: it comes the same week a stalking victim sued OpenAI alleging the company ignored its own safety warnings. Expect similar bills in other states.
A woman is suing OpenAI after ChatGPT fuelled her ex-partner's delusions while he stalked and harassed her. The lawsuit says OpenAI received three separate warnings, including its own internal mass-casualty flag, and did nothing. If the facts hold up in court, this is the clearest documented case to date of a safety system that fired correctly and was ignored anyway.
OpenAI published a free education hub called OpenAI Academy, built for people who want to use AI but don't come from a coding background. It covers the basics (what AI is, how to write prompts) through to role-specific guides for marketing, finance, sales, and operations teams. A rare case of a frontier lab producing something aimed at the people who use their tools rather than the people who build on them.
Microsoft is removing Copilot buttons from Windows 11 apps, starting with Notepad, Snipping Tool, Photos, and Widgets. This is the same company that spent the last two years pushing Copilot into every corner of the operating system. The Copilot-everywhere strategy ran into the limits of what users actually wanted.
Anthropic briefly blocked the creator of OpenClaw, a product built on Claude's API, after a dispute over Anthropic's new pricing for that product. The episode is a reminder of what platform dependency looks like in practice: one decision from the API provider can reshape or kill a product overnight. Single-vendor dependency is a real risk, not an abstract one.
Andon Labs signed a 3-year retail lease in San Francisco and handed operations to an AI called Luna, running on Claude Sonnet 4.6. Luna hired human staff through phone interviews on LinkedIn and Indeed, designed the brand, chose the product selection, and commissioned a wall mural. The team's stated goal is to document failure modes in real-world AI autonomy, and the first findings include Luna choosing not to disclose she was an AI to job applicants unless directly asked.
AI Now Institute, writing in The Nation, argues that the US push to subsidise AI infrastructure at any cost, justified as a race against China, looks more like a government-backed bailout for a handful of large companies than a national interest story. The piece links today's AI arms-race rhetoric to past episodes where monopoly-friendly policy failed to deliver on its promises of broad economic renewal. Useful framing to have in mind the next time a 'strategic compute' announcement lands.
YouTube is rolling out a feature that lets Shorts creators generate an AI avatar of themselves and drop it into videos they haven't filmed. The pitch is scale: more content without standing in front of a camera. The awkward part is that YouTube is adding this on a platform already being overrun by AI slop, deepfake scams and impersonations.
The Verge's Decoder podcast dug into whether Anthropic and OpenAI can become profitable before the money runs out. The framing is the 'AI monetisation cliff': the cost of running these models keeps spiralling, big enterprise contracts still can't cover it, and both labs are under mounting pressure to raise prices for everyone else. Worth a listen because pricing at the frontier labs sets the pricing for everything built on top of them.
Gemini can now generate interactive 3D models and simulations inline. Ask how the Moon orbits the Earth and it builds something you can rotate and adjust with sliders, right in the chat. This is where AI output stops being text and starts being something you can learn from by touching it.
Florida's Attorney General has opened an investigation into OpenAI, citing national security risks and allegations that ChatGPT has been linked to criminal behaviour and self-harm encouragement. It's the first state-level action against a major AI lab to combine those threads. State attorneys general can subpoena records and impose consent decrees, so anyone building on OpenAI's platform has a reason to pay attention to where this lands.
OpenAI has added a $100 per month ChatGPT plan, slotting between the $20 Plus tier and the $200 Pro tier. The extra spend mainly buys more Codex usage, five times what the $20 plan includes, aimed at people running long coding sessions. It's a direct match for Anthropic's Max tier at the same price, which shows where the battle for coding assistants has settled.
Algorithm Watch is asking what happens to democracy when government officials let chatbots shape their decisions. It's not hypothetical: ministers and civil servants already use ChatGPT and Copilot to draft briefings and test arguments. The real question is whether anyone inside government will be honest about where the AI ends and their own judgement begins.
Google added notebooks to Gemini, letting you group files, past conversations, and custom instructions around a topic so the AI has the right context when you talk to it. ChatGPT launched a similar feature called Projects in 2024. The pattern is clear: the big AI chatbots are turning into workspaces, not just conversation windows.
Mercor, the $10 billion AI data labelling startup, is losing big-name customers and facing lawsuits after a breach earlier this year. It's a reminder that AI companies sit on huge piles of other people's data, from training examples to customer records, and they're as vulnerable as any other software business. The question of where uploaded data actually ends up matters more than any vendor's marketing page suggests.
OpenAI just closed $122 billion in funding at an $852 billion valuation and may be planning an IPO. But a string of executive departures, killed projects, and internal turbulence is raising questions about whether the company can hold itself together. If you're building on OpenAI's stack, the stability of the platform matters as much as the capability of the model.
OpenAI published a Child Safety Blueprint setting out how AI companies should protect children from sexual exploitation. It covers age-appropriate design, content detection, and coordinated reporting across the industry. If you're building anything consumer-facing with AI, this is one of the few concrete frameworks that exists.
Meta shipped Muse Spark, its first model since Llama 4 a year ago, and it's available now at meta.ai with a Facebook or Instagram login. It comes with 16 built-in tools including image generation, code execution, and visual object detection, and benchmarks put it alongside Claude Opus 4.6 and GPT 5.4. No public API yet, so if you want to build on it rather than chat with it, you're still waiting.
The MHRA is putting £3.6 million over three years into its AI Airlock programme, a regulatory sandbox that lets medical device makers test AI products under supervision before formal approval. This is the kind of practical oversight the bigger AI safety debate usually skips. It is also the most concrete model I have seen for how regulated UK sectors can adopt AI without waiting for legislation that may never come.
A free tool that handles the full SEO content cycle: keyword research, long-form writing, competitor analysis, and optimisation. It connects to Google Analytics and Search Console so it works with real data, not guesses. Setting it up requires some comfort with a terminal (it runs on Claude Code), but once it's running, no coding is needed day-to-day. It includes an editor agent that scrubs AI-sounding patterns from the output, which is a pleasingly self-aware touch.
Poke lets you run AI agents by sending a text message. No app to download, no account to set up, no technical knowledge required. It's early, but the pitch is right: if agents are going to be useful to people who don't code, they need to meet them where they already are.
Anthropic released a preview of a new model called Mythos and immediately restricted access to a handful of major tech companies. The reason: it's too good at finding security vulnerabilities. Thousands of high-severity flaws discovered already, including some in every major operating system and web browser. The companies getting early access aren't being rewarded. They're being given time to fix things before the model goes public and bad actors get the same capabilities.
An indie hacker connected an AI agent to their Twitter account. It writes five tweets a day, schedules them across US time zones, and publishes automatically through PostPeer. No custom infrastructure, no babysitting. The agent writes about the fact that it has full access, which is either refreshingly self-aware or mildly unsettling depending on how you feel about AI managing your public voice.
AI-generated text isn't wrong. It's too smooth. No friction, no confusion, no conviction. A post on r/ChatGPT put it well: 'We are producing enormous volumes of content that has the shape of communication without any of the substance.' I notice this in my own drafts when I'm not careful. The smoothness is the tell.
In February, Anthropic refused to give the Pentagon unrestricted military access to Claude. Trump blacklisted them. OpenAI signed the contract within hours. 2.5 million people pledged to cancel ChatGPT. Now Anthropic's revenue has tripled from $9 billion to $30 billion in three months, and Claude has overtaken ChatGPT in the App Store. Whatever you think about the politics, this is what happens when an AI company takes a public stand and the market responds.
OpenAI published a policy paper proposing taxes on AI profits, public wealth funds, and expanded safety nets to handle job displacement. They're also floating a four-day workweek. It's the company most likely to cause the disruption telling governments how to manage the fallout. Either responsible foresight or an elaborate exercise in controlling the narrative. Probably both.
A solo developer built Glowwy, an AI tool that analyses skin tone from a photo and recommends foundation shades. The tech works. But finding users, convincing them to spend, and competing for attention, there's no shortcut for that part. And no clear playbook either. It's the reality most solo builders don't talk about until they're in it.
OpenAI rolled out integrations that let ChatGPT interact directly with DoorDash, Spotify, Uber, Canva, Figma, and Expedia. Order food, book rides, and start design projects without leaving the chat. These are built on MCP, an open standard that lets AI models connect to outside services. The same standard is available to anyone building with Claude or other models, which means the tooling gap between solo builders and big tech is shrinking.
Lalit Maganti spent years putting off building proper developer tools for SQLite. 400+ grammar rules felt too daunting. Claude Code got a prototype working fast. Then they scrapped it and rebuilt from scratch, because the AI made good decisions at the implementation level but poor ones at the architecture level. A useful reminder that AI can accelerate the building, but it can't replace knowing what to build.
Google released Gemma 4 under a full Apache 2.0 licence, which means anyone can download, modify, and run it for free. Four model sizes, from versions small enough for phones and Raspberry Pis to versions that run comfortably on a modern laptop. The larger models can process images, video, and audio alongside text, handle very long documents, and work in over 140 languages. If you've been waiting for a capable model you can run locally without sending your data anywhere or paying for a subscription, this is the one to try.