Josiah Khor

Josiah Khor https://josiahkhor.com/ 1 part experiment, 1 part a more permanent and public replacement for keeping approximately 100 tabs open in my browser at all times. Semantic ablation: Why AI writing is boring and dangerous https://josiahkhor.com/2026-06-14-semantic-ablation-why-ai-writing-is-boring-and-dangerous https://josiahkhor.com/2026-06-14-semantic-ablation-why-ai-writing-is-boring-and-dangerous Sun, 14 Jun 2026 12:35:55 GMT I don't think AI will make your processes go faster https://josiahkhor.com/2026-06-14-i-dont-think-ai-will-make-your-processes-go-faster https://josiahkhor.com/2026-06-14-i-dont-think-ai-will-make-your-processes-go-faster Sun, 14 Jun 2026 12:28:51 GMT This ... is what software developers have been begging for since the beginning of the profession: Receiving a detailed outline of the problem and what the end result should look like.]]> The Normalization of Deviance in AI https://josiahkhor.com/2026-06-14-the-normalization-of-deviance-in-ai https://josiahkhor.com/2026-06-14-the-normalization-of-deviance-in-ai Sun, 14 Jun 2026 12:15:32 GMT Such a drift does not happen through a single reckless decision. It happens through a series of “temporary” shortcuts that quietly become the new baseline. Because systems continue to work, teams stop questioning the shortcuts, and the deviation becomes invisible and the new norm. > >Especially under competitive pressure for automation, cost savings, a drive to be first, and the overall hype, this dangerous drift is evident. The incentives for speed and winning outweigh the incentives for foundational security. Over time, organizations forget why the guardrails existed in the first place. This was written a while ago (December 2025 - 6 months ago), so the examples are a little out of date. One from the past month: Instagram accounts were "hacked" by asking Meta's chatbot ... for the account? [More or less.](https://www.404media.co/hackers-simply-asked-meta-ai-to-give-them-access-to-high-profile-instagram-accounts-it-worked/)]]> Why the AI Renaissance Keeps Not Arriving https://josiahkhor.com/2026-06-14-why-the-ai-renaissance-keeps-not-arriving https://josiahkhor.com/2026-06-14-why-the-ai-renaissance-keeps-not-arriving Sun, 14 Jun 2026 11:03:11 GMT Every brainstorm is the same brainstorm-shaped object and every essay has the same skeleton. Put simply, the outputs are locally excellent and globally identical. > >I call this manifold collapse. The model never explores the whole landscape of possible ideas. It circles a small, well-worn region of it, and inside that region it travels what I think of as latent grooves, a few deep and reliable paths it falls into no matter how you phrase the request. > >A system trained to satisfy consensus hands you the median of everything humanity has tried, instantly. That feels like genius exactly once. If AI is lifting your quality and not just your efficiency, then you might have been below average before.]]> How building an HTML-first site doubled our users overnight https://josiahkhor.com/2026-06-14-how-building-an-html-first-site-doubled-our-users-overnight https://josiahkhor.com/2026-06-14-how-building-an-html-first-site-doubled-our-users-overnight Sun, 14 Jun 2026 10:58:52 GMT I have nothing against heavily client-side applications, in their place. But this is just a big form - it’s not showing real-time data. Our user could be standing in the middle of a field on a new-build housing estate, holding a decade-old commodity android phone they bought in Tesco. Shipping them 20MB of javascript before we even render a form would be a ridiculous thing to do.]]> CEOs Who Think AI Replaces Their Employees Are Just Bad CEOs https://josiahkhor.com/2026-06-14-ceos-who-think-ai-replaces-their-employees-are-just-bad-ceos https://josiahkhor.com/2026-06-14-ceos-who-think-ai-replaces-their-employees-are-just-bad-ceos Sun, 14 Jun 2026 10:55:02 GMT Making things work is different than making things work well. Or well at scale. Or well at scale in a specific environment. Obviously, it depends on the kind of project and what it’s being designed to do, but oftentimes the reason a company has a bunch of employees is to fill in the seemingly small, but incredibly important details that CEOs might not ever get much visibility into: things like security or legal compliance or accessibility or who knows what else. > >Using an agentic tool to build something that works is all well and good, but building a product for the mass market to use — and use well, and use safely — involves much, much more. Agentic coding tools can sometimes help with that too, but the leap from “I built a thing” to “therefore anyone can build a thing” misses the entire point of why you hire knowledgeable, experienced people in the first place. It’s also why I think the best case of these tools is building totally personalized tools to assist you in accomplishing a specific task, and not for building mass market tools.]]> Pressure Without A Plan https://josiahkhor.com/2026-06-14-pressure-without-a-plan https://josiahkhor.com/2026-06-14-pressure-without-a-plan Sun, 14 Jun 2026 10:42:44 GMT There was a lot of pressure to get things working, but no one knew what to do about it. It took almost a month to get it wholly functioning. It was not a pleasant month, with many false starts while we tried to dig out of launching an unfinished, desperate product... many leaders inject that sort of pressure [without a plan] into their teams as a routine management technique.]]> Tarek Loubani on 3D Printing in Gaza https://josiahkhor.com/2026-05-05-seize-the-means-of-production https://josiahkhor.com/2026-05-05-seize-the-means-of-production Tue, 05 May 2026 00:55:44 GMT We didn’t really care about a name, but we needed one to apply for a medical license and The High-Quality, Low-Cost Open Access Medical Device Project just didn’t seem like the thing to write on that application.]]> AI got the blame for the Iran school bombing. The truth is far more worrying https://josiahkhor.com/2026-04-27-ai-got-the-blame-iran-school-bombing https://josiahkhor.com/2026-04-27-ai-got-the-blame-iran-school-bombing Mon, 27 Apr 2026 00:55:44 GMT Personnel became disinclined to ask whether some targets were potential allies, or not actually bad guys at all, because producing targets meant participating in the hunt. The targeting guide had warned about this too. “If targeteers don’t provide full targeting service,” it read, “then other well meaning but undertrained and ill-experienced groups will step in.” Unless your users are locked into your system by higher powers with no alternatives allowed, if you don't give them enough of what they want - no matter how objectively bad or risky it is, they will just ignore your system and/or exploit loopholes.]]> The L in LLM Stands for Lying https://josiahkhor.com/2026-03-30-the-l-in-llm https://josiahkhor.com/2026-03-30-the-l-in-llm Mon, 30 Mar 2026 00:55:44 GMT The salient difference here is whether an engineer has mostly spent their career solving problems created by other software, or solving problems people already had before there was any software at all. Only the latter will teach you to think about the constraints a problem actually has, and the needs of the users who solve it, which are always far messier than a novice would think.]]> Why ATMs didn’t kill bank teller jobs, but the iPhone did https://josiahkhor.com/2026-03-29-why-atms-didnt-kill-bank-teller-jobs https://josiahkhor.com/2026-03-29-why-atms-didnt-kill-bank-teller-jobs Sun, 29 Mar 2026 00:55:44 GMT When a technology automates some of what a human does within an existing paradigm, even the vast majority of what a human does within it, it’s quite rare for it to actually get rid of the human, because the definition of the paradigm around human-shaped roles creates all sorts of bottlenecks and frictions that demand human involvement. It’s only when we see the construction of entirely new paradigms that the full power of a technology can be realized. The ATM substituted tasks; but the iPhone made them irrelevant.]]> Why Language Models Hallucinate https://josiahkhor.com/2026-03-26-why-language-models-hallucinate https://josiahkhor.com/2026-03-26-why-language-models-hallucinate Thu, 26 Mar 2026 00:55:44 GMT 1.5 TB of RAM Across 4 Mac Studios https://josiahkhor.com/2026-01-08-1pt5-tb-of-ram-across-4-mac-studios https://josiahkhor.com/2026-01-08-1pt5-tb-of-ram-across-4-mac-studios Thu, 08 Jan 2026 00:55:44 GMT Thunderbolt 5 switches don't exist, so you can't plug in multiple Macs to one central switch—you have to plug every Mac into every other Mac, which adds to the cabling mess. Slight stretch to call the Unified Memory VRAM imo, but the point is that its available for use in GPU compute tasks such as LLM inference so I get it.]]> Tahoe Icons are a Problem https://josiahkhor.com/2026-01-07-tahoe-icons-are-a-problem https://josiahkhor.com/2026-01-07-tahoe-icons-are-a-problem Wed, 07 Jan 2026 00:55:44 GMT 21 Lessons From 14 Years at Google https://josiahkhor.com/2026-01-05-21-lessons-from-14-years-at-google https://josiahkhor.com/2026-01-05-21-lessons-from-14-years-at-google Mon, 05 Jan 2026 00:55:44 GMT Karpathy's Year in Review 2025 https://josiahkhor.com/2026-01-01-karpathy-year-in-review https://josiahkhor.com/2026-01-01-karpathy-year-in-review Thu, 01 Jan 2026 00:55:44 GMT Related to all this is my general apathy and loss of trust in benchmarks in 2025. The core issue is that benchmarks are almost by construction verifiable environments and are therefore immediately susceptible to RLVR and weaker forms of it via synthetic data generation. In the typical benchmaxxing process, teams in LLM labs inevitably construct environments adjacent to little pockets of the embedding space occupied by benchmarks and grow jaggies to cover them. Training on the test set is a new art form. > > What does it look like to crush all the benchmarks but still not get AGI? Well yeah, old mate Alan would probably be surprised to see how the Turing test became "irrelevant" almost overnight.]]> Karpathy on Dwarkesh Podcast https://josiahkhor.com/2025-10-21-karpathy-on-dwarkesh https://josiahkhor.com/2025-10-21-karpathy-on-dwarkesh Tue, 21 Oct 2025 00:55:44 GMT I’ve been in AI for almost two decades. It’s going to be 15 years or so, not that long. You had Richard Sutton here, who was around for much longer. I do have about 15 years of experience of people making predictions, of seeing how they turned out. Also I was in the industry for a while, I was in research, and I’ve worked in the industry for a while. I have a general intuition that I have left from that. > > I feel like the problems are tractable, they’re surmountable, but they’re still difficult. If I just average it out, it just feels like a decade to me. Neural networks were definitely a thing around 2012 when I was midway through my undergraduate program, amongst a larger ensemble of "machine learning" (terminology which seems to be much less used today). > Recently I went back all the way to 1989 which was a fun exercise for me, a few years ago, because I was reproducing Yann LeCun’s 1989 convolutional network, which was the first neural network I’m aware of trained via gradient descent, like modern neural network trained gradient descent on digit recognition. I was just interested in how I could modernize this. How much of this is algorithms? How much of this is data? How much of this progress is compute and systems? I was able to very quickly halve the learning just by time traveling by 33 years. > > So if I time travel by algorithms 33 years, I could adjust what Yann LeCun did in 1989, and I could halve the error. But to get further gains, I had to add a lot more data, I had to 10x the training set, and then I had to add more computational optimizations. I had to train for much longer with dropout and other regularization techniques. Progress (towards AGI) doesn't feel non-linear to me. The improvements in algorithms, data, and compute are fantastic and the usefulness is greatly advanced but the spaces in which AI plays still largely feel the same to me. Temperature doesn't equal creativity, and knowledge doesn't equal understanding.]]> A small number of samples can poison LLMs of any size https://josiahkhor.com/2025-10-11-llm-poisoning https://josiahkhor.com/2025-10-11-llm-poisoning Sat, 11 Oct 2025 00:55:44 GMT Specifically, we demonstrate that by injecting just 250 malicious documents into pretraining data, adversaries can successfully backdoor LLMs ranging from 600M to 13B parameters.]]> Thinking Machines: Defeating Nondeterminism in LLM Inference https://josiahkhor.com/2025-09-11-nondeterminism-in-llm-inference https://josiahkhor.com/2025-09-11-nondeterminism-in-llm-inference Thu, 11 Sep 2025 00:55:44 GMT In other words, the primary reason nearly all LLM inference endpoints are nondeterministic is that the load (and thus batch-size) nondeterministically varies!]]> Subliminal Learning: Language Models Transmit Behavioral Traits via Hidden Signals in Data https://josiahkhor.com/2025-07-23-subliminal-learning https://josiahkhor.com/2025-07-23-subliminal-learning Wed, 23 Jul 2025 00:55:44 GMT Tacos vs Burritos: Container Resourcing Optimisation Theory https://josiahkhor.com/2024-07-19-tacos-vs-burritos https://josiahkhor.com/2024-07-19-tacos-vs-burritos Fri, 19 Jul 2024 00:55:44 GMT