
Context Engineering
Context engineering is the system-level discipline of architecting the dynamic information environment for AI models. Unlike prompt engineering, which focuses on phrasing specific instructions, context engineering progra

Hosted by AI-Talk · 🇺🇸 US · EN · 68 episodes
Established thought leaders with verified media credentials.
AI Explained breaks down the world of AI in just 10 minutes. Get quick, clear insights into AI concepts and innovations, without any complicated math or jargon. Perfect for your commute or spare time, this podcast makes understanding AI easy, engaging, and fun—whether you're a beginner or tech enthusiast.
AI-Talk hosts Large Language Model (LLM) Talk, a technology show with 68 episodes published.

Context engineering is the system-level discipline of architecting the dynamic information environment for AI models. Unlike prompt engineering, which focuses on phrasing specific instructions, context engineering progra

Manus AI is a general-purpose autonomous agent designed to function as a digital worker rather than a passive chatbot. Developed by Monica and acquired by Meta, it utilizes a Planner-Executor architecture to orchestrate

Kimi K2, developed by Moonshot AI, is an open agentic intelligence model built on a Mixture-of-Experts (MoE) architecture. It features 1 trillion total parameters, with 32 billion active during inference. Trained on 15.5

Mixture-of-Recursions (MoR) is a unified framework built on a Recursive Transformer architecture, designed to enhance the efficiency of large language models. It achieves this by combining three core paradigms: parameter

MeanFlow models introduce the concept of average velocity to fundamentally reformulate one-step generative modeling. Unlike Flow Matching, which focuses on instantaneous velocity, MeanFlow directly models the displacemen

Mamba is a novel deep learning architecture that achieves linear scaling in computation and memory with sequence length, addressing Transformers' quadratic limitations. Its selective State Space Model (SSM) layer dynamic

LLM alignment is the process of steering Large Language Models to operate in a manner consistent with intended human goals, preferences, and ethical principles. Its primary objective is to make LLMs helpful, honest, and

The "Why We Think" from Lilian Weng, examines improving language models by allocating more computation at test time, drawing an analogy to human "slow thinking" or System 2. By treating computation as a resource, the aim

Deep Research is an autonomous research agent built into ChatGPT. It performs multi-step online research over several minutes, behaving like a human researcher by searching, reading, analyzing, and synthesizing informati

vLLM is a high-throughput serving system for large language models. It addresses inefficient KV cache memory management in existing systems caused by fragmentation and lack of sharing, which limits batch size. vLLM uses

Qwen3 models introduce both Mixture-of-Experts (MoE) and dense architectures. They utilize hybrid thinking modes, allowing users to balance response speed and reasoning depth for tasks, controllable via parameters or tag

RAGEN is a modular system for training and evaluating LLM agents using multi-turn reinforcement learning. Built on the StarPO framework, it implements the full training loop including rollout generation, reward assignmen

DeepSeek-Prover-V2 is an open-source large language model designed for formal theorem proving in Lean 4. Its training relies heavily on synthetic data, generated by using DeepSeek-V3 to decompose problems into subgoals,

The DeepSeek-Prover project aims to advance large language model capabilities in formal theorem proving by addressing the scarcity of training data. It uses autoformalization to convert informal high school and undergrad

The Model Context Protocol (MCP), introduced by Anthropic in November 2024, is an open protocol standardizing how applications provide context to LLMs. Acting like a "USB-C port for AI applications," it provides a standa

LLM post-training is crucial for refining the reasoning abilities developed during pretraining. It employs fine-tuning on specific reasoning tasks, reinforcement learning to reward logical steps and coherent thought proc

Agent AI refers to interactive systems that perceive visual, language, and environmental data to produce meaningful embodied actions in physical and virtual worlds. It aims to create sophisticated and context-aware AI, p

FlashAttention-3 accelerates attention on NVIDIA Hopper GPUs through three key innovations. It achieves producer-consumer asynchrony by dividing warps into producer (data loading with TMA) and consumer (computation with

FlashAttention-2 builds upon FlashAttention to achieve faster attention computation with better GPU resource utilization. It enhances parallelism by also parallelizing along the sequence length dimension, optimizing work

FlashAttention is an IO-aware attention mechanism designed to be fast and memory-efficient, especially for long sequences. Its core innovation is tiling, where input sequences are divided into blocks processed within the
Sponsor detection runs nightly. Check back soon.
No public pitch examples yet for this show.
Generate your own personalised pitchBased on semantic analysis of episode topics and host coverage, this show is a strong guest fit for executives in:
Industry fit is computed by PitchCentric using vector embeddings of the show's episode catalog.
Shows with the most semantically similar episode content. Pitch one, pitch all; producers cluster.







To pitch Large Language Model (LLM) Talk, visit https://podcasters.spotify.com/pod/show/jack1505 for contact information, then craft a tight one-paragraph hook that ties your expertise to a gap in their recent technology coverage.
Large Language Model (LLM) Talk is hosted by AI-Talk. The show is categorised under technology and has published 68 episodes.
Large Language Model (LLM) Talk has published 68 episodes.
Large Language Model (LLM) Talk regularly covers technology. It sits in the technology category.
Large Language Model (LLM) Talk is accessible for guests with genuine technology expertise. A personalised, episode-aware pitch will still outperform a generic one every time.
Large Language Model (LLM) Talk hasn't explicitly signalled guest openness in recent episodes. That doesn't rule out pitching. your hook just needs to be especially compelling and relevant to their recent content.
Episodes of Large Language Model (LLM) Talk average 14 minutes. a focused format where a clear narrative arc and tight preparation matter most.
Our data rates Large Language Model (LLM) Talk's guest bar at 80/100 (Premium tier). Established thought leaders with verified media credentials. Sign in to PitchCentric to see how your own Pod Score compares against this show.
Methodology. Booking Probability™ blends Listen Score, 30-day Virality, open-to-guests detection, and Apple ratings. Data refreshed every 60 minutes. Listen Score and Booking Probability are calculated by PitchCentric. Last enriched 10 days ago.