Best Igor Melnyk Podcasts (2025)

1
[QA] Looking beyond the next token 7:22

2h ago7:22

7:22

The paper presents TRELAWNEY, a method for rearranging training data to improve causal language models' performance in planning and reasoning without altering architecture, enhancing goal generation capabilities. https://arxiv.org/abs//2504.11336 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcas…

1
Looking beyond the next token 16:58

2h ago16:58

16:58

The paper presents TRELAWNEY, a method for rearranging training data to improve causal language models' performance in planning and reasoning without altering architecture, enhancing goal generation capabilities. https://arxiv.org/abs//2504.11336 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcas…

1
[QA] How to Predict Best Pretraining Data with Small Experiments 8:16

2h ago8:16

8:16

The paper introduces DATADECIDE, a suite for evaluating data selection methods, revealing that small-scale model rankings effectively predict larger model performance, enhancing cost-efficient pretraining decisions. https://arxiv.org/abs//2504.11393 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Pod…

1
How to Predict Best Pretraining Data with Small Experiments 20:22

2h ago20:22

20:22

The paper introduces DATADECIDE, a suite for evaluating data selection methods, revealing that small-scale model rankings effectively predict larger model performance, enhancing cost-efficient pretraining decisions. https://arxiv.org/abs//2504.11393 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Pod…

1
[QA] Have we unified image generation and understanding yet? An empirical study of GPT-4o's image generation ability 7:18

2h ago7:18

7:18

This study evaluates OpenAI's GPT-4o, revealing limitations in semantic synthesis, instruction adherence, and reasoning, challenging assumptions about its multimodal capabilities and calling for improved benchmarks and training strategies. https://arxiv.org/abs//2504.08003 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com…

1
Have we unified image generation and understanding yet? An empirical study of GPT-4o's image generation ability 7:07

2h ago7:07

7:07

This study evaluates OpenAI's GPT-4o, revealing limitations in semantic synthesis, instruction adherence, and reasoning, challenging assumptions about its multimodal capabilities and calling for improved benchmarks and training strategies. https://arxiv.org/abs//2504.08003 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com…

1
[QA] DUMP: Automated Distribution-Level Curriculum Learning for RL-based LLM Post-training 7:39

2h ago7:39

7:39

This paper introduces a distribution-level curriculum learning framework for RL-based post-training of LLMs, enhancing reasoning capabilities by adaptively scheduling training across diverse data distributions. https://arxiv.org/abs//2504.09710 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts…

1
DUMP: Automated Distribution-Level Curriculum Learning for RL-based LLM Post-training 10:11

2h ago10:11

10:11

This paper introduces a distribution-level curriculum learning framework for RL-based post-training of LLMs, enhancing reasoning capabilities by adaptively scheduling training across diverse data distributions. https://arxiv.org/abs//2504.09710 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts…

1
[QA] Steering CLIP's vision transformer with sparse autoencoders 8:11

2h ago8:11

8:11

This study explores sparse autoencoders in vision models, revealing unique processing patterns and enhancing steerability, leading to improved performance in vision disentanglement tasks and defense strategies. https://arxiv.org/abs//2504.08729 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts…

1
Steering CLIP's vision transformer with sparse autoencoders 17:53

6h ago17:53

17:53

This study explores sparse autoencoders in vision models, revealing unique processing patterns and enhancing steerability, leading to improved performance in vision disentanglement tasks and defense strategies. https://arxiv.org/abs//2504.08729 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts…

1
[QA] Genius: A Generalizable and Purely Unsupervised Self-Training Framework For Advanced Reasoning 7:58

6h ago7:58

7:58

Genius is an unsupervised self-training framework that enhances LLM reasoning without external supervision, using stepwise foresight re-sampling and advantage-calibrated optimization to improve performance. https://arxiv.org/abs//2504.08672 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: ht…

1
Genius: A Generalizable and Purely Unsupervised Self-Training Framework For Advanced Reasoning 18:11

6h ago18:11

18:11

Genius is an unsupervised self-training framework that enhances LLM reasoning without external supervision, using stepwise foresight re-sampling and advantage-calibrated optimization to improve performance. https://arxiv.org/abs//2504.08672 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: ht…

1
[QA] Rethinking Reflection in Pre-Training 8:18

3d ago8:18

8:18

The study reveals that language models develop self-correcting abilities during pre-training, enhancing their problem-solving skills, as demonstrated by the OLMo-2-7B model's performance on self-reflection tasks. https://arxiv.org/abs//2504.04022 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcas…

1
Rethinking Reflection in Pre-Training 17:47

1d ago17:47

17:47

The study reveals that language models develop self-correcting abilities during pre-training, enhancing their problem-solving skills, as demonstrated by the OLMo-2-7B model's performance on self-reflection tasks. https://arxiv.org/abs//2504.04022 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcas…

1
[QA] Self-Steering Language Models 7:21

1d ago7:21

7:21

DISCIPL enables language models to generate task-specific inference programs, improving reasoning efficiency and verifiability, and outperforming larger models on constrained generation tasks without requiring finetuning. https://arxiv.org/abs//2504.07081 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers App…

1
Self-Steering Language Models 8:43

1d ago8:43

8:43

DISCIPL enables language models to generate task-specific inference programs, improving reasoning efficiency and verifiability, and outperforming larger models on constrained generation tasks without requiring finetuning. https://arxiv.org/abs//2504.07081 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers App…

1
[QA] Missing Premise exacerbates Overthinking: Are Reasoning Models losing Critical Thinking Skill? 7:45

4d ago7:45

7:45

The study reveals that reasoning LLMs struggle with ill-posed questions, leading to excessive, ineffective responses, while non-reasoning LLMs perform better, highlighting flaws in current training methods.https://arxiv.org/abs//2504.06514YouTube: https://www.youtube.com/@ArxivPapersTikTok: https://www.tiktok.com/@arxiv_papersApple Podcasts: https:…

1
Missing Premise exacerbates Overthinking: Are Reasoning Models losing Critical Thinking Skill? 16:23

2d ago16:23

16:23

The study reveals that reasoning LLMs struggle with ill-posed questions, leading to excessive, ineffective responses, while non-reasoning LLMs perform better, highlighting flaws in current training methods.https://arxiv.org/abs//2504.06514YouTube: https://www.youtube.com/@ArxivPapersTikTok: https://www.tiktok.com/@arxiv_papersApple Podcasts: https:…

1
DDT: Decoupled Diffusion Transformer 8:07

2d ago8:07

8:07

The proposed Diffusion Transformer (DDT) improves generation quality and inference speed by decoupling semantic encoding and high-frequency decoding, achieving state-of-the-art performance on ImageNet with faster training convergence.https://arxiv.org/abs//2504.05741YouTube: https://www.youtube.com/@ArxivPapersTikTok: https://www.tiktok.com/@arxiv_…

1
[QA] Dynamic Cheatsheet: Test-Time Learning with Adaptive Memory 7:56

3d ago7:56

7:56

Dynamic Cheatsheet (DC) enhances language models with persistent memory, improving performance on various tasks by enabling test-time learning and efficient reuse of problem-solving insights without altering model parameters. https://arxiv.org/abs//2504.07952 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers…

1
Dynamic Cheatsheet: Test-Time Learning with Adaptive Memory 15:48

3d ago15:48

15:48

Dynamic Cheatsheet (DC) enhances language models with persistent memory, improving performance on various tasks by enabling test-time learning and efficient reuse of problem-solving insights without altering model parameters. https://arxiv.org/abs//2504.07952 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers…

1
[QA] Scaling Laws for Native Multimodal Models 7:14

3d ago7:14

7:14

This study compares late-fusion and early-fusion multimodal models, finding early-fusion more efficient and effective, especially when enhanced with Mixture of Experts for modality-specific learning. https://arxiv.org/abs//2504.07951 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://p…

1
Scaling Laws for Native Multimodal Models 18:46

3d ago18:46

18:46

This study compares late-fusion and early-fusion multimodal models, finding early-fusion more efficient and effective, especially when enhanced with Mixture of Experts for modality-specific learning. https://arxiv.org/abs//2504.07951 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://p…

1
[QA] OLMOTRACE: Tracing Language Model Outputs Back to Trillions of Training Tokens 7:16

4d ago7:16

7:16

OLMOTRACE is a real-time system that traces language model outputs to their training data, enabling users to explore fact-checking, hallucination, and creativity in language models. https://arxiv.org/abs//2504.07096 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/…

1
OLMOTRACE: Tracing Language Model Outputs Back to Trillions of Training Tokens 18:20

4d ago18:20

18:20

OLMOTRACE is a real-time system that traces language model outputs to their training data, enabling users to explore fact-checking, hallucination, and creativity in language models. https://arxiv.org/abs//2504.07096 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/…

1
[QA] Wanting to be Understood 7:28

4d ago7:28

7:28

https://arxiv.org/abs//2504.06611 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016 Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers

1
Wanting to be Understood 16:47

4d ago16:47

16:47

https://arxiv.org/abs//2504.06611 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016 Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers

1
[QA] A Sober Look at Progress in Language Model Reasoning: Pitfalls and Paths to Reproducibility 7:38

4d ago7:38

7:38

This study critiques current mathematical reasoning benchmarks for language models, highlighting sensitivity to implementation choices and proposing a standardized evaluation framework to improve transparency and reproducibility. https://arxiv.org/abs//2504.07086 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_pa…

1
A Sober Look at Progress in Language Model Reasoning: Pitfalls and Paths to Reproducibility 19:29

4d ago19:29

19:29

This study critiques current mathematical reasoning benchmarks for language models, highlighting sensitivity to implementation choices and proposing a standardized evaluation framework to improve transparency and reproducibility. https://arxiv.org/abs//2504.07086 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_pa…

1
[QA] From 128K to 4M: Efficient Training of Ultra-Long Context Large Language Models 8:19

5d ago8:19

8:19

This paper presents an efficient training method for ultra-long context LLMs, extending context lengths to 4M tokens while maintaining performance on both long and short context tasks. https://arxiv.org/abs//2504.06214 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.c…

1
From 128K to 4M: Efficient Training of Ultra-Long Context Large Language Models 22:34

5d ago22:34

22:34

This paper presents an efficient training method for ultra-long context LLMs, extending context lengths to 4M tokens while maintaining performance on both long and short context tasks. https://arxiv.org/abs//2504.06214 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.c…

1
[QA] Hogwild! Inference: Parallel LLM Generation via Concurrent Attention 6:54

8d ago6:54

6:54

This paper presents Hogwild! Inference, a parallel LLM inference engine enabling LLMs to collaborate effectively using a shared attention cache, enhancing reasoning and efficiency without fine-tuning. https://arxiv.org/abs//2504.06261 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://…

1
Hogwild! Inference: Parallel LLM Generation via Concurrent Attention 15:03

5d ago15:03

15:03

This paper presents Hogwild! Inference, a parallel LLM inference engine enabling LLMs to collaborate effectively using a shared attention cache, enhancing reasoning and efficiency without fine-tuning. https://arxiv.org/abs//2504.06261 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://…

1
[QA] Can ChatGPT Learn My Life From a Week of First-Person Video? 7:48

6d ago7:48

7:48

The study explores how generative AI models learn personal information from first-person camera data, revealing both accurate insights and hallucinations about the wearer's life. https://arxiv.org/abs//2504.03857 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/…

1
Can ChatGPT Learn My Life From a Week of First-Person Video? 8:40

6d ago8:40

8:40

The study explores how generative AI models learn personal information from first-person camera data, revealing both accurate insights and hallucinations about the wearer's life. https://arxiv.org/abs//2504.03857 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/…

1
[QA] Using Attention Sinks to Identify and Evaluate Dormant Heads in Pretrained LLMs 8:08

6d ago8:08

8:08

The paper introduces "dormant attention heads" in multi-head attention, analyzing their impact on model performance and revealing their early emergence and dependency on input text characteristics. https://arxiv.org/abs//2504.03889 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://pod…

1
Using Attention Sinks to Identify and Evaluate Dormant Heads in Pretrained LLMs 17:06

6d ago17:06

17:06

The paper introduces "dormant attention heads" in multi-head attention, analyzing their impact on model performance and revealing their early emergence and dependency on input text characteristics. https://arxiv.org/abs//2504.03889 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://pod…

1
[QA] Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models 8:11

10d ago8:11

8:11

Nemotron-H models enhance inference efficiency by replacing self-attention layers with Mamba layers, achieving comparable accuracy to state-of-the-art models while being significantly faster and requiring less memory. https://arxiv.org/abs//2504.03624 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple P…

1
Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models 25:17

10d ago25:17

25:17

Nemotron-H models enhance inference efficiency by replacing self-attention layers with Mamba layers, achieving comparable accuracy to state-of-the-art models while being significantly faster and requiring less memory. https://arxiv.org/abs//2504.03624 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple P…

1
[QA] Agentic Knowledgeable Self-awareness 7:13

10d ago7:13

7:13

The paper introduces KnowSelf, a novel approach for LLM-based agents that enhances decision-making through knowledgeable self-awareness, improving planning efficiency while minimizing external knowledge reliance. https://arxiv.org/abs//2504.03553 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcas…

1
Agentic Knowledgeable Self-awareness 18:40

10d ago18:40

18:40

The paper introduces KnowSelf, a novel approach for LLM-based agents that enhances decision-making through knowledgeable self-awareness, improving planning efficiency while minimizing external knowledge reliance. https://arxiv.org/abs//2504.03553 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcas…

1
[QA] Inference-Time Scaling for Generalist Reward Modeling 8:07

12d ago8:07

8:07

This paper explores improving reward modeling and inference-time scalability in large language models using pointwise generative reward modeling and Self-Principled Critique Tuning, achieving enhanced performance and quality. https://arxiv.org/abs//2504.02495 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers…

1
Inference-Time Scaling for Generalist Reward Modeling 18:00

12d ago18:00

18:00

This paper explores improving reward modeling and inference-time scalability in large language models using pointwise generative reward modeling and Self-Principled Critique Tuning, achieving enhanced performance and quality. https://arxiv.org/abs//2504.02495 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers…

1
[QA] Multi-Token Attention 8:04

12d ago8:04

8:04

The paper introduces Multi-Token Attention (MTA), enhancing LLMs' attention mechanisms by using multiple query and key vectors, improving performance on language modeling and long-context tasks. https://arxiv.org/abs//2504.00927 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcas…

1
Multi-Token Attention 18:18

12d ago18:18

18:18

The paper introduces Multi-Token Attention (MTA), enhancing LLMs' attention mechanisms by using multiple query and key vectors, improving performance on language modeling and long-context tasks. https://arxiv.org/abs//2504.00927 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcas…

1
[QA] Visual Jenga: Discovering Object Dependencies via Counterfactual Inpainting 6:59

18d ago6:59

6:59

The paper introduces Visual Jenga, a scene understanding task that explores object removal while maintaining scene coherence, using a data-driven approach to analyze structural dependencies in images. https://arxiv.org/abs//2503.21770 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://…

1
Visual Jenga: Discovering Object Dependencies via Counterfactual Inpainting 16:15

19d ago16:15

16:15

The paper introduces Visual Jenga, a scene understanding task that explores object removal while maintaining scene coherence, using a data-driven approach to analyze structural dependencies in images. https://arxiv.org/abs//2503.21770 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://…

1
[QA] Wan: Open and Advanced Large-Scale Video Generative Models 8:19

19d ago8:19

8:19

Wan is an open suite of video foundation models that enhances video generation through innovations, offering leading performance, efficiency, and versatility across multiple applications, while promoting community growth. https://arxiv.org/abs//2503.20314 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers App…

1
Wan: Open and Advanced Large-Scale Video Generative Models 1:04:43

19d ago1:04:43

1:04:43

Wan is an open suite of video foundation models that enhances video generation through innovations, offering leading performance, efficiency, and versatility across multiple applications, while promoting community growth. https://arxiv.org/abs//2503.20314 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers App…

Podcasts Worth a Listen

Igor Melnyk Podcasts

Podcasts Worth a Listen

Quick Reference Guide

zproxy.org