David Johnstone
arXiv.org on Hacker News
Recent submissions with ten or more points
228
LoRA vs. Full Fine-Tuning: An Illusion of Equivalence
52
8 Nov
19
Physics-informed Shadowgraph Network: End-to-end Density Field Reconstruction
1
7 Nov
156
Evaluating the world model implicit in a generative model
44
7 Nov
23
WebRL: Training LLM Web Agents via Self-Evolving Online Reinforcement Learning
1
5 Nov
111
Designing a Home Radio Telescope for 21 Cm Emission
25
4 Nov
257
An embarrassingly simple approach to recover unlearned knowledge for LLMs
121
4 Nov
11
Creating Interactive and Embedded Physics Simulations from Static Textbooks
0
3 Nov
124
Spann: Highly-Efficient Billion-Scale Approximate Nearest Neighbor Search (2021)
33
2 Nov
62
Ring-Based Mid-Air Gesture Typing System Using Deep Learning Word Prediction
34
2 Nov
54
Consistently faster and smaller compressed bitmaps with Roaring (2016)
7
1 Nov
174
TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters
32
1 Nov
10
DAWN: Designing Distributed Agents in a Worldwide Network
3
1 Nov
371
Chain-of-thought can hurt performance on tasks where thinking makes humans worse
250
30 Oct
85
Representing web applications as knowledge graphs
9
30 Oct
137
LLMs know more than they show: On the intrinsic representation of hallucinations
140
30 Oct
100
Crux, a Precise Verifier for Rust and Other Languages
13
28 Oct
85
Identifying factors contributing to "bad days" for software developers
62
26 Oct
86
State-space models can learn in-context by gradient descent
58
26 Oct
203
Universal optimality of Dijkstra via beyond-worst-case heaps
47
25 Oct
43
Dynamic Models of Gentrification
32
24 Oct
44
Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces
15
23 Oct
17
Building a simple oscillator based Ising machine for research and education
7
22 Oct
157
Guide to Fine-Tuning LLMs
16
22 Oct
39
Transformers Utilization in Chart Understanding: A Review of Advances and Future
2
22 Oct
20
Machine Learning to Computational Plasma Physics Reduced-Order Plasma Modeling
1
21 Oct
27
Route Planning in Transportation Networks (2015)
3
20 Oct
313
QUIC is not quick enough over fast internet
280
19 Oct
66
Breaking Bad: How Compilers Break Constant-Time~Implementations
60
19 Oct
13
Agents Thinking Fast and Slow: A Talker-Reasoner Architecture
0
18 Oct
66
Numerical Precision Affects Mathematical Reasoning Capabilities of LLMs
49
18 Oct
48
LLMD: A Large Language Model for Interpreting Longitudinal Medical Records
19
18 Oct
295
Why do random forests work? They are self-regularizing adaptive smoothers
41
17 Oct
19
Sample what you can't compress; image auto-encoders wihtout GANs
4
17 Oct
14
Running LLMs with 3.3M Context Tokens on a Single GPU
1
15 Oct
65
Meissonic, High-Resolution Text-to-Image Synthesis on consumer graphics cards
4
14 Oct
186
DeepSeek: Advancing theorem proving in LLMs through large-scale synthetic data
54
14 Oct
93
A novel channel contention mechanism for improving wi-fi's reliability
43
13 Oct
81
Gödel Agent: A self-referential agent framework for recursive self-improvement
29
13 Oct
109
Machine learning and information theory concepts towards an AI Mathematician
19
12 Oct
67
Skip Hash: A fast ordered map via software transactional memory
4
11 Oct
89
Grokking at the edge of linear separability
26
11 Oct
18
The Role of Anchor Tokens in Self-Attention Networks
5
11 Oct
282
Understanding the Limitations of Mathematical Reasoning in LLMs
266
11 Oct
97
ARIA: An Open Multimodal Native Mixture-of-Experts Model
21
11 Oct
334
Addition is all you need for energy-efficient language models
126
9 Oct
562
Differential Transformer
177
8 Oct
22
Generated Checklists Improve LLM Evaluation and Generation
1
7 Oct
57
Sorbet: A neuromorphic hardware-compatible transformer-based spiking model
52
7 Oct
45
Decoding the Language of Othering by Russia-Ukraine War Bloggers
112
5 Oct
520
Were RNNs all we needed?
260
3 Oct
Powered by
hn.algolia.com