David Johnstone
arXiv.org on Hacker News
Recent submissions with ten or more points
82
Identifying factors contributing to "bad days" for software developers
61
26 Oct
75
State-space models can learn in-context by gradient descent
16
26 Oct
201
Universal optimality of Dijkstra via beyond-worst-case heaps
47
25 Oct
43
Dynamic Models of Gentrification
32
24 Oct
43
Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces
15
23 Oct
17
Building a simple oscillator based Ising machine for research and education
7
22 Oct
157
Guide to Fine-Tuning LLMs
16
22 Oct
39
Transformers Utilization in Chart Understanding: A Review of Advances and Future
2
22 Oct
20
Machine Learning to Computational Plasma Physics Reduced-Order Plasma Modeling
1
21 Oct
27
Route Planning in Transportation Networks (2015)
3
20 Oct
313
QUIC is not quick enough over fast internet
280
19 Oct
66
Breaking Bad: How Compilers Break Constant-Time~Implementations
60
19 Oct
13
Agents Thinking Fast and Slow: A Talker-Reasoner Architecture
0
18 Oct
66
Numerical Precision Affects Mathematical Reasoning Capabilities of LLMs
49
18 Oct
48
LLMD: A Large Language Model for Interpreting Longitudinal Medical Records
19
18 Oct
295
Why do random forests work? They are self-regularizing adaptive smoothers
41
17 Oct
19
Sample what you can't compress; image auto-encoders wihtout GANs
4
17 Oct
14
Running LLMs with 3.3M Context Tokens on a Single GPU
1
15 Oct
65
Meissonic, High-Resolution Text-to-Image Synthesis on consumer graphics cards
4
14 Oct
186
DeepSeek: Advancing theorem proving in LLMs through large-scale synthetic data
54
14 Oct
93
A novel channel contention mechanism for improving wi-fi's reliability
43
13 Oct
81
Gödel Agent: A self-referential agent framework for recursive self-improvement
29
13 Oct
109
Machine learning and information theory concepts towards an AI Mathematician
19
12 Oct
67
Skip Hash: A fast ordered map via software transactional memory
4
11 Oct
89
Grokking at the edge of linear separability
26
11 Oct
18
The Role of Anchor Tokens in Self-Attention Networks
5
11 Oct
282
Understanding the Limitations of Mathematical Reasoning in LLMs
266
11 Oct
97
ARIA: An Open Multimodal Native Mixture-of-Experts Model
21
11 Oct
334
Addition is all you need for energy-efficient language models
126
9 Oct
562
Differential Transformer
177
8 Oct
22
Generated Checklists Improve LLM Evaluation and Generation
1
7 Oct
57
Sorbet: A neuromorphic hardware-compatible transformer-based spiking model
52
7 Oct
45
Decoding the Language of Othering by Russia-Ukraine War Bloggers
112
5 Oct
520
Were RNNs all we needed?
260
3 Oct
248
Serving 70B-scale LLMs efficiently on low-resource edge devices [pdf]
58
3 Oct
31
A Comprehensive Analysis of Package Hallucinations by Code Generating LLMs
9
1 Oct
179
On the design of text editors (2020)
86
28 Sep
251
Collaborative text editing with Eg-Walker: Better, faster, smaller
31
27 Sep
124
LlamaF: An Efficient Llama2 Architecture Accelerator on Embedded FPGAs
29
27 Sep
159
Automatic Content Recognition Tracking in Smart TVs
155
26 Sep
43
The Impact of Element Ordering on LM Agent Performance
2
24 Sep
51
AI Companions Reduce Loneliness
81
21 Sep
42
Dissociating language and thought in large language models
4
21 Sep
10
SwiGLU activation function causes instability in FP8 LLM training
2
21 Sep
230
Training Language Models to Self-Correct via Reinforcement Learning
92
20 Sep
73
A rigid but foldable indoor airship aerial system for cave exploration
45
20 Sep
20
Late Chunking: Contextual Chunk Embeddings Using Long-Context Embedding Models
3
20 Sep
32
Some remarks on the mathematical structure of the multiverse (2016)
21
17 Sep
33
What Is Entropy?
5
17 Sep
Powered by
hn.algolia.com