David Johnstone
arXiv.org on Hacker News
Recent submissions with ten or more points
11
Fine-Tuning LLMs: A Review of Technologies, Research, Best Practices, Challenges
1
Today
16
Transformers Utilization in Chart Understanding: A Review of Advances and Future
1
Today
16
Machine Learning to Computational Plasma Physics Reduced-Order Plasma Modeling
1
21 Oct
25
Route Planning in Transportation Networks (2015)
3
20 Oct
306
QUIC is not quick enough over fast internet
270
19 Oct
13
Agents Thinking Fast and Slow: A Talker-Reasoner Architecture
0
18 Oct
47
LLMD: A Large Language Model for Interpreting Longitudinal Medical Records
19
18 Oct
289
Why do random forests work? They are self-regularizing adaptive smoothers
40
17 Oct
19
Sample what you can't compress; image auto-encoders wihtout GANs
4
17 Oct
14
Running LLMs with 3.3M Context Tokens on a Single GPU
1
15 Oct
65
Meissonic, High-Resolution Text-to-Image Synthesis on consumer graphics cards
4
14 Oct
184
DeepSeek: Advancing theorem proving in LLMs through large-scale synthetic data
53
14 Oct
93
A novel channel contention mechanism for improving wi-fi's reliability
43
13 Oct
81
Gödel Agent: A self-referential agent framework for recursive self-improvement
29
13 Oct
109
Machine learning and information theory concepts towards an AI Mathematician
19
12 Oct
67
Skip Hash: A fast ordered map via software transactional memory
4
11 Oct
89
Grokking at the edge of linear separability
26
11 Oct
18
The Role of Anchor Tokens in Self-Attention Networks
5
11 Oct
282
Understanding the Limitations of Mathematical Reasoning in LLMs
266
11 Oct
97
ARIA: An Open Multimodal Native Mixture-of-Experts Model
21
11 Oct
333
Addition is all you need for energy-efficient language models
126
9 Oct
562
Differential Transformer
177
8 Oct
22
Generated Checklists Improve LLM Evaluation and Generation
1
7 Oct
57
Sorbet: A neuromorphic hardware-compatible transformer-based spiking model
52
7 Oct
45
Decoding the Language of Othering by Russia-Ukraine War Bloggers
112
5 Oct
520
Were RNNs all we needed?
260
3 Oct
248
Serving 70B-scale LLMs efficiently on low-resource edge devices [pdf]
58
3 Oct
31
A Comprehensive Analysis of Package Hallucinations by Code Generating LLMs
9
1 Oct
179
On the design of text editors (2020)
86
28 Sep
251
Collaborative text editing with Eg-Walker: Better, faster, smaller
31
27 Sep
124
LlamaF: An Efficient Llama2 Architecture Accelerator on Embedded FPGAs
29
27 Sep
159
Automatic Content Recognition Tracking in Smart TVs
155
26 Sep
43
The Impact of Element Ordering on LM Agent Performance
2
24 Sep
51
AI Companions Reduce Loneliness
81
21 Sep
42
Dissociating language and thought in large language models
4
21 Sep
10
SwiGLU activation function causes instability in FP8 LLM training
2
21 Sep
230
Training Language Models to Self-Correct via Reinforcement Learning
92
20 Sep
73
A rigid but foldable indoor airship aerial system for cave exploration
45
20 Sep
20
Late Chunking: Contextual Chunk Embeddings Using Long-Context Embedding Models
3
20 Sep
32
Some remarks on the mathematical structure of the multiverse (2016)
21
17 Sep
33
What Is Entropy?
5
17 Sep
261
Chain of Thought empowers transformers to solve inherently serial problems
184
17 Sep
291
LLMs Will Always Hallucinate, and We Need to Live with This
261
14 Sep
139
The Legend of Holy Sword: An Immersive Experience for Concentration Enhancement
66
13 Sep
221
Tutorial on diffusion models for imaging and vision
18
10 Sep
80
Pixhell Attack: Leaking Info from Air-Gap Computers via 'Singing Pixels'
30
10 Sep
38
ChartEye: A Deep Learning Framework for Chart Information Extraction
0
10 Sep
80
Deductive Verification for Chain-of-Thought Reasoning in LLMs
20
10 Sep
Powered by
hn.algolia.com