David Johnstone
arXiv.org on Hacker News
Recent submissions with ten or more points
86
Porting HPC Applications to AMD Instinct MI300A Using Unified Memory and OpenMP
27
4 May
130
The Matrix: A Bayesian learning model for LLMs
10
4 May
37
CookingSense: A Culinary Knowledgebase with Multidisciplinary Assertions
1
4 May
76
StructLM: Towards Building Generalist Models for Structured Knowledge Grounding
0
3 May
298
Better and Faster Large Language Models via Multi-Token Prediction
126
1 May
19
Iterative reasoning preference optimization
4
1 May
76
Building a Large Japanese Web Corpus for Large Language Models
29
30 Apr
46
Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Models
4
30 Apr
33
RAGCache: Efficient Knowledge Caching for Retrieval-Augmented Generation
3
30 Apr
181
LoRA+: Efficient Low Rank Adaptation of Large Models
46
28 Apr
15
Step Differences in Instructional Video
0
28 Apr
159
Let's Think Dot by Dot: Hidden Computation in Transformer Language Models
32
27 Apr
73
Relational Graph Convolutional Networks for Sentiment Analysis
3
26 Apr
78
One bad apple can spoil your IPv6 privacy (2022)
59
26 Apr
48
CatLIP: Clip Vision Accuracy with 2.7x Faster Pre-Training on Web-Scale Data
4
25 Apr
100
Quaternion Knowledge Graph Embeddings (2019)
41
25 Apr
135
Removing Reflections from RAW Photos
30
24 Apr
126
Claude 3 beats Google Translate
118
23 Apr
411
Phi-3 Technical Report
129
23 Apr
128
FPGA Architecture for Deep Learning: Survey and Future Directions
52
22 Apr
77
Survey Study on AI Agent Architectures (2024)
16
22 Apr
62
Many-Shot In-Context Learning
1
22 Apr
47
RecurrentGemma: Moving Past Transformers for Efficient Open Language Models
3
22 Apr
136
Lossless Acceleration of LLM via Adaptive N-Gram Parallel Decoding
23
21 Apr
45
Eight Transaction Papers by Jim Gray
9
19 Apr
124
Chinchilla Scaling: A replication attempt
68
18 Apr
92
Collapse of self-trained language models
30
17 Apr
88
The Ballmer Peak: An Empirical Search
24
17 Apr
168
Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length
28
16 Apr
124
ResearchAgent: Iterative Research Idea Generation Using LLMs
63
16 Apr
38
We have no idea how models will behave in production until production
3
15 Apr
29
ChatGPT Can Predict the Future Telling Stories Set in the Future About the Past
8
14 Apr
13
Mechanics of Next Token Prediction with Self-Attention
0
13 Apr
119
Your LLM Is a Capable Regressor When Given In-Context Examples
36
13 Apr
59
Fine-Tuning Increases LLM Vulnerabilities and Risk
33
11 Apr
14
Autonomous LLM agents with human-out-of-loop
8
11 Apr
39
Leave No Context Behind: Efficient Infinite Context Transformers
4
11 Apr
24
Toward Inference-Optimal Mixture-of-Expert Large Language Models
0
10 Apr
16
A Survey on Red Teaming for Generative Models
0
10 Apr
71
Evaluating faithfulness and content selection of LLMs in book-length summaries
7
9 Apr
21
AI consciousness is inevitable: A theoretical computer science perspective
54
9 Apr
104
Social Skill Training with Large Language Models
100
9 Apr
53
Apple Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs
7
9 Apr
52
Direct Nash Optimization: Teaching language models to self-improve
11
8 Apr
71
Nightfall: Can Kalgash Exist (2014)
10
8 Apr
281
Mixture-of-Depths: Dynamically allocating compute in transformers
83
7 Apr
29
Rendering string diagrams recursively [pdf]
4
7 Apr
54
Sophia: Scalable Stochastic 2nd-Order Optimizer for Language Model Pre-Training
2
7 Apr
288
More Agents Is All You Need: LLMs performance scales with the number of agents
206
6 Apr
26
Long-form factuality in large language models
16
6 Apr
Powered by
hn.algolia.com