Grab a β & get comfy! Nathan Benaich released a whopper 163-page The State of AI 2023. Great review of a crazy year in AI. What resonated with me:
π GPT-4 the uncontested, most generally capable model. Solved tasks GPT-3.5 unable to, like the Uniform Bar Exam (90% score vs 10%). p.12
π Economic stakes high! OpenAI & Google technical reports on latest models donβt disclose information useful for researchers. p.16
π LLaMa-2 70B by Meta competitive with ChatGPT on most tasks. Can use commercially, downloaded 32M times! p.19
π Although context windows (the text input length) of models growing in size, thereβs a βLost in the Middleβ performance problem. p.24
π Innovations including FlashAttention & 4-bit quantization tackle challenges like reducing memory footprint & accelerating inference. p.25
π Microsoft researchers find small language models (SLMs) trained with very specialized, curated datasets rival models 50x larger on specific tasks, e.g., phi-1.5. p.26
π Research to embed output from models with digital signatures to support identifying real from fake. p.30
π LLMs are great prompt engineers, out-performing human designed prompts. p.35
π Continuously monitor performance of GPT models, as they are continuously updated, with varying performance. p.36
π Googleβs Med_PaLM2 sets new SOTA results for medical-related benchmarks. p.63
π More than 70% of most cited AI papers in last 3 years have authors from US organizations. p.68
π NVIDIA joins $1T market cap club. p.70
π Not just top hyperscalers buying GPUs. Startup infra provider Lambda spending 9-figure $ sums, with over 45,000 GPUs installed. (p.71). Other companies buying vast quantities include Cohere, Inflection, & Imbue. (p.72)
π Compute is the new oil, even in Saudi Arabia! One university buys 3000 cards for LLM research. (p.73)
π Microsoft leads cloud service provider AI spending as % of total capex. p.79
π For mid-level professional writing, workers using ChatGPT took 40% less time, quality 18% better. p.91
π Compared to YouTube, Instagram, & TikTok, GenAI apps like ChatGPT suffer from lower retention rates & daily active users. p.94
π Legal complications on text and image copyright infringements surfacing. p.98
π Governments building GenAI compute capacity, but lagging behind private sector. p.131
π Ukraine a lab for AI-powered warfare. p.133
π Policymakers adopting wait-and-see to potential job losses. Support for Universal Basic Income. p.137
π Open vs Closed source debate continues; open source levels playing field but higher risk of misuse, versus closed source for greater security but less transparency. p.146
π Top predictions include craze on huge models seeing >$1B spent on training a single model, investigation of Microsoft/OpenAI deal on competition grounds, & a large AI company acquiring an AI chip company. p. 157