We Need Efficient and Transparent Language Models

LanguageModels
LanguageModels

Holistic Evaluation of Language Models. Stanford researchers recently introduced tools to help users and developers understand large language models (LLM) in their totality. Given the central role of LLMs in NLP and in GenerativeAI, this suite of tools is an important step towards better transparency for language models. I hope other researchers build upon this exciting suite of techniques and ideas.

Benchmarking Open Table Formats. When migrating to a modern data warehouse or data lakehouse, selecting the right table format is crucial. Brooklyn Data Company just released an important new benchmark comparing open source Delta Lake and Apache Iceberg. They found that Delta Lake is typically 1.5x – 3x better for write workloads. Read workloads ran consistently 7x – 8x faster on Delta Lake.

Read more