Compute-Enabled Memory to Accelerate Large-Context LLMs via Sparse Attention” was published by researchers at Cornell ...