AI-360
"Quantised Llama 3.2 achieves 56% size reduction using QLoRA and SpinQuant, with 4-bit weights and 8-bit activations for mobile deployment."
"SAM 2 enables real-time object segmentation, processing 60M+ polygons across 350M images for manufacturing QA and scientific research."
"AI search engine defends right to cite public facts, highlights source attribution system and revenue-sharing with TIME, Fortune, Der Spiegel."
sCM system reduces image generation to 0.11 seconds using two processing steps vs hundreds, and scales to 1.5B parameters while matching quality of slower methods.
Gefion supercomputer, with 1,528 H100 GPUs, enables quantum simulation jump from 36 to 40 qubits while accelerating weather forecasts from hours to minutes
Meta's Llama model uses split inference: 1st layer processes on user devices, remaining 31 in cloud, enabling privacy while avoiding quantisation