All Stories

  1. System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models
  2. TEA+ : A Novel Temporal Graph Random Walk Engine with Hybrid Storage Architecture
  3. Mitigating Coupling Map Constrained Correlated Measurement Errors on Quantum Devices
  4. NAS-SE: Designing A Highly-Efficient In-Situ Neural Architecture Search Engine for Large-Scale Deployment
  5. HEAT: A Highly Efficient and Affordable Training System for Collaborative Filtering Based Recommendation on CPUs
  6. TEA: A General-Purpose Temporal Graph Random Walk Engine
  7. DynamAP: Architectural Support for Dynamic Graph Traversal on the Automata Processor
  8. T-GCN
  9. Vapro
  10. AStitch: enabling a new multi-dimensional optimization space for memory-intensive ML training and inference on modern SIMT architectures
  11. MAPA
  12. Shift-BNN: Highly-Efficient Probabilistic Bayesian Neural Network Training via Memory-Friendly Pattern Retrieving
  13. Toward efficient interactions between Python and native libraries
  14. Fast and Scalable Sparse Triangular Solver for Multi-GPU Based HPC Architectures
  15. ClickTrain
  16. η-LSTM: Co-Designing Highly-Efficient Large LSTM Training via Exploiting Memory-Saving and Architectural Design Opportunities
  17. Designing Future Planet-Scale Virtual Reality System via Software-Hardware Co-design
  18. An efficient uncertain graph processing framework for heterogeneous architectures
  19. A novel memory-efficient deep learning training framework via error-bounded lossy compression