All Stories

  1. MUPPET
  2. Enhancing Performance Through Control-Flow Unmerging and Loop Unrolling on GPUs
  3. Scalable Tuning of (OpenMP) GPU Applications via Kernel Record and Replay
  4. HPAC-Offload: Accelerating HPC Applications with Portable Approximate Computing on the GPU
  5. ARETE: Accurate Error Assessment via Machine Learning-Guided Dynamic-Timing Analysis
  6. Approximate High-Performance Computing: A Fast and Energy-Efficient Computing Paradigm in the Post-Moore Era
  7. One Step Closer to Converged Computing: Achieving Scalability with Cloud-Native HPC
  8. Piper: Pipelining OpenMP Offloading Execution Through Compiler Optimization For Performance
  9. Breaking the Vendor Lock
  10. Co-Designing an OpenMP GPU Runtime and Optimizations for Near-Zero Overhead Execution
  11. Efficient Execution of OpenMP on GPUs
  12. Extending OpenMP for Machine Learning-Driven Adaptation
  13. Extending OpenMP to Support Automated Function Specialization Across Translation Units
  14. HPAC
  15. PyOMP: Multithreaded Parallel Programming in Python
  16. Towards Compile-Time-Reducing Compiler Optimization Selection via Machine Learning
  17. Examining Failures and Repairs on Supercomputers with Multi-GPU Compute Nodes
  18. Co-Designing Multi-Level Checkpoint Restart for MPI Applications
  19. A Case Study of LLVM-Based Analysis for Optimizing SIMD Code Generation
  20. Artemis: Automatic Runtime Tuning of Parallel Execution Parameters Using Machine Learning
  21. OMPRacer: A Scalable and Precise Static Race Detector for OpenMP Programs
  22. HPC-MixPBench: An HPC Benchmark Suite for Mixed-Precision Analysis
  23. MATCH: An MPI Fault Tolerance Benchmark Suite
  24. DEFCON: Generating and Detecting Failure-prone Instruction Sequences via Stochastic Search
  25. FAROS: A Framework to Analyze OpenMP Compilation Through Benchmarking and Compiler Optimization Analysis
  26. Reinit$$^{++}$$: Evaluating the Performance of Global-Restart Recovery Methods for MPI Fault Tolerance
  27. Evaluating the Impact of Energy Efficient Networks on HPC Workloads
  28. SAFIRE: Scalable and Accurate Fault Injection for Parallel Multithreaded Applications
  29. NanoStreams: A Microserver Architecture for Real-Time Analytics on Fast Data Streams
  30. Online estimation of scalability of multi-threaded programs, with minimal runtime overhead
  31. DARE
  32. REFINE
  33. NanoStreams: Codesigned microservers for edge analytics in real time
  34. Low-Cost Hardware Infrastructure for Runtime Thread Level Energy Accounting
  35. Methods and metrics for fair server assessment under real-time financial workloads
  36. Iso-Quality of Service: Fairly Ranking Servers for Real-Time Data Analytics
  37. Application-Level Energy Awareness for OpenMP
  38. On the Viability of Microservers for Financial Analytics
  39. Fast Dynamic Binary Rewriting for flexible thread migration on shared-ISA heterogeneous MPSoCs
  40. Fast dynamic binary rewriting to support thread migration in shared-ISA asymmetric multicores
  41. Middleware Mechanisms for Agent Mobility in Wireless Sensor and Actuator Networks
  42. Dynamic binary rewriting and migration for shared-ISA asymmetric, multicore processors
  43. Development Tools for Opportunistic Pervasive Computing