All Stories

  1. Training Signal Optimization for Behavioral Modeling and Digital Predistortion of RF Power Amplifiers
  2. A Survey of FPGA-based 3D CNN Accelerators and Hardware-aware Algorithmic Optimizations
  3. Miniature: Fast AI Supercomputer Networks Simulation on FPGAs
  4. Transfer Learning on the Edge for a Wireless Application Using an SoC Platform
  5. LUTMUL: Exceed Conventional FPGA Roofline Limit by LUT-based Efficient Multiplication for Neural Network Inference
  6. Realizing Network-Attached FPGAs in the Cloud
  7. Optimizing FPGA Memory Allocation for Matrix-Matrix Multiplication Using Bayesian Optimization
  8. Efficient Neural Networks on the Edge with FPGAs by Optimizing an Adaptive Activation Function
  9. Efficient Neural Networks on the Edge with FPGAs by Optimizing an Adaptive Activation Function
  10. Efficient Neural Networks on the Edge with FPGAs by Optimizing an Adaptive Activation Function
  11. HotaQ: Hardware Oriented Token Adaptive Quantization for Large Language Models
  12. Minimizing Training Signal Length for Power Amplifier Characterization and Linearization
  13. Pets vs Cattle:  Heterogeneous Systems in the 21st Century
  14. Artifact Evaluation for ACM TRETS Papers Submitted from the FPT Journal Track
  15. The Future of FPGA Acceleration in Datacenters and the Cloud
  16. Optimizing Designs Using Several Types of Memories on Modern FPGAs
  17. FPGA-aware automatic acceleration framework for vision transformer with mixed-scheme quantization
  18. Evaluating Theoretical Baselines for ML Benchmarking Across Different Accelerators
  19. Computationally Efficient Look-up-Tables for Behavioral Modelling and Digital Pre-distortion of Multi-standard Wireless Systems
  20. FPGAs in The Cloud
  21. FPGAs in the Cloud
  22. High‐performance transformation of protein structure representation from internal to Cartesian coordinates
  23. Strategies and Demonstration to Support Multiple Wireless Protocols with a Single RF Front-End
  24. A Novel Physical Layer Authentication With PAPR Reduction Based on Channel and Hardware Frequency Responses
  25. Real Time Receiver Baseband Processing Platform for Sub 6 GHz PHY Layer Experiments
  26. Evaluation of Optimized CNNs on Heterogeneous Accelerators using a Novel Benchmarking Approach
  27. SIFO: Secure Computational Infrastructure Using FPGA Overlays
  28. QuTiBench
  29. Garbled Circuits in the Cloud using FPGA Enabled Nodes
  30. Detection of Different Wireless Protocols on an FPGA with the Same Analog/RF Front End
  31. High-Level and Compact Design of Cross-Channel LTE DownLink Channel Encoder
  32. FINN- R
  33. Local and Global Shared Memory for Task Based HPC Applications on Heterogeneous Platforms
  34. Scaling Neural Network Performance through Customized Hardware Architectures on Reconfigurable Logic
  35. Accelerating big data applications using lightweight virtualization framework on enterprise cloud
  36. FPGA modeling techniques for detecting and demodulating multiple wireless protocols
  37. FIM: Performance Prediction for Parallel Computation in Iterative Data Processing Applications
  38. Secure Function Evaluation Using an FPGA Overlay Architecture
  39. A Framework for Developing Parallel Applications with high level Tasks on Heterogeneous Platforms
  40. Using High Level GPU Tasks to Explore Memory and Communications Options on Heterogeneous Platforms
  41. Performance prediction techniques for scalable large data processing in distributed MPI systems
  42. Open-Source Variable-Precision Floating-Point Library for Major Commercial FPGAs
  43. Design space exploration of GPU Accelerated cluster systems for optimal data transfer using PCIe bus
  44. Unified and lightweight tasks and conduits: A high level parallel programming framework
  45. Modeling considerations for the hardware-software co-design of flexible modern wireless transceivers
  46. State-Action Based Link Layer Design for IEEE 802.11b Compliant MATLAB-Based SDR
  47. High-level hardware-software co-design of an 802.11a transceiver system using Zynq SoC
  48. Cardiac MRI compressed sensing image reconstruction with a graphics processing unit
  49. High-Level System Design of IEEE 802.11b Standard-Compliant Link Layer for MATLAB-Based SDR
  50. Validity and reliability of Kinect skeleton for measuring shoulder joint angles: a feasibility study
  51. Accelerating K-Means clustering with parallel implementations and GPU computing
  52. GPU implementation of reverse coordinate conversion for proteins
  53. Leakage evaluation on power balance countermeasure against side-channel attack on FPGAs
  54. Balance power leakage to fight against side-channel analysis at gate level in FPGAs
  55. Side-channel analysis of MAC-Keccak hardware implementations
  56. Accuracy of kinect for measuring shoulder joint angles in multiple planes of motion
  57. Kernel Specialization Provides Adaptable GPU Code for Particle Image Velocimetry
  58. Behavioral Non-portability in Scientific Numeric Computing
  59. Implementing a MATLAB-Based Self-configurable Software Defined Radio Transceiver
  60. Power analysis attack on hardware implementation of MAC-Keccak on FPGAs
  61. Accelerating protein coordinate conversion using GPUs
  62. Fast reconstruction of 3D volumes from 2D CT projection data with GPUs
  63. Reducing Processing Latency with a Heterogeneous FPGA-Processor Framework
  64. Validity and reliability of kinect for measuring shoulder joint angles
  65. Make it real: Effective floating-point reasoning via exact arithmetic
  66. FPGA-based hyperspectral covariance coprocessor for size, weight, and power constrained platforms
  67. Vendor agnostic, high performance, double precision Floating Point division for FPGAs
  68. Development of a low-cost, adaptive, clinician-friendly virtual rehabilitation system
  69. Kernel Specialization for Improved Adaptability and Performance on Graphics Processing Units (GPUs)
  70. A Message from the General Chair and Program Chair
  71. Minimum energy operation for clustered island-style FPGAs
  72. The effect of temporal impulse response on experimental reduction of photon scatter in time-resolved diffuse optical tomography
  73. Minimum Energy Analysis and Experimental Verification of a Latch-Based Subthreshold FPGA
  74. Characterization of a single-supply subthreshold FPGA
  75. Cognitive radio universal software hardware
  76. CUDA and OpenCL implementations of 3D CT reconstruction for biomedical imaging
  77. VForce: An environment for portable applications on high performance systems with accelerators
  78. CRUSH: Cognitive Radio Universal Software Hardware
  79. Heterogeneous tasks and conduits framework for rapid application portability and deployment
  80. Cognitive Radio Universal Software Hardware
  81. Incremental clustering applied to radar deinterleaving
  82. Adaptable Two-Dimension Sliding Windows on NVIDIA GPUs with Runtime Compilation
  83. An Autonomous Vector/Scalar Floating Point Coprocessor for FPGAs
  84. VFloat
  85. Efficient template matching with variable size templates in CUDA
  86. A truly two-dimensional systolic array FPGA implementation of QR decomposition
  87. Implementing a Highly Parameterized Digital PIV System on Reconfigurable Hardware
  88. Message from the ASAP '09 General and Technical Program Chairs
  89. Accelerating phase unwrapping and affine transformations for optical quadrature microscopy using CUDA
  90. FPGA Supercomputing Platforms, Architectures, and Techniques for Accelerating Computationally Complex Algorithms
  91. Parallel Backprojection: A Case Study in High-Performance Reconfigurable Computing
  92. Implementing phase unwrapping using Field Programmable Gate Arrays or Graphics Processing Units: A comparison
  93. Special issue: General-purpose processing using graphics processing units
  94. Efficient Communication Between the Embedded Processor and the Reconfigurable Logic on an FPGA
  95. An efficient implementation of a phase unwrapping kernel on reconfigurable hardware
  96. An FPGA Implementation of Explicit-State Model Checking
  97. An Efficient Implementation of a Phase Unwrapping Kernel on Reconfigurable Hardware
  98. Dynamo: a runtime partitioning system for FPGA-based HW/SW image processing systems
  99. K-means Clustering for Multispectral Images Using Floating-Point Divide
  100. Writing Portable Applications that Dynamically Bind at Run Time to Reconfigurable Hardware
  101. Vforce: An Extensible Framework for Reconfigurable Supercomputing
  102. Advanced Components in the Variable Precision Floating-Point Library
  103. Automatic Sliding Window Operation Optimization for FPGA-Based
  104. Enabling MPEG-2 video playback in embedded systems through improved data cache efficiency
  105. Real-Time Particle Image Velocimetry for Feedback Loops Using FPGA Implementation
  106. Field-Programmable Gate Arrays in Embedded Systems
  107. Poster reception---Improving the performance of parallel backprojection on a reconfigurable supercomputer
  108. Parallel-Beam Backprojection: An FPGA Implementation Optimized for Medical Imaging
  109. Optimizing data intensive window-based image processing on reconfigurable hardware boards
  110. Applying reconfigurable hardware to the analysis of multispectral and hyperspectral imagery
  111. Accurate Power Estimation for Sequential CMOS Circuits Using Graph-based Methods
  112. Design issues for hardware implementation of an algorithm for segmenting hyperspectral imagery
  113. Effect of data truncation in an implementation of pixel clustering on a custom computing machine
  114. HML, a novel hardware description language and its translation to VHDL
  115. A data-centric approach to high-level synthesis
  116. Spatial and color clustering on an FPGA-based computer system
  117. Rothko: a three-dimensional FPGA
  118. Division and square root: choosing the right implementation
  119. Optimizing the data cache performance of a software MPEG-2 video decoder
  120. Rothko: A three dimensional FPGA architecture, its fabrication, and design tools
  121. Area and performance tradeoffs in floating-point divide and square-root implementations
  122. An automaton model for scheduling constraints in synchronous machines
  123. Non-restoring integer square root: A case study in design by principled optimization
  124. Reasoning about pipelines with structural hazards
  125. Verifying a logic-synthesis algorithm and implementation: a case study in software verification
  126. A methodology for efficient hardware verification
  127. PBS: proven Boolean simplification
  128. Erratum to: High level synthesis and generation FPGAs with the BEDROC system
  129. High level synthesis and generating FPGAs with the BEDROC system
  130. Formally verified synthesis of combinational CMOS circuits
  131. From programs to transistors: Verifying hardware synthesis tools
  132. Reasoning about the function and timing of integrated circuits with interval temporal logic
  133. Automatic determination of signal flow through MOS transistor networks
  134. Runtime assignment of reconfigurable hardware components for image processing pipelines
  135. Run-time execution of reconfigurable hardware in a Java environment
  136. Design tradeoffs in a hardware implementation of the k-means clustering algorithm
  137. High level synthesis for designing custom computing hardware
  138. Truly rapid prototyping requires high level synthesis