Unified Parallel Programming of Tilera using UPC
Previous CHREC research has shown that the TILE64 multicore processor can complement FPGAs in high performance computing applications. With the availability of a large number of on-chip cores that are software programmable, the TILE64 platform promises a significant productivity advantage over FPGAs. This project intends to leverage this advantage and provide a programming model that will result in significant improvements to many-core programmability. To achieve the desired objectives on Tilera’s platform, the Partitioned Global Address Space (PGAS) programming model is a very good candidate. PGAS provides programmers with a global address space which is logically partitioned, allowing threads to be aware of the locality of accessed data. Apart from the ease-of-use and data-locality awareness inherent in PGAS programming languages, the architecture of TILE64 exhibits interesting features for the PGAS model; for example, shared memory is available for each core, but locality is important due to the small size of the caches. In addition to memory access optimizations, there are other research challenges, such as exploiting fully all the available inter-core communication mechanisms. During the course of this project, the platform from Tilera will be first evaluated with respect to the PGAS programming model through a library integration study. A case study using the UPC language will be conducted and would lead to a prototype UPC compiler for the TILE64. Ultimately, it is expected that the research carried out during this project will result in UPC language improvements and new PGAS programming model concepts for many-core chips.
Virtualizing FPGA Resources for HPRC
The execution of parallel applications on HPRCs mainly follows the Single-Program Multiple-Data (SPMD) model, which is largely the case in traditional High-Performance Computers (HPCs). In addition, the prevailing usage of FPGAs in such systems has been as co-processors. The overall system resources, however, are often underutilized because of the asymmetric distribution of the reconfigurable processors relative to the conventional processors. This asymmetry is often a challenge for using the SPMD programming model on these systems. In this project, we propose a resource virtualization solution based on Partial Run-Time Reconfiguration (PRTR). This technique will allow sharing the reconfigurable processors among the underutilized processors. Therefore, the goals of this project are maximizing resource utilization in HPRC systems and taking advantage of advances in multicore technology and trends by extending hardware virtualization for multitasking on multicore processors and FPGAs
Multi-core architectures (also known as chip multiprocessors – CMPs) have emerged as the dominant architecture for both desktop and high-performance systems. The emerging multi-core architectures provide a solution to increase the performance capability on a single chip without requiring a complex system and increasing the power requirements. However, these architectures have introduced many challenges in maximizing application performance. In broad perspective, our objective is to enhance our utilization of such systems.
Here is a general categorization of our ongoing research efforts in CMPs:
- Performance evaluation of emerging CMPs
- Exploring the impact of CMPs in the context of large clusters
- Exploring the heterogeneous multi-core systems including CELL processor and GPUs
- Architectural support for future CMPs
Unified Parallel C (UPC)
UPC, or Unified Parallel C, is a parallel extension of ANSI C. UPC follows a distributed shared memory programming model aimed at leveraging the ease of programming of the shared memory paradigm, while enabling the exploitation of data locality. The HPC Lab coordinates the UPC consortium activities and actively participates in the development of the language specifications. The HPCL is also leading the performance evaluation and validation research efforts, as well as the development of the UPC I/O specifications in collaboration with Argonne National Lab and UC Berkeley.
View Project Website [link]