Preface
Page: ii-iv (3)
Author: Pao-Ann Hsiung, Yean-Ru Chen and Chao-Sheng Lin
DOI: 10.2174/9781608052257111010100ii
List of Contributors
Page: v-vi (2)
Author: Pao-Ann Hsiung, Yean-Ru Chen and Chao-Sheng Lin
DOI: 10.2174/97816080522571110101000v
Abstract
Full text available
Acknowledgements
Page: vii-vii (1)
Author: Pao-Ann Hsiung, Yean-Ru Chen and Chao-Sheng Lin
DOI: 10.2174/978160805225711101010vii
Abstract
Full text available
Affinity and Distance-Aware Thread Scheduling and Migration in Reconfigurable Many-Core Architectures
Page: 3-18 (16)
Author: Fadi N. Sibai
DOI: 10.2174/978160805225711101010003
PDF Price: $15
Abstract
Modern many-core CMPs and MPSoC embedded systems integrate different cores. A 2D mesh interconnects the routers at each core providing system reconfigurability which for instance allows the bypassing of specific routes due to faults or congestion. In this work, a class-based many-core architecture with reconfigurable classes and a 2D Mesh-based on-chip interconnection network is considered. We present an affinity- and distanceaware thread scheduling scheme and migration policies for a reconfigurable heterogeneous class-based 2D Meshinterconnected many-core CMP. We also present a simulator for evaluating various scheduling and migration algorithms. The simulation results reveal that scheduling algorithms which both consider core affinity and support distance-based migration outperform the other considered algorithms in reconfigurable many-core architectures.
On the Design of Multicore Architectures Guided by a Miss Table at Level-1 and Level-2 Caches to Improve Predictability and Performance/Power Ratio
Page: 19-32 (14)
Author: Abu Asaduzzaman and Fadi N. Sibai
DOI: 10.2174/978160805225711101010019
PDF Price: $15
Abstract
Most contemporary architectures for high-performance low-power computing systems consist of multicore processors, where tasks are distributed among multiple cores to improve processing speed and the system runs at a lower frequency to reduce the total power consumption. However, multilevel caches in multicore architectures multiply the timing unpredictability and require significant amount of power to be operated. Cache locking techniques are used in single-core systems to improve predictability by locking useful blocks in the cache. The success of cache locking primarily depends on the effective selection of the right blocks to be locked. In prior work, we introduced an efficient block selection methodology and a Miss Table based cache locking scheme where information about the blocks and cache misses are stored in the Miss Table to facilitate the cache locking. Cache locking in multicore is more challenging because of the complexity introduced by the architecture. In this chapter, we investigate the impact of the particular placement of the Miss Table, i.e. whether at the level-1 cache (CL1) or at level-2 cache (CL2), on the system’s predictability and performance/power ratio. Using VisualSim and Heptane simulation tools, we simulate an 8-core architecture, where each core’s private CL1 is split into instruction (I1) and data (D1) caches and the CL2 is unified and shared by the cores. Experimental results using MPEG4 decoding and FFT algorithms show that Miss Table based cache locking at level-1 is more beneficiary than Miss Table based cache locking at level-2 for MPEG4; a maximum reduction of 38% in mean delay per task and a maximum reduction of 32% in total power consumption are achieved by locking one-fourth of the I1 cache size. For FFT, the impact of locking at level-1 and level-2 is almost the same.
TRoCMP: An Approach to Energy Saving for Multi-Core Systems
Page: 33-60 (28)
Author: Long Zheng, Mianxiong Dong, Minyi Guo, Song Guo, Kaoru Ota and Jun Ma
DOI: 10.2174/978160805225711101010033
PDF Price: $15
Abstract
Nowadays, multi-core processor, also called Chip Multiprocessor (CMP) becomes the mainstream that can achieve higher computation capability. However, energy issue is still a crucial problem for design and manufacture of multi-core processor. Tag reduction technique can save energy of the single-core system. This chapter introduces the Tag Reduction on CMP (TRoCMP) that is a novel approach to energy saving for multicore system. We first extend tag reduction from single-core to multi-core processor, including proposing 3 heuristic algorithms to implement TRoCMP. Then the performance overhead is considered, so that Core Degree mechanism and a refined heuristic algorithm are further introduced and designed to find out the trade-off of energy saving and performance overhead of TRoCMP. In particular, we formulate the energy consumption and performance overhead of TRoCMP to analyze and estimate them. In experiments, we modify the Linux kernel and implement new modules to collect the experimental data from benchmarks of SPEC CPU2006 running on a real operating system. In this way, the precision of our experiments is guaranteed, since tag reduction is very sensitive to the usage of physical memory. The experimental results show that our TRoCMP can save total energy up to 83.93% and 76.16% on 8-core and 4-core processors in average respectively, compared to the one that the tag-reduction is not used for. TRoCMP outperforms significantly tag reduction on single-core processor as well. With consideration of performance overhead, when Core Degree is set to 6, the best balance of energy saving and performance overhead can be achieved.
Model-Driven Multi-core Embedded Software Design
Page: 61-77 (17)
Author: Chao-Sheng Lin, Pao-Ann Hsiung, Chih-Hung Chang, Nien-Lin Hsueh, Chorng-Shiuh Koong, Chih-Hsiong Shih, Chao-Tung Yang and William C.-C. Chu
DOI: 10.2174/978160805225711101010061
PDF Price: $15
Abstract
Multi-core processors have emerged rapidly in personal computing and embedded systems. However, the programming environment for multi-core processor based systems is still quite immature and lacks efficient tools. In this work, we present a new VERTAF/Multi-Core framework and show how software code can be automatically generated from SysML models of multi-core embedded systems. We illustrate how model-driven design based on SysML can be seamlessly integrated with Intel’s threading building blocks (TBB) and the Quantum Platform middleware libraries. We use a digital video recording system to illustrate the benefits of the framework. Our experiments show how the combination of SysML, QP, and TBB help in making the multi-core embedded system programming model-driven, easy, efficient, and effortless.
Automatic High-Level Code Generation for Multi-Core Processors in Embedded Systems
Page: 78-92 (15)
Author: Yu-Shin Lin, Shang-Wei Lin, Chao-Sheng Lin, Chun-Hsien Lu, Chia-Chiao Ho, Yi-Luen Chang, Bo-Hsuan Wang and Pao-Ann Hsiung
DOI: 10.2174/978160805225711101010078
PDF Price: $15
Abstract
This chapter demonstrates how high-level code is automatically generated for multi-core processors. The code generation capability of the Verifiable Embedded Real-Time Application Framework (VERTAF) was extended to support multi-core processors in the new VERTAF/Multi-Core (VMC) framework for embedded systems. After users specify embedded software requirements via SysML models along with parallel task, parallel data, and parallel dataflow specifications, the code generator automatically generates parallel code. Using the digital video recording (DVR) system as a case study, we show the correctness and advantages of the VMC code generator. The main inputs of VMC code generator include the block definition diagrams, state machine diagrams, and requirement diagrams of the system to be designed. The proposed code generation in VMC not only significantly decreases the amount of manually-written code, but also provides a formal procedure for model-conforming code generation of multi-core embedded software.
Index
Page: 93-95 (3)
Author: Pao-Ann Hsiung, Yean-Ru Chen and Chao-Sheng Lin
DOI: 10.2174/978160805225711101010093
Abstract
Full text available
Introduction
The surge of multicore processors coming into the market and on users’ desktops has made parallel computing the focus of attention once again. This time, however, it is led by the industry, which ensures that multicore computing is here to stay. Nevertheless, there is still so much research work to be done in multicore hardware-software designs before consumer applications can leverage the benefits of this new paradigm. This ebook is being put forward as a platform for immediate collection of state-of-the-art technologies in both hardware and software designs for multicore computing. With the burgeoning prevalence of multicore processors in embedded systems, real-time systems, multimedia systems, bioinformatics systems, network systems, to list a few, this ebook attempts to cover the design and verification issues related to different application domains as a singular source of reference to the state-of-the-art techniques in multicore processor design and software programming that covers multiple application domains. This ebook will be of immense help to system and software engineers, including both experts and non-experts in parallel computing.