|
Current technology trends such as decreasing feature sizes and lower voltage levels present the reliability problem, which is expected to become a serious issue in near future. This is why integrating Fault Tolerance (FT) becomes necessary not only in life-critical systems, as it used to be since the transistor-based systems appeared, but even in PCs.
Most traditional FT techniques are applicable for the modern multiprocessor systems. In addition, multiprocessors introduce a relatively new direction in the FT research, which focuses on the cache coherence reliability. Many modern multiprocessors implement cache coherence to hide the details of the complex underlying memory structure from an application developer. Improper cache coherence functionality is likely to lead to a data integrity violation, which in turn can result in wrong application output, application crash etc. Several concurrent cache coherence verification schemes have been proposed. All of them introduce a certain network traffic overhead, which is likely to lead to performance degradation. Significant hardware overhead is also typical for the proposed systems. We have proposed a low-cost method which (in the case of the considered MESI coherence protocol) does not have any performance overhead. In return, it has a slightly smaller fault coverage than some other proposed methods (with a performance overhead). The technique is scalable, in that it introduces a constant hardware overhead per cache, and is relatively easy to integrate into an existing system: the cache controller has to be modified to add a few additional bits to the network messages it sends, and independent checkers have to be attached to the network.
All the FT methods are based on some form of redundancy, and introduce a certain overhead (time, performance, energy consumption etc.) We have observed that many applications are by their nature tolerant to faults in certain their parts, while depend on the correctness of other (critical) parts. For example, consider a loop adding two matrices (images). A fault in the addition operation for one of the matrix elements can often be neglected. On the other hand, a fault in the code controlling the loop would most probably crash the application, corrupt the rest of the output matrix, etc. Thus, for these applications it is sufficient to protect only the most critical parts, or desirable to protect them better than other parts. The proposed Instruction-Level Fault Tolerance Configurability (ILCOFT) technique makes use of this and allows to assign different protection levels to different application parts. This reduces the performance and energy consumption overhead of the original FT technique with a full protection.
Some of the existing fault detection methods duplicate all the instructions executed by a processor and compare their results. We address improving the fault coverage and minimizing the overhead of this method by using instruction precomputation. Applications are profiled off-line, and the most frequently executed instructions are loaded into a special hardware buffer when the application execution begins. If an instruction with the same input operands is found in the precomputation buffer, it is executed only once, and the result is compared with the precomputed value. Otherwise, the instruction is duplicated as in the original duplication scheme. This method increases the fault coverage (addressing not only short transient but also long-lasting transient and permanent faults), but also reduces the performance and energy consumption overhead.
Libavcodec is an open source library that contains many different audio/video codecs, including a very fast MPEG4 codec. It has been optimised for several multimedia extensions such as Intel's MMX, AMD's 3DNow, etc. Wasabi is a chip multiprocessor targeted at media applications. It is being developed at Philips. Wasabi consists of several TriMedia DSPs and one or more general-purpose CPUs. The TriMedia is a VLIW processor that supports many media operations.
This project is focused on (1) porting libavcodec to the TriMedia, (2) improvement of its performance by applying architecture-specific optimisations, (3) parallelisation of libavcodec, and (4) providing an interface between the TriMedia(s) executing libavcodec and the general-purpose processor(s) running application(s) that use libavcodec. First we describe how libavcodec was ported to the TriMedia. Libavcodec supports only the most recent gcc compilers, whereas the TriMedia compiler accepts standard ANSI C. Hence certain C constructs had to be replaced by others and certain library functions had to be implemented. Thereafter, we describe how libavcodec was optimised for the TriMedia. The optimisations improve the performance of the MPEG2 and MPEG4 decoders by approximately 41% and 30%, respectively. Parallelisation of the encoder involved changing the interface from Windows threads to the TriMedia multithreading API TM OSAL. A linear speedup has been achieved for up to 6 CPUs, further it slightly levels off. Additional work is required to make libavcodec more scalable so that it can exploit more processors efficiently. Finally, an interface between the TriMedia(s) executing libavcodec and host applications that use libavcodec is proposed. This interface enables applications to efficiently use libavcodec running on the TriMedias, without having to port the applications themselves to the TriMedia.
With my wife Olga, we founded a start-up software development company VeprIT. Our first product is a very easy photo editing software for Mac OS and iOS called Photo Sense.
Since the beginning of 2007, photography became a very serious hobby of mine and Olga's. It even evolved into a small business. The outlook of my and my wife's work can be found at our Photography Site.