Literature
173 references, last updated Mon Oct 22 17:02:15 Europe/Amsterdam 2007
- [von Neumann, 1945]
- John von Neumann.
First draft of a report on the edvac.
Technical Report W-670-ORD-4926, Moore School of Electrical Engineering,
University of Pennsylvania, June 1945.
Contracted with United States Army Ordnance Department.
(PDF)
(DOI)
The first draft of a report on the EDVAC written by John von
Neumann is presented. This first draft contains a wealth of information, and
it had a pervasive influence when it was first written. Most prominently,
Alan Turing cites it in his proposal for the Pilot automatic computing engine
(ACE) as the definitive source for understanding the nature and design of a
general-purpose digital computer
- [McCabe, 1976]
- Thomas J. McCabe.
A complexity measure.
In ICSE '76: Proceedings of the 2nd international conference on Software
engineering, page 407, Los Alamitos, CA, USA, 1976. IEEE Computer
Society Press.
(PDF)
This paper describes a graph-theoretic complexity measure and
illustrates how it can be used to manage and control program com- plexity
.The paper first explains how the graph-theory concepts apply and gives an
intuitive explanation of the graph concepts in programming terms. The control
graphs of several actual Fortran programs are then presented to illustrate
the conelation between intuitive complexity and the graph-theoretic
complexity
- [Halstead, 1977]
- Maurice H.
Halstead.
Elements of Software Science (Operating and programming systems
series).
Elsevier Science Inc., New York, NY, USA, 1977.
- [Backus, 1978]
- John Backus.
Can programming be liberated from the von neumann style?: a functional style
and its algebra of programs.
Commun. ACM, 21(8):613–641, 1978.
(DOI)
- [Oviedo, 1980]
- E. I. Oviedo.
Control flow, data flow, and program complexity.
In COMPSAC'80: Proceedings of the Fourth International Computer Software
and Applications Conference, pages 146–152, November 1980.
- [Boehm, 1981]
- Barry W. Boehm.
Software Engineering Economics.
Prentice Hall PTR, Upper Saddle River, NJ, USA, 1981.
- [Harrison and Magel,
1981]
- Warren Harrison and Kenneth Magel.
A topological analysis of the complexity of computer programs with less than
three binary branches.
SIGPLAN Not., 16(4):51–63, 1981.
(DOI)
- [Piwowarski, 1982]
- Paul
Piwowarski.
A nesting level complexity measure.
SIGPLAN Not., 17(9):44–50, 1982.
(DOI)
- [Basili and Hutchens,
1983]
- Victor R. Basili and David H. Hutchens.
An empirical study of a syntactic complexity family.
IEEE Transactions on Software Engineering, 9(6):664–672,
1983.
- [Kirkpatrick et al.,
1983]
- S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi.
Optimization by simulated annealing.
Science, Number 4598, 13 May 1983, 220, 4598:671–680, 1983.
- [Elshoff, 1984]
- James L.
Elshoff.
Characteristic program complexity measures.
In ICSE '84: Proceedings of the 7th international conference on Software
engineering, pages 288–293, Piscataway, NJ, USA, 1984. IEEE
Press.
- [Prather,
1984]
- R. E. Prather.
An Axiomatic Theory of Software Complexity Measure.
The Computer Journal, 27(4):340–347, 1984.
(DOI)
In software engineering, various metrics' have been introduced in
an attempt to measure the complexity of programs. We show how the whole idea
of a software complexity measure' can be axiomatized in such a way as to
include the more familiar concrete examples and to allow for new measures
that might offer advantages not captured by those previously introduced. In
particular, a new testing measure is introduced, based on the
multiple-condition' test strategy. Comparisons are made between this new
measure and the more traditional metrics. In addition, a more general
theoretical study is initiated, showing the effect of the axiomatic
development in relation to the treatment of program
structuredness.
- [Tai, 1984]
- Kuo-Chung Tai.
A program complexity metric based on data flow information in control graphs.
In ICSE '84: Proceedings of the 7th international conference on Software
engineering, pages 239–248, Piscataway, NJ, USA, 1984. IEEE
Press.
- [Ejiogu, 1985]
- Lem O. Ejiogu.
A simple measure of software complexity.
SIGMETRICS Perform. Eval. Rev., 13(1):33–47, 1985.
(DOI)
- [Gong and Schmidt,
1985]
- Huisheng Gong and Monika Schmidt.
A complexity measure based on selection and nesting.
SIGMETRICS Perform. Eval. Rev., 13(1):14–19, 1985.
(DOI)
- [Kafura and Canning,
1985]
- Dennis Kafura and James Canning.
A validation of software metrics using many metrics and two resources.
In ICSE '85: Proceedings of the 8th international conference on Software
engineering, pages 378–385, Los Alamitos, CA, USA, 1985. IEEE
Computer Society Press.
- [Chen and Kwan, 1986]
- T Y Chen and
S C Kwan.
An analysis of length equation using a dynamic approach.
SIGPLAN Not., 21(4):42–47, 1986.
(DOI)
- [Höskuldsson, 1988]
- Agnar
Höskuldsson.
Pls regression methods.
Journal of Chemometrics, 2(3):211–228, 1988.
(DOI)
- [Nejmeh, 1988]
- Brian A. Nejmeh.
Npath: a measure of execution path complexity and its applications.
Commun. ACM, 31(2):188–200, 1988.
(DOI)
- [Weyuker, 1988]
- E. J. Weyuker.
Evaluating software complexity measures.
IEEE Trans. Softw. Eng., 14(9):1357–1365, 1988.
(DOI)
A set of properties of syntactic software complexity measures is
proposed to serve as a basis for the evaluation of such measures. Four known
complexity measures are evaluated and compared using these criteria. This
formalized evaluation clarifies the strengths and weaknesses of the examined
complexity measures, which include the statement count, cyclomatic number,
effort measure, and data flow complexity measures. None of these measures
possesses all nine properties, and several are found to fail to possess
particularly fundamental properties; this failure calls into question their
usefulness in measuring synthetic complexity
- [Zalateu and Felician,
1989]
- L. Zalateu and G. Felician.
Validating halstead's theory for pascal programs.
IEEE Transactions on Software Engineering, 15(12):1630–1632,
1989.
(DOI)
M.H. Halstead's theory (1977) has been validated for different
languages, but Pascal programs seem to fit only partially with the theory.
D.B. Johnston and A.M. Lister (1981) first recognized the lack of operators
due to the structure of this language and proposed a modification of
Halstead's formula. The article confirms their results but suggests a
correction to their formula, which is particularly necessary for large
programs. Experimental results, obtained by examining about 550 Pascal
programs, represent the widest test to date of Halstead theory with regard to
Pascal programs
- [Huang et al., 1990]
- Chu-Yi Huang,
Yen-Shen Chen, Youn-Long Lin, and Yu-Chin Hsu.
Data path allocation based on bipartite weighted matching.
In DAC '90: Proceedings of the 27th ACM/IEEE conference on Design
automation, pages 499–504, New York, NY, USA, 1990. ACM Press.
(DOI)
- [Athanas and Silverman,
1991]
- Peter M. Athanas and Harvey F. Silverman.
An adaptive hardware machine architecture and compiler for dynamic processor
reconfiguration.
In ICCD '91: Proceedings of the 1991 IEEE International Conference on
Computer Design on VLSI in Computer & Processors, pages 397–400,
Washington, DC, USA, 1991. IEEE Computer Society.
- [Fenton, 1991]
- Norman E. Fenton.
Software Metrics: A Rigorous Approach.
Chapman & Hall, Ltd., London, UK, UK, 1991.
- [Jayaprakash et al.,
1991]
- K. B. Jayaprakash, S. Sinha, and P. K. Lakshmanan.
Properties of control-flow complexity measures.
IEEE Transactions on Software Engineering, 17(12):1289–1295,
1991.
(DOI)
The authors attempt to formalize some properties which any
reasonable control-flow complexity measure must satisfy. Since large programs
are often built by sequencing and nesting of simpler constructs, the authors
explore how control-flow complexity measures behave under such compositions.
They analyze five existing control flow complexity measures-cyclomatic
number, total adjusted complexity, scope ratio, MEBOW, and NPATH. The
analysis reveals the strengths and weaknesses of these control flow
complexity measures
- [Smith and Cherniavsky,
1991]
- J. C. Smith and C. H. Cherniavsky.
On weyuker's axioms for software complexity measures.
IEEE Transactions on Software Engineering, 17(6):636–638, 1991.
(DOI)
Properties for software complexity measures are discussed. It is
shown that a collection of nine properties suggested by E.J. Weyuker is
inadequate for determining the quality of a software complexity measure. (see
ibid., vol.14, p.1357-65, 1988). A complexity measure which satisfies all
nine of the properties, but which has absolutely no practical utility in
measuring the complexity of a program is presented. It is concluded that
satisfying all of the nine properties is a necessary, but not sufficient,
condition for a good complexity measure
- [Harrison, 1992]
- Warren
Harrison.
An entropy-based measure of software complexity.
IEEE Trans. Softw. Eng., 18(11):1025–1029, 1992.
(DOI)
It is proposed that the complexity of a program is inversely
proportional to the average information content of its operators. An
empirical probability distribution of the operators occurring in a program is
constructed, and the classical entropy calculation is applied. The
performance of the resulting metric is assessed in the analysis of two
commercial applications totaling well over 130000 lines of code. The results
indicate that the new metric does a good job of associating modules with
their error spans (averaging number of tokens between error
occurrences)
- [Micheli and Gupta,
1992]
- R. K. De Micheli and G. Gupta.
System-level synthesis using re-programmable components.
In Design Automation, 1992. Proceedings. [3rd], pages 2–7,
Brussels, 1992.
(DOI)
The authors formulate the synthesis problem of complex behavioral
descriptions with performance constraints as a hardware- software co-design
problem. The target system architecture consists of a software component as a
program running on a re- programmable processor assisted by
application-specific hardware components. System synthesis is performed by
first partitioning the input system description into hardware and software
portions and then by implementing each of them separately. The synthesis of
dedicated hardware is then achieved by means of hardware synthesis tools
(D.D. Mitchell, D.C.Ku, F. Mailhot, and T. Truong, `The Olympus Synthesis
System for digital design' IEEE Design and Test Magazine, p.37- 53, Oct.
1990), while the software component is generated using software compiling
techniques. The authors consider the problem of identifying potential
hardware and software components of a system described in a high-level
modeling language and they present a partitioning procedure. They then
describe the results of partitioning a network coprocessor
- [Zuse, 1992]
- Horst Zuse.
Properties of software measures.
Software Quality Journal, 1(4):225–260, 12 1992.
(DOI)
- [Adams et al., 1993]
- D. E.
Adams, J. K. Schmit, and H. Thomas.
A model and methodology for hardware-software codesign.
IEEE Design & Test of Computers, 10(3):6–15, 1993.
(DOI)
A behavioral model of a class of mixed hardware-software systems is
presented. A codesign methodology for such systems is defined. The
methodology includes hardware-software partitioning, behavioral synthesis,
software compilation, and demonstration on a testbed consisting of a
commercial central processing unit (CPU), field-programmable gate arrays, and
programmable interconnections. Design examples that illustrate how certain
characteristics of system behavior and constraints suggest hardware or
software implementation are presented
- [Casselman, 1993]
- S. Casselman.
Virtual computing and the virtual computer.
In FPGA '93: Proceedings of the IEEE Workshop on FPGAs for Custom
Computing Machines, pages 43–48, Reseda, CA, USA, April 1993. IEEE
Computer Society.
(DOI)
Virtual computing is an entirely new form of supercomputing that
allows an algorithm to be implemented in hardware. Based on the Xilinx FPGA
and ICube's FPID the Virtual Computer is completely reconfigurable in every
respect. Computing machines based on reconfigurable logic are hyper-scalable
meaning they scale up better than 1-1
- [Henkel et al.,
1993]
- R. Henkel, J. Benner, and T. Ernst.
Hardware-software cosynthesis for microcontrollers.
IEEE Design & Test of Computers, 10(4):64–75, 1993.
(DOI)
The authors present a software-oriented approach to
hardware-software partitioning which avoids restrictions on the software
semantics as well as an iterative partitioning process based on hardware
extraction controlled by a cost function. This process is used in Cosyma, an
experimental cosynthesis system for embedded controllers. As an example, the
extraction of coprocessors for loops is demonstrated. Results are presented
for several benchmark designs
- [Lee and Kalavade,
1993]
- A. Lee and E. A. Kalavade.
A hardware-software codesign methodology for dsp applications.
IEEE Design & Test of Computers, 10(3):16–28, 1993.
(DOI)
The authors describe a systematic, heterogeneous design methodology
using the Ptolemy framework for simulation, prototyping, and software
synthesis of systems containing a mixture of hardware and software
components. They focus on signal-processing systems in which the hardware
typically consists of custom data paths, finite-state machines (FSMs), glue
logic and programmable processors. The software is one or more embedded
programs running on the programmable components
- [Lok et al., 1993]
- W. Lok,
V. Page, and I. Luk.
Hardware acceleration of divide-and-conquer paradigms: a case study.
In FPGAs for Custom Computing Machines, 1993., pages 192–201,
Napa, CA, 1993.
(DOI)
The authors describe a method for speeding up divide-and-conquer
algorithms with a hardware coprocessor, using sorting as an example. The
method employs a conventional processor for the `divide' and `merge' phases,
while the `conquer' phase is handled by a purpose-built coprocessor. It is
shown how transformation techniques from the Ruby language can be adopted in
developing a family of systolic sorters, and how one of the resulting designs
is prototyped in eight FPGAs on a PC coprocessor board known as CHS2×4 from
Algotronix. The execution of the hardware unit is embedded in a sorting
program, with the PC host merging the sorted sequences from the hardware
sorter. The performance of this implementation is compared against various
sorting algorithms on a number of PC systems
- [Micheli and Gupta,
1993]
- R. K. De Micheli and G. Gupta.
Hardware-software cosynthesis for digital systems.
IEEE Design & Test of Computers, 10(3):29–41, 1993.
(DOI)
As system design grows increasingly complex, the use of predesigned
components, such as general-purpose microprocessors can simplify synthesized
hardware. While the problems in designing systems that contain processors and
application-specific integrated circuit chips are not new, computer-aided
synthesis of such heterogeneous or mixed systems poses unique problems. The
authors demonstrate the feasibility of synthesizing heterogeneous systems by
using timing constraints to delegate tasks between hardware and software so
that performance requirements can be met. System functionality is captured
using the HardwareC hardware description language. The synthesis of an
Ethernet-based network coprocessor is discussed as an
example
- [O'Neal, 1993]
- Michael B. O'Neal.
An empirical study of three common software complexity measures.
In SAC '93: Proceedings of the 1993 ACM/SIGAPP symposium on Applied
computing, pages 203–207, New York, NY, USA, 1993. ACM Press.
(DOI)
- [Sharma and Jain, 1993]
- Alok
Sharma and Rajiv Jain.
Estimating architectural resources and performance for high-level synthesis
applications.
In DAC '93: Proceedings of the 30th international conference on Design
automation, pages 355–360, New York, NY, USA, 1993. ACM Press.
(DOI)
- [Silverman and Athanas,
1993]
- P. M. Silverman and H. F. Athanas.
Processor reconfiguration through instruction-set metamorphosis.
Computer, 26(3):11–18, 1993.
(DOI)
The processor reconfiguration through instruction-set metamorphosis
(PRISM) general-purpose architecture, which speeds up computationally
intensive tasks by augmenting the core processor's functionality with new
operations, is described. The PRISM approach adapts the configuration and
fundamental operations of a core processing system to the computationally
intensive portions of a targeted application. PRISM-1, an initial prototype
system, is described, and experimental results that demonstrate the benefits
of the PRISM concept are presented
- [van Ierssel et al.,
1993]
- D. M. van Ierssel, M. H. Wong, and D. H. Lewis.
A field programmable accelerator for compiled-code applications.
In FPGAs for Custom Computing Machines, 1993., pages 60–67, Napa,
CA, 1993.
(DOI)
The paper describes a special purpose application accelerator using
field programmable gate arrays to accelerate a range of applications. The
accelerator is designed to support applications by allowing the user to
implement a processor with an instruction set designed for the specific
application being accelerated, using specialized instructions to implement
critical fragments of the application. A compiled-code software organization
is used to reduce overhead operations. A prototype has been built, and the
first application to be ported to it, logic simulation, is
underway
- [DeHon, 1994]
- André DeHon.
DPGA-coupled
microprocessors: Commodity ICs for the early 21st century.
In Duncan A. Buell and Kenneth L. Pocek, editors, IEEE Workshop on FPGAs
for Custom Computing Machines, pages 31–39, Los Alamitos, CA, 1994.
IEEE Computer Society Press.
- [Eldredge and Hutchings,
1994]
- J. G. Eldredge and B. L. Hutchings.
RRANN: The
run-time reconfiguration artificial neural network.
In Custom Integrated Circuits Conference, pages 77–80, San Diego,
CA, 1994.
- [Ellervee et al.,
1994]
- P. Ellervee, A. Jantsch, J. Öberg, A. Hemani, and
H. Tenhunen.
Exploring asic design space at system level with a neural networkestimator.
In ASIC '94: Proceedings of the Seventh Annual IEEE International ASIC
Conference and Exhibit, pages 67–70, Campus IT University, Kista,
Sweden, September 1994.
(DOI)
Estimators are critical tools in carrying out architectural level
exploration of the design space. We present a novel approach to estimation
based on the multilayer perceptron which builds the estimation function
during the learning process and thus allows the description of arbitrary
complex functions. We also describe how the control data flow graph is
encoded for the neural network input and present results of the first
experiments made with realistic design examples
- [Jantsch et al., 1994]
- Axel
Jantsch, Peeter Ellervee, Johny Öberg, and Ahmed Hemani.
A case study on hardware/software partitioning.
In FCCM'94, Proceedings of the Workshop on FPGAs for Custom Computing
Machines, pages 111–118, Napa Valley, CA, 1994. IEEE Computer Society
Press.
(PDF)
(DOI)
We present an analysis of a fully automatic method to accelerate
standard software in C or C++ by use of field programmable gate arrays.
Traditional compiler techniques are applied to the hardware/software
partitioning problem and a compiler is linked to state of the art hardware
synthesis tools. Time critical regions are identified by means of profiling
and are automatically implemented in user programmable logic with high level
and logic synthesis design tools. The underlying architecture is an add-on
board with user programmable logic connected to a Spare based workstation via
the system bus. We present an analysis and case study of this method. Eight
programs are used as test cases and the data collected by applying this
method to programs is used to discuss potentials and limitations of this and
similar methods. We discuss architectural parameters, programming language
properties, and analysis techniques
- [Kalavade and Lee, 1994]
- Asawaree
Kalavade and Edward A. Lee.
A global criticality/local phase driven algorithm for the constrained
hardware/software partitioning problem.
In CODES '94: Proceedings of the 3rd international workshop on
Hardware/software co-design, pages 42–48, Los Alamitos, CA, USA,
1994. IEEE Computer Society Press.
(DOI)
- [Peng and Kuchcinski,
1994]
- Zebu Peng and Krzysztof Kuchcinski.
An algorithm for partitioning of application specific systems.
Technical Report R-94-01, Department of Computer and Information Science,
Linköping University, Linköping, Sweden, 1994.
Published in Proceedings of the European Conference on Design Automation
EDAC'93, Paris, France, February 22-25, 1993.
(PS)
- [Vahid et al.,
1994]
- Frank Vahid, Daniel D. Gajski, and Jie Gong.
A binary-constraint search algorithm for minimizing hardware during
hardware/software partitioning.
In EURO-DAC '94: Proceedings of the conference on European design
automation, pages 214–219, Los Alamitos, CA, USA, 1994. IEEE Computer
Society Press.
- [Kalavade and Lee,
1995]
- A. Kalavade and E. A. Lee.
The extended partitioning problem: hardware/software mapping and
implementation-bin selection.
In RSP '95: Proceedings of the Sixth IEEE International Workshop on Rapid
System Prototyping (RSP'95), page 12, Washington, DC, USA, 1995. IEEE
Computer Society.
- [Kifli et al.,
1995]
- A. Kifli, G. Goosens, and H. De Man.
A unified scheduling model for high-level synthesis and code generation.
In EDTC '95: Proceedings of the 1995 European conference on Design and
Test, page 234, Washington, DC, USA, 1995. IEEE Computer Society.
- [Lemoine and Merceron,
1995]
- E. Lemoine and D. Merceron.
Run time reconfiguration of fpga for scanning genomic databases.
In FCCM '95: Proceedings of the IEEE Symposium on FPGA's for Custom
Computing Machines, page 90, Washington, DC, USA, 1995. IEEE Computer
Society.
- [Prather, 1995]
- Ronald E. Prather.
Design and analysis of hierarchical software metrics.
ACM Comput. Surv., 27(4):497–518, 1995.
(DOI)
- [Triantafyllos et al.,
1995]
- George Triantafyllos, Stamatis Vassiliadis, and Walid
Kobrosly.
On the prediction of computer implementation faults via static error prediction
models.
J. Syst. Softw., 28(2):129–142, 1995.
(DOI)
- [Vahid, 1995]
- Frank Vahid.
Procedure exlining: a transformation for improved system and behavioral
synthesis.
In ISSS '95: Proceedings of the 8th international symposium on System
synthesis, pages 84–89, New York, NY, USA, 1995. ACM Press.
(DOI)
- [Balboni et al.,
1996]
- A. Balboni, W. Fornaciari, and D. Sciuto.
Partitioning and exploration strategies in the tosca co-design flow.
In CODES '96: Proceedings of the 4th International Workshop on
Hardware/Software Co-Design, page 62, Washington, DC, USA, 1996. IEEE
Computer Society.
- [Ball and Larus, 1996]
- Thomas
Ball and James R. Larus.
Efficient path profiling.
In MICRO 29: Proceedings of the 29th annual ACM/IEEE international
symposium on Microarchitecture, pages 46–57, Washington, DC, USA,
1996. IEEE Computer Society.
- [dm Le Vahid, 1996]
- F. Thuy
dm Le Vahid.
Towards a model for hardware and software functional partitioning.
In Hardware/Software Co-Design, 1996. (Codes/CASHE '96), Proceedings of
the Fourth International Workshop on, pages 116–123, Pittsburgh, PA,
1996.
(DOI)
We describe a model that supports the functional partitioning of a
system-level functional specification among hardware and software components.
The model includes only the information needed by partitioning, and thus can
be communicated freely and generated automatically. Based on characteristics
of several real examples, we describe a technique for automatically
generating generic model instances, on which partitioning heuristics can be
applied and fairly compared. Such comparisons will become increasingly
important as research begins to focus on fast yet effective functional
partitioning techniques. We describe a set of tools for converting a
specification to the model, for generating generic model instances, and for
applying and comparing partitioning heuristics, available via ftp. Use of
these tools may greatly reduce duplicated efforts among researchers wishing
to investigate hardware/software partitioning heuristics
- [Landman, 1996]
- Paul Landman.
High-level power estimation.
In ISLPED '96: Proceedings of the 1996 international symposium on Low
power electronics and design, pages 29–35, Piscataway, NJ, USA, 1996.
IEEE Press.
- [Suzuki and
Sangiovanni-Vincentelli, 1996]
- Kei Suzuki and Alberto
Sangiovanni-Vincentelli.
Efficient software performance estimation methods for hardware/software
codesign.
In DAC '96: Proceedings of the 33rd annual conference on Design
automation, pages 605–610, New York, NY, USA, 1996. ACM Press.
(DOI)
- [Triantafyllos et al.,
1996]
- George Triantafyllos, Stamatis Vassiliadis, and José G.
Delgado-Frias.
Software metrics and microcode: a case study.
Journal of Software Maintenance, 8(3):199–224, 1996.
(DOI)
- [Ernst and Ye, 1997]
- R. Ernst and
W. Ye.
Embedded program timing analysis based on path clustering and architecture
classification.
In ICCAD '97: Proceedings of the 1997 IEEE/ACM international conference
on Computer-aided design, pages 598–604, Washington, DC, USA, 1997.
IEEE Computer Society.
(DOI)
- [Gubian et al.,
1997]
- W. Gubian, P. Sciuto, D. Silvano, and C. Fornaciari.
System-level power evaluation metrics.
In Innovative Systems in Silicon, 1997., pages 323–330, Austin,
TX, 1997.
(DOI)
High-level power estimation is a key issue for IC designers and
system engineers. The goal is to widely explore the architectural design
space and to compare alternative solutions, while maintaining an acceptable
accuracy and a competitive design time. In this paper, an approach is
proposed for evaluating the system-level power consumption of embedded
systems implemented by using VLSI circuits. Accurate and efficient early
power evaluation metrics have been defined to guide the system-level
partitioning phase of a more general HW/SW co-design approach for control
dominated embedded systems. The hardware and software contributions to the
power consumption at the system- level have been considered as well as the
contribution of the HW/SW communication
- [Hammel et al.,
1997]
- T. Hammel, U. Schwefel, and H. P. Back.
Evolutionary computation: comments on the history and current state.
Evolutionary Computation, IEEE Transactions on, 1(1):3–17, 1997.
(DOI)
Evolutionary computation has started to receive significant
attention during the last decade, although the origins can be traced back to
the late 1950's. This article surveys the history as well as the current
state of this rapidly growing field. We describe the purpose, the general
structure, and the working principles of different approaches, including
genetic algorithms (GA) (with links to genetic programming (GP) and
classifier systems (CS)), evolution strategies (ES), and evolutionary
programming (EP) by analysis and comparison of their most important
constituents (i.e. representations, variation operators, reproduction, and
selection mechanism). Finally, we give a brief overview on the manifold of
application domains, although this necessarily must remain
incomplete
- [Hartenstein et al.,
1997a]
- C. Hartenstein, R. Mencer, O. Morris, J. Palem, and
Ebeling.
Seeking solutions in configurable computing.
Computer, 30(12):38–43, 1997.
(DOI)
Configurable computing offers the potential of producing powerful
new computing systems. Will current research overcome the dearth of
commercial applicability to make such systems a reality? Unfortunately, no
system to date has yet proven attractive or competitive enough to establish a
commercial presence. We believe that ample opportunity exists for work in a
broad range of areas. In particular, the configurable computing community
should focus on refining the emerging architectures, producing more effective
software/hardware APIs, better tools for application development that
incorporate the models of hardware reconfiguration, and effective
benchmarking strategies
- [Hartenstein et al.,
1997b]
- R. Hartenstein, J. Becker, and U. Nageldinger M.Herz.
Data scheduling in hardware/software co-design for field-programmable
accelerators.
In FPL '97: Proceedings of 7th International Workshop on Field
Programmable Logic, pages 294–303, University of Kaiserlautern,
Kaiserlautern, Germany, September 1997. Springer.
- [Long, 1997]
- J. Scott Long.
Regression Models for Categorical and Limited Dependent Variables
(Advanced Quantitative Techniques in the Social Sciences).
Sage Publications, March 1997.
In Regression Models for Categorical and Limited Dependent
Variables, J. Scott Long provides a well-written, comprehensive introduction
to statistical models for binary, ordinal, nominal, and limited dependent
variables. The book would serve equally well as an addition to a social
scientist's quantitative library or as a text for a graduate level statistics
course. The major strength of the book is its emphasis on
interpretation.
- [Nemani and Najm,
1997]
- Mahadevamurty Nemani and Farid N. Najm.
High-level area and power estimation for vlsi circuits.
In ICCAD '97: Proceedings of the 1997 IEEE/ACM international conference
on Computer-aided design, pages 114–119, Washington, DC, USA, 1997.
IEEE Computer Society.
- [Saha et al., 1997]
- D. Saha,
A. Basu, and R. S. Mitra.
Hardware software partitioning using genetic algorithm.
In VLSID '97: Proceedings of the Tenth International Conference on VLSI
Design: VLSI in Multimedia Applications, page 155, Washington, DC,
USA, 1997. IEEE Computer Society.
- [Salchak and Chawla,
1997]
- P. W. Salchak and P. Chawla.
Supporting hardware trade analysis and cost estimation using design complexity.
In VIUF '97: Proceedings of the VHDL International Users' Forum,
pages 126–133, Beavercreek, OH, USA, October 1997.
(DOI)
Defines and illustrates a hardware design complexity measure (HDCM)
and describe its potential applications to trade-off analysis and cost
estimation. Specifically, we define a VHDL complexity measure. We have
derived the HDCM from an avionics software design complexity measure (ASDCM)
that we have shown to be effective in estimation and optimization of overall
software costs. Similar to the ASDCM, we believe that the proposed HDCM could
enable more optimal hardware design, implementation and
maintenance
- [Wawrzynek and Hauser,
1997]
- J. R. Wawrzynek and J. Hauser.
Garp: a MIPS processor with a reconfigurable coprocessor.
In FPGAs for Custom Computing Machines, 1997., pages 12–21, Napa
Valley, CA, 1997.
(DOI)
Typical reconfigurable machines exhibit shortcomings that make them
less than ideal for general-purpose computing. The Garp Architecture combines
reconfigurable hardware with a standard MIPS processor on the same die to
retain the better features of both. Novel aspects of the architecture are
presented, as well as a prototype software environment and preliminary
performance results. Compared to an UltraSPARC, a Garp of similar technology
could achieve speedups ranging from a factor of 2 to as high as a factor of
24 for some useful applications
- [Brandolese, 1998]
- Carlo
Brandolese.
System-level performance estimation strategy for sw and hw.
In ICCD '98: Proceedings of the International Conference on Computer
Design, page 48, Washington, DC, USA, 1998. IEEE Computer
Society.
- [Henkel and Ernst,
1998]
- Jörg Henkel and Rolf L. Ernst.
High-level estimation techniques for usage in hardware/software co-design.
In ASPDAC '98: Proceedings of the ASP-DAC'98. Asia and South Pacific
Design Automation Conference, pages 353–360, Princeton, NJ, USA,
February 1998.
(DOI)
High-level estimation techniques are of paramount importance for
design decisions like hardware/software partitioning or design space
explorations. In both cases an appropriate compromise between accuracy and
computation time determines about the feasibility of those estimation
techniques. In this paper we present high-level estimation techniques for
hardware effort and hardware/software communication time. Our techniques
deliver fast results at sufficient accuracy. Furthermore, it is shown in
which way these techniques are applied in order to cope with contradictory
design goals like performance constraints and hardware effort constraints. As
a solution, we present a cost function for the purpose of hardware/software
partitioning that offers a dynamic weighting of its components. The conducted
experiments show that the usage of our estimation techniques in conjunction
with their efficient combination leads to reasonable hardware/software
implementations as opposed to approaches that consider single constraints
only
- [Henkel and Li,
1998]
- Jörg Henkel and Yanbing Li.
Energy-conscious hw/sw-partitioning of embedded systems: a case study on an
mpeg-2 encoder.
In CODES/CASHE '98: Proceedings of the 6th international workshop on
Hardware/software codesign, pages 23–27, Washington, DC, USA, 1998.
IEEE Computer Society.
(PDF)
(DOI)
Energy dissipation is a hot topic in the design of- especially
mobile-embedded systems. This is because applications like digital video
cameras, cellular phones etc. draw their current from batteries that spend a
limited amount of energy only. In this paper we show that energy-conscious
HW/SW-partitioning can lead to drastic reductions of energy dissipation of a
whole embedded system. Subject of investigation is an MPEG-2 encoder.
Therefore, we introduce our framework for estimating and optimizing system
energy as well as all conducted design steps. The obtained results show
energy savings up 59% while the performance remains approximately the same
or becomes even slightly higher. As a main result, energy-conscious
HW/SW-partitioning is a promising method to be deployed in addition to
classical energy and/or power reduction methods
- [Khouri et al.,
1998]
- Kamal S. Khouri, Ganesh Lakshminarayana, and Niraj K. Jha.
Fast high-level power estimation for control-flow intensive design.
In ISLPED '98: Proceedings of the 1998 international symposium on Low
power electronics and design, pages 299–304, New York, NY, USA, 1998.
ACM Press.
(DOI)
- [Kokol and Brest, 1998]
- Peter
Kokol and Janez Brest.
Fractal structure of random programs.
SIGPLAN Notices, 33(6):33–38, 1998.
- [Maestro et al.,
1998]
- J. A. Maestro, D. Mozos, and H. Mecha.
A macroscopic time and cost estimation model allowing task parallelism and
hardware sharing for the codesign partitioning process.
In DATE '98: Proceedings of the Design, Automation and Test in Europe
Conference, pages 218–225, Madrid, Spain, February 1998.
(DOI)
This paper describes a method to estimate the implementation cost
of the hardware part in a mixed hardware/software system, as well as the
related performance. These estimations try to avoid the use of many
implementation details in order to keep the complexity order of the process
under control. The concepts of hardware sharing and parallelism are exploited
to make a picture of the whole hardware cost associated with a given
partition
- [Séméria and Micheli,
1998]
- Luc Séméria and Giovanni De Micheli.
Spc: synthesis of pointers in c: application of pointer analysis to the
behavioral synthesis from c.
In ICCAD '98: Proceedings of the 1998 IEEE/ACM international conference
on Computer-aided design, pages 340–346, New York, NY, USA, 1998. ACM
Press.
(DOI)
- [Stone and Gokhale, 1998]
- M. B.
Stone and J. M. Gokhale.
NAPA c: compiling for a hybrid RISC/fpga architecture.
In FPGAs for Custom Computing Machines, 1998., pages 126–135,
Napa Valley, CA, 1998.
(DOI)
Hybrid architectures combining conventional processors with
configurable logic resources enable efficient coordination of control with
datapath computation. With integration of the two components on a single
device, loop control and data-dependent branching can be handled by the
conventional processor. While regular datapath computation occurs on the
configurable hardware. This paper describes a novel pragma-based approach to
programming such hybrid devices. The NAPA C language provides pragma
directives so that the programmer (or an automatic partitioner) can specify
where data is to reside and where computation is to occur with
statement-level granularity. The NAPA C compiler, targeting National
Semiconductor's NAPA1000 chip, performs semantic analysis of the
pragma-annotated program and co-synthesizes a conventional program executable
combined with a configuration bit stream for the adaptive logic. Compiler
optimizations include synthesis of hardware pipelines from pipelineable
loops
- [Arnout, 1999]
- Guido Arnout.
C for system level design.
In DATE '99: Proceedings of the conference on Design, automation and test
in Europe, page 81, New York, NY, USA, 1999. ACM Press.
(DOI)
- [Dave, 1999]
- Bharat P. Dave.
Crusade: hardware/software co-synthesis of dynamically reconfigurable
heterogeneous real-time distributed embedded systems.
In DATE '99: Proceedings of the conference on Design, automation and test
in Europe, page 22, New York, NY, USA, 1999. ACM Press.
(DOI)
- [Dave et al.,
1999]
- Bharat P. Dave, Ganesh Lakshminarayana, and Niraj K. Jha.
Cosyn: hardware-software co-synthesis of heterogeneous distributed embedded
systems.
IEEE Trans. Very Large Scale Integr. Syst., 7(1):92–104, 1999.
(DOI)
Hardware-software co-synthesis starts with an embedded-system
specification and results in an architecture consisting of hardware and
software modules to meet performance, power, and cost goals. Embedded systems
are generally specified in terms of a set of acyclic task graphs. In this
paper, we present a co-synthesis algorithm COSYN, which starts with periodic
task graphs with real-time constraints and produces a low-cost heterogeneous
distributed embedded-system architecture meeting these constraints. It
supports both concurrent and sequential modes of communication and
computation. It employs a combination of preemptive and nonpreemptive static
scheduling. It allows task graphs in which different tasks have different
deadlines. It introduces the concept of an association array to tackle the
problem of multirate systems. It uses a new task- clustering technique, which
takes the changing nature of the critical path in the task graph into
account. It supports pipelining of task graphs and a mix of various
technologies to meet embedded-system constraints and minimize power
dissipation. In general, embedded-system tasks are reused across multiple
functions. COSYN uses the concept of architectural hints and reuse to exploit
this fact. Finally, if desired, it also optimizes the architecture for power
consumption. COSYN produces optimal results for the examples from the
literature while providing several orders of magnitude advantage in central
processing unit time over an existing optimal algorithm. The efficacy of
COSYN and its low-power extension COSYN-LP is also established through their
application to very large task graphs (with over 1000
tasks)
- [Hammes et al.,
1999]
- J. Hammes, R. Rinker, W. Böhm, and W. Najjar.
Compiling a high-level language to reconfigurable systems.
In Compiler and Architecture Support for Embedded Systems Conference (CASES
'99), October 1999.
- [Haruyama and Cummings,
1999]
- M. Haruyama and S. Cummings.
Fpga in the software radio.
Communications Magazine, IEEE, 37(2):108–112, 1999.
(DOI)
As new radio standards are deployed without substantially
supplanting existing ones, the need for multimode multiband handsets and
infrastructure increases. This article describes how emerging FPGA
technology's unique combination of size and power efficiency plus field
programmability offers a transition of FPCAs from ASIC prototyping to
embedded products. Software- defined receiver examples suggest an enlarged
role for FPGAs in pragmatic paths toward the productization of software radio
technology
- [Micheli, 1999]
- Giovanni De Micheli.
Hardware synthesis from c/c++ models.
In DATE '99: Proceedings of the conference on Design, automation and test
in Europe, page 80, New York, NY, USA, 1999. ACM Press.
(DOI)
- [Shenoy et al.,
1999]
- U. Nagaraj Shenoy, Alok Choudhary, and Prithviraj Banerjee.
Symphany: A tool for automatic synthesis of parallel heterogeneous
adaptive systems.
Technical Report CPDC-TR-9903-002, Center for Parallel and Distributed
Computing, Northwestern University, Evanston, IL, USA, March 1999.
- [Shin, 1999]
- Tae-Woo Kim Hyunchul Shin.
Hardware cost estimation techniques for c-level description.
In VLSI and CAD, 1999. ICVC '99. 6th International, pages
85–88, Seoul, 1999.
(DOI)
Recent trends in the hardware/software codesign and architectural
exploration bring us the need to develop sophisticated high- level estimation
tools. This paper describes hardware cost estimation techniques for
descriptions written in C language. This approach estimates the area and
performance of the system described in standard ANSI C language to be
implemented in hardware. Experimental results show that this approach has
some errors but gives the designer useful information concerning the hardware
for architectural exploration and hardware/software partitioning in
high-level codesign
- [Walker and Blythe,
1999]
- S. A. Walker and R. A. Blythe.
Efficiently searching the optimal design space.
In VLSI, 1999. Proceedings. Ninth Great Lakes, pages 192–195,
Ypsilanti, MI, 1999.
(DOI)
One of the primary advantages of a high-level synthesis system is
its ability to explore the design space. This paper presents several
methodologies for design space exploration that compute all optimal tradeoff
points for the combined problem of scheduling, clock length determination,
and module selection. We discuss how each methodology takes advantage of both
the structure within the design space itself as well as the structure of, and
interaction between, each of the three subproblems
- [Aigner et al., 2000]
- Gerald
Aigner, Amer Diwan, David L. Heine, Monica S. Lam, David L. Moore, Brian R.
Murphy, and Constantine Sapuntzakis.
An Overview of the SUIF2 Compiler Infrastructure.
Stanford University, Stanford, CA, USA, 2.2.0-4 edition, 2000.
(PS)
- [Arnold et al.,
2000]
- Matthew Arnold, Stephen Fink, Vivek Sarkar, and Peter F.
Sweeney.
A comparative study of static and profile-based heuristics for inlining.
In DYNAMO '00: Proceedings of the ACM SIGPLAN workshop on Dynamic and
adaptive compilation and optimization, pages 52–64, New York, NY,
USA, 2000. ACM Press.
(DOI)
- [Bammi et al.,
2000]
- Jwahar R. Bammi, Wido Kruijtzer, Luciano Lavagno, Edwin
Harcourt, and Mihai T. Lazarescu.
Software performance estimation strategies in a system-level design tool.
In CODES '00: Proceedings of the eighth international workshop on
Hardware/software codesign, pages 82–86, New York, NY, USA, 2000. ACM
Press.
(DOI)
- [Banerjee et al.,
2000]
- P. Banerjee, N. Shenoy, A. Choudhary, S. Hauck,
C. Bachmann, M. Haldar, P. Joisha, A. Jones, A. Kanhare, A. Nayak,
S. Periyacheri, M. Walkden, and D. Zaretsky.
A matlab compiler for distributed, heterogeneous, reconfigurable computing
systems.
In FCCM '00: Proceedings of the 2000 IEEE Symposium on Field-Programmable
Custom Computing Machines, page 39, Washington, DC, USA, 2000. IEEE
Computer Society.
- [Bilavarn et al.,
2000]
- S. Bilavarn, G. Gogniat, and J. Philippe.
Area time power
estimation for fpga based designs at a behavioral level.
In ICECS'2K, Kaslik, Lebanon, December 2000.
- [Brandolese et al.,
2000]
- C. Brandolese, W. Fornaciari, L. Pomante, F. Salice, and
D. Sciuto.
A multi-level strategy for software power estimation.
In ISSS '00: Proceedings of the 13th international symposium on System
synthesis, pages 187–192, Washington, DC, USA, 2000. IEEE Computer
Society.
(DOI)
- [Buracchini, 2000]
- Enrico
Buracchini.
The software radio concept.
IEEE Communications Magazine, 38(9):138–143, September 2000.
- [Diguet et al., 2000]
- J.-P.
Diguet, G. Gogniat, P. Danielo, J.-L. Philippe, and M. Auguin.
System specification with the spf model.
In FDL '00: Proceedings of the Forum on Design Languages, Lester,
UBS University, Lorient, France, September 2000. ECSI Association.
Session 3.4.
- [Hosemann et al.,
2000]
- S. Hosemann, M. Reed, J. H. Athanas, and P. Srikanteswara.
Design and implementation of a completely reconfigurable soft radio.
In Radio and Wireless Conference, 2000. RAWCON, pages 7–11,
Denver, CO, 2000.
(DOI)
The advances in reconfigurable computing have now made it possible
to implement the concept of hardware paging, which has the potential to
greatly advance the design of soft radios. While many soft/software radio
architectures have been suggested and implemented there remains a lack of a
formal design methodology that can be used to design and implement these
radios on reconfigurable platforms that exploit the latest inventions. This
paper presents a unified architecture, called the layered radio architecture,
for design of soft radios on a reconfigurable platform. Using the assumptions
of the availability of run-time reconfigurable hardware and the use of
stream-based computing, the layered radio architecture defines a soft radio
architecture that is scalable in hardware and software, flexible, and capable
of supporting multi-mode radios along with over-the-air updates and software
validation
- [Khoshgoftaar et al.,
2000]
- Taghi M. Khoshgoftaar, Edward B. Allen, Wendell D. Jones,
and John P. Hudepohl.
Accuracy of software quality models over multiple releases.
Ann. Softw. Eng., 9(1-4):103–116, 2000.
- [Lay, 2000]
- David C. Lay.
Linear Algebra and its Applications.
Addison Wesley Longman, Inc., Boston, MA, USA, second edition, 2000.
- [Li et al.,
2000]
- Yanbing Li, Tim Callahan, Ervan Darnell, Randolph Harr,
Uday Kurkure, and Jon Stockwood.
Hardware-software co-design of embedded reconfigurable architectures.
In DAC '00: Proceedings of the 37th conference on Design
automation, pages 507–512, New York, NY, USA, 2000. ACM Press.
(DOI)
- [Séméria et al.,
2000]
- Luc Séméria, K. Sato, and Giovanni De Micheli.
Resolution of dynamic memory allocation and pointers for the behavioral
synthesis from c.
In DATE '00: Proceedings of the Design, Automation and Test in Europe
Confeence and Exhibition, pages 312–319, Stanford University, CA,
USA, March 2000.
(DOI)
One of the greatest challenges in C/C++-based design methodology is
to efficiently map C/C++ models into hardware. Many of the networking and
multimedia applications implemented in hardware or mixed hardware/software
systems are making use of complex data structures stored in one or multiple
memories. As a result, many of the C/C++ features which were originally
designed for software applications are now making their way into hardware.
Such features include dynamic memory allocation and pointers used to manage
data. We present a solution for efficiently mapping arbitrary C code with
pointers and malloc/free into hardware. Our solution fits current memory
management methodologies. It consists of instantiating a hardware allocator
tailored to an application and a memory architecture. Our work also supports
the resolution of pointers without restriction on the data structures. An
implementation using the SUIF framework is presented, followed by some case
studies such as the realization of a video filter
- [Ye et al., 2000]
- Zhi Alex Ye, Nagaraj
Shenoy, and Prithviraj Baneijee.
A c compiler for a processor with a reconfigurable functional unit.
In FPGA '00: Proceedings of the 2000 ACM/SIGDA eighth international
symposium on Field programmable gate arrays, pages 95–100, New York,
NY, USA, 2000. ACM Press.
(DOI)
- [Bohm et al., 2001]
- A. P.
Bohm, B. Draper, W. Najjar, J. Hammes, R. rinker, M. Chawathe, and C. Ross.
One-step compilation of image processing applications to fpgas.
In FCCM '01: Proceedings of the 9th Annual IEEE Symposium on
field-Programmable Custom Computing Machines, pages 209–218, Colorado
State University, CO, USA, May 2001. IEEE Computer Society.
(DOI)
- [Brandolese et al.,
2001]
- C. Brandolese, W. Fornaciari, F. Salice, and D. Sciuto.
Source-level execution time estimation of c programs.
In CODES '01: Proceedings of the ninth international symposium on
Hardware/software codesign, pages 98–103, New York, NY, USA, 2001.
ACM Press.
(DOI)
- [Chilimbi, 2001]
- Trishul M.
Chilimbi.
Efficient representations and abstractions for quantifying and exploiting data
reference locality.
In PLDI '01: Proceedings of the ACM SIGPLAN 2001 conference on
Programming language design and implementation, pages 191–202, New
York, NY, USA, 2001. ACM Press.
(DOI)
- [Haldar et al., 2001]
- Malay
Haldar, Anshuman Nayak, Alok Choudhary, Prith Banerjee, and Nagraj Shenoy.
Fpga hardware synthesis from matlab.
In VLSID '01: Proceedings of the 14th International Conference on VLSI
Design (VLSID '01), page 299, Washington, DC, USA, 2001. IEEE Computer
Society.
- [Hirzel and Chilimbi,
2001]
- M. Hirzel and T. Chilimbi.
Bursty tracing: A
framework for low-overhead temporal profiling.
In 4th ACM Workshop on Feedback-Directed and Dynamic Optimization (FDDO-4),
December 2001.
- [Power et al.,
2001]
- J. Power, J. Waldron, and J. Horgan.
Measurement and analysis of runtime profiling data for java programs.
In Source Code Analysis and Manipulation, 2001., pages 122–130,
Florence, 2001.
(DOI)
The authors examine a procedure for the analysis of data produced
by the dynamic profiling of Java programs. In particular, we describe the
issues involved in dynamic analysis, propose a metric for discrimination
between the resulting data sets, and examine its application over different
test suites and compilers
- [Srinivasan et al.,
2001]
- V. Srinivasan, S. Govindarajan, and R. Vemuri.
Fine-grained and coarse-grained behavioral partitioning witheffective
utilization of memory and design space exploration formulti-fpga
architectures.
IEEE Transactions on Very Large Scale Integration (VLSI) Systems,
9(1):140–158, February 2001.
(DOI)
Reconfigurable computers (RCs) host multiple field programmable
gate arrays (FPGAs) and one or more physical memories that communicate
through an interconnection fabric. State-of-the-art RCs provide abundant
hardware and storage resources, but have tight constraints on FPGA pin-out
and inter-FPGA interconnection resources. These stringent constraints are the
primary impediment for multi-FPGA partitioning tools to generate high-quality
designs, in this paper, we present two integrated partitioning and synthesis
approaches for RCs. The first approach involves fine-grained partitioning of
a scheduled data-flow graph (DFG, or an operation graph), and the second
involves a coarse-grained partitioning of an unscheduled control data flow
graph (CDFG, or a block graph). A hardware design space exploration engine is
integrated with the block graph partitioner that dynamically contemplates
multiple schedules during partitioning. The novel feature in the partitioning
approaches is that the physical memory in the RC is effectively used to
alleviate the FPGA pin-out and inter- FPGA interconnection bottle-neck.
Several experiments have been conducted, targeting commercial multi-FPGA
boards, to compare the two partitioning approaches, and detailed summaries
are presented
- [Villarreal et al.,
2001]
- Jason R. Villarreal, R Lysecky, S Cotterell, and F. Vahid.
A study on the loop behavior of embedded programs.
Technical Report UCR–CSE–01–03, University of California, Riverside,
Riverside, CA, USA, 2001.
(PDF)
- [Bjuréus et al.,
2002]
- Per Bjuréus, Mikael Millberg, and Axel Jantsch.
Fpga resource and timing estimation from matlab execution traces.
In CODES '02: Proceedings of the tenth international symposium on
Hardware/software codesign, pages 31–36, New York, NY, USA, 2002. ACM
Press.
(DOI)
- [Bodik et al., 2002]
- B. Bodik,
R. Hill, and M. D. Fields.
Slack: maximizing performance under technological constraints.
In Computer Architecture, 2002. Proceedings. 29th, pages 47–58,
Anchorage, AK, 2002.
(DOI)
Many emerging processor microarchitectures seek to manage
technological constraints (e.g., wire, delay, power, and circuit complexity)
by resorting to non-uniform designs that provide resources at multiple
quality levels (e.g., fast/slow bypass paths, multi-speed functional units,
and grid architectures). In such designs, the constraint problem becomes a
control problem, and the challenge becomes designing a control policy that
mitigates the performance penalty of the non-uniformity. Given the increasing
importance of non-uniform control policies, we believe it is appropriate to
examine them, in their own right. To this end, we develop slack for use in
creating control policies that match program execution behavior to machine
design. Intuitively, the slack of a dynamic instruction i is the number of
cycles i can be delayed with no effect on execution time. This property makes
slack a natural candidate for hiding non-uniform latencies. We make three
contributions in our exploration of slack. First, we formally define slack,
distinguish three variants (local, global and apportioned), and perform a
limit study to show that slack is prevalent in our SPEC2000 workload. Second,
we show how to predict slack in hardware. Third, we illustrate how to create
a control policy based on slack for steering instructions among fast (high
power) and slow (lower power) pipelines
- [Cotterell and Hughes,
2002]
- Mike Cotterell and Bob Hughes.
Software project management.
McGraw-Hill Publishing Company, Berkshire, England, UK, third edition,
2002.
- [Cousot and Cousot,
2002]
- Patrick Cousot and Radhia Cousot.
Modular static program analysis.
In CC '02: Proceedings of the 11th International Conference on Compiler
Construction, pages 159–178, London, UK, 2002. Springer-Verlag.
- [Dasu and Panchanathan,
2002]
- Aravind Dasu and Sethuraman Panchanathan.
Reconfigurable media processing.
Parallel Comput., 28(7-8):1111–1139, 2002.
(DOI)
- [Haggard and L., 2002]
- Jie Chen
Haggard and R. L.
Extraction of parallel hardware during c to vhdl translation.
In System Theory, 2002. Proceedings of the, pages 334–338, 2002.
(DOI)
Translating C/C++ language into VHDL is an important step in
synthesizing hardware from C/C++. However, there is no explicit facility in
the general C/C++ language to declare concurrent parallel execution which is
a critical characteristic of hardware systems. This paper presents the
outline of a set of transformation algorithms. These algorithms are helpful
in the process of extracting parallel hardware during C to VHDL translation.
An example of extracting parallel hardware from an array addition routine
written in C is also presented in this paper.
- [Jones et al., 2002]
- Alex Jones,
Debabrata Bagchi, Sartajit Pal, Prith Banerjee, and Alok Choudhary.
PACT HDL: a compiler targeting ASICS and FPGAS with power and performance
optimizations, pages 169–190.
Series in Computer Science. Kluwer Academic Publishers, Norwell, MA, USA,
2002.
- [Kulkarni et al.,
2002]
- Dhananjay Kulkarni, Walid A. Najjar, Robert Rinker, and
Fadi J. Kurdahi.
Fast area estimation to support compiler optimizations in fpga-based
reconfigurable systems.
In FCCM '02: Proceedings of the 10th Annual IEEE Symposium on
Field-Programmable Custom Computing Machines, page 239, Washington,
DC, USA, 2002. IEEE Computer Society.
- [Maas et al.,
2002]
- Elmar Maas, Dirk Herrmann, Rolf Ernst, Peter Rüffer,
Sieghard Hasenzahl, and Martin Seitz.
A processor-coprocessor architecture for high end video
applications, pages 688–691.
The Morgan Kaufmann Systems On Silicon Series. Kluwer Academic Publishers,
Norwell, MA, USA, 2002.
- [McPeak, 2002]
- Scott McPeak.
Elkhound: A fast, practical glr parser generator.
Technical Report UCB/CSD–2--1214, University of California, Berkeley,
Berkeley, CA, USA, december 2002.
- [Munson, 2002]
- John C. Munson.
Software Engineering Measurement.
CRC Press, Inc., Boca Raton, FL, USA, 2002.
- [Nayak et al.,
2002]
- A. Nayak, M. Haldar, A. Choudhary, and P. Banerjee.
Accurate area and delay estimators for fpgas.
In DATE '02: Proceedings of the conference on Design, automation and test
in Europe, page 862, Washington, DC, USA, 2002. IEEE Computer
Society.
- [Rubin et al., 2002]
- Shai
Rubin, Rastislav Bodík, and Trishul Chilimbi.
An efficient profile-analysis framework for data-layout optimizations.
In POPL '02: Proceedings of the 29th ACM SIGPLAN-SIGACT symposium on
Principles of programming languages, pages 140–153, New York, NY,
USA, 2002. ACM Press.
(DOI)
- [Shang and Jha,
2002]
- Li Shang and Niraj K. Jha.
Hardware-software co-synthesis of low power real-time distributed embedded
systems with dynamically reconfigurable fpgas.
In ASP-DAC '02: Proceedings of the 2002 conference on Asia South Pacific
design automation/VLSI Design, page 345, Washington, DC, USA, 2002.
IEEE Computer Society.
- [Sima et al.,
2002]
- Mihai Sima, Stamatis Vassiliadis, Sorin Cotofana, Jos T. J.
van Eijndhoven, and Kees A. Vissers.
Field-programmable custom computing machines - a taxonomy -.
In FPL '02: Proceedings of the Reconfigurable Computing Is Going
Mainstream, 12th International Conference on Field-Programmable Logic and
Applications, pages 79–88, London, UK, 2002. Springer-Verlag.
- [Vahid and Gajski,
2002]
- Frank Vahid and Daniel D. Gajski.
Incremental hardware estimation during hardware/software functional
partitioning.
In Readings in hardware/software co-design, pages 516–521. Kluwer
Academic Publishers, Norwell, MA, USA, 2002.
- [Villarreal et al.,
2002]
- Jason R. Villarreal, D. Suresh, G. Stitt, F. Vahid, and
W. Najjar.
Improving software performance with configurable logic.
Journal on Design Automation of Embedded Systems, 7(4):325–339,
2002.
(DOI)
- [Zhu, 2002]
- Jianwen Zhu.
Symbolic pointer analysis.
In ICCAD '02: Proceedings of the 2002 IEEE/ACM international conference
on Computer-aided design, pages 150–157, New York, NY, USA, 2002. ACM
Press.
(DOI)
- [Becker and Hartenstein,
2003]
- Jürgen Becker and Reiner Hartenstein.
Configware and morphware going mainstream.
J. Syst. Archit., 49(4-6):127–142, 2003.
(DOI)
- [Bhasyam and Bazargan,
2003]
- Karthikeyan Bhasyam and Kia Bazargan.
Hw/sw codesign incorporating edge delays using dynamic programming.
In DSD '03: Proceedings of the Euromicro Symposium on Digital Systems
Design, page 264, Washington, DC, USA, 2003. IEEE Computer
Society.
- [Catthoor et al., 2003]
- P. G.
Catthoor, F. Aas, and E. J. Kjeldsberg.
Data dependency size estimation for use in memory optimization.
Computer-Aided Design of Integrated Circuits and, 22(7):908–921,
2003.
(DOI)
A novel storage requirement estimation methodology is presented for
use in the early system design phases when the data transfer ordering is only
partly fixed. At that stage, none of the existing estimation tools are
adequate, as they either assume a fully specified execution order or ignore
it completely. This paper presents an algorithm for automated estimation of
strict upper and lower bounds on the individual data dependency sizes in
high-level application code given a partially fixed execution ordering. In
the overall estimation technique, this is followed by a detection of the
maximally combined size of simultaneously alive dependencies, resulting in
the overall storage requirement of the application. Using representative
application demonstrators, we show how our techniques can effectively guide
the designer to achieve a transformed specification with low storage
requirement.
- [Fornaciari et al.,
2003]
- W. Fornaciari, F. Salice, and D. P. Scarpazza.
Early estimation of the size of vhdl projects.
In Proceedings of the first IEEE/ACM/IFIP Intl. Conf. on
Hardware/Software Codesign and System Synthesis, pages 207–212,
Milano, Italy, 2003. Politecnico di Milano.
(PDF)
(DOI)
- [Guo et al.,
2003]
- Z. Guo, D. C. Suresh, and W. A. Najjar.
Programmability and efficiency in reconfigurable computer systems.
In Workshop on Software Support for Reconfigurable Systems, held in conjunction
with the 9th Int. Conf. Of High-Performance Computer Architecture, Anaheim,
CA, February 2003.
(PDF)
- [Gupta et al., 2003]
- Sumit Gupta,
Nikil Dutt, Rajesh Gupta, and Alex Nicolau.
Spark : A high-level synthesis framework for applying parallelizing compiler
transformations.
vlsid, 00:461, 2003.
(DOI)
This paper presents a modular and extensible high-level synthesis
research system, called SPARK, that takes a behavioral description in ANSI-C
as input and produces synthesizable register-transfer level VHDL. SPARK uses
parallelizing compiler technology, developed previously, to enhance
instruction-level parallelism and re-instruments it for high-level synthesis
by incorporating ideas of mutual exclusivity of operations, resource sharing
and hardware cost models. In this paper, we present the design flow through
the SPARK system, a set of transformations that include speculative code
motions and dynamic transformations and show how these transformations and
other optimizing synthesis and compiler techniques are employed by a
scheduling heuristic. Experiments are performed on two moderately complex
industrial applications, namely MPEG-1 and the GIMP image processing tool.
The results show that the various code transformations lead to up to 70 improvements in performance without any increase in the overall area and
critical path of the final synthesized design.
- [Jha and Vallerio, 2003]
- K. S. Jha
and N. K. Vallerio.
Task graph extraction for embedded system synthesis.
In VLSI Design, 2003. Proceedings. 16th, pages 480–486, 2003.
(DOI)
Consumer demand and improvements in hardware have caused
distributed real-time embedded systems to rapidly increase in complexity. As
a result, designers faced with time-to-market constraints are forced to rely
on intelligent design tools to enable them to keep up with demand. These
tools are continually being used earlier in the design process when the
design is at higher levels of abstraction. At the highest level of
abstraction are hardware/software co-synthesis tools which take a system
specification as input. Although many embedded systems are described in C,
the system specifications for many of these tools are often in the form of
one or more task graphs. These tools are very effective at solving the
co-synthesis problem using task graphs but require that designers manually
transform the specification from C code to task graphs, a tedious and
error-prone job. The task graph extraction tool described in this paper
reduces the potential for error and the time required to design an embedded
system by automating the task graph extraction process. Such a tool can
drastically improve designer productivity. As far as we know, this is the
first tool of its kind. It has been made available on the
web.
- [Kaplan et al., 2003]
- Adam
Kaplan, Philip Brisk, and Ryan Kastner.
Data communication estimation and reduction for reconfigurable systems.
In DAC '03: Proceedings of the 40th conference on Design
automation, pages 616–621, New York, NY, USA, 2003. ACM Press.
(DOI)
- [Levine and Schmit,
2003]
- Benjamin A. Levine and Herman H. Schmit.
Efficient application representation for haste: Hybrid architectures with a
single, transformable executable.
In FCCM '03: Proceedings of the 11th Annual IEEE Symposium on
Field-Programmable Custom Computing Machines, page 101, Washington,
DC, USA, 2003. IEEE Computer Society.
- [Merrill, 2003]
- Jason Merrill.
Generic and gimple: A new tree representation for entire functions.
In Proceedings of the 2003 GCC Summit. Red Hat, Inc., 2003.
(PS)
- [Neto and Cardoso,
2003]
- J. M. P. Neto and H. C. Cardoso.
Compilation for fpga-based reconfigurable hardware.
IEEE Design & Test of Computers, 20(2):65–75, 2003.
(DOI)
This paper provides techniques for compiling software programs into
reconfigurable hardware which offer faster and more efficient performance
than the complex resource-sharing approaches typical of high-level synthesis
systems. The Java-based compiler presented in this paper uses intermediate
graph representations to embody parallelism at various
levels.
- [Panainte et al.,
2003]
- Elena Moscu Panainte, Koen Bertels, and Stamatis
Vassiliadis.
Compiling for the molen programming paradigm.
In Proceedings of the 13th International Conference on Field Programmable
Logic and Applications (FPL'03), pages 900–910, Delft, Netherlands,
September 2003.
- [Soininen and P.,
2003]
- Yang Qu Soininen and J. P.
Estimating the utilization of embedded fpga co-processor.
In Digital System Design, 2003. Proceedings., pages 214–221,
2003.
(DOI)
Embedded FPGA co-processors will bring new alternatives for SoC
system designers. Comparison of software implementations and reconfigurable
hardware implementations will need fast and easy-to-use estimation
techniques. In this paper, we present an estimation approach for the resource
utilization of the embedded FPGA co-processor. Our approach is based on the
principles of high-level synthesis, such as force-directed scheduling,
resource allocation, operation assignment and interconnection binding. The
method has been applied to simple test cases and a C-language model of MPEG-2
decoder. The average hardware estimation error of MPEG-2 functions was
25%.
- [Srikanteswara et al.,
2003]
- S. Srikanteswara, R. C. Palat, J. H. Reed, and P. Athanas.
An overview of configurable computing machines for software radio handsets.
IEEE Communications Magazine, 41(7):134–141, 2003.
(DOI)
The advent of software radios has brought a paradigm shift to radio
design. A multimode handset with dynamic reconfigurability has the promise of
integrated services and global roaming capabilities. However, most of the
work to date has been focused on software radio base stations, which do not
have as tight constraints on area and power as handsets. Base station
software radio technology progressed dramatically with advances in system
design, adaptive modulation and coding techniques, reconfigurable hardware,
A/D converters, RF design, and rapid prototyping systems, and has helped
bring software radio handsets a step closer to reality. However, supporting
multimode radios on a small handset still remains a design challenge. A
configurable computing machine, which is an optimized FPGA with
application-specific capabilities, show promise for software radio handsets
in optimizing hardware implementations for heterogeneous systems. In this
article contemporary CCM architectures that allow dynamic hardware
reconfiguration with maximum flexibility are reviewed and assessed. This is
followed by design recommendations for CCM architectures for use in software
radio handsets.
- [Suresh et al.,
2003]
- Dinesh C. Suresh, Walid A. Najjar, Frank Vahid, Jason R.
Villarreal, and Greg Stitt.
Profiling tools for hardware/software partitioning of embedded applications.
SIGPLAN Not., 38(7):189–198, 2003.
(PDF)
(DOI)
- [Swahn and Hassoun,
2003]
- Brian Swahn and Soha Hassoun.
Hardware scheduling for dynamic adaptability using external profiling and
hardware threading.
In ICCAD '03: Proceedings of the 2003 IEEE/ACM international conference
on Computer-aided design, page 58, Washington, DC, USA, 2003. IEEE
Computer Society.
(DOI)
- [Vassiliadis et al.,
2003]
- Stamatis Vassiliadis, Georgi N. Gaydadjiev, Koen Bertels,
and Elena Moscu Panainte.
The molen programming paradigm.
In Proceedings of the Third International Workshop on Systems,
Architectures, Modeling, and Simulation, pages 1–10, Delft,
Netherlands, July 2003.
- [Banerjee and Dutt,
2004]
- Sudarshan Banerjee and Nikil Dutt.
Very fast simulated annealing for hw-sw partitioning.
Technical Report UCI–CECS–04–18, University of California, Irvine, Irvine,
CA, USA, June 2004.
- [Cardoso and Diniz,
2004]
- João M. P. Cardoso and Pedro C. Diniz.
Modeling Loop Unrolling: Approaches and Open Issues, volume
3133/2004 of Lecture Notes in Computer Science, page 224.
Springer, July 2004.
(DOI)
- [Cherkaskyy,
2004]
- M. Cherkaskyy.
Theoretical fundamentals software/hardware algorithms.
In TCSET '04: Proceedings of the International Conference on Modern
Problems of Radio Engineering, Telecommunications and Computer
Science, pages 9–13, Lviv, Ukraine, February 2004.
(DOI)
- [Faruque et al., 2004]
- M.A. Al
Faruque, K. Karuri, S. Kowalewski, and R. Leupers.
Fine grained application profiling for guiding application specific instruction
set processors(asips) design.
Master's thesis, Reinisch-Westfälische Hochshule, Aachen, Germany, 2004.
(PDF)
Current Application Specific Instruction set Processor (ASIP)
design methodologies are mostly based on iterative architecture exploration
that uses Architecture Description Languages (ADLs) and retargetable software
development tools. However, for improved design efficiency, additional
pre-architecture exploration tools are required to help narrow-down the huge
design space and making coarsegrained Instruction Set Architecture (ISA)
decisions before detailed ADL modeling. Extensive application code profiling
is the key in such early design stages. Based on a novel code instrumentation
technology, we present a microprofiling approach that fills the current gap
between source-level and instruction-level profilers and combines their
advantages w.r.t. speed and accuracy. We show how the microprofiler is
embedded into an advanced ASIP design flow and justify its use in a case
study to design an MP3 decoder ASIP.
- [Mukherjee et al.,
2004]
- Rajarshi Mukherjee, Alex Jones, and Prith Banerjee.
Handling data streams while compiling c programs onto hardware.
In ISVLSI '04: Proceedings of the IEEE Computer Society Annual Symposium
on VLSI Emerging Trends in VLSI Systems Design, pages 271–272. IEEE
Computer Society, 2004.
(DOI)
- [Niehaus et al.,
2004]
- D. Niehaus, D. Ashenden, and P. Andrews.
Programming models for hybrid CPU/fpga chips.
Computer, 37(1):118–120, 2004.
(DOI)
Designers of embedded and real-time systems are continually
challenged to meet tighter system requirements at better price-performance
ratios. Best-practice methods have long promoted the use of
commercial-off-the-shelf components to reduce design costs and time to
market, but creating COTS components that are reusable in a wide range of
applications remains difficult. In part, the challenge lies in satisfying the
contradictory design forces of generalization and specialization. Systems
designers are all too familiar with the tension these opposing forces cause
in trying to balance cost versus performance. Adopting COTS components
reduces costs and time to market but often fails to meet the most demanding
performance requirements; custom- designed components can achieve
significantly higher performance but at greater development costs and longer
times to market. Emerging hybrid chips containing both CPU and field-
programmable gate array (FPGA) components are an exciting new development.
They promise COTS economies of scale while also supporting significant
hardware customization. Components that combine a CPU and reconfigurable
logic gates need a programming model that abstracts the computational
hardware.
- [Panainte et al.,
2004]
- Elena Moscu Panainte, Koen Bertels, and Stamatis
Vassiliadis.
Multimedia reconfigurable hardware design space exploration.
In Proceedings of the 16th IASTED International Conference on Parallel
and Distributed Computing and Systems (PDCS 2004), pages 398–403,
Delft, Netherlands, November 2004. ACTA Press.
(PDF)
In this paper we consider a set of multimedia applications and
investigate the potential performance impact a recon figurable microcoded
processor can provide when added to a general purpose core processor. In a
design space ex ploration, considering MPEG2 and JPEG benchmarks, we
investigate performance boundaries, memory bottlenecks and the influence the
core and reconfigurable processor communication has on performance. Under
some realis tic scenarios and serial FPGA execution, it is shown that a 53 cycle reduction is expected when comparing a design having a core processor
and a design when the core pro cessor is augmented with a reconfigurable
microcoded en gine. In addition, we have found that transferring parame ters
between the core processor and the reconfigurable pro cessor may not severely
influence the overall performance. Finally we investigated the memory
bandwidth for opera tions mapped automatically on FPGA. The case study in
dicates that small latency DCT hardware design performs well when interfaced
with 512 bytes/cycle. Our studies also indicate that about 64 bytes/cycle
will support high speed execution for SAD and IDCT.
- [Sima, 2004]
- Mihai Sima.
The ρ–TriMedia Processor.
PhD thesis, Delft University of Technology, Delft, Netherlands, March
2004.
- [Strelzoff, 2004]
- Al Strelzoff.
Functional programming for reconfigurable computing.
In IPDPS '04: Proceedings of the 18th International Parallel and
Distributed Processing Symposium, San Jose, CA, USA, April 2004. IEEE
Computer Society.
(DOI)
Summary form only given. Reconfigurable computing requires
organizing computation with mixtures of processors and discrete logic thus
presenting a difficult problem of hardware/software integration. An execution
model and adaptation of functional programming is proposed which removes the
distinction between hardware and software while offering the possibility of
``correct by construction'' design. The resulting language is called ``V''
because one way of creating it is to begin with the verifiable, synthesizable
subset of Verilog, and then add functional programming features. V generates
the net-list of elementary functions which are supported by an array. The
compiler has stages of compilation and instantiation so that recursion can be
supported in the early definition of a design. The execution model is cycle
based synchronous dataflow. V syntax looks much like Verilog or C without
pointers in order to facilitate adoption.
- [van Albada et al.,
2004]
- P. F. van Albada, G. D. Sloot, and P. M. A. Spinnato.
Performance modeling of distributed hybrid architectures.
IEEE Parallel and Distributed Systems, 15(1):81–92, 2004.
(DOI)
Hybrid architectures are systems where a high performance general
purpose computer is coupled to one or more special purpose devices (SPDs).
Such a system can be the optimal choice for several fields of computational
science. Configuring the system and finding the optimal mapping of the
application tasks onto the hybrid machine often is not straightforward.
Performance modeling is a tool to tackle and solve these problems. We have
developed a performance model to simulate the behavior of a hybrid
architecture consisting of a parallel multiprocessor where some nodes are the
host of a GRAPE board. GRAPE is a very high performance SPD used in
computational astrophysics. We validate our model on the architecture at our
disposal, and show examples of predictions that our model can
produce.
- [Vuletic et al.,
2004]
- Miljan Vuletic, Laura Pozzi, and Paolo Ienne.
Programming transparency and portable hardware interfacing: Towards
general-purpose reconfigurable computing.
In ASAP '04: Proceedings of the 15th IEEE International Conference on
Application-Specific Systems, Architectures, and Processors, pages
339–351. IEEE Computer Society, September 2004.
(DOI)
- [Wong et al., 2004]
- S. Wong,
S. Gaydadjiev, G. Bertels, and K. Vassiliadis.
The molen polymorphic processor.
Transactions on Computers, 53(11):1363–1375, 2004.
(DOI)
In this paper, we present a polymorphic processor paradigm
incorporating both general-purpose and custom computing processing. The
proposal incorporates an arbitrary number of programmable units, exposes the
hardware to the programmers/designers, and allows them to modify and extend
the processor functionality at will. To achieve the previously stated
attributes, we present a new programming paradigm, a new instruction set
architecture, a microcode-based microarchitecture, and a compiler
methodology. The programming paradigm, in contrast with the conventional
programming paradigms, allows general-purpose conventional code and hardware
descriptions to coexist in a program: In our proposal, for a given
instruction set architecture, a onetime instruction set extension of eight
instructions, is sufficient to implement the reconfigurable functionality of
the processor. We propose a microarchitecture based on reconfigurable
hardware emulation to allow high-speed reconfiguration and execution. To
prove the viability of the proposal, we experimented with the MPEG-2 encoder
and decoder and a Xilinx Virtex II Pro FPGA. We have implemented three
operations, SAD, DCT, and IDCT. The overall attainable application speedup
for the MPEG-2 encoder and decoder is between 2.64-3.18 and between
1.56-1.94, respectively, representing between 93 percent and 98 percent of
the theoretically obtainable speedups.
- [Bhansali, 2005]
- P. V. Bhansali.
Complexity measurement of data and control flow.
SIGSOFT Software Engineering Notes, 30(1):1, 2005.
(DOI)
- [Buyukkurt et al.,
2005]
- Z. Buyukkurt, B. Najjar, W. Vissers, and K. Guo.
Optimized generation of data-path from c codes for fpgas.
Design, Automation and Test in Europe, 2005., pages 112–117,
2005.
(DOI)
FPGAs, as computing devices, offer significant speedup over
microprocessors. Furthermore, their configurability offers an advantage over
traditional ASICs. However, they do not yet enjoy high-level language
programmability, as microprocessors do. This has become the main obstacle for
their wider acceptance by application designers. ROCCC is a compiler designed
to generate circuits from C source code to execute on FPGAs, more
specifically on CSoCs. It generates RTL level HDLs from frequently executing
kernels in an application. In this paper, we describe the ROCCC's system
overview and focus on its data path generation. We compare the performance of
ROCCC- generated VHDL code with that of Xilinx IPs. The synthesis result
shows that the ROCCC-generated circuit takes around 2/spl times//spl
sim/3/spl times/ the area and runs at a comparable clock
rate.
- [Calman and S., 2005]
- Jianwen Zhu
Calman and S.
Context sensitive symbolic pointer analysis.
Computer-Aided Design of Integrated Circuits and, 24(4):516–531,
2005.
(DOI)
One of the bottlenecks in the recent movement of hardware synthesis
from behavioral C programs is the difficulty in reasoning about runtime
pointer values at compile time. The pointer analysis problem has been
investigated in the compiler community for two decades and has yielded
efficient, polynomial time algorithms for context-insensitive (CI) analysis.
However, at the accuracy level for which hardware synthesis is desired,
namely context and flow sensitive analysis, the time and space complexity of
the best algorithms reported grow exponentially with program size. In this
paper, we propose a new analysis technology to combat the inefficiency
encountered in traditional algorithms. The key idea is to implicitly encode
the pointer-to relation in the Boolean domain by Bryant's binary decision
diagram, thereby capturing the procedure transfer function completely,
compactly and canonically. With symbolic transfer functions, we can establish
a common framework to perform both CI and context- sensitive (CS) pointer
analysis efficiently. In addition, we propose a symbolic representation of
the invocation graph, which can otherwise be exponentially large. In contrast
to the classical frameworks, where CS point-to information of a procedure has
to be obtained by the application of its transfer function exponentially many
times, our method can obtain point- to information of all contexts in a
single application. Our experimental evaluation on a wide range of C
benchmarks indicates that our CS pointer analysis can be made almost as fast
as its CI counterpart.
- [Cardoso, 2005a]
- J. Cardoso.
Evaluating the process control-flow complexity measure.
In Web Services, 2005. ICWS 2005. Proceedings. 2005, 2005.
(DOI)
Process measurement is the task of empirically and objectively
assigning numbers to the attributes of processes in such a way as to describe
them. We define process complexity as the degree to which a process is
difficult to analyze, understand or explain. One way to analyze a process'
complexity is to use a process control-flow complexity measure. This measure
analyzes the control-flow of processes and can be applied to both Web
processes and workflows. In this paper, we discuss how to evaluate the
control-flow complexity measure to ensure that it can be qualify as a good
and comprehensive one.
- [Cardoso, 2005b]
- Jorge Cardoso.
Control-flow complexity measurement of processes and weyuker's properties.
Enformatika, 8:213–218, 10 2005.
(DOI)
- [Edwards, 2005]
- Stephen A.
Edwards.
The challenges of hardware synthesis from c-like languages.
In DATE '05: Proceedings of the Design, Automation and Test in Europe
Conference and Exposition, pages 66–67, Columbia University, NY, USA,
2005. IEEE Computer Society.
(DOI)
Many techniques for synthesizing digital hardware from C-like
languages have been proposed, but none have emerged as successful as Verilog
or VHDL for register-transfer-level design. Familiarity is the main reason
C-like languages have been proposed for hardware synthesis. Synthesize
hardware from C, proponents claim, and a C programmer can be turned into a
hardware designer. Another common motivation is hardware/software codesign:
today's systems usually contain a mix of hardware and software, and it is
often unclear initially which portions to implement in hardware. Here, using
a single language should simplify the migration task. The paper surveys
several C-like hardware synthesis languages and looks at two of the
fundamental challenges, concurrency and timing control.
- [Holzer and Rupp, 2005]
- M. Holzer
and M. Rupp.
Static estimation of execution times for hardwar/e accelerators in
system-on-chips.
In System-on-Chip, 2005. Proceedings. 2005 International Symposium
on, pages 62–65, 2005.
(PDF)
- [Venkataramani et al.,
2005]
- Girish Venkataramani, Tiberiu Chelcea, and Seth Copen
Goldstein.
Hls support for unconstrained memory accesses.
In IWLS '05: IEEE 14th International Workshop on Logic Synthesis,
Carnegie Mellon University, Pittsburgh, PA, USA, June 2005.
- [Wolf, 2005]
- Wayne Wolf.
Building the software radio.
Computer, 38(3):87–89, 2005.
(DOI)
People have been working on software radio for about few years.
Software radio is just what it sounds like - a radio that uses software to
perform many of the signal processing tasks that analog circuits
traditionally handle. Software radio could turn out to be a paradigm shift
for communication systems. The US Defense Advanced Projects Research Agency
(DARPA) kicked off research into software radios to solve military problems,
but software radios can help solve some important problems in commercial
communication systems as well. Software radio offers the advantage of putting
many traditionally hard functions in modules whose characteristics can be
changed while the radio is running.
- [Meeuws, 2007]
- R. J. Meeuws.
A quantitative model for hardware/software partitioning.
Master's thesis, Delft University of Technology, Delft, Netherlands, May 2007.
System Development needs Hardware/Software Partitioning
performed early on in the development process. In order to do this early on
predictions of hardware resource usage and delay are necessary. In this
thesis a Quantitative Model is presented that can make early predictions to
support the partitioning process. The model is based on Software Complexity
Metrics, which capture important aspects of functions like control intensity,
data intensity, code size, etc. In order to remedy the interdependence of the
software metrics a Principal Component Analysis performed. The hardware
characteristics were determined by automatically generating VHDL from C using
the DWARV C-to-VHDL compiler. Using the results from the principal component
analysis, the quantitative model was generated using linear regression. The
error of the model differs per hardware characteristic. We show that for
flip-flops the mean error for the predictions is 69%. In conclusion, our
quantitative model can make fast and sufficiently accurate area predictions
to support Hardware/Software Partitioning. In the future, the model can be
extended by introducing extra software metrics, using more advanced modeling
techniques, and using a larger collection of functions and
algorithms.
- [Meeuws et al.,
2007]
- R. J. Meeuws, Y. D. Yankova, K.L.M. Bertels, G. N.
Gaydadjiev, and S. Vassiliadis.
A quantitative prediction model for hardware/software partitioning.
In Proceedings of 17th International Conference on Field Programmable
Logic and Applications (FPL'07), page 5, August 2007.
An important step in Heterogeneous System Development is
Hardware/Software Partitioning. This process involves exploring a huge
design space. By using profiling to select hot-spots and estimate area and
delay we can prune the design space considerably. We present a Quantitative
Model that makes early predictions to prune the design space and support the
partitioning process. The model is based on Software Complexity Metrics,
which capture important aspects of functions as control intensity, data
intensity, and code size. To remedy interdependence among software metrics,
we performed a Principal Component Analysis. The hardware characteristics
were determined by automatically generating VHDL from C using the DWARV
C-to-VHDL compiler. Linear regression on these data generated our model. The
model error differs per hardware characteristic. We show that for flip-flops
the mean error is 69%. In conclusion, our quantitative model makes fast and
sufficiently accurate area predictions in support of early Hardware/Software
Partitioning.
- [Virginia, 2007]
- Arcilio
Jaime-Raul Virginia.
Comparative study of vhdl generators.
Master's thesis, Delft University of Technology, Delft, The Netherlands, May
2007.
- [Yankova et al.,
2007]
- Y. D. Yankova, K.L.M. Bertels, S. Vassiliadis, R. J.
Meeuws, and A.J.R. Virginia.
Automated hdl generation: Comparative evaluation.
In Proceedings of International Symposium on Circuits and Systems
(ISCAS2007), May 2007.
- [Larsen]
- Pia Veldt Larsen.
St111 - regression and
analysis of variance.
- [weba]
- Delft workbench.
A semi-automatic tool platform for integrated hardware-software
co-design targeting heterogeneous computing systems containing reconfigurable
components. Delft Workbench addresses the entire design cycle rather than
isolated parts. It involves the development of compilers for reconfigurable
platforms, programming models, hardware software co-design, CAD and design
space exploration software, optimization algorithms and integration software
development. The Delft Workbench targets the MOLEN machine
organisation.
- [webb]
- Elkhound and elsa.
Elkhound is a parser generator, similar to Bison. The parsers it
generates use the Generalized LR (GLR) parsing algorithm. GLR works with any
context-free grammar, whereas LR parsers (such as Bison) require grammars to
be LALR(1).
- [webc]
- The
r project for statistical computing.
R is a language and environment for statistical computing and
graphics. It is a GNU project which is similar to the S language and
environment which was developed at Bell Laboratories (formerly AT&T, now
Lucent Technologies) by John Chambers and colleagues. R can be considered as
a different implementation of S. There are some important differences, but
much code written for S runs unaltered under R. R provides a wide variety of
statistical (linear and nonlinear modelling, classical statistical tests,
time-series analysis, classification, clustering, ...) and graphical
techniques, and is highly extensible. The S language is often the vehicle of
choice for research in statistical methodology, and R provides an Open Source
route to participation in that activity.
- [webd]
- Roccc website.
ROCCC is a C to hardware compilation project whose objective is the
FPGA-based acceleration of frequently executed code segments (loop nests). It
focus is on extensive compile-time transformations and optimizations with the
aim of 1) Maximizing parallelism by exploiting functional, loop and operation
parallelism, 2) Maximizing the troughput of a computation, 3) Minimizing the
number of off-FPGA memory accesses, 4) Minimizing the area occupied by the
circuit.
- [webe]
- Spark: High-level synthesis using
parallelizing compiler techniques.
SPARK is a C-to-VHDL high-level synthesis framework that employs a
set of innovative compiler, parallelizing compiler, and synthesis
transformations to improve the quality of high-level synthesis results. The
compiler transformations have been re-instrumented for synthesis by
incorporating ideas of mutual exclusivity of operations, resource sharing and
hardware cost models. The SPARK parallelizing high-level synthesis
methodology is particularly targeted to multimedia and image processing
applications along with control-intensive microprocessor functional
blocks.