A Priori Implementation Effort Estimation for HW Design Based on Independent-Path Analysis Special Issue on Design and Architectures for Signal Image Processing

This paper presents a metric-based approach for estimating the hardware implementation e ﬀ ort (in terms of time) for an application in relation to the number of linear-independent paths of its algorithms. We exploit the relation between the number of edges and linear-independent paths in an algorithm and the corresponding implementation e ﬀ ort. We propose an adaptation of the concept of cyclomatic complexity, complemented with a correction function to take designers’ learning curve and experience into account. Our experimental results, composed of a training and a validation phase, show that with the proposed approach it is possible to estimate the hardware implementation e ﬀ ort. This approach, part of our light design space exploration concept


Discussion of the problem
Companies developing embedded systems based on highend technology in areas such as telecommunication, defence, consumer products, healthcare equipment are evolving in an extremely competitive globalised market.In order to preserve their competitiveness, they have to deal with several contradicting objectives: on one hand, they have to face the ever-increasing need for shorter time-to-market; and on the other hand, they have to develop and produce low-cost, highquality, and innovative products.
This raises major challenges for most companies, especially for small-and medium-sized enterprises (SMEs).Although SMEs are under pressure due to the abovementioned factors, they are either not applying the latest design methodologies or cannot afford the modern electronic system level (ESL) design tools.By limiting themselves to traditional design methodologies, SMEs make themselves more vulnerable to unforeseen problems in the development process, making the time-to-market factor one of the most critical challenges they have to deal with.A survey released at the Embedded Systems Conference (ESC 2006) [1] indicated that more than 50% of embedded design projects are running behind schedule (i.e., 25% are 1-2 months late, 18% 3-6 months).In the 2008 version of the survey [2], it is again shown that meeting the schedule is the greatest concern for design teams.
Moreover, a workshop [3] held for Danish SMEs working in the domain of embedded systems clearly indicates that there is a need for changing and improving their design trajectories in order to stay in front of the global market.More specifically, this calls for setting modern design, that is, hardware/software (HW/SW) codesign, and ESL design into actual practice in SMEs, so that they can reduce their timeto-market factor and keep up with their competitors by being more efficient in producing embedded systems.
Although HW/SW codesign and ESL design tools (both commercial and academic) have been available for several years, there are several barriers that, so far, have prevented their wide adoption such as the following: (i) difficulty in transferring the methods and tools developed by academia into industry, because they are mostly developed for experimenting, validating, and proving new concepts rather than for being used in companies; therefore adapting and transferring these methods and tools require additional and tedious efforts, delaying their adoption; (ii) financial cost in terms of tool licenses, training, and so forth that many SMEs cannot afford, since the cost of a complete commercial tool chain can exceed in excess of 150 kC per year; (iii) training cost and knowledge management issues, meaning that switching to a new design trajectory also involves the risk of loosing momentum, that is, loosing time and efficiency because of the training needed to master the new methods and tools; (iv) many modern design flows are not mature enough to generate efficient and automatic real-time code, and combined with the previous item, cause potential adopters to wait until it is safe to switch.
Considerable research has been undertaken to estimate implementation factors such as area, power, and speed up that are subsequently used in HW/SW partitioning tools with different focuses related to granularity, architecture model, communication topology, and so on.All of these research projects do not include the man-power cost which is the most critical one for many companies, and especially SMEs.This work takes its outset in a research framework facilitating the HW/SW partitioning step for SMEs.It focuses on a light design space exploration approach called "DSE-light" that combines the advances in terms of design methodologies found in academia and the ease of integration required by SMEs, that is, lowering the above-mentioned barriers.
The contribution presented in this paper is the development of a method for estimating the man-power cost (i.e., development time) for implementing hardware components and the integration of this method into our framework, so that HW/SW partitioning decisions can be wiser.A method that used iteratively and systematic will form the engine for precise development schedules.The following subsections present the rationale for this work and the idea enabling this contribution.

Parameters that influence the implementation effort
A common problem in both SMEs and larger companies is that of estimating the amount of time required to map and implement an algorithm onto an architecture given parameters such as [4,5] the following : (i) manpower, that is, the available development team(s) and their size(s), (ii) quality of the social interactions between the team members and the teams, (iii) experience of the developers (e.g., years of experience, previously developed projects, novelty of the current project, etc.), (iv) skills of the developers, that is, their ability to solve problems (this is not the same as experience, which only reflects how often one has tried before), (v) availability of suitable and efficient tools and how easy they are to learn and use, (vi) availability of SW/HW IP code/cores, (vii) involvement of the designers, that is, are they working on other projects simultaneously?
(viii) design constraints, that is, real-time requirements, This work addresses the issue of adding man-power cost parameter into the cost function and thereby guiding the HW/SW partitioning.More specifically we concentrate on the mapping process, that is, the process of mapping a given algorithm onto a given architecture and the implementation effort (i.e., time) related to the complexity of that algorithm.Our framework also addresses other issues of HW/SW partitioning, for example, [6].

Idea
In order to understand what makes an algorithm difficult to implement, five semistructured interviews have been conducted with engineers (hardware developers) with very little to 20 years of experience.(Semistructured interview is an information-gathering method of qualitative research.It is also an adequate tool to capture how a person thinks of a particular domain [7].) From the interviews, it was deduced that several parameters influence on the hardware design difficulty.The hardware developers stated that available knowledge about worst cases, dependencies between variables, and the completeness of the design description of the entire system including all communications are important for the design time.However, according to them, the major parameter influencing a hardware design is the number of connections and signals between the internal components.This should be viewed in the way in which every time a signal enters a component, it means that the component needs to act on it.More signals bring more parameters into the component and that very often leads to an increased complexity.
Based on the interviews, we form our hypothesis, which is that a strong relation exists between what renders an algorithm complex to implement and the number of components as well as the number of signals/paths in the algorithm.
To ensure that not only the number of paths are counted but also that a high number of components is present, we choose to only measure the number of linear-independent paths.Furthermore, this insures that components occuring several times during the execution are counted only once, which better reflects the actual implementation efforts.
The remainder of the paper is organised as follows: Section 2 gives an overview of the state-of-the-art methods for estimating the implementation effort both for software and hardware designs and indicates the need for further work for hardware design.In Section 3 a new metric for estimating the development time is defined and combined with our research tool "Design-Trotter."Section 4 presents some test cases used to investigate the validity of the above-mentioned hypothesis and of the proposed metric.Furthermore, the experimental results are analysed.Finally we conclude in Section 5.

Software
Most research about estimating implementation effort is found in the software domain, especially within the COCOMO project [8].The problem of estimating the implementation effort is twofold.First, a reasonable measure needs to be developed for being able to quantify the algorithm.Second, a model needs to be developed, describing a rational relation between the measure and the implementation effort.

COCOMO
To start with the model, a typical power model has been proposed inside the COCOMO experiment [8,9]: where Size is an estimate of the project size, and A and b are adjustable parameters.These parameters are influenced by many external factors which we previously discussed in Section 1.2, but can be trained, based on previous project data.
To use this COCOMO measure, there is a need for expressing the size of the project.Inside the software domain, the dominating metric is lines of code (LOC).Using LOC is not without difficulties, for example, how is a code line defined?Reference [10] discusses this issue and states that LOC is not consistent enough for that use; this is also supported by [11].Using the LOC metric also has several difficulties, for example, it is not a language independent metric.Furthermore, hardware developers also tend to disapprove this measure, since they do not feel that it is a representative measure for hardware designs.
However, we do not claim that there is no relation between LOC and the implementation effort.It is impossible to write 10 k lines in one day, but for VHDL the relation is not always straightforward.In the experiments that we have performed (data shown in Table 1) there is no unambiguous relation between the LOC in VHDL and the development time.
Reference [11] describes that making "a priori" determination of the size of a software project is difficult especially when using the traditional lines of code measure; instead function points-based estimation seems to be more robust.

Function points analysis
The function points metric was first introduced by Albrecht [12] and consists of two main stages: The first stage is counting and classifying the function types for the software.The identified functions need to be weighted reflecting their complexity, that is determined on the basis of the developers' perception.The second stage is the adjustment of the function points according to the application and environment, based on 14 parameters.The function points can then be converted into an LOC measure, based on an implementation language-dependent factor, and, for example, [11] reports that the function points metric can be used as an implementation effort estimation metric.The function points analysis has been criticised of being too heuristic and [10] has proposed the SPQR/20 function points metric as an alternative.Reference [13] has compared the SPQR/20 and the function points analysis and found their accuracy comparable even though the SPQR/20 metric is simpler to estimate.

VHDL function points
To the knowledge of the authors, limited research has been carried out in the field of estimating the implementation difficulty of hardware designs.
Fornaciari et al. [14] have taken up the idea from the function points analysis and modified it to fit VHDL.By counting the number of internal I/O signals and components, and classifying these counts into levels, they extract a function point value related to VHDL.They have related their measure to the number of source lines in the LEON-1 processor project, and their predictions are within 20% of the real size.However, as stated previously, estimating the size does not always give an accurate indication of the implementation difficulty, and the necessary implementation time.
By measuring the number of internal I/O signals and components, their work goes along the same road as our initial observations indicate.However, our approach is pointing towards estimating the implementation effort, based on a behavioural description of the algorithm in the Clanguage.Furthermore, it also takes the designer's experience into account.

METHODOLOGY
The proposed flow for estimating the implementation effort is illustrated in Figure 1.It takes its outset in a behavioural description of the algorithm, in C-language (including library function source code), which is intended to be implemented in hardware.From this description, we use the design-Trotter framework to generate a hierarchical control data flow graph (HCDFG) which is then measured to identify the number of independent paths.The resulting measure, combined with the experience of the developers, gives an estimate of the required implementation effort.The method is self-learning in the sense that after each successful implementation, new knowledge about the developers involved can be integrated, and improve the accuracy of the estimates.The HCDFG and the approach for modelling the developers experience are covered later in this section but initially we investigate how the number of paths can be measured.The starting point is a behavioural description in C of the algorithm to be implemented in hardware (e.g., via VHDL).From this description, an HCDFG is generated and measured to identify the number of independent paths in the algorithm.This measure, combined with the experience of the developers, gives an estimate of the required implementation effort (expressed in time).

Cyclomatic complexity
As described in Section 1.3, the number of independent paths is expected to correlate with the complexity that the engineers are facing when working on the implementation.Therefore, finding a method to measure the number of independent paths in an algorithm could help us investigating this issue.A metric measuring is the cyclomatic complexity measure proposed by McCabe [15] which measures the number of linear-independent paths in the algorithm.
The cyclomatic complexity was originally invented as a way to intuitively quantify the complexity of algorithms, but has later found use for other purposes especially in the software domain.The cyclomatic complexity has been used for evaluating the quality of code in companies [16], where quality covers aspects from understandability over testability to maintainability.It has also been shown [17] that algorithms with a high cyclomatic complexity more frequently have errors than algorithms with lower cyclomatic complexity.The cyclomatic complexity has furthermore been used for evaluating programming languages for parallel computing [18], where languages that encapsulate control statement in instructions are receiving higher scores.All use the cyclomatic complexity measure under the assumptions that the complexity has significant influence on the number of paths the developers need to inspect, its correlation to the number of paths that needs to be tested, or a combination of the two.
In the domain of hardware, the cyclomatic complexity has also found use, judging the readability and maintainability in the SAVE project [19].It is worth noticing that they use a misinterpreted [20] definition of the cyclomatic complexity [21].
All these projects utilise the cyclomatic complexity's ability to measure the number of independent paths and relate them to their individual cases: where π represents the number of condition nodes in the graph G representing the algorithm being analysed.Figure 2 shows two examples of graphs and the corresponding cyclomatic complexity.
In this work, we propose an adapted version of the cyclomatic complexity definition to estimate, a priori, the number of independent paths on a hierarchical control data flow graph (HCDFG), defined in the following section.The cyclomatic complexity for an HCDFG is obtained by examining its subgraphs as explained in Section 3.3.

HCDFG
For this work we use the hierarchical control data flow graphs (HCDFGs), which are introduced in [22,23].The HCDFGs are used to represent an algorithm with a graphbased model so the examination task of the algorithm is eased.Control/Data Flow Graphs (CDFGs) are well accepted by designers as a representation of an algorithm where data flow graphs represent the data flow between different processes/operations, and the control flow layer, encapsulating these data flows and adding control structures to the graphical notation.The hierarchy layered structure is added to help representing large algorithms as well as to enable the analysis mechanism to identify functions/blocks in the graph.Such an identified block can then be seen as a single HCDFG that can be instantiated several times.
For eval.

For
For body #1 For evol.In this work the design space exploration tool "Design-Trotter" is used as an engine for analysing the algorithms.The HCDFG model is used as "Design-Trotter's" internal representation.
The hierarchy of an HCDFG is shown in Figure 3.An HCDFG can consist of other HCDFGs, Control/Data flow graphs (CDFGs) and data flow graphs (DFGs) as well as elementary nodes (processing, memory, and control nodes).An HCDFG is connected via dependency edges.In this work we only explore the graph at levels above the DFGs, and therefore only concentrate on these when we define the graph types in what follows.
Let us consider the hierarchical control data flow graph, G HCDFG = (N HCDFG , E HCDFG ), where N HCDFG arethe nodes denoted by N HCDFG = {n HCDFG1 , . . ., n HCDFGm } and the nodes are N HCDFG ∈ {G HCDFG |G CDFG |G DFG |Data}, meaning that the nodes in the G HCDFG can be instances of its own type, encapsulated control data flow graphs, G CDFG , encapsulated data flow graphs G DFG , or data transfer nodes, Data.The last one is introduced to avoid the duplication of data representations in the hierarchy, when data is exchanged between the graphs.Thereby, data are only represented by their nodes and not by edges as it is common in many other types of DFGs.
The edges, E HCDFG , connect the nodes such that E HCDFG = {e nHCDFG i ,nHCDFG j }, where i / = j and represent the indexes of the nodes, E HCDFG ∈ {DD} and where every node can have multiple input and/or output edges.For the G HCDFG , only data dependencies, DD, are allowed, and no control dependencies, CD.
In this way the HCDFG forms a hierarchy of encapsulated HCDFGs, CDFGs, and DFGs, connected via exchanging data nodes.The HCDFG can be seen as a container graph for other graph types such as the CDFG.
We can define the CDFG as G CDFG = (N CDFG , E CDFG ), where N CDFG are the nodes denoted by N CDFG = {n CDFG1 , . . ., n CDFGm } and the nodes are N CDFG ∈ {CC|G HCDFG | G DFG |Data}, where CC ∈ {if|switch|for|while|do-while}.In this way the G CDFG is able to describe common control structures, where the actual data processing is encapsulated in either DFGs or HCDFGs.Again, the data exchange nodes are used to exchange data between the other nodes.
Beneath the control data flow graphs G CDFG , the data flow graphs G DFG exist but they are of no use in this work so we will not define them further here.

Calculating the cyclomatic complexity on CDFGs
Now that the HCDFG has been defined, we explain our proposed method for measuring the cyclomatic complexity on the CDFGs.
Since the cyclomatic complexity only considers the control structure in finding the number of independent paths in the algorithm, the DFG part of the algorithm is, as mentioned earlier, of no interest for this task because it only gives a single path.On the other hand, what is of interest is how the cyclomatic complexity is measured on the CDFGs and HCDFGs which are built by the tool Design-Trotter.This leaves us with the following cases which are described in detail afterwards: (i) If constructs, (ii) Switch constructs, (iii) For-loop, (iv) While/do-while loops, (v) Functions, (vi) HCDFGs in parallel, (vii) HCDFGs in serial sequence.

If constructs
"If constructs" case is represented as CDFGs, G CDFG , where one node is a control node of type if (see Figure 4(a)).Before arriving at the control node, a condition evaluation node n eval ∈ {G HCDFG |G DFG } is traversed to calculate the boolean variable stored in n Data (to maintain simplicity, these are not shown in Figure 4(a)) that is used in the condition node.If the variable is true, the algorithm follows the path through the true body node, n true ∈ {G HCDFG |G DFG |∅}.Else it goes to the false body node n false ∈ {G HCDFG |G DFG |∅}.Note that in some cases, either the true body or the false body does not exist, but it still gives a path.In this case, according to the cyclomatic complexity measure, the number of independent paths is P n if = P n true + P n false + P n eval − 1. ( The last part of (3), +P(n eval ) − 1 is included in case the evaluation graph is an HCDFG node.

Switch constructs
"Switch constructs" case is represented as CDFGs, G CDFG , and is almost the same flow as the "if constructs" case discussed above.One node is a control node of switch type.Before arriving to the control node, a condition evaluation node n eval ∈ {G HCDFG |G DFG } is traversed.Depending on the output, the switch node leads the algorithm flow to the selected case node: n casei ∈ {G HCDFG |G DFG }.An example is shown in Figure 4(b).According to the cyclomatic complexity measure, the number of independent paths is as follows : where N represents the number of cases, i the index to the corresponding node on which the paths are measured.The same argument goes for the P(n eval )−1 part of (4); it is included in case the evaluation graph is an HCDFG node, but else it is omitted.Between the (HC)DFGs there is a set of data exchange nodes which are here left out for simplicity.The symbols are similar to those presented in Figure 3.

For-loop
"For-loop" case is the most complex of the control structures.Strictly speaking, a "for loop" consists of three different parts: the evaluation body, the evolution body, and the for body, n eval , n evol , and n for-body , respectively.The control node nfor, determines, based on the output from the evaluation graph, whether the flow should go into the "for loop" or leave it.The evolution node updates the indexes.Since each iteration of the graph needs to pass through the evaluation and evolution nodes, the number of independent paths is calculated as P n for = P n for-body + P n eval − 1 + P n evol − 1. ( 5) In many cases, the evaluation and evolution part of the "for loop" are quite simple indexing functions, meaning that n eval ∈ {G DFG }, n evol ∈ {G DFG }, will leave P(n for ) = P(n for-body ).The "for loop" is illustrated in Figure 4(d).

While loops and do-while loops
"While loops" and "do-while loops" cases are described jointly since it is only the entry to the loop structure that separates them and their cyclomatic complexity are equivalent.The "while loops" consist of two main parts: the while body n while-body ∈ {G HCDFG |G DFG }, and the while evaluation n eval ∈ {G HCDFG |G DFG }.This is illustrated in Figure 4(c).Deciding whether to continue looping is decided by the control node n while ∈ {while} based on the output of the n eval .Similarly to the "for loop," each iteration of the graph needs to pass through the evaluation nodes, so the number of independent paths can be calculated as P n while = P n while-body + P n eval − 1.
In many cases, the evaluation part of the while loop is a set of simple test functions, meaning that n eval ∈ {G DFG }, which leaves the P(n while ) = P(n while-body ).

Functions
The goal is to identify the number of independent paths in the algorithm/system.For this, reuse in terms of functions/blocks of code is important.When all independent paths through a function are known, reuse of this function does not change the number of independent paths in the system.From an implementation point of view, such functions represent an entity where the paths only need to be implemented once.In HCDFGs, a function/block can be seen as an encapsulated G HCDFG .Therefore, the number of independent paths in function/blocks of reused code should only count once.The paths can be calculated as P n HCDFGfunction = 0 if reuse, P n HCDFG else. (7)

HCDFGs in parallel and serial
Knowing how to handle all the HCDFGs that are identified for reuse (function), together with all the CDFGs, does not give it all.How the hierarchy of graphs should be combined is also of interest.For a parallel combination of two or more HCDFGs/CDFGs, as shown in Figure 4(e), the increase in the number of independent paths is then additive.The number of paths can be calculated as where N represents the number of nodes in parallel, i the index to the corresponding node where the paths are measured.
For serial combination of two or more HCDFGs and/or CDFGs, the number of independent paths is a combination of the independent paths of the involved HCDFGs/CDFGs.Remembering that there always needs to be one path through the system, the number of independent paths in a serial combination, is given as where N represents the number of nodes in serial, i the index to the corresponding node where the paths are measured.An example of serial combination is shown in Figure 4(f).The number of independent paths for the entire algorithm, (P(n HCDFGAlg )), is equivalent to the top HCDFG node which includes all the independent paths of its subgraphs.

Experience impact
The experience of the designer has an impact on the challenge that he/she is facing when developing a system.A radical example is when a beginner and a developer with ten years of experience are asked to solve the same task.They will not see equal difficulty in the same task, and thereby do not need to put the same effort into the development.
Experience is influenced by many parameters but in this work we only focus on the time the developer has worked with the implementation language and the target architecture.
The impact of experience is a factor that slowly decreases over time: consider a new developer, the experience that he/she obtains in the first months working with the language, and architecture improves his/her skills significantly.On the other hand, a developer who has worked with the language and architecture for five years, for example, will not improve her/his skills at the same rate by working an extra year.The impact from the experience is therefore not linear but tends to have a negative acceleration or inverse logarithmic nature, with dramatic change in impact in the beginning, progressing towards little or no change as time increases.In literature, for example, [24], many studies try to fit historical data to models.An example of a model is a power function with negative slope or a negative exponential function.From the vast variety of models that has been proposed over the years, the only conclusion that can be drawn is that there are multiple curvatures, but they all appear to have a negative accelerating slope, which tends to be exponential/logarithmic.
In order to get the best possible outset for predicting the implementation effort, it is of vital importance to obtain some data of the developers' experiences, and also how they performed in the past.The parameters involved in the experience curve can then be trimmed to create the best possible fit.However, it has not been the purpose of this work to select the perfect nature for a learning curve nor to evaluate the accuracy of such one.The learning curve will be adapted to the individual developers, and as the model is used in subsequent projects, its accuracy will progressively improve.As a consequence, the experience here is only intended as an element in modelling the complexity and thereby a means for more accurate estimates.
For the experiments in this study we have chosen to use the following model: where α and β are trim parameters which can be used to optimise the curve to fit reality, Experience is the number of weeks which the developer, Dev, has worked with the language and architecture.Figure 5 depicts the shape of the experience model.In this work,our initial experiments have shown that setting α = 1 and β = 1 makes our model sufficiently general, and therefore we have not further investigated the tuning of these two parameters.

RESULTS
In order to verify the hypothesis, a classical test has been conducted.The test is dual phased and consists of (i) a training phase using a first set of real-life data, during which the hypothesis is said to be true, and (ii) a validation phase during which a second set of real-life data is used to evaluate whether the hypothesis holds true or not.

Phase one-training
The real-life data used as training data originate from two different application types that are both developed as academic projects in universities in France.The first application is composed of five different video processing algorithms for an intelligent camera, which is able to track moving objects in a video sequence.The second application is a cryptographic system, able to encrypt data with different cryptographic/hashing algorithms, that is, MD5, AES and SHA-1.The system consists of one combined engine [25] as well as individual implementations.These projects were selected since they all follow the methodology of using a behavioural specification in C, as a starting point for the VHDL implementation.Common to this data is that none of the developers has made the behavioural specification in C. For the cryptographic algorithms the behavioural specification comes from the standards, and the video algorithms were based on a previous project.
Using the behavioural description as the starting point of the experiment, the exercise consists of studying the relationship between the complexity of the algorithms (as defined in Section 3) and the implementation effort (i.e., time) required to implement them in VHDL (including testbed and heuristic tests).
The developers involved in these projects have all been Master and Ph.D. students with electrical engineering backgrounds but no VHDL background other than what they obtained during their studies, see Table 2.All developers were taught VHDL by other instructors than the authors, but at our university.Table 3 summaries the training data.Figure 6 shows the relation between the implementation effort and the measured complexity for the individual algorithms.Please note that in this graph the complexity values are not yet corrected for the designers' experience.
A first examination of the data points indicates a possible relation between some of them.However many other points are located far away from any relation.These data are not corrected for the designers' experience and, as earlier mentioned, we strongly believe that the experience of the individual designer has a nonnegligible influence on the development time.If we inspect the data more thoroughly, it is clear that the points of greatest divergence are those implementations where the developers have very limited knowledge and experience with the VHDL language.
Applying the proposed equation ( 10) (nonlinear) experience transform onto the data, results in a significantly different picture as depicted in Figure 7.A clear trend toward a relation is now visible in the plotted data.From the COCOMO II project [8], it is known that the relationship between the implementation time and the complexity measure (in their case lines of code, LOC) can be expressed as a power function with a weak slope.We showed its nature in (1), and with correction for experience it becomes

Phase two-validation
After having elaborated on a model based on the training data, we proceeded with the validation of its correctness.For this, a new set of data provided by ETI A/S, a Danish SME, is used.The dataset originates from a networking system and consists of Ethernet applications that have been implemented on an FPGA, as well as corresponding testbeds.This Ethernet application is part of an existing system with which it requires interaction.Table 1 shows additional implementation information with regards to these applications.
The system is a real-time system with hard-time constraints and all algorithms were implemented as to meet these constraints.Similar to the training data, the development flow for this application has been as follows: a behavioural C++ model of the application has been constructed before the implementation on the FPGA architecture.for the implementation have obtained their skills in VHDL from a professional course with no relation to our university in Denmark.
The time spent on the implementation process covers: the design and implementation of the VHDL code of the functionalities and testbed as well as the tests of the different modules in the applications.This data is shown in the lower part of Table 3.The time data originate from the company's internal registration for the project, and correspond therefore to the effective time used.
The relation between implementation effort and complexity is plotted in Figure 8.It can be seen that this data, corrected for the designers' experience ( * ) closely follows the model derived from the training data (dashed line).Figure 8 also shows the 95% confidence interval, indicating that, with 95% confidence, future predictions of implementation effort will lie within this interval, given that the model holds true.
Comparing the predicted effort (dashed line) to the real effort ( * ), indicates that there is an estimation error.The values are also shown in Table 4, The average estimation error is 0.2 week with a variance of 8.In the next section, we discuss the validity of the model.

Validity discussion
Estimating the effort required in implementing an algorithm into hardware involves many parameters.We discussed a number of these parameters in Section 1.2, but could not include them all in this study.The proposed model is therefore devised from the idea of the relation between implementation effort and number of linear-independent paths.To validate the model, a classical two-phased hypothesis test has been performed and the validity of this test depends on the following important factors: (i) the independence between training and validation data; (ii) the volume and variety of the experiments.
In the first instance, not only different applications were used for training and validation data, but in addition the developers had no relation in terms of education, nationality, work, and so forth.Moreover, the validation data has not been measured before the model was trained.All this strengthens the validity of the results.The only potential connection is that some of the developers who have been involved in the implementation of the training and validation data have also been included within those interviewed.However, this accounts for a minority and we see this as a minimal risk.
Secondly, we should ideally have had a large volume and variety of experimental data for training and validation.However, our set of data originates from a single company and a few developers.So strictly speaking we can only conclude that this model applies to the specific SME setup involved in the study and partially to the academic environment studied.
In order to generalise our model, more cases of validation are needed.However, obtaining all the statistical data for this new methodology is time consuming.We would therefore like to remind the reader that this paper proposes a methodology for estimating implementation effort and the validation of the model concentrates on illustrating its usefulness.Looking at the graphs, we can determine a clear trend in the results.The curve identified in the training data is sustained for the validation data as well: they both fall in line with the underlying rationale, and we are quite confident in the strength of the proposed model.
The results clearly show the necessity for the proposed correction function; the proposed logarithmic nature works well, even though the correction function has not been trimmed to fit the individual developers due to the lack of available data.In this light, our approach must be seen as the engine of a global methodology for the management of design projects, that impose a systematic registration of man-power.With such a registration, a database of the developers' experience can easily be constructed and the correction function can be trimmed to fit the companies' individual designers.Several iterations of this process would provide convergence towards a more precise estimation of the implementation effort.
The limited data set on which the model is constructed also limits the complexity window to which this model can be applied: having no algorithm with a corrected complexity value larger than 51, extrapolating the model further would weaken the current conclusion.More training data, from larger and more varied projects would allow for a more refined model.
Nevertheless, the results described in this paper are very encouraging with all the real-life cases that we have examined and we are reasonably confident that this model can easily be applied to other types of applications.

CONCLUSION
The contribution presented in this paper is a metric-based approach for estimating the time needed for hardware implementation in relation to the complexity of an algorithm.We have deduced that a relationship exists between the number of linear-independent paths in the algorithm and the corresponding implementation effort.We have proposed an original solution for estimating implementation effort that extends the concept of the cyclomatic complexity.
To further improve our solution, we developed a more realistic estimation model that includes a correction function to take into account the designer's experience.
We have implemented this solution in our tool design Trotter of which the input is a behavioural description in C language and the output is the number of independent paths.Based on this output and the proposed model, we are able to predict the required implementation effort.Our experimental results, using industrial Ethernet applications, confirmed that the data, corrected for the designers' experience, follows the derived model closely and that all data falls inside its 95% confidence interval.Using this method iteratively paves the way for an implementation effort estimator of which the accuracy improves continuously after each project.

Figure 1 :
Figure1: The flow of estimating the required implementation effort.The starting point is a behavioural description in C of the algorithm to be implemented in hardware (e.g., via VHDL).From this description, an HCDFG is generated and measured to identify the number of independent paths in the algorithm.This measure, combined with the experience of the developers, gives an estimate of the required implementation effort (expressed in time).

= 3 Figure 2 :
Figure 2: Two examples of graphs for which the cyclomatic complexities have been calculated.

Figure 3 :
Figure 3: An overview of how the hierarchy in an HCDFG allows analysis of an algorithm on different levels and how the levels are related.

Figure 3
Figure 3  shows an example of a hierarchical control data flow graph.In this work the design space exploration tool "Design-Trotter" is used as an engine for analysing the algorithms.The HCDFG model is used as "Design-Trotter's" internal representation.

Figure 4 :
Figure 4: Overview of the different CDFGs and combined HCD-FGs, on which the cyclomatic complexity values are measured.Between the (HC)DFGs there is a set of data exchange nodes which are here left out for simplicity.The symbols are similar to those presented in Figure3.

Figure 5 :Figure 6 :
Figure 5: An example of how the lack of experience impacts the difficulty the engineers are facing.

Figure 7 :
Figure 7: Relation between the implementation effort (number of weeks) and the complexity corrected according to the designers' experience model as shown in Figure 5.

Figure 7
Figure 7 the dashed line illustrates the relationship, with the parameters given above.

Figure 8 :
Figure 8: Validation data plot: relation between implementation effort (number of weeks) and complexity, corrected according to the designers' experience model.

Table 1 :
Line of code, area, and time constraints for the validation data.

Table 2 :
Facts about the developers.Developers for training data (top) and validation data (bottom).

Table 3 :
Training data (top) and validation data (bottom).Algorithms are related to the developers and their experience at the given time.Complexity is not corrected.

Table 4 :
Development time and estimated development time measured in weeks together with the error.