Team:Tianjin/Modeling/Human
From 2012.igem.org
Background
Gene Contamination has been proposed as a potential hazard by Chinese specialist. Transgenic suffered from 3 aspects of nondeterminacy, the chain reactor after the altered of life structure, the potential risk of the food chain and the transgenic contamination, propagation, proliferation, and clear. Until the beginning of the 21st century, transgenic contamination has occurred in many countries world widely, such as Starlink Corn in America, transgenic canola in Canada and corn transgenic contamination in Mexico. All of these facts prove that the hazard of transgenic contamination could not be regarded. But for the limitation of the statistics, many prediction and analysis couldn’t be conducted. In our modeling process, we start from the most fundamental problem, trying to construct a right stream of proposing, analyzing and solving problem from detail to macroscopy. Two models concerning proposing problem predicted the hazard of genetic contamination. The analyzing model focused on a variety factors relating with the attitudes of four major group in society, government, industry, public and institutes. The solving model emphasized on two hands, one is the effectiveness of our project and, on the other hand, the publicizing strategy referring to the necessity of the mastery of the genetic safety knowledge. What's more, parts of our modeling can be adapted to most of the similar issues, which is a acceptable reference for the followup teams.
iGEM Project evaluation model
The problem
Our projects need to be evaluated through eyes of public, especially through comparison with other similar projects. Evaluation are difficult to be done without comparison, to evaluate our project of genetic safety, we have compared our AegiSafe O-Key with other similar gene safety projects: that of the 2011 Imperial college of London and of the 2011 Yale University. After thoroughly investigation of how others evaluate scientific projects, we have made up an evaluation system to assess our project through eyes of people who have been introduced about our projects. What is more, this system can also be used to comprehensively evaluate any other iGEM team projects to individual positioning and to simplify and to improve the lengthy evaluation work of iGEM judgers.
Our evaluation system
In assessing the various projects from the whole world, a comprehensive evaluation contains divergence measurable indexes. We have done so much literature work to make sure that our evaluation system contains all related and useful perspectives. Finally, an overall and systematic evaluation system is established. The frame diagram of this system is shown in the Figure 1. The shape of the system seems like a tree, so it is also called evaluation tree.
How to use the system
The evaluation system is also very easy to use, we only need to rank the teams waiting for evaluation through the different perspectives at the leaf note (the points which do not have child note) of the evaluation tree. After collecting of the data, we only need to do some basic calculation to obtain the final result of the evaluation of these teams on different layer of evaluation index. All these calculation results will be very useful to help us to understand all these teams' performance in all possible level and perspectives.
The inner operating principles: AHP
All the data are obtained from that of questionnaire. And the relative values are transformed through comparison matrix. All the other calculation procedures are flow the previous procedure of AHP method.
Results and discussion
- The overall performance of the three teams
- The performance of the three teams in the aspect of overall economic factor
- The performance of the three teams in the aspect of social welfare factor
- The performance of the three teams in the aspect of academic level
- The performance of the three teams in the aspect of green and sustainability
Conclusion
An overall evaluation system of iGEM team projects is established and tested through our example (comparison of 2011 Yale, 2011 Imperial College and 2012 Tianjin)
To test the reasonableness of our assessment method, we carried out a test assessment of three iGEM team: Yale 2011, Imperial College London 2011 and TJU 2012. The results showed TJU 2012 rank first, and then Imperial College London 2011 and Yale 2011.
Though the data in the above table, we can see that Yale 2011 ranks first in academic level. This is in accordance with our general thought. TJU 2012 behaves relatively poor in academic level, which is just the weakness of TJU 2012.
In all other aspects, TJU2012 ranks first. This result is not so reasonable. We have reflected upon this problem and proposed several explanations. The main reason may be that our sample size is not large enough and our questionnaire has some unapparent flaws. We can further refine our assessment in several aspects, as shown below.
Reallocate weight value to each assessment level. To complete the job, we need to more surveys and data.
Reconsider every question and find out the ones with biased tendency.
Size up sample capacity. We need to carry out more surveys in varies people to decrease random deviate induced by individual assessment.
2. Risk evaluation system of genetic pollution
Background and Problem
With the increasing concern about the genetic pollution in our society together with the little knowledge about this newly occurred problem, we are more urgently required to establish a reasonable and comprehensive evaluation system to determine the actual genetic pollution level. With this system, we can also evaluate the seriousness of the genetic pollution under different times and the evaluation result might also be used to predict the future genetic pollution.
Establishment of the System
The construction of this system is based on many literatures analysis which contain related and useful perspectives. The frame diagram of this system is shown in the Figure 1. The shape of the system seems like a tree, so it is also called evaluation tree.
How to use the system
The evaluation system is also very easy to use, we only need to give a grade within the range of (0, 10) through the different perspectives at the leaf note (the points which do not have child note) of the evaluation tree. After collecting of the data, we only need to do some basic calculation to obtain the final result of the evaluation of these teams on different layer of evaluation index. All these calculation results will be very helpful to help us to understand all these teams' performance in all possible level and perspectives.
The inner operating principles
All the data are obtained from that of questionnaire. And the relative values are transformed through comparison matrix. This assessment has three levels. There are four aspects influencing the possible harm of Genetic Pollution. Aim level is the ranking list of the four aspects. The Second level is the detailed information about these aspects. The third level contains further assessment questions. From bottom to top, each branch has its weight of value. Therefore, we can calculate the points of that four aspects.
Result and discussion
Though this assessment, we can notice that the range of academic use and remedial possibility are the most two important factors.
Transgenic technology is a newly born technology. It’s used mostly in scientific fields. As a consequence, the main part of people’s worry about transgenic technology comes from its wide use in research. Scientific research must obey strictly the rule of safety, especially when research is about some new technology with uncertain hazard.
Another source of worry is remedial possibility. Genetic pollution may endanger the whole system. Since we know little about its detailed risks, and at the same time we know the possibility of great pollution can’t be ignored, remedial possibility turns to be another main source of worry.
So far, transgenic technology has little application in industry and commercial fields. For this reason, the weight of values of industrial use wide range and public emphasis are relatively small.
In the table, the grade of genetic pollution on four perspectives is within the range of 0 to 10.
Conclusion
An overall risk evaluation system of the genetic pollution is established and used to evaluate the relative seriousness of this problem. We can also use this system to evaluate the seriousness of genetic pollution at different times and to predict the development tendency of this problem.
3. Influence on public attitude to-wards genetic pollution
Background and Problem
For a social problem, it will call for support from different social power. At least, there should be little barriers from various social groups. What is more, the society should pay enough attention to the newly occurred genetic pollution, and what factor has more influence on the public attention towards the genetic pollution is unquestionably urgent and important. Consequently, we designed a model to quantitatively analyze the problem. The source data of the model are derived from the questionnaire.
Another problem that still exists is that all the probable factors that might impact the public attitude are complex and complicated that make our analysis more difficult to analysis with order. So the other task for us is to classify the factors to make sure the factors with certain primary character is classified into one group.
Classification method and Evaluation system
In searching for factors in various perspectives and fields that might impact the public attitude towards the genetic pollution problem, we have searched many articles and record all the potential factors, after integration we have achieved 26 factors that will potentially impact the public attitude and attention towards genetic pollution.
First of all, we distributed questionnaires towards the society which contains primarily four kinds of groups: 1) governmental leaders, 2) business decision makers, 3) general social public, 4) researcher. In the questionnaire we let them to grade the importance of the 26 potential factors on public attitudes.
Secondly, we collect these data and use the cluster analysis to analyze these factors’ grade from all the survey respondents. The principle component analysis can sep-arate these 26 factors into some groups. We can also find some inherent connections amount these factors within the same group and consequently find the significance of each group. After separation of groups, the evaluation system is constructed. The final evaluation system is shown in Figure 1.
With such system, we can know the relative determinant factor of public attitude of the four groups. And thus know what is treated as important factors or influential groups that determine the public attitude towards genetic pollution. This will be very helpful to determine what factor or group is the common one that is thought of as the most important factor that determine public attitude. Such factors or groups should be the one that deserve the devotion in attracting the public attention towards the problem. This factor will probably be the one whose devotion of energy and time will encounter the least resistance.
The mathematical method: PCA methods
All the data are obtained from that of questionnaire. And the relative values are transformed through comparison matrix. All the other calculation procedures are flow the previous procedure of PCA method.
Result and Discussion
Since the data, especially the ones from the governmental leader are difficult to obtain. As a result, we cannot do the analyzing work; and our work is still on the track of searching for appropriate samples from the group of governmental leaders to collects data.
Future Conclusion
We can gain a comprehensive evaluation system to determine the influential factors that determine the public attitude and attention towards genetic pollution. At the same time, we can also achieve which factor is the influential factor that count and which factor’s influence is minute. This will be very helpful to determine what factor or group is the common one that is thought of as the most important factor that determine public attitude. Such factors or groups should be the one that deserve the devotion in attracting the public attention towards the problem. This factor will probably be the one whose devotion of energy and time will encounter the least resistance. We can also avoid the factors that do not pay enough attention by all the people from the four social groups.
4. Determination of the Optimum propaganda method
Background and problem
The propaganda of genetic pollution is unquestionably a very important method to the prevention and protection of genetic pollution. However, there are so many propaganda methods with various properties for us to choose from. Therefore, we are in need of searching for the optimum propaganda methods. Another trouble encountered is that the optimum propaganda method is difficult to determine be-cause there are many properties of the method that should be taken into account. Therefore, we also need to establish a system to evaluate the optimum propaganda method within the public eyes.
Our evaluation system
In evaluation a propaganda method, there are at least three perspectives that should be considered: the cost, the speed and the efficiency. However, these three per-spectives are too abstract to evaluate quantitatively. As a result, we established a modeling system to evaluate the propaganda effects with many child evaluation indexes that are measurable and specific, to determine the overall performance of these propagation methods. All the candidate includes: the newspaper, magazine and journal, leaflet, TV advertisement, online picture, online video, lecture, broadcast and oral communication. The system is established as shown in figure1.
The inner operating principles: AHP
All the data are obtained from that of questionnaire. And the relative values are transformed through comparison matrix. All the other calculation procedures are flow the previous procedure of AHP method.
Result and discussion
- The overall performance grades of the nine propaganda method are listed in the following figure, with the range of grade from 0 to 10.
- The performance grades of propagation effects of the nine propaganda method are listed in the following figure, with the range of grade from 0 to 10.
- The performance grades of propagation cost of the nine propaganda method are listed in the following figure, with the range of grade from 0 to 10.
Conclusion
The calculation result table is shown below. kjhh
This is the result of an assessment about different ways of propaganda. We can see that online video is the most effective way. This is because of the wide coverage of the internet and the viv-idness of video materials. Compare several ways with high points, we can find the advantages shared by effective ways.
- Using attractive methods, such as internet, video, lecture etc.
- Changing from passive reception to positive participation, such as lecture and oral communication. When people participate positively, they have more sound impression, thus resulting in effective propaganda.
- Novelty and attractiveness. People are more easily attracted by vivid and interesting information.
There are also some common characters marking a humdrum propaganda, such as small coverage, hackneyed design etc.
There are also some common characters marking a humdrum propaganda, such as small cover-age, hackneyed design etc.
5. Universal Risk evaluation model of Environmental problem
Background and Problem
Nowadays, a variety of environmental concerns such as air and water pollution have surfaced as a result of progress in technology. The diversity of pollution makes it difficult to judge the seriousness of the many problems. A universal evaluation system of “valid eve-rywhere” risk analysis model is called for by the public. Through risk evaluation of this systematic model, we can know the hazard potential of any environmental problem in all aspects. With this system, we can also compare unfamiliar problems with the well-known ones to understand the different performance of these problems in various perspectives.
Our evaluation system
In assessing the various problems in divergent fields, a comprehensive evaluation contains all related specific and measurable indexes. We have done quite a lot of literature work to make sure that our evaluation system contains all related and useful perspectives. Finally, an overall and systematic evaluation system is established. The frame diagram of this system is shown in the figure 1. The shape of the system seems like a tree, so it is also called evaluation tree.
To test the system and also to know the dangerousness of genetic pollution, we have induced the land desertification, the green house effects and the water eutrophication, together with the genetic pollution, to compare the relative performance of the four environmental problems in different aspects.
How to use the system
A questionnaire survey should be conducted to collect data of public attitude towards the four problems. The evaluation system is also very easy to use, we only need to rank the four problems waiting for evaluation through the different perspectives at the leaf note (the points which do not have child note) of the evaluation tree. After collecting of the data, we only need to do some basic calculation to obtain the final result of the evaluation of these problems on different layer of evaluation index. All these calculation results will be very helpful to help us to understand all these teams' performance in all possible level and perspectives.
The inner operating principles: AHP
All the data are obtained from that of questionnaire. And the relative values are transformed through comparison matrix. All the other calculation procedures are flow the previous procedure of AHP method.
Result and Discussion
- The overall performance grade of four environmental problems
- The performance grade of four environmental problems from the economic perspective
- The performance grade of four environmental problems from the social perspective
- The performance grade of four environmental problems from the environmental perspective
- The performance grade of four environmental problems from the prevention and cureing perspective
- The performance grade of four environmental problems from the public life and health perspective
Conclusion
At last, we have established a universal evaluation system which can assess various environmental problems especially the newly occurred ones such as the genetic pollution. This system can also assist the public to understand the new problem better from various perspective by comparing with that of the familiar problems.
Analytical hierarchy process (AHP)
What is AHP?
The analytic hierarchy process (AHP) is a structured technique for organizing and analyz-ing complex decisions. Based on mathematics and psychology, it was developed by Thomas L. Saaty in the 1970s and has been extensively studied and refined since then.
It has particular application in group decision making, and is used around the world in a wide variety of decision situations, in fields such as government, business, industry, healthcare, and education.
Rather than prescribing a "correct" decision, the AHP helps decision makers find one that best suits their goal and their understanding of the problem. It provides a comprehensive and ra-tional framework for structuring a decision problem, for representing and quantifying its ele-ments, for relating those elements to overall goals, and for evaluating alternative solutions.
Is there any application in real life?
While it can be used by individuals working on straightforward decisions, the Analytic Hierarchy Process (AHP) is most useful where teams of people are working on complex problems, especially those with high stakes, involving human perceptions and judgments, whose resolutions have long-term repercussions. It has unique advantages when important elements of the decision are difficult to quantify or compare, or where communication among team members is impeded by their different specializations, terminologies, or perspectives. Decision situations to which the AHP can be applied include:
- Choice - The selection of one alternative from a given set of alternatives, usually where there are multiple decision criteria involved.
- Ranking - Putting a set of alternatives in order from most to least desirable
- Prioritization - Determining the relative merit of members of a set of alternatives, as opposed to selecting a single one or merely ranking them
- Resource allocation - Apportioning resources among a set of alternatives
- Benchmarking - Comparing the processes in one's own organization with those of other best-of-breed organizations
- Quality management - Dealing with the multidimensional aspects of quality and quality improvement
- Conflict resolution - Settling disputes between parties with apparently incompatible goals or positions
The AHP procedure
Users of the AHP first decompose their decision problem into a hierarchy of more easily comprehended sub-problems, each of which can be analyzed independently. The elements of the hierarchy can relate to any aspect of the decision problem—tangible or intangible, carefully measured or roughly estimated, well- or poorly-understood—anything at all that applies to the decision at hand.
Once the hierarchy is built, the decision makers systematically evaluate its various elements by comparing them to one another two at a time, with respect to their impact on an element above them in the hierarchy. In making the comparisons, the decision makers can use concrete data about the elements, but they typically use their judgments about the elements' relative meaning and importance. It is the essence of the AHP that human judgments, and not just the underlying information, can be used in performing the evaluations.
The AHP converts these evaluations to numerical values that can be processed and compared over the entire range of the problem. A numerical weight or priority is derived for each element of the hierarchy, allowing diverse and often incommensurable elements to be compared to one another in a rational and consistent way. This capability distinguishes the AHP from other decision making techniques.
In the final step of the process, numerical priorities are calculated for each of the decision alternatives. These numbers represent the alternatives' relative ability to achieve the decision goal, so they allow a straightforward consideration of the various courses of action.
Several firms supply computer software to assist in using the process.
Calculation result through AHP method
- iGEM Project evaluation model
- Determination of the Optimum propaganda method
- Universal Risk evaluation model of Environmental problem
Principal component analysis (PCA)
What is the problem?
In assessing the main reasons of how the genetic pollution impact the world, we can analyze it from different perspectives, and the evaluation indexes are too numerous and complicated.
To make our assessing task easier, we need to classify all these evaluation indexes into various independent types according to the inherent characteristics of the indexes themselves. This is also called dimensionality reduction which can simplify our problem into various types. Through classification, we can also evaluate our projects through different independent perspectives. What is more, this type of classification can also provide some basic knowledge to classes.
After the classification we can not only analyze the problems from the main per-spective which avoid the trouble of assessing the same problem from too many indexes, we can also know the performance of one problem in different single prospects.
As for how to accomplish this task, we need to use the principal component analysis (PCA) method.
What is principal component analysis?
Principal component analysis (PCA) is a mathematical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components. The number of principal components is less than or equal to the number of original variables. This transformation is defined in such a way that the first principal component has the largest possible variance (that is, accounts for as much of the variability in the data as possible), and each succeeding component in turn has the highest variance possible under the constraint that it be orthogonal to (i.e., uncorrelated with) the preceding components. Principal components are guaranteed to be independent only if the data set is jointly normally distributed. PCA is sensitive to the rela-tive scaling of the original variables. Depending on the field of application, it is also named the discrete Karhunen–Loève transform (KLT), the Hotelling transform or proper orthogonal de-composition (POD).
PCA was invented in 1901 by Karl Pearson. Now it is mostly used as a tool in exploratory data analysis and for making predictive models. PCA can be done by eigenvalue decompo-sition of a data covariance (or correlation) matrix or singular value decomposition of a data matrix, usually after mean centering (and normalizing or using Z-scores) the data matrix for each attribute. The results of a PCA are usually discussed in terms of component scores, sometimes called factor scores (the transformed variable values corresponding to a particular data point), and loadings (the weight by which each standardized original variable should be multiplied to get the component score).
PCA is the simplest of the true eigenvector-based multivariate analyses. Often, its operation can be thought of as revealing the internal structure of the data in a way that best explains the variance in the data. If a multivariate dataset is visualized as a set of coordinates in a high-dimensional data space (1 axis per variable), PCA can supply the user with a low-er-dimensional picture, a "shadow" of this object when viewed from its (in some sense) most informative viewpoint. This is done by using only the first few principal components so that the dimensionality of the transformed data is reduced.
PCA is closely related to factor analysis. Factor analysis typically incorporates more domain specific assumptions about the underlying structure and solves eigenvectors of a slightly different matrix.
How to use PAC?
Principal component analysis (PCA) involves a mathematical procedure that transforms a number of (possibly) correlated variables into a (smaller) number of uncorrelated variables called principal components. The first principal component accounts for as much of the var-iability in the data as possible, and each succeeding component accounts for as much of the remaining variability as possible.
- Determine the Objectives?
Generally, there are two objectives of the PAC listed in the following:
- To discover or to reduce the dimensionality of the data set.
- To identify new meaningful underlying variables.
- How to start?
We assume that the multi-dimensional data have been collected in a Table Of Real data matrix, in which the rows are associated with the cases and the columns with the variables.
Traditionally, principal component analysis is performed on the symmetric Covariance matrix or on the symmetric Correlation matrix. These matrices can be calculated from the data matrix. The covariance matrix contains scaled sums of squares and cross products. A correlation matrix is like a covariance matrix but first the variables, i.e. the columns, have been standardized. We will have to standardize the data first if the variances of variables differ much, or if the units of measurement of the variables differ. You can standardize the data in the Table Of Real by choosing Standardize columns.
To perform the analysis, we select the Tabel Of Real data matrix in the list of objects and choose To PCA. This results in a new PCA object in the list of objects.
We can now make a scree plot of the eigenvalues, Draw eigenvalues... to get an indication of the importance of each eigenvalue. The exact contribution of each eigenvalue (or a range of eigenvalues) to the "explained variance" can also be queried: Get fraction variance accounted for.... You might also check for the equality of a number of eigenvalues: Get equality of ei-genvalues....
- Determining the number of components
There are two methods to help you to choose the number of components. Both methods are based on relations between the eigenvalues.
- Plot the eigenvalues, Draw eigenvalues.... If the points on the graph tend to level out (show an "elbow"), these eigenvalues are usually close enough to zero that they can be ignored.
- Limit the number of components to that number that accounts for a certain fraction of the total variance. For example, if you are satisfied with 95 of the total variance explained then use the number you get by the query Get number of components (VAF)... 0.95.
- How to get the principal components
Principal components are obtained by projecting the multivariate datavectors on the space spanned by the eigenvectors. This can be done in two ways:
- Directly from the Table Of Real without first forming a PCA object: To Configuration (pca).... You can then draw the Configuration or display its numbers.
- Select a PCA and a Table Of Real object together and choose To Configuration.... In this way you project the TableOfReal onto the PCA's eigenspace.
- Mathematical background on principal component analysis
The mathematical technique used in PCA is called eigen analysis: we solve for the eigenvalues and eigenvectors of a square symmetric matrix with sums of squares and cross products. The eigenvector associated with the largest eigenvalue has the same direction as the first principal component. The eigenvector associated with the second largest eigenvalue determines the direction of the second principal component. The sum of the eigenvalues equals the trace of the square matrix and the maximum number of eigenvectors equals the number of rows (or columns) of this matrix.
- Algorithms
If our starting point happens to be a symmetric matrix like the covariance matrix, we solve for the eigenvalue and eigenvectors by first performing a Householder reduction to tridiagonal form, followed by the QL algorithm with implicit shifts.
If, conversely, our starting point is the data matrix A, we do not have to form explicitly the matrix with sums of squares and cross products, A′A. Instead, we proceed by a numerically more stable method, and form the singular value decomposition of A, U Σ V′. The matrix V then contains the eigenvectors, and the squared diagonal elements of Σ contain the eigenvalues.