# DATA HANDLING LINKED TO MOLECULAR MEDICINE; Mathematical Question in biology

DATA HANDLING LINKED TO MOLECULAR MEDICINE; Mathematical Question in biology

Coursework 1:

Using statistical methods for data analysis

A series of experimental results have been collected and should be analysed by statistical methods. Data sets will be provided and the task will be to:

(1)    Select adequate statistical methods to analyse the data

(2)    Describe briefly the rationale why a method is reasonable and useful for the problems

(3)    Perform the analysis,

(4)    Answer the question(s) associated with the problems and

(5)    Draw adequate conclusions on the biological problems given

The final results and any conclusions, which can be drawn from the results should be discussed briefly – but conclusively. The report should focus on the clear description, rationale and presentation of the analysis of the data. This is not an essay (!) but it is useful to introduce the question and the main conclusions from the data. The discussions should be short, but clearly define the major conclusions, which can be drawn from the data – or not.

Calculations

•    Calculations can be done “by hand” using the adequate formulas as presented in the lectures/seminars. They will also be found in textbooks on statistics.

•    Calculations should be limited to adequate decimals.
•    Statistical analysis packages may be used (like SPSS) or available online calculation programs

•    Raw data are also posted for easier use on Blackboard (Excel format)

(>BIO-M209 Coursework > Coursework 1 > File: Coursework 1_Supplementary Data_Raw Data)

•    Tables of critical values for different distributions (t-Test, F-Test, Chi-square test) are found in most textbooks on statistics and freely available programs

•    A selected list of some useful programs and potentially needed tables are also found on

Blackboard

(>BIO-M209 Coursework > Coursework 1 > Coursework 1_Supplementary Data_Tables.Formulas)

Presentation in general:

Do not simply present the calculations, but explain the rationale of using a statistical method and draw adequate biological conclusions from the results. It should state the result, its significance and any other conclusion(s), which may be relevant.

Part 1: Genetics and phenotype of a novel mouse mutant

Introduction: A novel mouse mutant and the corresponding protein, mKIAA058 (in the following ‘KIAA’), was identified and further analysed. In order to understand the functions of this gene/protein, a mouse strain containing a non-functional allele (mKiAA058-; Null-allele) was successfully established. First data indicated an autosomal recessive inheritance as heterozygous (KIAA+/-) as well as homozygous knockout (KIAA-/-) animals were identified. The deficiency of this protein (‘Knockout KIAA’) affects multiple tissues, including skeletal defects (delayed/reduced development of bone and cartilage; growth retardation) as well as a progressive form of vascular degeneration. Later, a corresponding disease in humans was found in a small number of very young patients. The prospects of the patients are not clear at the time and the analysis of the mouse model may provide some hints for the severity of the disease. The analyses of the molecular mechanisms underlying the disease are still ongoing and some problems and experiments linked to these studies are given in the following. Statistical tests may be used to answer some of the questions.

Experiments 1 and 2: Genetics

In order to define potential effects of the presence/absence of the KIAA protein on the inheritance patterns, a number of breedings were performed. The genotypes of the parents were known and the genotypes of the litters (age: 14-16 days) were analysed by allele-specific PCR reactions.

Experiment 1:    (Marks: 10)

In three parallel experiments (1.1, 1.2, 1.3), crossings of 5 wildtype males (KIAA+/+) with 10 heterozygous females (KIAA+/-) were started and all litters (given as total number of mice) were genotyped and the numbers of all possible genotypes are given in the following Table 1. All tested mice appear normal and show no altered phenotype at the tested age (day 14-16).

Breeding: male(KIAA+/+) x female(KIAA+/- )

Table 1:    Experiment        1.1    1.2    1.3
Litters    Total number    62        90    98
Genotypes    KIAA+/+    34        37    42
KIAA+/-    28        53    56
KIAA-/-    0        0    0

Experiment 2:    (Marks: 20)

In three parallel experiments (2.1, 2.2, 2.3), crossings of 5 heterozygous males (KIAA+/-) with 10 heterozygous females (KIAA+/-) from experiment 1 were started, all litters were again genotyped (age: day 14-16) and the numbers of mice with all possible genotypes are given in the following Table 2.

Breeding: male(KIAA+/-) x female(KIAA+/- )

Table 2:    Experiment    2.1    2.2    2.3    Sum (2.1-2.3)
Litters    Total    71    84    59
Genotypes    KIAA+/+    21    23    18
KIAA+/-    39    46    35
KIAA-/-    11    15    6

(A)    Discuss briefly which type of genetic inheritance would be most adequate to describe the distribution of genotypes?

(B)    Do the distributions of all genotypes (numbers of mice) in experiments 1 and 2 correspond to the expected numbers according to Mendelian laws?

(C)    Are there any differences seen in the amount of Wildtype and Heterozygotes? Are the overall numbers of genotypic normal mice (+/+ and +/-) versus Knockout (-/-) significantly different?

(D)    Are there any further conclusions possible? If yes, what could be possible biological explanations of the observed distribution/s? Which other experiments would be needed?

Experiment 3: Bodyweight of various genotypes    (Marks: 30)

The lack of KIAA seems to affect various organ systems (see before) Therefore, a series of experiments were performed to define the growth of mice with various genotypes. Wildtype (KIAA+/+), heterozygous carriers (KIAA+/-) and homozygous knockouts (KIAA-/-) were analysed for their body weight (in grams: g) at different ages, 10-11 days, 20-22 days and 40-42 days. Only female animals were included in this analysis to avoid gender-specific variability.

The following tables contain the collected raw data.
(Raw data are available: >BIO-M209 >Courseworks >Coursework 1 >2014_Coursework 1_Supplementary Data_Raw data)
Day10-11                Day 20-22                Day 40-42
(Bodyweight; gram)        (Bodyweight; gram)        (Bodyweight; gram)

wt    (+/-)        (-/-)        wt        (+/-)        (-/-)        wt        (+/-)    (-/-)
6.5    6.1        5.7        16        16        14.3        19.8        17.1    14.3

6.9    6.1        6.3        14        15.1        15.7        18.2        17.2    15.9

5.4    6.9        5.8        13.9        14.7        13.9        18.8        17.9    15.1

6.6    5.6        6.5        14.7        14.8        14.7        19.1        17.9    16.1

6.2    6        5.6        16.2        13.7        15.4        17.8        17.8    15.6

5.6    5.6        5.6        15.2        15.5        12.8        18.9        18.4    14.8

6.3    6.3        5.8        14.2        15.7        13.5        16.9        14.4    14.8

6.2    6.4        6.5        15.8        15.2        14.1        17.3        18.7    13.1

6.9    6.1        6.1        16.2        13.4        13.7        17        16.9

5.9    6.3                16.2        15.9                        17.2

6    6.5                15.3        13.8                        18

5.9    6.5                        14.9                        17.8

6.1    6.6                        16.4

6.5                        14.9

6

(A)    Are there differences in the bodyweight of females between genotypes at various ages? Are these differences significant? Which data sets are ‘reasonable’ to be compared? What conclusions could be drawn?

(B)    Which alternative methods could be used to analyse the data? What are the limitations of these methods? Are there additional tests to be done?

(C)    Plot the data in bar diagrams and add adequate labeling and indicate statistically significant differences (eg. if p<0.01)

(D)    What (further) conclusions may be drawn from these results? Discuss briefly in the context of the disease phenotype.

Experiment 4: Comparison to expected normal bodyweight    (Marks: 10)

The mouse strain used in the previous experiments 1-3 represents a c57Bl6 genetic background. From published data of long-term studies at the Jackson Laboratory, USA it is known that mice of this genetic background show a mean body weight of female mice of µ=18,51 g with a standard deviation of s=1,44 g at the age of about 40 days.

(A)    Are there any significant differences in the bodyweight of females of any genotype (at 40 days) seen, when compared with previously known data?

(B)    What can be concluded from your result?

Experiment 5: Genotyping    (Marks: 10)

During the initial breeding experiments, litters including 40 mice were genotyped and 24 wildtype and 16 heterozygote females (KIAA+/-) were found. As a faulty batch of clips (used for marking individual mice) was used, during the next 3 weeks the clips either broke or were ripped off by the mice in the cage. Therefore, the genotypes of individual mice were unknown. As only few heterozygotes were needed, a small number of mice were tested, marked and then kept individually in cages. In order to keep the number of tests low, the probability of identifying heterozygotes should be predicted.

(A)    If 4 mice were tested, what is the probability that only heterozygotes are contained?

(B)    If 3 mice are tested, what is the probability that none of the heterozygotes is selected?

(C)    How many mice have to be selected that at least 1 heterozygote animal should be contained in the retested group with a probability of more than 95%?

Part 2: Sporadic diseases in a mouse colony    (Marks: 20)

Introduction: Recently, a sporadic appearance of an unusual (potentially infectious?) eye disease (linked with blindness, itching, bleeding in older animals) was detected in some individual mice in a population of mice in the animal house, including the Anxa5 strain. Presently, in various culture rooms are 40 racks (=shelves holding cages) with up to 20 mouse cages. As individual cages contain up to 5-6 mice, each rack may host more than 100 mice. Over a period of 6 months (01/2014-06/2014), the mice were routinely checked and the numbers of cages with at least 1 affected mouse were noted.

Due to space limitations, some racks have to be moved to a clean culture room. The task will be to define numbers of cages which (1) guarantee the survival of the strain (means: sufficient number of mice for breeding) as well as (2) minimize the risk of spreading the disease.

Use an adequate statistical test to calculate the risks of moving infected cages and carrying over the disease. Briefly explain your strategy and the reasons for selecting a calculation method.

Data table:

The table (shown on the next page) includes the number of infected cages per rack. Each rack may contain up to 20 cages. Cages identified as ‘infected cages’ contain at least 1 infected mouse. The dates of testing are indicated.

The table of affected cages is also found in the supplementary data:

(Raw data are available: >BIO-M209 >Courseworks >Coursework 1 >2014_Coursework 1_Supplementary Data_Raw data)

(A) Which distribution would be most adequate to describe the probability of events? Define the relevant criteria for your choice.

(B)    When 1 randomly selected rack is moved to a clean room, what is the probability that no affected cages (with diseased mice) are included in this rack? What is the probability that 2 transferred racks contain no cages with diseased animals, respectively?

(C)    What is the probability that individual racks contain 1, 2, 3, 4 or 5 cages with affected animals?

Racks    01/2014    02/2014    03/2014    04/2014    05/2014    06/2014
1    0    0    0    0    –    –
2    0    0    0    0    0    –
3    0    2    0    –    –    –
4    0    0    0    0    –    –
5    3    0    0    0    –    –
6    0    0    0    0    –    –
7    0    0    3    0    –    –
8    0    0    0    3    –    –
9    0    0    0    0    0    –
10    0    3    0    0    0    –
11    0    0    0    0    0    –
12    0    –    0    0    0    –
13    1    –    0    0    0    –
14    0    –    –    0    0    –
15    0    –    0    2    0    –
16    0    –    –    0    0    –
17    0    –    –    0    1    –
18    0    0    –    0    0    –
19    0    0    0    0    0    –
20    0    0    0    0    0    –
21    0    0    0    0    0    –
22    0    1    2    0    0    –
23    0    0    0    0    0    –
24    0    0    –    1    0    –
25    1    0    –    0    0    –
26    0    0    –    1    0    –
27    2    0    0    0    0    –
28    0    0    0    0    0    –
29    0    0    0    0    0    –
30    0    0    0    0    0    –
31    0    1    0    0    0    –
32    0    0    –    0    0    –
33    0    0    –    0    0    –
34    –    0    –    0    0    –
35    –    0    –    0    0    –
36    –    0    –    0    –    –
37    –    0    0    2    1    –
38    0    0    0    0    0    –
39    0    0    0    2    0    –
40    0    0    0    0    0    –

Coursework 1:

Marking scheme

Part    Experiment        Max. Marks
Part 1    Experiment 1    Genetics    10
Experiment 2    Genetics    20
Experiment 3    Bodyweight    30
Experiment 4    Bodyweight    10
Experiment 5    Genotypes    10
Part 2    –    Sporadic disease    20

Total            100

PLACE THIS ORDER OR A SIMILAR ORDER WITH US TODAY AND GET AN AMAZING DISCOUNT 🙂