Welcome to Swayam. Course is called Biostatistics and Mathematical Biology. Thank you for choosing
this web course and I welcome you all to this course, and I hope you will have a great time
on attending this course.
First of all, let me tell you why you want to learn this course, what are the benefit
that you are going to get out of this course, or otherwise, why do you want to study the
Biostatistics and Mathematical Biology? There are three main, you know, benefits that you
can accrue by learning Biostatistics and Mathematical Biology, if you are a biology student.
First of all, you can analyze your own data, that is a very very important and infer meaningful
conclusions about your data. Second one is that, this course and learning biostatistics
will help you to read and understand primary research articles that is on your subject,
on your field.
For example, you know, if you are reading a any kind of a scientific article, that is
also called scientific literature, right ? Or or primary research paper or scholarly literature,
a lot of name for that. These are original research articles. So, if you read this kind
of original research article, it’s very difficult to understand that research article,
if you do not understand how to perform this bi… biostatistics via the statistical operation.
So, this course will help you to understand and decipher the primary research article.
And the third important point is that it make informed decisions on your life. For example,
purchasing an annuity or an insurance policy, or to be an informed customer, right , to
buy the things in an informed way or to risk, to understand the risks associated with investing
or mathematical expectancy in lottery or gambling and so on. So, the course will not only help
the only biologist, it’ll help you, it’ll help most of the general reasonings for the everyday life.
There is a very famous quote by Ronald Aylmer Fisher, he is the father of statistics and
his famous statement is that "To call in the statistician after the experiment is done
is no more than asking him to perform a postmortem surgery or examination- he may be able to
tell you, what the particular experiment is died of.”
So, a common strategy for most of the biology students and investigators is to collaborate
with a statistician for analyzing the data, after the data has been generated.
So, that is not the right way. So, my inspiration to start this particular course is to know how
to design an experiment, thus to enable the students, to know how to design an experiment
perfectly and how to analyze the data, right , so that statistics in co… common picture
much before, even during the designing of the statistic, the the scientific experiments.
Duration of this course is fifteen weeks. So, it is three credit course, in fifteen
weeks, each of this particular week we are going to cover two modules. So, that is how
the structure of this particular course. And the level of this course is post graduate.
So, for the graduates students this will be of great help, right ?
So, the transfer of credits across the U G C recognized universities across India, pan
India that is possible. So, you can take this credit as part of your existing ongoing postgraduate
program and you can earn the cre… the credits, from this particular thing because this is
a U G C Swayam platform. So, that is, this the course is absolutely free to take also,
that that many advantages for taking this particular M O O C course.
So, the course may also be taken by anyone, irrespective of the educational background,
for understanding the probability and statistics. So, as part of your lifelong learning, you
can take this course. So, there are no prior requisites for this particular course to take it so.
Let me explain to you about the course. The course in one sentence, I can tell you is
that it is a non-mathematical intuitive introduction, to mathematics and statistics for the biologists.
So, I’m targeting, the target audience is biologist and it's a non-mathematical and
intuitive introduction to the discipline of statistics and mathematical biology.
The target audience as I told you is postgraduate level students of sciences with limited or
no background on mathematics other than, you know, high school mathematics. So, I do not
expect you to have a great background on mathematics. So we will actually start from ground zero.
So take it easy, the course is going to be a lot of fun. So, you know, you can take this
course, there's absolutely no problem in that. So it's suitable for a post graduate level
students in science and medicine.
So, as I'm not as looking for any special mathematical background for you people, I'm
going to make it as simple as possible, right, that the the course is going to be very
simple and very non-technical and it will be great for tho… those students who have
never been exposed to college mathematics, for example.
And the prerequisites of the course is nothing but a a basic mathematical understanding of
the high school mathematics. And a degree in sciences would be beneficial, because most
of the examples that I'm going to cover in this particular course will be from the sciences,
especially from the biology. So the course is targeted to the biology, biological sciences
students or biomedical students or medical students.
Now, coming to the course objectives that are, what are the course objective, you know?
The first objective is to introduce the basic concepts of the probability, statistics and
statistical hypothesis testing for the students of biology. So I'm going to introduce the
basic concepts, so the probability and statistics for the students of biology. Elaborate how
to interpret the statistical results of the published study, so how to actually read and
understand the published literature, you know, the published paper. So, you can actually
go through the published , you know, literature in much more informed manner after taking
73
00:06:16.240 --> 00:06:17.800
this particular course.
74
00:06:17.800 --> 00:06:22.750
Choosing the right statistical tests for the scientific problem and interpretation of the
75
00:06:22.750 --> 00:06:27.810
research. So, you must be having ss… a specific scientific problem at hand. So, how to actually
76
00:06:27.810 --> 00:06:32.570
perform and how to choose the best statistical test for, to suit your requirements?
77
00:06:32.570 --> 00:06:38.800
So, that this particular course will enable you to do exactly that. And it will also sensitize the students
78
00:06:38.800 --> 00:06:43.900
about various statistical pitfalls to avoid, so that will be elaborated as part
79
00:06:43.900 --> 00:06:49.700
of this course. And to provide a brief framework on the mathematical biology for the students
80
00:06:49.700 --> 00:06:55.050
of biology, not the mathematical biology will also be introduced as part of this particular course.
82
00:06:55.050 --> 00:07:00.960
So, the total number of modules of this course will be thirty and per week, each week we
83
00:07:00.960 --> 00:07:06.220
are going to cover two modules, right ? So each module in turn is subdivided into three
84
00:07:06.220 --> 00:07:11.210
sections, so each section will consist of around thirteen minutes video, and around
85
00:07:11.210 --> 00:07:18.820
thousand words, e-text. So, I hope everybody will be actually going through this particular course.
87
00:07:18.820 --> 00:07:24.540
So weekly time commitment will be approximately three hours per week. So each week, I expect
88
00:07:24.540 --> 00:07:29.920
you to commit around three hours, each week, right , so that would mean that total eighty
89
00:07:29.920 --> 00:07:37.190
minutes of video per week, plus eighty minutes of the reading or as well as the problem solving.
90
00:07:37.190 --> 00:07:41.100
So eighty minutes of video, eighty minutes of the problem solving, plus twenty minutes
91
00:07:41.100 --> 00:07:45.600
of the assessment that will be covered off here. So overall time commitment of the entire
92
00:07:45.600 --> 00:07:50.380
course is around forty five hours, which includes twenty hours of video.
93
00:07:50.380 --> 00:07:55.400
So example, week here as you can see here on Sunday, I’ll going to release the module
94
00:07:55.400 --> 00:08:01.290
one with the subsection one the video and e-text, so on Monday I’ll release the ma…
95
00:08:01.290 --> 00:08:06.590
module number one, the same module, section two, and the next day section three.
96
00:08:06.590 --> 00:08:12.490
So on the third day, I'm going to release an ungraded test for this particular. Remember that there
97
00:08:12.490 --> 00:08:17.360
are two types of tests ungraded and graded. So in the case of ungraded, your marks will
98
00:08:17.360 --> 00:08:21.250
not be counted for the final, where the performance of you students.
99
00:08:21.250 --> 00:08:27.700
Now, on the Wednesday, we'll start with the second module, the section one, then section
100
00:08:27.700 --> 00:08:32.539
two, then section three, and then on the same day, we are going to have an ungraded test
101
00:08:32.539 --> 00:08:36.789
for that particular module. And on the Saturday we are going to have the graded test, so that
102
00:08:36.789 --> 00:08:40.570
is how in each week we are going to cover two modules.
103
00:08:40.570 --> 00:08:45.450
So let us go through each module one by one. So in the first week to the f… the week
104
00:08:45.450 --> 00:08:50.920
number five, we are going to cover ten modules, right ? So, in the first week we are going
105
00:08:50.920 --> 00:08:55.759
to cover the module entitled Biostatistics and Mathematical Biology an introduction and
106
00:08:55.759 --> 00:08:59.930
subsequent on the same week we are going to cover the types of studies.
107
00:08:59.930 --> 00:09:05.517
On the second week we are going to cover Levels of Measurements and Summarizing the Data-The Tabular Presentation.
108
00:09:05.517 --> 00:09:09.310
On the third week we are going to cover Summarizing the Data-
109
00:09:09.310 --> 00:09:14.639
Graphical Presentation and Charting with the Excel. And now coming to the fourth week we
110
00:09:14.639 --> 00:09:19.149
are going to cover Descriptive Statistics- Point Estimates and then Interval Estimates.
111
00:09:19.149 --> 00:09:24.689
And on the fifth week, we are going to cover Error Bars, Moments, Normality Test and Outliers.
112
00:09:24.689 --> 00:09:30.350
Now coming to, six to ten week, so on the sixth week we are going to cover Concepts
113
00:09:30.350 --> 00:09:35.360
of Population, Sample, Confidence Interval, and subsequent on the same week Statistical
114
00:09:35.360 --> 00:09:41.380
Hypothesis Testing. And on seventh week we are going to cover Statistical Significance and P-Values
115
00:09:41.380 --> 00:09:46.470
and Relationship between Confidence Intervals and Statistical Significance.
116
00:09:46.470 --> 00:09:51.889
So subsequently, on the next week we are going to cover Statistical Power and Choosing the
117
00:09:51.889 --> 00:09:55.830
right Sample Size, I'm going to elaborate how to choose the best sample size for your
118
00:09:55.830 --> 00:10:01.269
data and subsequently t-Distribution and the test of significance based on the t-distribution
119
00:10:01.269 --> 00:10:02.450
will be elaborated.
120
00:10:02.450 --> 00:10:07.699
Now on the ninth week we are going to cover F-distribution and the test of significance
121
00:10:07.699 --> 00:10:12.339
based on F-distribution, and on the same week we are going to cover Chi squared distribution
122
00:10:12.339 --> 00:10:16.889
and sig… test of significance based on the Chi squared distribution in this particular way.
123
00:10:16.889 --> 00:10:22.019
And finally, on the tenth week, we are going to cover Comparing Proportions, and
124
00:10:22.019 --> 00:10:28.670
on the same week we are also going to cover Gaussian, Binomial, Lognormal and Poisson Distributions.
125
00:10:28.670 --> 00:10:31.980
So, different kinds of distributions will be elaborated on the tenth.
126
00:10:31.980 --> 00:10:36.819
On eleventh week we are going to cover Pearson's Correlation and Simple Linear Regression and
127
00:10:36.819 --> 00:10:42.819
on twelfth week we're going to cover Non-Linear Regression as well as Nonparametric tests.
128
00:10:42.819 --> 00:10:46.879
Then on thirteenth week we are going to cover Permutations and Combinations, and on the
129
00:10:46.879 --> 00:10:52.550
next week we are, on the same week we are also going to cover Probability. On fourteenth
130
00:10:52.550 --> 00:10:57.160
week, we are going to cover Bayes Theorem and Maximum Likelihood. And the same week
131
00:10:57.160 --> 00:11:02.770
we're going to cover Statistics with M S Excel and GraphPad Prism, so the tue... two of this
132
00:11:02.770 --> 00:11:07.890
most important software, we are going to cover comprehensively in this particular course.
133
00:11:07.890 --> 00:11:12.520
And on finally on fifteenth week, the last week of this program, we are going to cover
134
00:11:12.520 --> 00:11:17.209
Key concepts of the statistics. This is kind of a sum-up of the whole course, okay, so
135
00:11:17.209 --> 00:11:21.069
it's a key take away from this course we are going to cover and finally statistical pitfalls
136
00:11:21.069 --> 00:11:25.370
to avoid, what what are what are the main takeaway from this course that we are going
137
00:11:25.370 --> 00:11:27.189
to cover on the fifteenth week.
138
00:11:27.189 --> 00:11:33.130
We will be covering two of the most widely used softwares for Biostatistical analysis.
139
00:11:33.130 --> 00:11:38.230
The first one is Microsoft Excel, well, the second one is called GraphPad Prism.
140
00:11:38.230 --> 00:11:43.300
The version seven is being used for this M O O C . So, let us first see the Microsoft Excel.
141
00:11:43.300 --> 00:11:51.329
I click here, the Microsoft Excel icon. Here you can see four groups- uranium, lead, arsenic, and mercury.
143
00:11:51.329 --> 00:11:56.069
I'll just show you how to perform a commonly used statistical analysis for ANOVA.
144
00:11:56.069 --> 00:12:03.309
I click here data, then I click here data analysis, and I click here ANOVA single factor, you
145
00:12:03.309 --> 00:12:09.230
can see there are two, three types of ANOVA here, two factor with replication, two factor
146
00:12:09.230 --> 00:12:13.280
without replication and single factor. So, I select here single factor. I tell you again,
147
00:12:13.280 --> 00:12:19.730
don't worry, we are going to cover all about ANOVA later in this module, later in this course.
149
00:12:19.730 --> 00:12:24.639
So, I click here first input range and define the input range, which it also includes a label.
150
00:12:24.639 --> 00:12:30.350
Then of course, it contains a label, so I click here the labels in the first row.
151
00:12:30.350 --> 00:12:37.160
I click new worksheet by and click here ‘all okay’. So, to get this particular the results
152
00:12:37.160 --> 00:12:41.720
of the ANOVA single factor, which also shows the P value here, the P value, obtained P
153
00:12:41.720 --> 00:12:47.930
value is three point three six e minus zero six, that means three point three six multiplied
154
00:12:47.930 --> 00:12:54.009
by ten power minus six. So this is the P value. And again, I tell you don't worry about it,
155
00:12:54.009 --> 00:12:59.490
I will teach you how to interpret this P value, but this is how to perform the one way ANOVA
156
00:12:59.490 --> 00:13:00.490
in a nutshell.
157
00:13:00.490 --> 00:13:07.360
Now, let us see the GraphPad Prism. Here, is one example, data sheet here we have a
158
00:13:07.360 --> 00:13:13.050
group A and group B. These are nothing but marks that the students got in the M S T one
159
00:13:13.050 --> 00:13:18.339
and then the M S T two. So these two groups that we will have to check out these two groups
160
00:13:18.339 --> 00:13:20.600
for, you know the column statistics.
161
00:13:20.600 --> 00:13:26.949
So first I just have to click here to highlight the group A and group B. Then I go here on
162
00:13:26.949 --> 00:13:34.269
the top insert, new graph from the existing data, then I click here, the column statistics,
163
00:13:34.269 --> 00:13:40.089
the scatter plot with the bar, so this is the scatter plot the bar or I can also click
164
00:13:40.089 --> 00:13:45.279
here the scatter plot. So I simply click here the scatter plot, I click okay, then we we
165
00:13:45.279 --> 00:13:51.420
have got this scatter plot. So we can see here, M S T one and M S T two each dot represent,
166
00:13:51.420 --> 00:13:56.519
you know, each data element, so data points, that is actually the marks that the students
167
00:13:56.519 --> 00:14:03.029
got, on Y axis is the marks, while X axis says M S T one and M S T two with the middle
168
00:14:03.029 --> 00:14:10.390
line is basically the average while this plus and minus is ninety percentage confidence interval.
170
00:14:10.390 --> 00:14:13.759
Again I told you don't worry, I will actually tell you all about this confidence interval
171
00:14:13.759 --> 00:14:20.300
and how to calculate this particular ninety five percent confidence interval, etcetera.
172
00:14:20.300 --> 00:14:24.329
Course textbook for this course that we are going to follow is this book Intuitive Biostatistics,
173
00:14:24.329 --> 00:14:28.529
which is available on bookstores all around the country or you can even order through
174
00:14:28.529 --> 00:14:33.869
online, so you don't really need this to buy this particular book, we are going to cover
175
00:14:33.869 --> 00:14:37.170
most of the contents of this particular thing and how to perform the operations you know,
176
00:14:37.170 --> 00:14:41.309
as outlined this particular book. So anyway, this is our course textbook that is actually
177
00:14:41.309 --> 00:14:46.380
called Intuitive Biostatistics and Non-mathematical Guide to the Statistical Thinking by Oxford
178
00:14:46.380 --> 00:14:49.929
University Press by Harvey Motulsky.
179
00:14:49.929 --> 00:14:53.819
So assessment, coming to the assessment of this particular course we are going to have
180
00:14:53.819 --> 00:14:59.499
twenty percentage of the total credit, you're going to, you are going to earn from the online
181
00:14:59.499 --> 00:15:04.170
based test, that is actually each week we are going to cover, you know, the graded test.
182
00:15:04.170 --> 00:15:07.550
So from those graded assignments and graded test you're going to earn twenty percentage
183
00:15:07.550 --> 00:15:14.240
of the total score of this particular course and the rest eighty percentage you will be
184
00:15:14.240 --> 00:15:17.470
earning through the proctored examinations in select centers.
185
00:15:17.470 --> 00:15:22.319
So, most probably this will be decided later by the U G C Swayam platform. So, you will
186
00:15:22.319 --> 00:15:27.649
have to go to that particular center and you have to get that examination done on pen and
187
00:15:27.649 --> 00:15:32.410
paper or a computer based that will be decided later on. So this is how eighty percentage
188
00:15:32.410 --> 00:15:37.240
through the proctored test, while twenty percentage will be through the online examination that
189
00:15:37.240 --> 00:15:39.230
you have the total freedom.
190
00:15:39.230 --> 00:15:44.029
Learning outcomes of this particular course are several learning outcomes. First one is to learn the scope
191
00:15:44.029 --> 00:15:48.229
and application of the field of biostatistics and mathematical biology.
192
00:15:48.229 --> 00:15:53.019
Second one is to learn the correct way to interpret the data using the tables as well
193
00:15:53.019 --> 00:15:57.730
as the diagram, so how to interpret the data. The third objective is to learn how to choose
194
00:15:57.730 --> 00:16:03.110
the right test out of the repertoire or the different statistical test for the scientific
195
00:16:03.110 --> 00:16:07.470
problem at hand. So, for your scientific problem how to choose the best statistical test, right?
197
00:16:08.470 --> 00:16:13.179
Now, the fourth objective of this particular or the learning outcome of this particular
198
00:16:13.179 --> 00:16:18.660
course is to learn how to interpret the statistical results of the published scientific study.
199
00:16:18.660 --> 00:16:24.079
So, how to interpret that particular data? Fifth, learning outcome is to learn how to
200
00:16:24.079 --> 00:16:29.869
perform the commonly used descriptive and inferential statistical tests, all the scientific
201
00:16:29.869 --> 00:16:35.670
data and interpretation of that particular data, how to interpret that data perfectly fine.
203
00:16:35.670 --> 00:16:41.309
And finally, to learn how to perform commonly used statistical tests on online and using
204
00:16:41.309 --> 00:16:47.350
MS Excel and also to learn about the statistical pitfalls to avoid, so le… several learning
205
00:16:47.350 --> 00:16:53.220
outcomes of this particular course and remember this course is going to be as non-mathematical
206
00:16:53.220 --> 00:16:58.009
and as non-technical as possible, and it's going to be a cool course and that there is
207
00:16:58.009 --> 00:17:03.059
absolutely no problem associated with this course. Other than a lot of fun, no prerequisites
208
00:17:03.059 --> 00:17:07.829
that I'm actually looking for this particular course and again, once again, thank you for
209
00:17:07.829 --> 00:17:13.449
choosing the course and a warm welcome. Course and I I I suggest you to interact with other
210
00:17:13.449 --> 00:17:15.589
students and meet through the discussion forums.