You will be using a file (
BlackBoard\Enrolments\MScs\PYMOS1\survey1) containing much more of the
Workers Survey data you worked with last week. There are 50 cases, and 21 variables, listed below
Note: For all variables except
daysabs, "0" denotes a missing value; for
daysabs, "999" is the missing value code). Copy the file onto your drive.
subj: Subject identification number
ethnicgp: 1=White; 2=Asian; 3=Afro-Caribbean; 4=Other
sex: 1=male; 2=female
income: Annual income in £
age: Age in years
yrs: No of years working for this company
quest13: Answers to 13 questions about person's job, on a scale of 1-5
skill: 1=Unskilled; 2=Semi-skilled; 3=Fairly skilled; 4=Highly skilled
daysabs: No of days absent from work in past year
SPSS, open the new data file
Data Editor window, open the
Variable View tab. The new variables have no labels or missing values, etc., defined. You must define these variable characteristics in the same way as you did last session. To speed things up,
Copy from last session's file, if you saved it: You can select, copy, & paste whole rows in
Variable View, without changing the data, so long as you copy into the correct variables.
Check the list in Section 1: Which variables still need to be defined? In Variable View:
missing valuecodes for all variables.
Variable labelsare probably not necessary, unless you forget what the variable names mean...)
Value labelsare not yet needed for variables
quest13, but one other new variable does need them...
Nominal, as appropriate
survey1.savfile in your drive, under a new name
There are two basic types of variable: non-numeric (also "categorical" or "nominal") which are summarised in terms of the frequencies or numbers of cases that have particular values, & (truly) numeric ("ordinal" or "interval") variables, which can be summarised by statistics such as the mean or median.
Next, we explore data of these two types, then test some simple statistical hypotheses about them.
Exploring data is describing & examining them before analysis. Exploration is always advisable because it can help you to choose an appropriate method of analysis, or to avoid an inappropriate one. It also helps to interpret the results of a statistical analysis if you have previously examined the pattern of data (e.g., frequencies, means, variability, distributions).
We first look at frequency data. These commonly come from categorical variables, but can be used to summarise any discrete variable (e.g., an ordinal variable with a small number of possible values).
From the list of variables in Section 1, name one categorical variable
And name one discrete ordinal variable
Frequency tables can be one-way, showing the numbers of cases that have each value of a single variable, or two-way, showing the numbers of cases that have combinations of the values of two variables.
Two-way tables are known as contingency tables or cross-tabulations (cross-tabulations can also be three- or more-way). Two-way tables allow you to look for associations between the two variables. The analysis of one-way tables is discussed at the end of this section.
Contingency table analysis is done with the
Crosstabs option. The
Crosstabs procedure allows you to tabulate and explore relationships between two or more discrete variables, two at a time. For example, you might ask for tables of
ethnicgp which would produce frequency tables for each combination of sex & ethnic group. The procedure also provides statistical tests, but first examine the frequencies.
Crosstabs allows you to display other descriptives, such as percentages & expected frequencies (i.e., the frequencies that would be expected under the null hypothesis of no association).
From the main menu, pick
Descriptive Statistics |
The dialogue box asks you for a row and column variable. Ignore the
Control variable for now.
Set up the dialogue box to show the frequencies of different ethnic groups within each sex. Choose which variable will make the rows and which the columns of the table (it doesn't matter which).
To obtain other descriptives, click
Cells. You see that
Observed counts (i.e., the actual frequencies) are selected by default. Select
Expected counts too.
Column are more useful than
Row will show what percentage of cases in each row of the table come from each category of the other variable, e.g., if sex is the row variable, it will show what percentage of each sex come from each ethnic group.
OK, then examine the output. Note that the
Value Labels you defined appear in the table, making it easier to understand.
What does the
Case Processing Summary show you?
Look at the observed counts (ignore expected counts for the moment) & the percentages. Is there any evidence of association, & of what kind? (e.g., does one sex have a relatively higher proportion of a particular ethnic group?)
What do the expected counts show?
Compare the observed to the expected counts. What does this comparison tell you?
One ethnic group category is not very useful. Which one and why?
You should see that the males have a higher proportion of one ethnic group (which?) than the females do, or conversely that one ethnic group is relatively more likely to be male than the other ethnic groups. Also, for these 'more likely combinations' the observed count is larger than the expected count.
But is this apparent association significant, or could it arise from chance variation? This can be tested by means of two-way chi-square or similar tests such as Fisher's Exact (for small samples).
Crosstabs again & click the
Statistics dialogue box, a list of statistical tests are given:
phi, etc. Note that some of the tests offered are not suitable here because they assume ordinal variables. To get more information about the various statistics, click
Help in the dialogue box.
The output shows a frequency table, followed by statistics and their p-values.
Is the association significant by the Pearson chi-square test? Find the chi-square, df, its p-value (
Asymp. Sig). This p-value is for a 2-tailed test, which does not assume any particular pattern of association: It will detect any pattern of deviation from H0.
Chi-square is not significant: What does that tell you about the data?
The footnote below the statistical output says that some cells have very low expected counts (less than 5). It is low expected, not observed, counts that matter. This warns you that chi-square is not really valid for these data, because you have split your 50 cases into too many cells with small expected counts. You could make the expected frequencies larger by combining categories, or omitting uninformative categories, so as to have fewer cells. You cannot do that with
sex but you could with
Which value of
ethnicgp might you omit?
ethnicgp values, which ones might you combine, & what would you call the resulting combined category?
Later you will learn how to omit ("filter out") cases with unwanted values.
Note: Imagine you want to do all the above separately for groups who have different skill levels. This is called using skill as a
control variable. The bottom section of the dialogue box is used to define a third control variable for that purpose. There are other options, e.g., to produce a
clustered bar chart of the cell frequencies in addition to, or instead of, the frequency table.
Exact button is sometimes useful.
Fisher's Exact Test is like two-way chi-square but is more accurate when samples are small (less than ~20 cases).
SPSS automatically computes
Fisher's Exact, in addition to chi-square, when the table is 2x2. (It also computes a Yates-corrected chi-square for 2x2 tables. Some people - but not Howell or Field - believe that this is preferable to the basic chi-square.) So, for 2x2 tables, there is no need to click the
If your table is larger (e.g., 2x4), but the sample is small, you can click
Exact to compute something similar to
Fisher's Exact test. Click
Help in the
Exact dialogue box for more information.
We can compare the frequency distribution of a single discrete variable to some specified or expected pattern using
one-way chi-square. For example, we could test whether the distribution of ethnic groups in this sample differed from the proportions expected in the UK population generally; or whether the proportions of males and females differ significantly from specified proportions, e.g. 50:50.
One-way chi-square is found under
Nonparametric Tests |
Chi-Square. It is not very commonly used, so there is no need to try it now. If you want to try it, use the
This session & last, you used Frequencies, Descriptives, & Crosstabs under the
Descriptive Statistics menu to examine variables of different kinds.
EXPLORE is also under the
Descriptives menu. It provides a wide range of plots, descriptive, & diagnostic statistics for distributions of numeric variables, including confidence intervals. It has other uses (e.g., normality testing) as you will see later.
Descriptive Statistics |
income into the
Dependent List. You will be examining the distributions of both these numeric variables.
Factor List box is used to split the sample (e.g., by sex) & examine distributions separately for the sub-samples. Do not put any variable into this box now. This means you will examine distributions for the whole sample.
Display the default is
Both statistics & plots, but start by selecting
Statistics button. The default is to provide
Descriptive Statistics &
95% Confidence Intervals. There are other options which you can try later, but do not use them yet: The
Descriptives are a long list, as you will see. Keep the default settings.
An important decision when analysing several variables at the same time is how to handle missing values. In
SPSS, this is usually controlled by the
Options & you see three choices. The first two are
Exclude cases listwise &
Exclude cases pairwise (ignore the third option for now). In the present case,
listwise means that any case which has missing values on either
income will be excluded from the analysis of both of them.
Pairwise means that cases with a missing value only on
age will be excluded from the analysis of
age but not from analysis of
income, & vice versa.
Here we are analysing
income independently, so
Pairwise is better: Select that. Click
How can you see from the
Case Processing Summary that the samples analysed for
income are not identical?
(This is because you selected
Exclude missing values pairwise. If you had chosen
Listwise, the two samples would be identical.)
Extract the following information from the Descriptives table:
|Variable||Mean||SEM||95% Confidence intervals||Skewness|
Note: That the CI limits are approximately 2 SE below & above the mean.
The CIs show you that the mean for each variable is significantly different (p<.05) from zero: Why?
Also note that both skewness scores are positive, and one variable (which one?) has a much stronger positive skew than the other. This will become clear when you look at plots of the distributions. We will look at skewness (and kurtosis) again in Section 8.
We will now inspect the plots. Re-open the
EXPLORE box. Under
Plots, then click the
Plots button. You obtain this:
Boxplots are useful. Retain the default setting
Factor levels together. We have no "factor variable", so the plots will show the whole sample.
Descriptive the default is
Stem-and-leaf. I suggest you select
Histogram. If you are curious, keep
Stem-and-leaf. If not, de-select it.
Do not click
Normality plots now: You will use it later. Click
OK. The output shows plots for
age (histogram, stem-and-leaf if requested, boxplot), & then the same plots for
Look at the histogram for age. How can you see that it has a positive skew?
stem-and-leaf plot is a "horizontal histogram" showing individual scores. On the left of each row are numbers of cases, in bands of 5 years. The right side of each row lists values of individual cases. For example, the first row shows 4 cases in the first 5-year band (15-19 years, with ages 18, 18, 19, & 19). The second row shows cases in the second, 20-24 year band, & so on. Some people prefer these to histograms.
Boxplot provides the most useful visual summary of distribution characteristics. The thick black line marks the
median, the box encloses the
first & third quartiles of the distribution, & the
whiskers show the highest & lowest individual scores that are not
outliers. Outliers were defined by Tukey in his book Exploratory Data Analysis. They are scores which are "outside the expected range" or "atypical". Sometimes one may decide to omit them from analysis.
The boxplot shows any
extremes (which are even more way-out scores than outliers) by special symbols,
x, labelled with their case numbers. If you don't know what any of the terms in bold means, the
Help will tell you. Note that they are all based on ranking of the scores, so they are suitable for either ordinal-level or interval-level data.
Boxplot gives a clear impression of whether the distribution is symmetrical or not, &, if asymmetrical, where the scores are relatively heaped up or spread out.
How can you tell from the
age boxplot that the distribution is slightly non-symmetrical, having a slight positive skew?
Note that there are no outliers on
age because there are no
Look at the histogram of income. What makes it have such a strong positive skewness?
Look at the boxplot of income. How does the positive skew appear in this?
An outlier is shown in the boxplot - approximately what is its value of income?
(If you made a stem-and-leaf plot of income, this case will be listed below it as
Note: There is another way to obtain information about "extreme" cases, using the
Descriptive Statistics. If you have time, & want to try it; open the
EXPLORE dialogue box again. Under
Statistics & click the
Statistics button. Remove the tick by
Descriptives, & instead select
Outliers. This produces a different kind of display which can sometimes be useful.
You can learn a lot about your data with
EXPLORE. It can suggest whether data are normally or non-normally distributed, & whether any cases may be atypical, so we can decide what to do about them.
EXPLORE can also carry out a statistical test of whether samples deviate from the normal distribution, see Section 8. This will provide a partial check on whether data meet the assumptions of the well-known parametric significance tests, e.g., Student's t-tests.
The familiar Student's t-test can be applied to independent samples (comparisons between 2 groups) or paired samples (within-subject comparisons). In addition, a one-sample t-test can be used to compare the mean of a single sample to a specified value.
The independent-samples t-test, as you probably know, is a parametric test which is based on certain assumptions, namely:
We can assume that 1 is true for the present data. Assumption 2 is automatically checked when you run the t-test (see Section 10). Assumption 3 can be tested with the
EXPLORE procedure, see Section 8.
The one-sample t-test only requires Assumption 3, that the data approximate a normal distribution within the single sample.
EXPLORE can of course be used to check that too. The paired-sample t-test is a little different, we will discuss it in Section 11.
In forthcoming sections you will be running procedures (
EXPLORE, & independent-samples t) where data from two groups,
Females, are compared. It is helpful if the output displays value labels for the groups (e.g.,
Female) not just values (e.g., "1" & "2"). To ensure that
SPSS does this: From the menu, select
Output Labels tab.
You see two panels,
Outline Labeling &
Pivot Table Labeling. In each panel, the lower box is
Variable values in labels shown as.... If either of the lower text boxes says
Values, click the down-arrow on the right, & change the setting to
Values and Labels. Click
See Field (2009), Chapter 5.4-5.5.
First, we check normality of the distribution of some variables in the whole sample, in preparation for carrying out a one-sample t-test on a variable. Then we will check normality for male & female sub-samples, in preparation for carrying out an independent-samples t-test to compare male & female means on a variable.
One way to check the shape of distributions is to test whether their skewness & kurtosis are significantly different from what's expected of a normal distribution. (skewness measures the asymmetry of a distribution; kurtosis measures how peaked or flat it is compared to a normal distribution.) We need to know skewness, kurtosis, & their SEs. Several
SPSS procedures provide these (e.g., it's an option in
FREQUENCIES which we used last week) but
EXPLORE does it automatically.
EXPLORE also offers more detailed tests of whether distributions deviate from normality.
EXPLORE dialogue box again.
Insert some dependent variables. I suggest you try
age (which you previously examined in Section 5) &, for a change,
daysabs. Try a third one if you like.
You are examining the whole sample, so no
Factor variable is required.
For testing skewness & kurtosis you need
Statistics; for testing normality you need
Plots. So, in the
Display panel, select
Plots button. In order to carry out the normality tests, select
Normality plots with tests. This will run the tests & draw some specialised plots.
It is also worth having simple plots (e.g.,
boxplots). You have already seen one for
age but not for
daysabs or any other new variables, so this option is useful. Click
The output sequence depends on exactly what options you selected. First, the list of
Statistics; then the
Tests of Normality; then
Histogram (if any); then the specialised plots linked to the normality tests, called
De-Trended Normal Q-Q plots; finally the
Boxplot (if any).
Look at the statistics, particularly at the skewness & kurtosis statistics, & SEs:
|Name of variable|
You can use the SEM to test the difference between a mean, & an expected value. This is a (one-sample) t-test: The difference between the actual & expected means is divided by the SEM, & the result is a t-statistic. If t (either positive or negative) is greater than the critical value with df=(N-1), then then difference is significant.
You can do the same with skewness, kurtosis, & their standard errors. If a distribution is normal, its expected skewness & kurtosis are both 0. Divide the difference between skewness & 0 (i.e., the skewness value) by its SE, & you get a t-statistic which can be compared to a critical t-value with df=(N-1). An exact t-test is best, but if you don't have tables, or don't want to do detailed calculations, you can use the approximate rule that, if t≥2 the skewness may be significantly different from 0. The same applies to kurtosis.
For your variable, is the value of (skewness / SE of skewness) ≥2?
Is the value of (kurtosis / SE of kurtosis)≥2?
If both answers are NO, the distribution of the variable is not grossly different in skewness or kurtosis from what you'd expect if the distribution were normal. If either answer is YES, there's a possible deviation from normality.
At the beginning of the
Plots output are the
Tests of Normality. Two tests are provided, Kolmogorov-Smirnov, and Shapiro-Wilk. They are both "goodness-of-fit" tests, which ask how closely the distributions fit a normal distribution. If a distribution is not close to normal (i.e. it deviates significantly from normal), the test will be significant. Conversely, if the distribution does not deviate from normality the test will be non-significant. The two tests do not always exactly agree with the simpler skewness and kurtosis tests, or with each other, but the results are usually similar. Like all statistical tests, the likelihood of significance depends on how large the samples are. (See Field (2009, p.147) for notes on the two tests.)
Examine the p-values, which are labelled
Sig. For each of your variables, check whether there is a significant deviation from normality by either test. Is there?
Which variable(s) if any would be OK (on this evidence) to analyse with a one-sample t-test?
Now look at your simple plots (
boxplots). They reflect the shape of the data distributions. Do not be too concerned with the details of the shapes: A distribution can look fairly bumpy, & yet not deviate significantly from normal, as you can tell by checking the statistical tests. But if any of your variables is highly deviant from normality, inspecting the plot will help show you why. It may also help to illustrate the skewness and kurtosis values.
(You can ignore the
Detrended Normal Q-Q plots for now, ask us (or check the
Help, or Field's book) if you want to know about them.)
EXPLORE to test the normality of the distribution of a variable within two sub-samples. In Section 11 you will run t-tests to compare males & females on some numeric variables. So you want to check the distribution of these variables separately for males & females (testing assumption 3 of the two-sample t-test, see Section 6).
EXPLORE you do this by defining a
Factor variable which in this case is
sex; insert it in the
Factor List as shown above.
Choose one or more variables as
dependent variables - these are the ones for which the distributions will be checked in the male & female groups. Test some of the same variables that you have already tested in the whole sample.
Display panel, select
Plots button & ensure that
Normality plots with tests is selected. When comparing groups, the most useful display is
Factor levels together, this will produce male & female boxplots side-by-side.
Tests of Normality. What are the results for each of your variables? Does any variable show significant deviation from normality within either group (males or females)?
Look at the boxplots. Are there outliers on any variable, and in which group, male or female?
Does it look from the boxplots as if the male & female groups differ in their central tendency on any variable (median, shown by thick black line)?
Does it look as if the groups differ in variability? (this is relevant to Assumption B of the t-test, Section 6). Remember that the variability will be increased if there are outliers.
If you want to plot the means of males vs. females on a single variable & draw CIs or SE bars around the means...
Try the default setting, which makes bars that represent the 95% CI around each mean. (You can change this to a 99% CI by typing 99 in the
Drag the variable whose mean you want to plot on to the vertical axis, & drag
Sex on to the horizontal axis.
No need to click the
Error Bars tab, but if you do, it offers options to change the style of error bar, add labels etc. Click
The resulting display shows you whether one group's mean falls within the other group's CI, whether the CIs overlap, & whether the CI includes a specific value. These are useful descriptives, though only the last is a 'formal' statistical test.
If the 95% CIs of the means of two groups overlap, does this suggest the group means are significantly different, or not?
(To make SE bars: Open the same dialogue box again, click the down-arrow on
Bars Represent & select
Standard Error of Mean. Next to this is a
Multiplier equal to 2, meaning that the default is to plot bars that are 2 SE around the mean. More usually, bars show 1 SE, so change the
Multiplier to 1, & click
This enables you to see whether the SEMs overlap, which is often used as a crude visual test of difference. You should remember that this visual test is often not reliable...
We will use this test to compare the mean of the whole sample on some variable to a specified value.
Look back at Section 8. Which variable met (or approximately met) the assumption of normality for the whole sample?
The one-sample t-test allows you to test whether the mean of the normally distributed variable is significantly different from a specified value. In Section 5 you used the CIs to judge whether the mean differed from zero but that is not useful for the present data. Instead, test whether the mean differs from 34.5.
Compare Means |
Make your chosen (normally distributed) variable the
Test Value box specifies the value that the mean will be compared to. The default is 0. Replace this with 34.5 & run the test.
The output first shows the sample mean etc., then the t-test results.
Note the t-value, its df, & p-value (2-tailed Sig.) Is the result significant (i.e. is the mean significantly greater or less than 34.5, 2-tailed)?
Also shown are the
mean difference of 3.11 (difference between the sample mean, & the
Test Value of 34.5) & a 95% CI for the difference.
Is this mean difference significantly different (at the 5% level, 2-tailed) from zero?
The answer to these two questions will be the same, because
SPSS uses t when calculating CIs.
All the tests so far were 2-tailed. This means that the p-value reported is the total probability of results as extreme as this in both the upper & the lower tails of the t-distribution.
But suppose your hypothesis is 1-tailed: For example, that the sample mean is greater than 34.5 (i.e., you are only interested in the results that lie in the upper tail of the t-distribution). So, you are only interested in half the previous p-value: The upper-tail half. Therefore, provided that the obtained result actually does lie in the specified direction (i.e. above 34.5), the p-value for a 1-tailed test is half that reported for the 2-tailed test.
Does the sample mean lie in the specified direction (i.e., above the test value)?
If YES, what is the p-value for the one-tailed test?
Is the 1-tailed result significant (p<.05)?
This test allows you to compare the mean scores of two independent samples of subjects, e.g., males & females, on some dependent variable.
Compare Means |
Independent Samples T-Test...
age as the
Test Variable. From your analysis with
EXPLORE in Section 8, you should already know that it is suitable.
sex as the
sex(? ?) because you have not told it which pair of groups to compare. In this case there are only two groups, but that's not always true: Sometimes you want to compare two groups taken from a larger set, e.g., two of the four ethnic groups.
Define Groups box, under
Use specified values, type 1 & 2 which are the number values of the two groups, Male & Female. Click
Options. The default is to provide a 95% CI (this will be a CI for the difference between means not for the individual group means), & to exclude missing values analysis by analysis (i.e., if you ask for more than one test variable, cases will only be excluded if they have a missing value on that particular test variable). Retain the default options, click
The output starts with a summary of descriptive statistics for the two samples. The main output is in 3 parts:
Levene's testis an F-test comparing the variances of the two groups. If it is significant, it implies that the groups have different variances, which violates the equal-variance assumption of the t-test (Assumption 2, homogeneity of variance, in Section 6 above).
What is the p-value (
Sig.) of Levene's test for the present 2 groups?
Is Levene's test significant (p<.05)?
Do the data violate the equal-variance assumption?
Even if data violate the equal-variance (homogeneity) assumption, all is not lost, because the t-test can be calculated two ways: Assuming equal variance, or not assuming it. If variances are not equal, use the
What are the two t-test results for the present data (note also the df & p-values):
Equal variance assumed?
Equal variance not assumed
What are the differences between the two results?
Note that the equal-variance version is usually preferable. If the variances are very different, a nonparametric test such as the Mann-Whitney (next statistics session) would be better than the unequal-variance t-test which is an approximation (note the "approximate" degrees of freedom). Unequal variances are particularly problematic if you also have very unequal sample sizes.
Now look at the last section - the
Mean Difference & its 95% CI. The
Mean Difference is the difference between the sample means, also shown in the descriptive statistics table.
Look at the 95% CI for the
Mean Difference. Does this show that the mean difference is significantly different from zero (p<.05, 2-tailed)?
The answer to this should be the same as the result of the 2-tailed t-test.
Imagine you are testing a 1-tailed hypothesis that the Females are older than the Males. Does the result go in the specified direction?
What is the p-value for the 1-tailed test? What do you conclude?
Suppose the Male & Female means had been the other way around, would your report of the result be any different?
You can of course also run a t-test to compare two of the groups from a variable which defines more than two groups: e.g.,
ethnicgp, & you can run the test simultaneously on more than one
Test Variable (dependent variable).
(If you used a variable other than
age when graphing CIs & error bars in Section 9, & you have time to spare, try running the t-test with that variable & see if the result agrees with your visual estimates in Section9.)
For this, you need a pair of variables, measured on the same subjects, where it makes sense to ask whether their means are different. Many pairs are not suitable (e.g., it is silly to ask whether a group's mean
income is higher or lower than their mean
age, although a correlation here might be interesting).
quest variables are suitable: All have a range of 1-5 & all are concerned with related issues about job quality. Although they may well not be normally distributed, the assumption for paired-samples tests is that the distribution of differences is normal. Let us suppose this assumption is met.
Compare Means |
quest4: First, highlight both variables in the left panel (hold down the
Ctrl key to select multiple options). Click the arrow to insert them into the
Paired Variables box.
Now select a second pair (e.g.,
quest4). Add them to the box.
Options. As before, the default settings,
95% Confidence intervals &
Exclude missing values analysis by analysis are appropriate. Continue, & run the procedure.
Homogeneity of variance is not an assumption of the paired-sample test, so there is no Levene's test. The output shows, in this order:
Note that the pairs of variables are correlated, one pair negatively, & one pair positively. This is very common, when similar variables are measured on the same people.
What are the means for quest2 quest3 quest4?
Is either correlation significant (p<.05, 2-tailed)?
Note that the correlation tells you nothing about the difference. Two samples can be highly correlated, yet have no difference between their means, or a large difference.
Next come the means & 95% CIs of the paired differences. The null hypothesis is that the mean paired difference is zero. If you checked whether the CIs include zero, you should reach the same conclusion as you do from the t-tests themselves (2-tailed at p<.05).
Look at the (2-tailed) t-test results. What are the t, df & p-values:
quest2 vs. quest4:
quest3 vs. quest4:
Imagine you are testing the directional hypothesis that quest4 scores are higher than quest3 scores. What is your conclusion?
Imagine instead you are testing the 1-tailed hypothesis that quest4 scores are lower than quest3 scores. What is your conclusion? (Hint: how does the p-value differ for a 1-tailed & 2-tailed test?)
SPSS, save the edited data file & output files to your drive or USB device: We will use them in the next session.
If you did not finish during the class, it is worth completing later, as it covers some important statistical methods & concepts. Make sure you have copies of the data if you will use computers outside the School, where the drives may not be available.