Teaching home - Stats - Department - University - Resources - Contact - NPHolmes' lab - NPHolmes' personal
Teaching home - Course #PYM0S1-2

Before starting this section, please make sure you have followed these instructions

Microsoft Excel/OpenOffice Calc: Data handling & graphs

Microsoft Excel (see also OpenOffice Calc) is a powerful spreadsheet program. It allows you to enter data, either numbers or text, into an array of cells, structured in rows and colums (a spreadsheet). You can then edit, sort, & perform calculations on the data. In particular, Excel produces graphs more flexibly than the SPSS statistical package we will use later.

Excel is versatile, having many features. Even if you have used Excel before, you may well find things in this handout that you didn't know. So, detailed guidance is given here for those who are less experienced.

The more experienced might want to use alternative data processing programs, for example Matlab/Octave, R. This may be excellent for your research, but in general the School's computers will not be set up for this. Ask me for more information...

Contents

  1. Introduction: Excel 2007 & earlier versions
  2. Keyboard short-cuts
  3. Start-up & log on
  4. Opening an Excel workbook file
  5. Basic data handling: Editing, copying, moving
  6. Sorting data
  7. Worksheets, copying rows & columns
  8. Functions: Mean & standard deviation
  9. Setting up data for a graph
  10. Drawing a basic graph
  11. Moving, re-sizing, printing, or deleting a graph
  12. Using applications together: Excel & Word
  13. Inserting Excel cells into Word as a table
  14. Improving & extending your graph
  15. Converting numbers to percentages
  16. Chart Tools: Titles and legends
  17. Adding another data series
  18. More chart editing: Changing type, axes, lines etc.
  19. Adding standard error bars
  20. A test of your Excel expertise

  1. Introduction: Excel 2007 & earlier versions
  2. The version of Microsoft Office that is installed on the computers in G37, & other University labs, is Office 2007. This booklet covers Excel 2007. If you have previously used earlier versions of Office programs (e.g., Word 2003, Excel 2003) you may find the new menu system unfamiliar, although you should get used to it quickly. This booklet tells you the basics.

    If you want to continue to use Office 2003 you can save files from the 2007 versions in a form which is readable by the 2003 versions. The 2007 versions can, fortunately, open files from earlier versions.

    You may also wish to use alternative software such as OpenOffice (free, available for many operating systems, & installed on at least some of the University computers). So long as you can perform the functions you need, you should use any software you wish.

    If you have never used Office 2007, or want a brief introduction to the Excel 2007 menu, read the following hints. Otherwise, go straight to Section 1 of the booklet.

    For absolute beginners in Office 2007

    When you start Excel 2007, you see this menu (or 'ribbon').

  3. Keyboard short-cuts
  4. Most operations in most Office & other programs can be performed using the keyboard alone. Learning these 'short-cuts' takes some time & some exploration, but once you know them, you can save yourself a great amount of time in the future, as well as decreasing the distance your hands travel across your desktop when working.

    I strongly recommend that you all learn as many short-cuts as you can!

    Some of the most useful keyboard short-cuts

    (annoyingly, these may not work in Office 2007, & work slightly differently across operating systems, but previous versions, & most other software on Windows & Linux operating systems use these (sorry, I know nothing about Macs...)).

    Ctrl is the 'Control' key (usually found at the bottom left on the keyboard)

  5. Start up & log on
  6. This handout assumes you know how to find & copy files on the Psychology network, using Windows XP. You should have practised this recently, but if you have difficulty, ask a demonstrator.

    Log on to Windows XP and open My Computer. In the BlackBoard MSc folder, you will find two files ExcelData 1 & ExcelData 2. Excel files are known as worksheets or workbooks.

    Save these files to your My Documents or another folder which

  7. Opening the ExcelData 1 & 2 workbook files
  8. Start Excel

    The Excel opening screen shows an array of empty cells (an empty spreadsheet) with the menu above it.

    Click the Office button), then Open, and browse to wherever you copied the files. The first few lines of data in the Excel Data 1

    Data can be arranged into Excel in many ways. This file shows a common method, with the names of the variables (Gender, etc.) in the top row, & each of the other rows containing one person's data. Income gp values 1, 2, 3 denote High, Middle, & Low income respectively.

    Note that the columns are labelled with letters and the rows with numbers. Individual cells are identified by a letter & number, e.g., the first person's gender is in cell B2.

    Use these data to practise or revise basic data handling procedures. If you are already experienced with Excel, you can go through sections 2-5 quickly, checking what you already know & what is new.

  9. Basic data handling: Editing, copying, & moving
  10. To replace the contents of a cell, click on it, type the new contents and press Enter. Alternatively, double-click on the cell and move the cursor to edit the contents.

    To copy & move contents of cells, use the Copy, Cut & Paste icons (under the Home tab, at the left-hand end of the ribbon, in the section called Clipboard. If you're unsure what an icon does, point the mouse at it & wait: Its label will appear.

    Select a row or block of cells by clicking on the first cell, & dragging the mouse over the desired area. To copy a highlighted area, click Copy, and to copy or move these cells elsewhere on the spreadsheet, click on a cell which is at the top left of the destination area, then click Paste. To move a block of cells to a different place, click on Cut instead of Copy.

    Find or Replace can be used in a similar way in Excel as in Word. The relevant icon is under the Home tab, labelled Editing. For example, to replace the Income gp value "1" by the letter "H", select the Income gp column. Then click the Find & Select icon, select Replace from the drop-down menu, enter the old & new values in the appropriate boxes, and click Replace All.

    To select non-adjacent blocks: select the first block of cells, hold down the Ctrl key while selecting the next block(s).

    To delete the contents of a cell or block of cells, select it & press Delete on the keyboard (or the Cut icon).

    If you make a mistake, you can undo what you've just done by clicking Undo which is the second icon on the Quick Access toolbar.

  11. Sorting data
  12. If you simply want to sort (group) the data by one variable, e.g., Gender: Click anywhere in the Gender column, then click the Sort & Filter icon (in the Editing section, next to Find & Select). From the drop-down menu, select Sort A to Z. Excel re-groups the data alphabetically by gender, listing first all Females, then all Males.

    You can sort data by more than one variable, e.g., first by Gender, then by Age. Click anywhere in the table, then click Sort & Filter and choose Custom Sort, which produces the following dialogue box:

    Use the down-arrow in the Sort by box to select the first variable (Gender). Then click Add Level which produces a second Sort by box. Select Age, and Smallest to Largest or the opposite. Note that My data has headers is checked because Excel assumes that the top row of the block of data contains variable names: That is true in this case. (If the top row contained data, you should un-check this box.) Click OK. Excel sorts only the data, not the header row.

  13. Worksheets, & copying rows & columns
  14. An Excel workbook can contain several pages, or worksheets, & it is often convenient to place separate sets of data, graphs, etc. on different sheets. At the bottom of the page there are tabs marked Sheet1, Sheet2, etc. Click on the tabs to move between sheets. To add a new sheet, click the asterisked tab below the spreadsheet.

    To perform actions on a whole sheet, e.g., to re-name it, delete it, or move, or copy it to another location, or to another workbook file; right-click on the sheet tab, & select the action from the pop-up menu, e.g., Rename, Delete, Move, or Copy. NB: if you want to copy a sheet, without losing the original, ensure that you have clicked Make a copy in the Move or Copy Sheet dialogue box.

    You can copy, cut, & paste material from one sheet to another, whether cells, rows, columns, or blocks. Select the area in one sheet, then cut/copy and paste into the other as before.

    For the rest of this exercise...

    The later sections will use the ExcelData 2 file, so first save your edited version of the ExcelData 1 workbook (click on the Office button then then Save As...)

    At this point you have three choices, depending on your Excel expertise:

    1. Beginner: Try to work through all the sections
    2. Intermediate: Read quickly through all the sections to check if you are familiar with all the procedures, and work only on those with which you are unfamiliar
    3. Expert: Go straight to the last section & prove it!

  15. Functions: Mean & standard deviation
  16. Sections 6-19 use file ExcelData 2, so open it.

    These data come from an experiment in which 19 people did a memory test. Each person was tested under 4 conditions, called same1, same2, same3, and diff. Each row shows the test scores from one person. The scores are numbers of correct responses out of a maximum of 6.

    In Section 9 you will make a graph showing the mean scores from the 4 conditions, & eventually add standard error bars to it. First, the relevant descriptive statistics (mean, standard deviation, & standard error) must be calculated for each condition. Excel makes it easy to do this with its built-in functions.

    Mean

    1. Select the cell where you want the mean of a range of numbers to appear (e.g., the cell just below a column)
    2. Click the Function button , immediately above the top row of the spreadsheet
    3. An Insert Function box appears, looking something like this (NB:the "=" sign shows that the result of a calculation will go in the cell):

    4. The function you want is called AVERAGE - find it...
    5. Click AVERAGE, then click OK
    6. A Function Arguments box appears: Define the range (the cells for which you want the average), by writing in the box (e.g., B2:B20), or use the mouse to select the cells, then click OK. The mean average should now appear in the cell

    7. You can copy the formula for the mean to adjacent rows or columns, but when working with data in Excel you should always check & double-check that formulae are entered correctly: Copy & paste errors are very common...
    8. Formulae & other cell contents can also be copied to adjacent cells by "dragging" across cells. When the mouse cursor is over the lower left corner of the cell, a small "+" appears: . Click & hold on the plus, & drag to adjacent cells to copy their contents (Beware: Excel has a number of "auto-complete" functions that operate during this procedure: Always check the results carefully!)

    Standard deviation (SD)

    Follow a similar procedure to compute the standard deviation (STDEV in Excel) for a range of cells. The only difference is in the selection of formulae. In Excel, there are a number of different "standard deviations" - ask a demonstrator if you are unsure which to use...

    If you compute the mean & SD for the four columns of data in the ExcelData_2 file, the final results should be like those shown here:

    If your results do not agree with these, check your calculations, then panic (or ask someone for help).

    Standard error of the mean

    One very important statistical function is the standard error of the mean (SEM), which you can use to put error bars on graphs. You may be familiar with standard errors from your basic statistical training, but if not, it will be covered in the statistics revision sessions.

    Excel does not include the SEM among its statistical functions. However, once you have the standard deviation (SD), the standard error is easy to obtain. The SEM formula is:

    n is the number of subjects (19 for the present data). So, in Excel, the SEM is written =STDEV([DATA RANGE])/sqrt(subjects).

    An even better (safer & easier to maintain) way to calculate it is by using the COUNT function to find n: =STDEV([DATA RANGE])/SQRT(COUNT([DATA RANGE])

    Calculate the SEM for the four columns. You should get this:

  17. Setting up data for a graph
  18. Excel allows to you produce many types of graph, of reasonably high quality (so long as you change the default settings & add labels, error bars...!) It is also *relatively* simple & bug-free to insert a copy of an Excel graph into a Word document, as shown later.

    Before drawing the graph, you should create a block of cells with the name of each condition (same1, same2, etc.), and, immediately below, the data for that condition (the mean, in this case). E.g.:

    If you want to move the mean values to create a table for the graph, you could use Copy | Paste Special or Paste Values. However, copying and pasting data always carries a risk: If you ever update the original raw data values (e.g., to add subjects, correct mistakes, or change formulae), you will also have to re-copy all the other information that you copied before. The more data-processing steps you add to your analysis, the more likely it is that you will make mistakes (& then miss them, & potentially waste hours, days, weeks (or worse!), of your (and others') time)

    KEEP IT SIMPLE!!!

    I strongly recommend that you never use the Paste values option. If you want to re-create values at a different location in the spreadsheet, then simply use the =[CELL ADDRESS] formula (e.g., =B21), which will then always contain the current values of any calculations that you perform in the target cell.

  19. Drawing a basic graph
  20. To create graphs, or charts as Excel prefers to call them, you use the Insert tab.

  21. Moving, re-sizing, printing, editing, or deleting charts
  22. For simple editing of the chart as a whole, e.g., moving it or changing its size or shape, select it by clicking on the chart. A border appears round the chart with "selection handles" (small dots) at the corners and edges. Then, drag with the mouse on any of the selection handles. To move it around the page, click inside, near the border, & drag. To delete the chart, click on the chart, then press the Delete key.

    All the elements of the chart are editable, and, by default, Excel typically does not create good graphs! Producing good scientific graphs is a skill that has to be learnt. The APA Publication Guide may be a good place to start learning how...

    Printing the chart from Excel is possible, but not recommended: Printing is more satisfactory & flexible if you copy the chart into a Word document & print from that. When finished with editing, click anywhere outside the graph.

  23. Using applications together: Excel & Word
  24. As mentioned above, the best way to print an Excel graph is to copy it into Word first. To do this, you must have both applications running at the same time, & switch between them.

    With Excel still running, start up Word

    You can switch between applications by clicking on the taskbar at the bottom of the screen, or by using Alt+Tab

    Now you have both Excel & Word running, you can switch between them, & Copy & Paste your graph from Excel into the Word document. You can place the graph wherever you want within the text. You can move it around, or re-size it, within Word, & print it. This works better than printing graphs from Excel itself.

    When positioning graphs in Word, it can be very useful to see the object placement anchors. Go to Tools | Options | View.

    The graph will be normally be pasted as an embedded object, meaning it can be edited from within Word. I prefer not to do this, but to format the graph properly in Excel, then only paste it into word or a graphics program when it is needed. Embedded objects are likely to cause more problems than un-embedded ones, particularly when moving files between machines, operating systems, etc...

    Formatting the chart in Word

    You can move it, resize it, etc. Once the chart is in Word, it may "float" over surrounding text. It is preferable for the text to "wrap around" the chart. Click on the chart in Word, then look at the Chart Tools heading, with three tabs underneath, Design, Layout, & Format. Under the Format tab, one of the icons is Text Wrapping. Click this to see a drop-down menu with various wrapping options: e.g., Top & bottom. Try these out.

    You can select the chart by single-clicking with the mouse; move it by dragging, & re-size it by dragging one side or corner. You can copy (or delete) it by using the standard Word functions. If you pasted it as an embedded object, you can double-click on the chart to edit it directly.

    To print the document, or the page containing the chart, click the Office button, then Print. Select Print Preview to see how it will look.

  25. Inserting Excel cells into Word as a table
  26. Select a block of cells in Excel, containing either numbers or a mixture of numbers and text. Paste the block into Word.

    The cells are pasted into a Word table, which provides a powerful & convenient tool, & well worth experimenting with.

    Formatting scientific tables is a skill that has to be learnt. One way to learn is to look at tables published in professional scientific journals, & replicate their style and formatting.

  27. Improving & extending your graph
  28. You have now carried out the most essential operations with Excel.

    The rest of this handout describes other useful functions, try if you have time. (If you don't, save what you need and come back later (or wait until a sensible opportunity arises to apply the functions below...))

    The graph created above is fairly basic. There are a number of things you could do to improve it:

    The following sections assume that you are in Excel with the same workbook opened as before.

  29. Converting numbers to percentages
  30. You can use Excel's formula-calculation facility for many purposes. For example: The present data are scores out of a maximum 6. It may be easier to understand them if they are converted to percentages. To convert scores to percentages, divide each score by the maximum score, & multiply by 100. Alternatively, since Excel can treat proportions as percentages (i.e., 0=0%, 1=100%), you can simply divide each score by the maximum score, then Format the cell as a percentage. One problem with this latter method is that Excel will then insert the % label, for example, to all the values on the y-axis of graphs which are based on data formatted as percentages. This does not always look pretty enough for science (the units should be given once in the y-axis label, not on every value on the y-axis...)

  31. Chart Tools: Titles & legends
  32. The Layout tab provides a number of options. For example, to add a title to the vertical axis, click Axis titles, then Primary vertical axis title, then one of the options, e.g., Rotated title. This will insert a title box. To change the text, drag with the mouse over the title and re-type. To change the font, right-click on the title box. Format axis title enables additional formatting options.

    The Legend option under the Layout tab allows you to change the position and appearance of the legend, but not its text: Legend text must be on the spreadsheet itself. Here is how to add, and change, the legend text.

    You did not originally provide a legend on the spreadsheet so, by default, Excel labels the lines on your graph Series N, where N is a number from 1 to however many data series your graph has. To provide a more meaningful legend, the legend text should be written in the cell immediately to the left of the data series itself, and the legends must be included in the block containing the graph data. Alternatively, legends can be added after the graph has been created. Make sure you know how both of these methods work...

  33. Adding another data series
  34. Your present graph has only one data series, i.e., only one graph line. However, you can add more series to the block from which the graph is drawn, and then more lines will be added.

    One reason for adding a second line would be to show "chance level" performance, in order to compare the actual performance to a chance or guessing level. In the present task, "chance" means 50% correct.

    On the line below the data for the experimental conditions, add the legend text Chance in the first column underneath the legend text Observed (or whatever you called it). Under each of the 4 means representing scores from the 4 conditions, type in the score representing chance performance, i.e., 50 (see illustration). You could now recreate the graph, including two rows of data not one. Alternatively, the new row can be added to it as follows. Select the chart. Under Chart Tools, click the Design tab, then click Select Data. The following dialogue box appears.

    Click Add to add a new series. In the Edit Series box (below), type "chance" in the Series name box or click on this cell in the spreadsheet. In the Series values box, delete its present contents, and drag the mouse over the 4 cells containing the desired values (the row of 50s). The values are inserted in the box, and added as a second line on the chart, as shown.

  35. More chart editing: Chart type, axes, lines...
  36. There are many options for changing graphs, e.g., the type of chart (column, bar, pie-chart), the minimum, maximum, & interval values on axes, the colour and pattern of lines, bars, & symbols.

    E.g., under Chart Tools, click on the Design, Layout, or Format tab.

  37. Adding error bars
  38. It is useful, & expected, for example by many journals, to add error bars on graphs. The statistics course will explain more about this later... Excel was not designed for scientific graphs, so its method for error bars is not ideal, & the default options are usually no good. OpenOffice Calc is next to useless for error bars (at present. This may change one day...)

    Error bars can be of a number of kinds. The most useful are likely to be those showing the standard deviation (SD), standard error (SE), or 95% confidence intervals (95%CI). If you don't understand these terms yet, don't worry, all will be explained...

    To add error bars, add a group of cells to the Excel sheet containing the error bar values (i.e., SD, SE, or 95%CI, or any other value) for each condition in the graph. NB: The error bar values must be in the same units as the means (whether raw, percentage, etc.). Then:

  39. A test of your Excel expertise
  40. To test your ability to use and adapt the techniques covered in this handout, try the following (for guidance, see the relevant sections).

    1. Open the original file ExcelData 1
    2. Copy all the data to Sheet 2, rename Sheet 2 "Original"
    3. Go back to Sheet 1, sort the data by Gender and Income gp, rename Sheet 1 as Sorted
    4. Compute the mean & standard deviation of Recog for each Gender & Income gp combination: There should be 6 means & 6 standard deviations
    5. Paste the 6 means into a block of cells with 3 labelled columns (Income gps) & 2 labelled rows (Genders)
    6. From this block of cells, make a Column chart of the mean Recog scores, which has the 3 Income gps along the horizontal axis, & columns (vertical bars) representing Female & Male means for each Income gp. The legend should show which bars represent Females & which Males
    7. Edit the chart so that it has no gridlines, has a legend at the bottom, a better scale on the y-axis, & new colours or patterns for the bars
    8. Compute the standard errors for each of the means shown in the graph
    9. Add standard error bars to the graph
    10. While keeping the Excel file open, start a new Word document. Paste the chart into the Word document. Write a legend for the figure. Figure legends should describe what the figure shows, the x- & y-axes, the units, the sample size, & what any error bars show.

    If you have successfully done all these, you have a good basic competence with Excel. Excel can do many more things, including simple statistical analysis...

    However, Excel & other spreadsheet software should only be thought of as a first step in learning how to store, process, & analyse scientific research data, rather than as a complete solution - Spreadsheets were designed for accountants, not for scientists!

    There are many other specific (& often free) software applications for scientific data processing, & as a researcher you should strive to use the best & most appropriate software for your scientific research needs. In my opinion, this is rarely Microsoft Excel!!


Teaching home - Stats