| I. Descriptive Statistics
        and Probability Why
        statistics? 
            Present and describe
                information - Chs. 1, 2, 3Draw conclusions on a
                larger population based on sample - Chs. 4, 14,
                5, 6, 7, 8, 9, 10Improve processes -
                Ch. 15Obtain forecasts for
                variables of interest - Chs. 11, 12, 13 A. Collecting and
        Presenting Data 
            1. Definitions 
                a. Population
                - totality of items under consideration b. Sample -
                subset of population selected for analysis c. Parameter
                - summary measure that describes a characteristic
                of a population d. Statistic
                - summary measure that describes a characteristic
                of a sample from a population e. Descriptive
                statistics - describes the characteristics of
                a set of data f. Inferential
                statistics - estimate characteristics of a
                population based on sample results 
            2. Data 
                a. Sources  
                
                    (1) Published
                    sources - government, industrial, or
                    individual . 
            
                
                    (2)
                    Experimentation . 
            
                
                    (3) Survey . 
            
                
                    (4)
                    Observation . 
            
                
                    (5) Point of
                    service . 
            
                b. Types of data 
                    (1) Time-series
                    - hold unit constant, vary across time (2) Cross-section
                    - hold time constant, vary across units (3) Categorical
                    data - categories (4) Numerical
                    data - numeric results 
                        (a) Discrete
                        data - results from a counting
                        process (b) Continuous
                        data - results from a measuring
                        process . 
            3. Samples 
                a. Concepts 
            
                
                    (1) Frame
                    - list of all items from which sample will be
                    drawn (2)
                    Replacement 
                        Sampling
                            with replacement - observation
                            returned to frame 
                        Sampling
                            without replacement - observation
                            not returned to frame (3) Types of
                    samples 
                        Probability
                            sample - sample chosen on basis
                            of known probabilities 
                        Nonprobability
                            sample - probability of sample
                            being chosen unknown (4) Randomness 
            
                
                    
                        Random
                            number table - Table E.1, p. 832 -
                            833 . . . 
            
                
                    
                    
                        Excel
                        function: =RANDBETWEEN(#1,#2) . . . . 
            
                
                    
                 . . 
            
                b. Sampling
                methods 
                    (1) Simple
                    random sample - each item equally likely
                    to be chosen (2) Systematic
                    sample - choose every kth item from a
                    list 
            
                
                    
                        Easier to
                        do if data already in the form of a list Also
                        easier if one item produced at a time 
            
                
                    (3) Stratified
                    sample - divide into categories, random
                    sample from each category 
            
                
                    
                        Want
                        sample to match characteristics of
                        population 
            
                
                    (4) Cluster
                    sample  
                        Divide
                            population into clusters 
                        Choose
                            clusters at random 
                        Random
                            sample from each cluster 
                        Should be
                        homogeneous across clusters,
                        heterogeneous within clusters 
            
                
                    
                        Less
                        costly if observations scattered
                        geographically 
            4. Sources of error 
                a. Coverage
                error - exclude part of population 
            
                
                    Selection bias 
            
                
                    Ex. - Literary
                    Digest How Polls are Conducted 
            
                b. Nonresponse
                error - some people dont respond 
            
                
                    Ex. -
                    Call screening Upper and
                    lower classes less likely to respond 
            
                c. Sampling
                error - wrong individuals chosen by chance d. Measurement
                error 
                    (1) Question
                    wording - ambiguous or leading 
            
                
                    
                        Ex.
                        - Unemployment rate Microsoft Rigged the Survey? 
            
                
                    (2)
                    Interviewers effect on respondent - try
                    to please interviewer  
                        "Halo"
                        effect 
            
                
                    
                        Ex.
                        - Race 
            
                
                    (3) Effort
                    made by respondent - exaggeration, lack of
                    effort 
                        Ex.
                        - TV ratings, consumer surveys 
            
                Key ethical issue
                is intent - okay if errors made
                unintentionally, unethical if deliberately done 
            5. Presenting data 
                a. Ordered array -
                raw data in rank order 
                    Use Sort
                    function: Data | Sort . . . 
            
                b. Stem-and-leaf
                display . . . 
            
                
                    Stem-and-leaf
                    option in PHStat c. Frequency
                distribution - table of class groupings or
                categories . . . 
            
                
                    (1) Need
                    sufficient number of classes (5 - 15, 3 - 10) (2) Class
                    interval 
            
                
                  
                    Width =
                        range / number of classes . . . . . 
            
                
                    
                      Better to
                        round 
            
                
                    (3) Class
                    boundaries 
            
                
                    
                      Avoid
                        overlapping 
            
                
                  Use Histogram
                    function of Excel: Tools | Data
                    Analysis | Histogram . . . . . . 
            
                d. Relative
                frequency distribution . . . . . . 
            
                
                    Percentage
                    distribution - convert relative frequencies
                    to percentages e. Cumulative
                distribution . . . . . . 
            
                
                    Cumulative
                    relative frequency distribution Cumulative
                    percentage distribution f. Summary table 
                    Frequency
                    distribution for categorical data . . . . . 
            
                g. Contingency
                table (cross-classification table) 
                    Two
                    simultaneous categorical variables . . . . . . 
            
                
                    Use PivotTable
                    function of Excel 6. Graphical
            presentation 
                a. Types of graphs 
                    (1) Histogram . . . . . . . 
            
                
                    Use Histogram
                    function of Excel: Tools | Data
                    Analysis | Histogram (2) Percentage
                    polygon . . . . . . . 
            
                
                    (3) Cumulative
                    percentage polygon (ogive) . . . . . . . 
            
                
                    (4) Pareto
                    diagram . . . . . . . 
            
                
                    Use Histogram
                    function of Excel: Tools | Data
                    Analysis | Histogram b. Principles of
                graphical excellence 
                    (1)
                    Well-designed presentation of data that
                    provides substance, statistics, and design (2)
                    Communicates complex ideas with clarity,
                    precision, and efficiency (3) Gives the
                    viewer the largest number of ideas in the
                    shortest time with the least ink (4) Almost
                    always involves several dimensions (5) Requires
                    telling the truth about the data |