2.9 Intro to Data Science: Basic Descriptive Statistics

  • In data science, you’ll often use statistics to describe and summarize your data.
  • Here, we introduce several descriptive statistics:
    • minimum—the smallest value in a collection of values.
    • maximum—the largest value in a collection of values.
    • range—the range of values from the minimum to the maximum.
    • count—the number of values in a collection.
    • sum—the total of the values in a collection.

Determining the Minimum of Three Values

  • Determine the minimum of three values manually in a script that prompts for and inputs three values, uses if statements to determine the minimum value, then displays it.
# fig02_02.py
"""Find the minimum of three values."""

number1 = int(input('Enter first integer: '))
number2 = int(input('Enter second integer: '))
number3 = int(input('Enter third integer: '))

minimum = number1

if number2 < minimum:
    minimum = number2

if number3 < minimum:
    minimum = number3

print('Minimum value is', minimum)
  • Enter 12, 27 and 36
In [1]:
run fig02_02.py
Minimum value is 12
  • Enter 12, 36 and 27
In [2]:
run fig02_02.py
Minimum value is 12
  • Enter 36, 27 and 12
In [3]:
run fig02_02.py
Minimum value is 12
  • Logic of determining minimum:
    • First, assume that number1 contains the smallest value.
    • The first if statement then tests number2 < minimum and if this condition is True assigns number2 to minimum.
    • The second if statement then tests number3 < minimum, and if this condition is True assigns number3 to minimum.
  • Now, minimum contains the smallest value, so we display it.

Determining the Minimum and Maximum with Built-In Functions min and max (1 of 2)

  • Python has many built-in functions for performing common tasks.
  • min and max calculate the minimum and maximum, respectively, of a collection of values.

Determining the Minimum and Maximum with Built-In Functions min and max (2 of 2)

In [4]:
min(36, 27, 12)
Out[4]:
12
In [5]:
max(36, 27, 12)
Out[5]:
36

Determining the Range of a Collection of Values (1 of 2)

  • The range of values is simply the minimum through the maximum value.
  • Much data science is devoted to getting to know your data.

Determining the Range of a Collection of Values (2 of 2)

  • Have to understand how to interpret statistics.
    • If you have 100 numbers with a range of 12 through 36, those numbers could be distributed evenly over that range.
    • At the opposite extreme, you could have clumping with 99 values of 12 and one 36, or one 12 and 99 values of 36.

Functional-Style Programming: Reduction

  • We introduce various functional-style programming capabilities.
  • These enable you to write code that can be more concise, clearer and easier to debug.
  • Functions min and max are examples of a functional-style programming concept called reductioneach reduces a collection of values to a single value.
  • Other reductions you'll see: sum, average, variance and standard deviation.

©1992–2020 by Pearson Education, Inc. All Rights Reserved. This content is based on Chapter 2 of the book Intro to Python for Computer Science and Data Science: Learning to Program with AI, Big Data and the Cloud.

DISCLAIMER: The authors and publisher of this book have used their best efforts in preparing the book. These efforts include the development, research, and testing of the theories and programs to determine their effectiveness. The authors and publisher make no warranty of any kind, expressed or implied, with regard to these programs or to the documentation contained in these books. The authors and publisher shall not be liable in any event for incidental or consequential damages in connection with, or arising out of, the furnishing, performance, or use of these programs.