Interval Data

The first type of numerical data that we encounter in data science are interval data. Interval data are a type of numerical data. That is, they represent measured quantities of things. Interval data allow for a degree of difference between two values (i.e. we can add or subtract the values in meaningful ways). However, interval scales have an arbitrary zero point on their scale (i.e. the place were zero appears on the scale was chosen for convenience not because it represents a true absence of the thing being measured. So there is no concept of a ratio between two numbers or the ability to multiply or divide two numbers in any meaningful way. For example, imagine a thermometer measuring outdoor temperature. The zero point on a Celsius thermometer represents the temperature where water freezes. This is simply for convenience zero on this scale does not represent absolute zero heat, like it does on the Kelvin scale. The difference between 20°C and 30°C (which is a 10° change) is the same difference in temperature as a change from 40° to 50° (also a 10° change). So we can perform addition and subtraction with this interval scale. However, it doesn’t make sense to say that 20°C is half as hot as 40°C or that 40°C is twice as hot as 20°C. This is because 0°C isn’t the absence of all heat but rather was an arbitrarily chosen point on the scale where water freezes. So it simply doesn’t make sense to discuss ratios, multiplication, or division with the Celsius temperature scale or other interval scales. Other examples of interval data include: IQ scores, dates on a calendar, and longitudes on a map. The key distinction is that the zero point on an interval scale is arbitrarily chosen; it doesn’t represent a natural minimum quantity of the thing being measured. We can perform a few more mathematical operations on interval data than we can on nominal and ordinal data. In addition to testing for equality, sorting by order, and determining both the mode and the median. We can also add or subtract interval data. In addition, we can also determine the arithmetic mean (i.e. the average value in a set of interval values). Interval data are a bit more powerful than nominal and ordinal data in terms of mathematical operations, but still not as powerful as ratio data.