How do we extract information from tabular
The answer, is a query!
A query is computer representation of a question
we want answer using a table of data.
They allow us to ask questions of the data
and return answers as results.
Queries are created using programming languages.
More specifically, we use a special type of
programming language called a query language.
The most popular query language is Structured
Query Language (or SQL for short).
However, you can also perform queries using
other programming languages like Python and R.
For example, let’s say we want to answer the
“What was Bill’s average body temperature
over the past 5 years”.
We could write this question in the form of
a SQL query.
The SQL query allows us to express that we
select the average temperature – from the
Vital Signs table
where the patient’s name is “Bill”
and the date is greater than or equal to January 1, 2015
(which we’re assuming is five years ago).
I’ve intentionally kept this query simple
for those of you who have never seen a SQL
However, if you’ve had some experience with
SQL, you may see several ways we could improve
When we execute this query:
First, the database will scan all of the records
in the Vital Signs table
Next, it will filter out anyone who is not
our patient named Bill.
Then, it will filter out any row that is older
than January 1, 2015.
Next, it will select just the Temperature
column from the Vital Signs table.
Finally, it will compute the average of the
remaining temperature values.
The computer will then return this value as
As we can see, the result of executing this
query is 37 degrees Celsius.
Essentially, this is how we get from a question
to an answer, using a query.
We start with a question in our natural language.
We construct a SQL query to express that question
to the computer.
We execute the query which returns a result.
And then we express that result in the form
of an answer to the question.
Queries are the primary tool used for extracting
information from tabular data in data science.