Repeat

The final stage in the data lifecycle is to repeat the process. As I mentioned in the previous step, it’s important that we observe the outcome of our actions to see whether they made a positive impact, a negative impact, or led to no change at all. We want to use this information as feedback to drive the next iteration of the process. Feedback is very important in data science – it tells us whether we’re steering the ship in the right direction or headed towards a giant cliff. Data science is a highly iterative process, so we are typically repeating this feedback loop over and over on a regular basis. The faster we can receive valuable feedback, the quicker we can make course corrections, and the sooner we can achieve our goals. We want to use this feedback loop to drive continuous improvement in our business processes over time. Essentially, this is how we optimize any business process – by continuously improving it over time using feedback. However, it’s critically important to note that the success of this data-driven process is based on all of the steps that came before it. So we need to ensure that: – we’ve collected reliable data from our observations, – we’ve stored them correctly in the right type of persistent storage, – we’ve processed them correctly using the right tools and methods, – we’ve analyzed them correctly using the right tools and methods, – we’ve made a rational decision based on results of our analysis, – and we’ve repeated this process using outcomes of our actions as feedback. Performing all of these steps correctly is a bit more difficult than it sounds. This is why it’s so important that you learn the rest of the details of data science: so that you can always choose the best possible action given the data.