DDDM#

  • We will only cover the basics of DDDM.

  • Many facets of “Data to Decision” can be placed under the DDDM umbrella.

Data-Driven#

  • Make informed decisions that are based on facts rather than intuition or guesswork, i.e., without bias or emotion.

  • Clive Humby 2006 -“Data is the new oil”

    • Big data

    • Blindly collecting data and looking for patterns is not likely to result in long term profits and revolutionary discoveries.

1. Know your objectives#

  • A well-rounded data analyst:

    • Understands the business and the competitive market.

    • Asks the right questions about the problems in the industry.

    • Identifies and understands the problems thoroughly.

    • Makes better inferences with data based on the foundational knowledge.

  • Before collecting data, a data analyst should:

    • Identify the business questions that need to be answered to achieve organizational goals.

    • Determine the precise questions that need to be answered to inform the strategy.

    • Streamline the data collection process and avoid wasting resources.

2. Find relevant data#

  • Identify the data sources. This could include databases, web-driven feedback forms, social media, and other sources.

  • Coordinate the data sources. This may involve identifying common variables and ensuring that the data is in a consistent format.

  • Consider the future use of the data. Is it only needed for this project, or could it be used for other projects in the future? If so, you should develop a strategy to present the data in a way that is accessible in other scenarios.

  • In “Data to Decision” we will call this part Data sources.

3. Pre-process data#

  • Data cleaning is important: 80% of a data analyst’s time is spent cleaning and organizing data, while only 20% is spent analyzing it. This shows that having clean data is essential for accurate analysis.

  • Data cleaning is the process of preparing raw data for analysis. Data cleaning involves a variety of tasks, such as:

    • Identifying and removing incorrect data

    • Filling in missing data

    • Formatting the data in a consistent way

    • Standardizing the data

  • To start data cleaning, build tables to organize and catalog the data: This will help you to better understand the data and identify any problems. Create a data dictionary: This is a table that catalogs each of your variables and translates them into what they mean to you in the context of this particular project.

  • In “Data to Decision” we will call this Data quality

4. Analyse data#

  • Once you’ve cleaned the data, you can start to analyze it using statistical models. This involves building models to test the data and answer the business questions you identified earlier in the process.

  • There are three different ways to present your findings:

    • Descriptive information: This is just the facts.

    • Inferential information: This includes the facts, plus an interpretation of what those facts indicate in the context of a particular project.

    • Predictive information: This is an inference based on facts and advice for further action based on your reasoning.

  • Clarifying how the information will be most effectively presented will help you remain organized when it comes time to interpret the data.

  • We will use graphics to condense data into interpretable information.

  • In “Data to Decision” we will call this Machine Learning

5. Make decisions#

  • Did the analysis:

    • confirm suspicions?

    • shed new light on previously accepted “facts”?

    • find something deviating from expectations?

    • reveal flaws or weeknesses?

  • In “Data to Decision” we might not make decisions ourselves, but we will perform Deployment

Data and decision making#

  • Simon Jackson defines the following Five flavours of decision making:

    • Data-driven: Commiting to a decision that will be taken from data before seing it. This means trusting that the DDDM pipeline is well designed and working.

    • Data-inspired: Making decision after looking at data or visualisations, believing that you see trends, patterns or anomalies. This means you have used data, but haven’t made your choices on testable assumptions but more vague notions.

    • Data-aspired: Making decisions without data, but having a plan of collecting and analysing data at a later stage. This means you at least wish to collect data, but do not commit to changing your decision if data-based evidence comes at a later time.

    • Data-ignorant: Making decisions without considering data. This means you either do not have data available, haven’t been able to leverage data or go against the evidence found by analysis.

    • Data-tortured: First making a decision, then looking through data until you find something that seems to support your case. This means you are biased by your own beliefs when looking at data, possibly grasping for straws. “If you torture the data long enough, it will confess to anything” ~ losely after Ronald Coase in the early 60’s.