Thinking with Data - Max Shron

Thinking with Data

  • work as a data strategy consultant. I help people figure out what problems they are trying to solve, how to solve them, and what to do with them once the problems are “solved.”
  • What is missing from most conversations is how important the “soft skills” are for making data useful. Determining what problem one is actually trying to solve, organizing results into something useful, translating vague problems or questions into precisely answerable ones, trying to figure out what may have been left out of an analysis, combining multiple lines or arguments into one useful result…the list could go on. These are the skills that separate the data scientist who can take direction from the data scientist who can give it, as much as knowledge of the latest tools or newest algorithms. Some
  • There are four parts to a project scope. The four parts are the context of the project; the needs that the project is trying to meet; the vision of what success might look like; and finally what the outcome will be, in terms of how the organization will adopt the results and how its effects will be measured down the line.
  • Context (Co) Every project has a context, the defining frame that is apart from the particular problems we are interested in solving. Who are the people with an interest in the results of this project? What are they generally trying to achieve? What work, generally, is the project going to be furthering?
  • A mnemonic for these four areas is CoNVO: context, need, vision, outcome.
  • All stories have a structure, and a project scope is no different. Like any story, our scope will have exposition (the context), some conflict (the need), a resolution (the vision), and hopefully a happily-ever-after (the outcome). Practicing telling stories is excellent practice for scoping data problems.
  • We learn the context from talking to people, and continuing to talk to them until we understand what their long-term goals are.
  • If working with data begins as a design process, what are we designing? We are designing the steps to create knowledge. A need that can be met with data is fundamentally about knowledge, fundamentally about understanding some part of how the world works. Data fills a hole that can only be filled with better intelligence. When we correctly explain a need, we are clearly laying out what it is that could be improved by better knowledge. What will this spreadsheet teach us? What will the tool let us know? What will we be able to do after making this graph that we could not do before?
  • And here are some famous ones from within the data world: We want to sell more goods to pregnant women. How do we identify them from their shopping habits? We want to reduce the amount of illegal grease dumping in the sewers. Where might we look to find the perpetrators?
  • We are generally better at criticizing than we are at making things, but when we criticize our own work, it helps us create things that make more sense.
  • Like designers, the process of discovering needs largely proceeds by listening to people, trying to condense what we understand, and bringing our ideas back to people again. Some partners and decision makers will be able to articulate what their needs are. More likely they will be able to tell us stories about what they care about, what they are working on, and where they are getting stuck. They will give us places to start. Sometimes those we talk with are too close to their task to see what is possible.
  • When possible, a well-framed need relates directly back to some particular action that depends on having good intelligence. A good need informs an action rather than simply informing. Rather than saying, “The manager wants to know where users drop out on the way to buying something,” consider saying, “The manager wants more users to finish their purchases. How do we encourage that?” Answering the first question is a component of doing the second, but the action-oriented formulation opens up more possibilities, such as testing new designs and performing user experience interviews to gather more data.
  • Note that the need is never something like, “the decision makers are lacking in a dashboard,” or predictive model, or ranking, or what have you.
  • data science need is a problem that can be solved with knowledge, not a lack of a particular tool. Tools are used to accomplish things; by themselves, they have no value except as academic exercises.
  • The vision is a glimpse of what it will look like to meet the need with data. It could consist of a mockup describing the intended results, or a sketch of the argument that we’re going to make, or some particular questions that narrowly focus our aims.
  • A mockup primes our imagination and starts the wheels turning about what we need to assemble to meet the need. Mockups, in one form or another, are the single most useful tool for creating focused, useful data work (see Figure 1-1).
  • It’s possible to know everything you need to know for a small, personal project before you even begin. Larger projects, which are more likely to cause something important to change, always have messier beginnings. Information is incomplete, expectations are miscalibrated, and definitions are too loose to be useful. In the same way that the nitty-gritty of data science presumes messier data than is given for problems in a statistics course, the problem definition for large, applied problems is always messier than the toy problems we think up ourselves.

table file.inlinks, file.outlinks from [[]] and !outgoing([[]])  AND -"Changelog"