Making an Effective Analysis
A good analysis has five traits:
- It answers the question
- It is made quickly
- It can be shared
- It is self-contained
- It can be revisited
Generally a good analysis is something that helps non-data scientists do their job.
The Request
Business Question -> Data Question -> Data Answer -> Business Answer
Ask the requestor lots of questions to make sure you have a thorough understanding of the business question they want an answer for. Questions can usually be asked in a 30-60 min meeting. Examples:
- Who’s requesting the analysis?
- What’s the motive?
- What’s the request?
- What decision will be made?
- Do we have the required data?
The Plan
Make a plan before running away with the data! Don't waste precious time doing something your customer (requestor) doesn't care about.
Use a template for a write up:
- The top—List the title of the analysis, who you are (in case the analysis will be shared with others), and the objective of the analysis.
- Sections—Each section should be a general topic in the analysis. The analysis work done within each section should be self-contained (not relying on the work of other sections), so it should be possible for a different person to do each section. Each section should have a list of tasks.
- First level of section lists—The first level of the section lists should be each ques tion that was posed. This section will help everyone remember why you’re doing that specific work, and if all the questions are successfully answered, the topic of the main section should be considered to be understood.
- Second level of section lists—The second level of the lists should have the actual tasks to do that can be checked off as the work is being done. These tasks could be types of models to run, for example, and the descriptions should be specific enough so that at any time, you could concretely say whether the work has been completed.
Share your plan with your supervisor and requestor to get it approved. You may need to make changes as you go, but changes are easier to manage once you already have a plan in place.
Doing the Analysis
Importing and Cleaning Data
- spend as little time as possible on anything that won't be needed and as much time as possible on work that will help down the line.
- When you come across weird data points, talk to people and find out where it comes from.
Data Exploration and Modeling
- Follow the approved plan and try to complete all the work you had planned. Pivot when necessary.
- Focus on answering the question above all else
- Simple methods are better than complex ones
- Make your report continuously ready to share and a one-button run
- Check in regularly with stakeholders to keep them up to date on your progress and pitfalls
Wrapping it up
- Polish your stuff! Whether that's code or slides make it nice.
- Set up a meeting with the stakeholder to present final findings
- Mothball your work
- Double-check whether you can rerun the whole analysis
- Comment your code
- Add a README file
- Store your code safely
- Ensure that the data is stored safely
- Output is stored in a shared location
Sources: 1