Data Entry Microsoft Word Formatting
Foundations: Data, Data, Everywhere
Introducing data analytics
Thinking analytically
Exploring the wonderful world of data
Setting up a data toolbox
Discovering data career possibilities
Course Challenge
Video 01 Introducing data analytics
Data helps us make decisions in everyday life and in business.
In this first part of the course, you’ll learn how data analysts use data analytics and the tools of their trade to inform those decisions.
You’ll also discover more about this course and the overall program expectations.
Define key concepts involved in data analytics including data, data analysis, and data ecosystem
Discuss the use of data in everyday life decisions
Identify the key features of the learning environment and their uses
Describe principles and practices that will help to increase one's chances of success in this certificate
Explain the use of data in organizational decision-making
Describe the key concepts to be discussed in the program, including learning outcomes
A data analyst collects, processes and performs statistical analyses on large dataset.
They discover how data can be used to answer questions and solve problems.
The analyst’s role is to identify patterns, make connections and help organizations make decisions.
However, the data analyst is not typically a trained statistician and therefore may not be able to interpret the statistical results of their analysis.
We are developing a new data analyst role to help address these challenges.
Data analysts work on a wide range of projects and tasks (data mining) to find patterns and insights.
They use data to answer questions, assess trends, and understand relationships in the data.
Data analysts work in a variety of disciplines ranging from business and management to biomedical research and education.
This role is described as a gatekeeper to the organization's data and the data analyst's role includes the role of the data analyst in analyzing the data in order to extract meaning, identify trends and identify areas of opportunity.
This role is often described as the “data analyst”.
“Organizations are using data analytics to make their business better.” - Jim Collins
There are 6 Phases of Data Analysis according To Google analytics certification course.
These six phases will help you grow as a data analyst.
1. Ask
2. prepare
3. process
4. Analyze
5. share
6. Act
The Six Data Analysis Phases
Data Analysis is never easy, yet we are all Data Analysts. Remember the time you opened the Excel file and compared two options for purchasing a new laptop? Or the time you looked at the stock market to understand where we are heading?
Congratulations, you are a Data Analyst. But analysis is a skill that is refined over years and years of hard work. Which does not mean, that there are no ways to make it a bit easier to start.
I have been looking closely at how Google is doing things and I really like their framework, let me use this article to reflect on that.
According to Google, there are six data analysis phases or steps: ask, prepare, process, analyze, share, and act.
Following them should result in a frame that makes decision-making and problem solving a little easier.
Please let us not mix them with the data life cycle, let's keep that for another time. But to understand how the six phases would help in decision-making, let's review them.
Step 1: Ask - Understand the problem
It is always important to understand what even seems to be the problem or the question.
Making an assumption or not understanding fully the problem will lead to wrong conclusions and will result in wrong actions.
Identifying the problem is naturally also one of the hardest tasks. Like A. Einstein stated: 'If I had an hour to solve a problem I'd spend 55 minutes thinking about the problem and 5 minutes thinking about solutions.'
So what would help to identify the problem? The following actions should help:
State the problem. This will become your first cornerstone. If it should change over the course of time, it is very natural. The more we know the wiser we are.
All the problem statements should be measurable, clear and concise.
Our statement of the problem is our focus. Everything else should be an afterthought and avoided.
Try to see the bigger picture! Take a step back and see the whole situation in context. And context is crucial here. Different settings can give different meanings.
Never set the sail alone and make sure you fully understand the collaborators' expectations. This means, get people involved, get their views and interest.
Once that is there, make it also clear what they expect! Be open in all the conversation and do not play telephone game*, meaning try to communicate to everyone in the same manner and if possible in the same conversation.
*Chinese whispers (Commonwealth English) or telephone (North American English) is an internationally popular children's game. Players form a line or circle, and the first player comes up with a message and whispers it to the ear of the second person in the line.
Questions to ask:
What are the stakeholders stating as their problems?
How can the stakeholders questions be resolved? (this is a bit reverse engineering)
Is the stated problem really the root cause?
Step 2: Prepare - What do I need?
Once there is an understanding of the problem, one can think about how to solve this. Time to decide what data needs to be collected in order to answer the questions and how to organize it so that it is useful.
One should think about the following aspects:
What metrics to measure? (*Metrics are quantitative measurements).
In answering this question, there might be a need to answer also sub-questions (e.g., is our time-to-market competitive for product X? If not, what process improvements would help?).
What factors should be taken into account?
Where the data is located (files, database, external system, internal system)?
If the data will be moved, how it will be stored and what are the needed security measures to protect that data.
Questions to ask yourself in this step:
What needs to be figured out how to solve this problem?
What would help to measure the outcome of any change to the problematic area?
What research is needed?
Where is the information held?
Step 3: Process - Make it usable!
When we start using the data, it might be a combination from different sources or it might not be of the highest quality. A process known as data cleaning is the fixing or removing incorrect, corrupted, incorrectly formatted, duplicate, or incomplete data within a dataset. What we aim to achieve is clean data.
And to tell the truth, that is a science on its own.
There are plenty of tools, theories, and methods to use, but let's keep everything basic here. Data Cleaning does not require fancy tools or words, a simple Spreadsheet program (yes that is Excel) will suffice. Although my preference lies with others (Pandas!). So during this step one might:
Using proper tools to find incorrect and incomplete data.
Removing inconsistencies in data. Sometimes there might be duplicated entries.
One of the most important aspects to keep in mind - identifying whether your data is biased. Essentially, data that is biased will not be representative of the population or phenomenon of study, our issue we are trying to solve.
Questions to ask yourself in this step:
Is the data source trustable and data quality high?
What data errors or inaccuracies could occur within given dataset.
What is the best possible answer to the problem being solved?
How to clean the data so the information is more consistent? (e.g. replace values with mean values, et cetera)
Step 4: Analyze - Tell me the story!
Next up is to make some conclusions based on the trustable data. Data Analyses is a skill that takes time to master, but over time the patterns will emerge faster and methods one uses will develop.
Main concept is to think analytically about your data, be critical and be creative. There might be a need to sort and format the data to make it easier to process, make a Pivot table, or create awesome graphs! Remember it is a story that must unfold. Further processing might include:
Performing different calculations get get additional metrics.
Combining additional data attributes from a variety of sources to get a more comprehensive story.
Create different views for the data. Like tables with your results, filter and pivot them.
Make it visual if possible! Charts tell more than a thousand words.
Questions to ask yourself in this step:
What story is my data telling me?
Why can’t it be done?
Will X (e.g. time, money, manpower or expertise) allow us to solve the issue?
How will my data help me solve this problem?
Who needs my company’s product or service?
What type of person is most likely to use it?
Step 5: Share - Get different views
One thing still to remember, whatever we do, we are biased. So as the next step, get additional opinions about the findings. This will significantly help to improve the results and ensure that main aspects were taken into account.
As the are many ways to share the finding each person has their preference and so does each company. However, many studies reassure that with clear and enticing visuals of the analysis results, the story is better understood.
(A Good article on this https://hdsr.mitpress.mit.edu/pub/zok97i7p/release/3).
The tools do not really matter here, it can be Tableau, Excel or even good old paper and pencil! But take this as a chance to show the stakeholders how their problem was solved. Sharing will certainly help with:
Making better decisions. The feedback will help to answer the questions that initially were not tought of.
Making more informed decisions. Feedback will not be merely critic, but also suggestions and additional information on the matter.
Improve the general outcome. From one angle, the decision will most likely be more informed and better, but also the transparency will grant that there is more support to the findings.
Questions to ask yourself in this step:
How can I make what I present to the stakeholders engaging and easy to understand?
What would help me understand this if I were the listener?
What makes a data visualisation good?
Step 6: Act - We know the problem, Let's solve it!
No analysis conclusion should remain to collect dust on a shelf! Rather some action should be taken.
Taken the results and depending on the problem statement, recommendations for further actions can be made. And once the recommendations are ready, the actual decision can be made!
Not necessarily is the conductor of the analysis the one to make a decision, it could also mean providing the decision-makers (stakeholders) with recommendations based on the findings so they can make data-driven decisions. But the key here is data-driven decisions.
Questions to ask yourself in this step:
How can the feedback received during the sharing phase (step 5) be used to meet the stakeholder’s needs and expectations?
What potential solutions to the outlined problem could there be?
Is this problem worth solving? (Yes, that is also a potential outcome)
And done you are!
These are the six steps that Google has outlined for Data Analysis. And they do help to structure the thinking when conducting Data Analysis. To define structured thinking then actually it is breaking the (data analysis) process into smaller, manageable parts.
According to Googlers, this process involves four basic activities:
Recognizing the current problem or situation.
Organizing available information.
Revealing gaps and opportunities.
Identifying your options.
But with that, let's go and make some data-driven decisions!
Start the Program
Data analysis is the collection, transformation, and organization of data in order to draw conclusions, make predictions, and drive informed decision-making.
A Data analyst is someone who collects, transforms, and organizes data in order to help make informed decisions.
Program description and course syllabus to the google data analytics certificate
Program you are about to explore is specifically designed to help every type of learner successfully finish the certificate and become an entry-level junior or associate data analyst.
No previous data analytics, mathematics, or statistical experience is required.
To succeed, you just need to be open to learning how data influences the world.
Become job-ready
Every day, the amount of data out there gets bigger and bigger.
So the ability to interpret it effectively is more important than ever before.
Data analytics is becoming one of the fastest-growing and most rewarding career choices in the world.
In the next decade, the demand for business analytics skills will probably be higher than the demand for any other career (10.9% vs. 5.2%) (Source: Bureau of Labor Statistics). All kinds of companies all over the world need qualified data analysts to solve problems and help them make the best possible business decisions.
And right now, fifty-nine percent of companies have plans to add even more positions requiring data analysis skills (Source: SHRM).
By the time you are done with this program, you will be well-prepared to make smart, strategic, data-driven recommendations for organizations in all kinds of industries.
During each course of the program, you will complete lots of hands-on assignments and projects based on both day-to-day life and the practical activities of a data analyst.
Along the way, you will learn how to ask the right questions and understand objectives.
You will also learn how to effectively clean and organize large amounts of data to make it ready for high-quality analysis.
On top of that, you will get hands-on experience using all kinds of tools and techniques that will help you recognize patterns and uncover relationships between data points.
And to help you communicate the results of your analysis, you will learn how to design visuals and dashboards. There is even an opportunity to create a case study, which you can highlight in your resume to show what you have learned to potential employers.
Course content
Course 1– Foundations: Data, Data, Everywhere
1. Introducing data analytics: Data helps us make decisions, in everyday life and in business. In this first part of the course, you will learn how data analysts use tools of their trade to inform those decisions. You will also get to know more about this course and the overall program expectations.
2. Thinking analytically: Data analysts balance many different roles in their work. In this part of the course, you will learn about some of these roles and the key skills that are required. You will also explore analytical thinking and how it relates to data-driven decision making.
3. Exploring the wonderful world of data: Data has its own life cycle, and data analysts use an analysis process that cuts across and leverages this life cycle. In this part of the course, you will learn about the data life cycle and data analysis process.
They are both relevant to your work in this program and on the job as a future data analyst. You will be introduced to applications that help guide data through the data analysis process.
4. Setting up a data toolbox: Spreadsheets, query languages, and data visualization tools are all a big part of a data analyst’s job. In this part of the course, you will learn the basic concepts to use them for data analysis. You will understand how they work through examples provided.
5. Discovering data career possibilities: All kinds of businesses value the work that data analysts do. In this part of the course, you will examine different types of businesses and the jobs and tasks that analysts do for them.
You will also learn how a Google Data Analytics Certificate will help you meet many of the requirements for a position with these organizations.
6. Completing the Course Challenge: At the end of this course, you will be able to put everything you have learned into perspective with the Course Challenge. The Course Challenge will ask you questions about the main concepts you have learned and then give you an opportunity to apply those concepts in two scenarios.
What to expect
Each week of the course includes a series of lessons with many types of learning opportunities. These include:
Videos for instructors to teach new concepts and demonstrate the use of tools
Readings to introduce new ideas and build on the concepts from the videos
Discussion forums to share, explore, and reinforce lesson topics for better understanding
Discussion prompts to promote thinking and engagement in the discussion forums
Practice quizzes to prepare you for graded quizzes
Graded quizzes to measure your progress and give you valuable feedback
Also, be sure to pay attention to the in-video questions that will pop up from time to time. They are designed for you to check your learning.
Everyone learns differently, so this program has been designed to let you work at your own pace. Although your personalized deadlines start when you enroll, they are just a guide. Feel free to move through the program at the speed that works best for you.
There is no penalty for late assignments; to earn your certificate, all you have to do is complete all of the work. If you prefer, you can extend your deadlines by returning to Overview in the navigation panel and clicking Switch Sessions. Assessments are based on the approach taken by the course to offer a wide variety of learning materials and activities that reinforce important skills.
Graded and ungraded quizzes will help the content sink in. Ungraded practice quizzes are a chance for you to prepare for the graded quizzes. Both types of quizzes can be taken more than one time.
Optional speed track for those experienced in data analytics
The Google Data Analytics Certificate provides instruction and feedback for learners hoping to earn a position as an entry-level data analyst. While many learners will be brand new to the world of data analytics, others may be familiar with the field and simply wanting to brush up on certain skills.
If you believe this course will be primarily a refresher for you, we recommend taking the practice diagnostic quiz (you can find it in this week's content).
It will enable you to determine if you should follow the speed track, which is an opportunity to proceed to Course 2 after having taken each of the Course 1 Weekly Challenges and the overall Course Challenge.
Learners who score 100% on the diagnostic quiz can treat Course 1 videos, readings, and activities as optional. Learners following the speed track are still able to earn the certificate.
Tips
It is strongly recommended to take these courses—and go through the items in each lesson—in the order they appear because new information and concepts build on previous knowledge.
Use the additional resources that are linked throughout the program. They are designed to support your learning.
When you encounter useful links in the course, remember to bookmark them so you can refer to the information for study or review.
Additional resources are free, but some sites place limits on how many articles can be accessed for free each month. Sometimes you can register on the site for full access, but you can always bookmark a resource and come back to view it later.
If something is confusing, don’t hesitate to re-watch a video, go through a reading again, and so on.
Take part in all learning opportunities to gain as much knowledge and experience possible.
Congratulations on choosing to take this first step toward becoming part of the wonderful world of data analytics. Enjoy the journey!
Learning Log: Think about data in daily life Overview
By now, you've started to discover how powerful data can be. Throughout this course, you’ll be asked to make entries in a learning log.
Your log will be a personal space where you can keep track of your thinking and reflections about the experiences you will have collecting and analyzing data.
Reflections may include what you liked, what you would change, and questions that were raised.
By the time you complete the entry for this activity, you will have a stronger understanding of data analytics.
Everyday data
Before you write an entry in your learning log, think about where and how you use data to make decisions. You will create a list of at least five questions that you might use data to answer. Here are a few examples to inspire you:
A. What’s the best time to go to the gym?
B. How does the length of your commute to work vary by day of the week?
C. How many cups of coffee do you drink each day?
D. What flavor of ice cream do customers buy?
E. How many hours of sleep do you get each day?
Then, you will select one of the five questions from your list to explore further and write down the types of data you might collect in order to make a decision. That’s data analysis in action!
Access your learning log
To use the learning log for this course item, click the link below and select Use Template.
Reflection
After you consider how you use data analysis in your own life, take a moment to reflect on what you discovered. Reflections may include what you liked, what you would change, and questions that were raised. In your new learning log entry, you will write 2-3 sentences (40-60 words) in response to each question below:
What are some considerations or preferences you want to keep in mind when making a decision?
What kind of information or data do you have access to that will influence your decision?
Are there any other things you might want to track associated with this decision?
When you’ve finished your entry in the learning log template, make sure to save the document so your response is somewhere accessible. This will help you continue applying data analysis to your everyday life. You will also be able to track your progress and growth as a data analyst.
Video 02 Introduction to the course
"Data! Data! Data! I can't make bricks without clay." Any guesses who said this? I'll give you a hint. It wasn't a famous tech CEO, or a data analyst.
The person who said this lived long before the tech companies even existed. But I bet you've still heard of him.
This line was said by Sherlock Holmes, the famous detective created by Sir Arthur Conan Doyle. What Doyle meant was that Holmes couldn't draw any conclusions, which would be the bricks he mentioned without data, or the clay. You're probably not here to become a world famous detective, but data is still the building block that you'll use for everything you do in your new data analyst career, Sherlock Holmes would agree. By starting this program, you've shown that you and Sherlock Holmes have something in common, you both have an interest in learning more.
That's one of the most important qualities that data analysts can have. Now, there are a bunch of different ways to explore data, but one of the great things about data analytics is that you can often learn how you want, when you want. That might mean doing your own research, talking with people in the industry, or taking online courses. With that said, welcome to your first course.
This is your introduction to the wonderful world of data analytics. Since data analytics is the science of data, you'll use this course to begin to learn all about data. Data is basically a collection of facts or information, and through analysis, you'll learn how to use the data to draw conclusions, and make predictions, and decisions. Personally, I didn't jump right into the data analytics field. I thought data analysis was for computer engineers. Instead, I started off with dreams of working in finance. Once I got through an internship though, I realized it wasn't the career path I wanted to take.
I started to learn about financial planning and analysis, and all of the work finance analysts were doing with data. I realized that finance analysts are really just data analysts working in a finance department. These analysts were helping to guide business decisions by knowing how to use data. It was then I realized how powerful data is, and I started to embrace it. Soon enough, I realized I could do this data analysis myself. Data analytics is a big open world of opportunity. There are so many areas that your analysis skills can be applied and in all different ways.
If you're new to this world, you'll learn how to identify which path and industry might suit your skills, and your interests the best. For those of you who already have some experience, we'll help you open doors to new and exciting opportunities.
One of the skills you'll gain from the program is how to follow the best practices that analysts use to help make data-driven decisions. Computers are one part of the process, but analysts rely on so much more to make decisions. That's why learning how to think analytically, and using your other skills and traits on the job will make your work easier. I know you already know how to make good decisions, you chose to be here after all. In this first course, you'll learn more about each phase of the data analysis process. Ask, prepare, process, analyze, share, and act. As a data analyst, you'll go through these steps as you use data to inform your decisions. Eventually, you'll see how this program itself is in a way, its own version of this process. While I know you'll enjoy watching these videos, your trip to the first course will include a whole lot more.
Other videos will take the form of vignettes, where you'll learn from data analytics professionals, who are already established in their careers. They'll offer words of wisdom as well as tales of their own experiences starting off on their career path. You'll start your own data journal that will help you keep track of what you've learned throughout the course. You'll also add your own thoughts about what you're learning as well, throughout the program.
You'll read up on how to navigate this program in the world of data analytics. You'll complete activities, including some that will help you get in the mindset of a data analyst. Along the way, you'll also have the chance to connect with your fellow learners.
Discussion prompts will give you a chance to share your thoughts, and at the same time see what your peers think about all that you're learning. These prompts will help you build a community support system to use throughout the program. Enough talking, let's get started on this exciting path. Your next step awaits.
Helpful resources to get started
The Google Data Analytics Certificate is designed to provide you with new lessons every week. As you’ve learned, each one includes a series of videos, readings, peer discussions, in-video questions, practice quizzes, and graded quizzes.
In this reading, you’ll learn about providing feedback on course content, obtaining the Google Data Analytics Certificate, and helpful habits for successfully completing the certificate.
Providing feedback or getting help on course content
Please remember to give feedback on videos, readings, and materials. Just open the resource, and look for the thumbs-up and thumbs-down symbols.
Click thumbs-up for materials that are helpful.
Click thumbs-down for materials that are not helpful.
That feedback goes to the course developers, not other learners, and helps improve this course.
For technical help on Coursera, visit the Learner Help Center. For help accessing course materials, click the Contact us link at the bottom of the page.
Obtaining the Google Data Analytics Certificate
After you complete all eight courses, you qualify for the Google Data Analytics Certificate.
To receive your certificate, you must:
Pass all required assignments in the course or meet the course-passing threshold. Each graded assignment is part of a cumulative graded score, and the passing grade for the Google Data Analytics Certificate is 80%. AND
Pay the Course Certificate fee ($39/month, with most learners completing the material in 6 months or less), or apply and be approved for a Coursera scholarship.
You can review videos, readings, discussion forums, in-video questions, and practice quizzes in the program for free.
However, you won’t have access to graded assignments. If you choose to go ahead and earn your certificate, you’ll need to upgrade to the certificate program, unlock the graded assessments, and finish those steps.
Helpful habits for successfully completing the certificate
As a learner, you’re bringing all of your past experiences and best learning practices to this program.
The designers of this course have also put together a list of helpful habits that they believe will help you to be the most successful:
Plan your time: Setting regular study times and sticking with them each week can help you make learning a part of your routine.
Use a calendar or timetable to create a schedule.
Listing what you plan to do each day will break your work down into achievable goals?
And creating a quiet place to watch the videos, review the readings, and complete the activities is important, so you can really focus on the material.
Learn in order: We recommend taking these courses — and the items in each lesson — in the order they appear, as new information and concepts build on previous ones. By following the order, you’ll be able to get comfortable with ideas, then practice and build on them.
Be curious: If you find an idea that gets you excited, please act on it! Ask questions, search for more details online, check out the links that interest you, and take notes on your discoveries.
The little things you do to support your learning along the way will take your knowledge even further; open more doors in this new, high-growth field; and help you qualify for all kinds of new jobs.
Take notes: Notes are useful when researching something you’re curious about. This is especially helpful when a task seems important and you think it might be useful later. Or, sometimes you might come across a subject that you want to explore in more detail. Keeping notes can help you keep track of what you learn.
Finally, taking notes is an effective way to help make connections between topics and gain a better understanding of them.
You can use your notes to build your very own data analytics journal — a place where you can capture ideas, information, and any questions you might have.
You’ll probably want to keep your notes together in one place-- whether that’s a physical journal or a document on your computer.
This will make it easier to stay organized. Feel free to revisit your journal as you progress through the program, during your job hunt, and even as you settle into your new role as a data analyst.
Chat (responsibly) with other learners: If you have a question, chances are, you’re not alone. Feel free to reach out in the discussion forum to ask for help from other learners taking this program.
You can also visit Coursera’s Global Online Community. Other important things to know while you’re making friends can be found in the Coursera Honor Code and the Code of Conduct.
Data analysis process:
A. Ask,
B. prepare,
C. process,
D. analyze,
E. share,
F. and act.
Fill in the blank: Data is a collection of _____ that can be used to draw conclusions, make predictions, and assist in decision-making. Answer: facts.
Google Data Analytics Certificate roadmap
Use this guide to review the topics covered, tools used, and skills you will use in each course.
F&Q What tools or platforms are included in the curriculum?
Spreadsheets, SQL, presentation tools, Tableau, Rstudio, and Kaggle.
Will you be teaching R or Python? This program teaches the open-source programming language, R, which is great for foundational data analysis, and offers helpful packages for beginners to apply to their projects. We do not cover Python in the curriculum.
1. Foundations
What you will learn:
Real-life roles and responsibilities of a junior data analyst
How businesses transform data into actionable insights
Spreadsheet basics
Database and query basics
Data visualization basics
Skill sets you will build:
Using data in everyday life
Thinking analytically
Applying tools from the data analytics toolkit
Showing trends and patterns with data visualizations
Ensuring your data analysis is fair
2. Ask
What you will learn:
How data analysts solve problems with data
The use of analytics for making data-driven decisions
Spreadsheet formulas and functions
Dashboard basics, including an introduction to Tableau
Data reporting basics
Skill sets you will build:
Asking SMART and effective questions
Structuring how you think
Summarizing data
Putting things into context
Managing team and stakeholder expectations
Problem-solving and conflict-resolution
3. Prepare
What you will learn:
How data is generated
Features of different data types, fields, and values
Database structures
The function of metadata in data analytics
Structured Query Language (SQL) functions
Skill sets you will build:
Ensuring ethical data analysis practices
Addressing issues of bias and credibility
Accessing databases and importing data
Writing simple queries
Organizing and protecting data
4. Process
What you will learn:
Data integrity and the importance of clean data
The tools and processes used by data analysts to clean data
Data-cleaning verification and reports
Statistics, hypothesis testing, and margin of error
Resume building and interpretation of job postings (optional)
Skill sets you will build:
Connecting business objectives to data analysis
Identifying clean and dirty data
Cleaning small datasets using spreadsheet tools
Cleaning large datasets by writing SQL queries
Documenting data-cleaning processes
5. Analyze
What you will learn:
Steps data analysts take to organize data
How to combine data from multiple sources
Spreadsheet calculations and pivot tables
SQL calculations
Temporary tables
Data validation
Skill sets you will build:
Sorting data in spreadsheets and by writing SQL queries
Filtering data in spreadsheets and by writing SQL queries
Converting data
Formatting data
Substantiating data analysis processes
Seeking feedback and support from others during data analysis
6. Share
What you will learn:
Design thinking
How data analysts use visualizations to communicate about data
The benefits of Tableau for presenting data analysis findings
Data-driven storytelling
Dashboards and dashboard filters
Strategies for creating an effective data presentation
Skill sets you will build:
Creating visualizations and dashboards in Tableau
Addressing accessibility issues when communicating about data
Understanding the purpose of different business communication tools
Telling a data-driven story
Presenting to others about data
Answering questions about data
7. Act
What you will learn:
Programming languages and environments
R packages
R functions, variables, data types, pipes, and vectors
R data frames
Bias and credibility in R
R visualization tools
R Markdown for documentation, creating structure, and emphasis
Skill sets you will build:
Coding in R
Writing functions in R
Accessing data in R
Cleaning data in R
Generating data visualizations in R
Reporting on data analysis to stakeholders
8. Capstone
What you will learn:
How a data analytics portfolio distinguishes you from other candidates
Practical, real-world problem-solving
Strategies for extracting insights from data
Clear presentation of data findings
Motivation and ability to take initiative
Skill sets you will build:
Building a portfolio
Increasing your employability
Showcasing your data analytics knowledge, skill, and technical expertise
Sharing your work during an interview
Communicating your unique value proposition to a potential employer
Video 03 Data analytics in Every Day life
Welcome back. At this point, you've been introduced to the world of data analytics and what data analysts do. You've also learned how this course will prepare you for a successful career as an analyst. Coming up, you'll learn all the ways data can be used, and you'll discover why data analysts are in such high demand.
I'm not exaggerating when I say every goal and success that my team and I have achieved couldn't have been done without data. Here at Google, all of our products are built on data and data-driven decision making.
From concept to development to launch, we're using data to figure out the best way forward. And we're not alone. Countless other organizations also see the incredible value in data and, of course, the data analysts who help them make use of it.
So we know data opens up a lot of opportunities. But to help you wrap your head around all the ways you can actually use data, let's go over a few examples from everyday life. You might not realize it, but people analyze data all the time.
For instance, I'm a morning person. A long time ago, I realized that I'm happier and more productive if I get to bed early and wake up early. I came to this conclusion after noticing a pattern in my day-to-day experiences. When I got seven hours of sleep and woke up at 6:30, I was the most successful.
So I thought about the relationship between this pattern and my daily life, and I predicted that early to bed early to rise would be the right choice for me. And I'm definitely my best self when I wake up bright and early. I bet you've identified patterns and relationships in your life, too.
Maybe about your own sleep cycle or how you feel after eating certain foods, or what time of day you like to workout. All of these are great examples of real life patterns and relationships that you can use to make predictions about the right actions to take, and that is a huge part of data analysis right there. Now, let's put this process into a business setting. You may remember from an earlier video that there's a ton of data out there. And every minute of every hour of every day, more data is being created.
Businesses need a way to control all that data so they can use it to improve processes, identify opportunities and trends, launch new products, serve customers, and make thoughtful decisions. For businesses to be on top of the competition, they need to be on top of their data.
That's why these companies hire data analysts to control the waves of data they collect every day, makes sense of it, and then draw conclusions or make predictions. This is the process of turning data into insights, and it's how analysts help businesses put all their data to good use. This is actually a good way to think about analysis: turning data into insights.
As a reminder, the more detailed definition you learned earlier is that data analysis is the collection, transformation, and organization of data in order to draw conclusions, make predictions, and drive informed decision-making. So after analysts have created insights from data, what happens? Well, a lot. Those insights are shared with others, decisions are made, and businesses take action.
And here's where it can get really exciting. Data analytics can help organizations completely rethink something they do or point them in a totally new direction. For example, maybe data leads them to a new product or unique service, or maybe it helps them find a new way to deliver an incredible customer experience. It's these kinds of aha moments that can help businesses reach another level, and that makes data analysts vital to any business.
Now that you know more of the amazing ways data is being used every day, you can see why data analysts are in such high demand. We'll continue exploring how analysts can transform data into insights that lead to action. And before you know it, you'll be ready to help any organization find new and exciting ways to transform their data.
Case Study: New data perspectives
As you’ve been learning, you can find data pretty much everywhere. Any time you observe and evaluate something in the world, you’re collecting and analyzing data.
Your analysis helps you find easier ways of doing things, identify patterns to save you time, and discover surprising new perspectives that can completely change the way you experience things.
Here’s a real-life example of how one group of data analysts used the six steps of the data analysis process to improve their workplace and its business processes. Their story involves something called people analytics — also known as human resources analytics or workforce analytics.
People analytics is the practice of collecting and analyzing data on the people who make up a company’s workforce in order to gain insights to improve how the company operates.
Being a people analyst involves using data analysis to gain insights about employees and how they experience their work lives. The insights are used to define and create a more productive and empowering workplace.
This can unlock employee potential, motivate people to perform at their best, and ensure a fair and inclusive company culture.
The six steps of the data analysis process that you have been learning in this program are: ask, prepare, process, analyze, share, and act.
These six steps apply to any data analysis. Continue reading to learn how a team of people analysts used these six steps to answer a business question.
An organization was experiencing a high turnover rate among new hires. Many employees left the company before the end of their first year on the job.
The analysts used the data analysis process to answer the following question: how can the organization improve the retention rate for new employees?
Let’s break down what this team did, step-by-step.
1. Ask
First up, the analysts in our example needed to define what the project would look like and what would qualify as a successful result. So, to determine these things, they asked effective questions and collaborated with leaders and managers who were interested in the outcome of their people analysis.
These were the kinds of questions they asked:
What do you think new employees need to learn to be successful in their first year on the job?
Have you gathered data from new employees before? If so, may we have access to the historical data?
Do you believe managers with higher retention rates offer new employees something extra or unique?
What do you suspect is a leading cause of dissatisfaction among new employees?
By what percentage would you like employee retention to increase in the next fiscal year?
2. Prepared
It all started with solid preparation. The group built a timeline of three months and decided how they wanted to relay their progress to interested parties.
Also during this step, the analysts identified what data they needed to achieve the successful result they identified in the previous step - in this case, the analysts chose to gather the data from a survey of new employees.
They identified specific questions to ask about employee satisfaction with different business processes, such as hiring, onboarding, and compensation.
Rules were established for who would have access to the data collected, what specific information would be gathered, and how best to present the data visually.
The analysts brainstormed possible project- and data-related issues and how to avoid them.
In this case, the analysts chose to gather the data from an online survey of new employees. These were the things they did to prepare:
They developed specific questions to ask about employee satisfaction with different business processes, such as hiring and onboarding, and their overall compensation.
They established rules for who would have access to the data collected - in this case, anyone outside the group wouldn't have access to the raw data, but could view summarized or aggregated data.
For example, an individual's compensation wouldn't be available, but salary ranges for groups of individuals would be viewable.
They finalized what specific information would be gathered, and how best to present the data visually. The analysts brainstormed possible project- and data-related issues and how to avoid them.
3. Process
The group sent the survey out. Great analysts know how to respect both their data and the people who provide it. Since employees provided the data, it was important to make sure all employees gave their consent to participate.
The data analysts also made sure employees understood how their data would be collected, stored, managed, and protected.
Collecting and using data ethically is one of the responsibilities of data analysts. In order to maintain confidentiality and protect and store the data effectively, these were the steps they took:
They restricted access to the data to a limited number of analysts.
They cleaned the data to make sure it was complete, correct, and relevant. Certain data was aggregated and summarized without revealing individual responses.
They uploaded raw data to an internal data warehouse for an additional layer of security.
4. Analyze
Then, the analysts did what they do best: analyze! From the completed surveys, the data analysts would discover that a new employee’s experience with the hiring process was a key indicator of overall job satisfaction.
The analysts found that employees who experienced an efficient and transparent hiring process were most likely to remain with the company. Employees who experienced a long and complicated hiring process were most likely to leave the company.
The group knew it was important to document exactly what they found in the analysis, no matter what the results. To do otherwise would decrease trust in the survey process and reduce their ability to collect truthful data from employees in the future.
5. Share
Just as they made sure the data was carefully protected, the analysts were also careful sharing the report. For example, in order for a manager to receive the survey report, a minimum number of their team members had to have participated in the survey.
The group presented the results to leaders first to make sure they had the full picture, then asked them to deliver the results to their teams.
This gave leaders an opportunity to communicate the results with the right context and have productive team conversations about next steps to improve employee engagement.
6. Act
The last stage of the process for the team of analysts was to work with leaders within their company and decide how best to implement changes and take actions based on the findings. The analysts recommended standardizing the hiring process for all new hires based on the most efficient and transparent hiring practices.
A year later, the same survey was distributed to employees. Analysts anticipated that a comparison between the two sets of results would indicate that the action plan worked. Turns out, the changes improved the retention rate for new employees and the actions taken by leaders were successful!
Is people analytics right for you?
One of the many things that makes data analytics so exciting is that the problems are always different, the solutions need creativity, and the impact on others can be great — even life-changing or life-saving.
As a data analyst, you can be part of these efforts. Maybe you’re even inspired to learn more about the field of people analytics. If so, consider learning more about this field and adding that research to your data analytics journal.
You never know: One day soon, you could be helping a company create an amazing work environment for you and your colleagues!
Additional Resource
To learn more about some recent applications of data analytics in the business world, check out the article “3 Examples of Business Analytics in Action” from Harvard Business School. The article reveals how corporations use data insights to optimize their decision-making process.
Please note that the first example in the article contains a minor error in the second paragraph, but the example is still a valid one.
Correction to article in bold below: Microsoft’s Workplace Analytics team hypothesized that moving the 1,200-person group from five buildings to four could improve collaboration by increasing the number of employees per building and by reducing the distance that staff needed to travel for meetings.
Learning Log: Consider how data analysts approach tasks
Overview
Earlier you learned about how data analysts at Google used data to improve employee retention. Now, you’ll complete an entry in your learning log to track your thinking and reflections about those data analysts' process and how they approached this problem.
By the time you complete this activity, you will have a stronger understanding of how the six phases of the data analysis process can be used to break down tasks and tackle big questions. This will help you apply these steps to future analysis tasks and start tackling big questions yourself.
Review the six phases of data analysis
Before you write your entry in your learning log, reflect on the case study from earlier. The data analysts at Google wanted to use data to improve employee retention. In order to do that, they had to break this larger project into manageable tasks.
The analysts organized those tasks and activities around the six phases of the data analysis process:
1. Ask
2. Prepare
3. Process
4. Analyze excel and Sql
5. Share tableau
6. Act R programming
The analysts asked questions to define both the issue to be solved and what would equal a successful result.
Next, they prepared by building a timeline and collecting data with employee surveys that were designed to be inclusive.
They processed the data by cleaning it to make sure it was complete, correct, relevant, and free of errors and outliers.
They analyzed the clean employee survey data. Then the analysts shared their findings and recommendations with team leaders. Afterward, leadership acted on the results and focused on improving key areas.
Reflection
In your learning log template, write 2-3 sentences (40-60 words) reflecting on what you’ve learned from the case study by answering each of the questions below:
Did the details of the case study help to change the way you think about data analysis? Why or why not?
Did you find anything surprising about the way the data analysts approached their task?
What else would you like to learn about data analysis?
When you’ve finished your entry in the learning log template, make sure to save the document so your response is somewhere accessible. This will help you continue applying data analysis to your everyday life. You will also be able to track your progress and growth as a data analyst.
Video 04 Cassie Dimensions of data analytics
Video 05 what is data echo system
What is the data ecosystem?
To put it simply, an ecosystem is a group of elements that interact with one another.
Data ecosystems are made up of various elements that interact with one another in order to produce, manage, store, organize, analyze, and share data.
Data can also be found in something called the cloud. The cloud is a place to keep data online, rather than on a computer hard drive.
For example,
You could tap into your retail store's database, which is an ecosystem filled with customer names, addresses, previous purchases, and customer reviews. As a data analyst, you could use this information to predict what these customers will buy in the future, and make sure the store has the products and stock when they're needed.
Let’s think about a data ecosystem used by a human resources department. This ecosystem would include information like postings from job websites, stats on the current labor market, employment rates, and social media data on prospective employees.
A data analyst could use this information to help their team recruit new workers and improve employee engagement and retention rates.
Agricultural companies regularly use data ecosystems that include information including geological patterns in weather movements. Data analysts can use this data to help farmers predict crop yields.
Some data analysts are even using data ecosystems to save real environmental ecosystems. At the Scripps Institution of Oceanography, coral reefs all over the world are monitored digitally, so they can see how organisms change over time, track their growth, and measure any increases or declines in individual colonies.
Q. In data analytics, what is the term for elements that interact with one another in order to produce, manage, store, organize, analyze, and share data? (Reminder: be sure to scroll down to see all options!) ans data Ecosystem
Data ecosystems
Elements that interact with one another in order to produce, manage, store, organize, analyze, and share data are data ecosystems.
These elements include hardware and software tools, as well as the people who use them.
Data can also be found INS something called the cloud.
Cloud
A place to keep data online, rather than a computer hard drive
The cloud plays a big part in the data ecosystem, and as a data analyst, it's your job to harness the power of that data ecosystem, find the right information and provide the team with analysis that helps them make smart decisions.
Difference between data scientists and data analysts
It's easy to confuse the two, but what they do is actually very different.
Data science is defined as creating new ways of modeling and understanding the unknown by using raw data.
Data scientists create new questions using data, while analysts find answers to existing questions by creating insights from data sources.
There are also many words and phrases you'll hear through this course that are easy to get mixed up.
For example, data analysis and data analytics sound the same, but they're actually very different things.
Data analysis is the collection, transformation, and organization of data in order to draw conclusions, make predictions, and drive informed decision-making.
Data analytics in the simplest terms is the science of data.
When you think about data, data analysis and the data ecosystem, it's important to understand that all of these things fit under the data analytics umbrella.
Video 06 How data informs better decisions
One of the most powerful ways you can put data to work is with data-driven decision-making.
Data-driven decision-making is defined as using facts to guide business strategy.
Organizations in many different industries are empowered to make better, data-driven decisions by data analysts all the time.
The first step in data-driven decision-making is figuring out the business need.
Usually, this is a problem that needs to be solved.
For example,
a problem could be a new company needing to establish better brand recognition, so it can compete with bigger, more well-known competitors
Maybe an organization wants to improve a product and needs to figure out how to source parts from a more sustainable or ethically responsible supplier. or
it could be a business trying to solve the problem of unhappy employees, low levels of engagement, satisfaction and retention
Whatever the problem is, once it's defined, a data analyst finds data, analyzes it and uses it to uncover trends, patterns and relationships.
Sometimes the data-driven strategy will build on what's worked in the past. Other times, it can guide a business to branch out in a whole new direction.
Let's look at a real-world example. Think about a music or movie streaming service. How do these companies know what people want to watch or listen to, and how do they provide it?
Well using data-driven decision-making, they gather information about what their customers are currently listening to, analyze it, then use the insights they've gained to make suggestions for things people will most likely enjoy in the future.
This keeps customers happy and coming back for more, which in turn means more revenue for the company.
Another example of data-driven decision-making can be seen in the rise of e-commerce. It wasn't long ago that most purchases were made in a physical store, but the data showed people's preferences were changing.
So a lot of companies created entirely new business models that remove the physical store, and let people shop right from their computers or mobile phones with products delivered right to their doorstep.
In fact, data-driven decision-making can be so powerful, it can make entire business methods obsolete.
It's important to note that no matter how valuable data-driven decision-making is, data alone will never be as powerful as data combined with human experience, observation, and sometimes even intuition.
To get the most out of data-driven decision-making, it's important to include insights from people who are familiar with the business problem.
These people are called subject matter experts, and they have the ability to look at the results of data analysis and identify any inconsistencies, make sense of gray areas, and eventually validate choices being made.
As a data analyst, you play a key role in empowering these organizations to make data-driven decisions, which is why it's so important for you to understand how data plays a part in the decision-making process.
Data and gut instinct
Detectives and data analysts have a lot in common. Both depend on facts and clues to make decisions. Both collect and look at the evidence.
Both talk to people who know part of the story. And both might even follow some footprints to see where they lead. Because whether you’re a detective or a data analyst, your job is all about following steps to collect and understand facts.
Analysts use data-driven decision-making and follow a step-by-step process. You have learned that there are six steps to this process:
1. Ask questions and define the problem.
2. Prepare data by collecting and storing the information.
3. Process data by cleaning and checking the information.
4. Analyze data to find patterns, relationships, and trends.
5. Share data with your audience.
6. Act on the data and use the analysis results.
Analyzing facts is a key part of data-driven decision making because facts lead to patterns that help guide the decisions we make — big and small. Data-driven decision-making is rooted in using facts to guide business strategy.
As an analyst, you will be tasked with creating a verified story about the data and sharing it with stakeholders. These stakeholders use your story to make choices based on facts, and make sure that the company is focused on the right goals.
Gut instinct can be a problem
Detectives and data analysts have a lot in common. Both depend on facts and clues to make decisions. Both collect and look at the evidence. Both talk to people who know part of the story. And both might even follow some footprints to see where they lead. Whether you’re a detective or a data analyst, your job is all about following steps to collect and understand facts.
Analysts use data-driven decision-making and follow a step-by-step process. You have learned that there are six steps to this process:
Ask questions and define the problem.
Prepare data by collecting and storing the information.
Process data by cleaning and checking the information.
Analyze data to find patterns, relationships, and trends.
Share data with your audience.
Act on the data and use the analysis results.
There are other factors influencing the decision making process, too, though. You may have read mysteries where the detective used their gut instinct, and followed a hunch that helped them solve the case. Gut instinct is an intuitive understanding of something with little or no explanation. This isn’t always something conscious; we often pick up on signals without even realizing. You just have a “feeling” its right.
Why gut instinct can be a problem
At the heart of data-driven decision making is data. Therefore, it's essential that data analysts focus on the data to ensure they make informed decisions.
If you ignore data by preferring to make decisions based on your own experience, your decisions may be biased. But even worse, decisions based on gut instinct without any data to back them up can cause mistakes.
Consider an example of a real estate developer bidding to redevelop a part of a city's central district.
They were well-known for preservation of historical buildings. Banking on their reputation, the agency's planners followed gut instinct and included the preservation of several buildings to gain support and win approval for the project.
However, private donations fell short and a partnership failed to materialize and save the day. The buildings eventually had to be torn down after much delay and an expensive dispute with the city.
The more you understand the data related to a project, the easier it will be to figure out what is required.
These efforts will also help you identify errors and gaps in your data so you can communicate your findings more effectively.
Sometimes past experience helps you make a connection that no one else would notice. For example, a detective might be able to crack open a case because they remember an old case just like the one they’re solving today. It's not just gut instinct.
But for data analysts, just trusting our gut instinct can be a problem. At the heart of data-driven decision making is data, so we always want to focus on the data to ensure that we’re making informed decisions. When we make decisions based on our gut instinct without any data to back it up, it can lead to mistakes.
Or worse, when we ignore the data based on our own personal experiences, we can create bias in our analysis. Businesses that rely on gut instinct to make decisions often make bad choices because they aren’t considering the story their data is actually telling.
Instead of relying on gut instinct, you can build your business knowledge and experience over time. The more you know about how a business works, the easier it will be to figure out what that business needs.
And that business knowledge and experience can also help you identify errors and gaps in your data and communicate your findings.
For example, a detective might be able to crack open a case because they remember an old case just like the one they’re solving today.
Their past experience could help them make a connection that no one else would notice. Maybe their unique background knowledge helps them discover someone is lying, or it could help them uncover new clues.
Your business knowledge and experience may help you understand problems intuitively. But, unlike gut instinct, it will give you more than just a feeling to go on.
Data + business knowledge = mystery solved
Blending facts and data with your business knowledge will be a common part of your process. The key is figuring out the exact mix of data and business knowledge for each particular project. A lot of times it will depend on the goals of your analysis. That is why analysts often ask, “How do I define success for this project?”
Successful analysis needs to be accurate, and fast enough to help decision-makers. So try asking yourself these questions about a project:
What kind of results are needed?
Who will be informed?
Am I answering the question being asked?
How quickly does a decision need to be made?
For example, if you are working on a rush project, you might need to rely on your own knowledge and experience more than usual.
There just isn’t enough time to thoroughly analyze all of the available data. But if you get a project that involves plenty of time and resources, then the best strategy would be to be more data-driven.
It’s up to you, the data analyst, to think about the situation and make the best possible choice.
You will probably blend facts and knowledge a million different ways over the course of your data analytics career. And the more you practice, the better you will get at finding that perfect blend.
Origins of the data analysis process
When you decided to join this program, you proved that you are a curious person. So let’s tap into your curiosity and talk about the origins of data analysis.
We don’t fully know when or why the first person decided to record data about people and things. But we do know it was useful because the idea is still around today!
We also know that data analysis is rooted in statistics, which has a pretty long history itself. Archaeologists mark the start of statistics in ancient Egypt with the building of the pyramids. The Ancient Egyptians were masters of organizing data.
They documented their calculations and theories on papyri (paper-like materials), which are now viewed as the earliest examples of spreadsheets and checklists.
Today’s data analysts owe a lot to those brilliant scribes, who helped create a more technical and efficient process.
It is time to enter the data analysis life cycle—the process of going from data to decision. Data goes through several phases as it gets created, consumed, tested, processed, and reused. With a life cycle model, all key team members can drive success by planning work both up front and at the end of the data analysis process.
While the data analysis life cycle is well known among experts, there isn't a single defined structure of those phases.
There might not be one single architecture that’s uniformly followed by every data analysis expert, but there are some shared fundamentals in every data analysis process.
This reading provides an overview of several, starting with the process that forms the foundation of the Google Data Analytics Certificate.
The process presented as part of the Google Data Analytics Certificate is one that will be valuable to you as you keep moving forward in your career:
A. Ask: Business Challenge/Objective/Question
B. Prepare: Data generation, collection, storage, and data management
C. Process: Data cleaning/data integrity
D. Analyze: Data exploration, visualization, and analysis
E. Share: Communicating and interpreting results
F. Act: Putting your insights to work to solve the problem
Understanding this process—and all of the iterations that helped make it popular—will be a big part of guiding your own analysis and your work in this program. Let’s go over a few other variations of the data analysis life cycle.
EMC's data analysis life cycle
EMC Corporation's data analytics life cycle is cyclical with six steps:
A. Discovery
B. Pre-processing data
C. Model planning
D. Model building
E. Communicate results
F. Operationalize
EMC Corporation is now Dell EMC. This model, created by David Dietrich, reflects the cyclical nature of real-world projects.
The phases aren’t static milestones; each step connects and leads to the next, and eventually repeats.
Key questions help analysts test whether they have accomplished enough to move forward and ensure that teams have spent enough time on each of the phases and don’t start modeling before the data is ready.
It is a little different from the data analysis life cycle this program is based on, but it has some core ideas in common: the first phase is interested in discovering and asking questions; data has to be prepared before it can be analyzed and used; and then findings should be shared and acted on.
For more information, refer to The Genesis of EMC's Data Analytics Lifecycle.
SAS' iterative life cycle
An iterative life cycle was created by a company called SAS, a leading data analytics solutions provider. It can be used to produce repeatable, reliable, and predictive results:
A. Ask
B. Prepare
C. Explore
D. Model
E. Implement
F. Act
G. Evaluate
The SAS model emphasizes the cyclical nature of their model by visualizing it as an infinity symbol. Their life cycle has seven steps, many of which we have seen in the other models, like Ask, Prepare, Model, and Act. But this life cycle is also a little different; it includes a step after the act phase designed to help analysts evaluate their solutions and potentially return to the ask phase again.
For more information, refer to Managing the Analytics Life Cycle for Decisions at Scale.
A. Project-based data analytics life cycle
B. A project-based data analytics life cycle has five simple steps:
C. Identifying the problem
D. Designing data requirements
E. Pre-processing data
F. Data analysis
G. Data visualizing
This data analytics project life cycle was developed by Vignesh Prajapati. It doesn’t include the sixth phase, or what we have been referring to as the Act phase. However, it still covers a lot of the same steps as the life cycles we have already described. It begins with identifying the problem, preparing and processing data before analysis, and ends with data visualization.
For more information, refer to Understanding the data analytics project life cycle.
Big data analytics life cycle
Authors Thomas Erl, Wajid Khattak, and Paul Buhler proposed a big data analytics life cycle in their book, Big Data Fundamentals: Concepts, Drivers & Techniques.
Their life cycle suggests phases divided into nine steps:
1. Business case evaluation
2. Data identification
3. Data acquisition and filtering
4. Data extraction
5. Data validation and cleaning
6. Data aggregation and representation
7. Data analysis
8. Data visualization
9. Utilization of analysis results
This life cycle appears to have three or four more steps than the previous life cycle models. But in reality, they have just broken down what we have been referring to as Prepare and Process into smaller steps.
It emphasizes the individual tasks required for gathering, preparing, and cleaning data before the analysis phase.
For more information, refer to Big Data Adoption and Planning Considerations.
Data life cycle based on research
One final data life cycle informed by Harvard University research has eight phases:
1. Generation
2. Collection
3. Processing
4. Storage
5. Management
6. Analysis
7. Visualization
8. Interpretation
This version includes storage, management, and interpretation phases, and excludes the Act phase that has appeared in other models.
For more information, refer to 8 Steps in the Data Life Cycle.
Key takeaway
From our journey to the pyramids and data in Ancient Egypt to now, the way we analyze data has evolved (and continues to do so).
The data analysis process is like real life architecture, there are different ways to do things but the same core ideas still appear in each model of the process.
Whether you use the structure of this Google Data Analytics Certificate or one of the many other iterations you have learned about, we are here to help guide you as you continue on your data journey.
Video 7 what to expect moving forward
We've covered a lot. I'm sure you have so much to think about already. That's a good thing. It means you've started collecting data and you're doing your own personal analysis. That's what it's all about. You've built a great base already. As this course continues, your knowledge and data analysis skills will continue to grow. Once you've established a solid foundation, you'll apply what you've learned to the rest of the program. The data analysis process will help provide a framework for everything you do. Soon, you'll take your first graded assessment. It's a great way to check your understanding of the concepts and build confidence in your knowledge. Everyone learns at different speeds. So take your time. Get familiar with the concepts. As soon as you feel ready, you can go ahead and get started. Keep in mind, if at any point, you're not sure about a question, you can always review the videos and readings to remind yourself of the answer. We're all about open- book tests here. Once you've passed, you'll be all set to move on. You've got this. Before you know it, you'll be done with all of the courses, and you'll be ready to create your own case study. Then, if it's what you want to do, you'll start your job search, equipped with the tools and skills that will wow any company you talk to. I can't wait to see where you go with data analytics. For now though, give yourself a pat on the back for a job well done. See you soon.
Week 2 Video 08 discovering data skills sets
Welcome. Now that you have a solid foundation on the basics of data, it's time to focus on some particular skills and characteristics that will be key to your future career as a data analyst. We'll begin with five key skills, move on to the characteristics of analytical thinking and then learn how data analysts balance their roles and responsibilities. Along the way, you'll also discover how to tap into your own natural abilities for strategy, technical expertise, and data design. These are incredibly helpful skills to have and you'll learn how to make them even stronger. Finally, you'll be introduced to some fascinating real-world examples of how data is influencing the lives of people all around the world. All right. Let's get started.
Thinking analytically
Data analysts balance many different roles in their work. In this part of the course, you’ll learn about some of these roles and the key skills used by analysts. You’ll also explore analytical thinking and how it relates to data-driven decision-making.
Explain the concept of data-driven decision-making including specific examples
Describe the key characteristics of analytical thinking
Conduct an analytical thinking self-assessment, giving specific examples of the application of analytical thinking
Demonstrate an understanding of the five key analytical skills used by data analysts
Explain how analytical thinking enables decision-making
Begin asking more effective questions
Video 09 Key data analyst skills
Earlier, I told you that you already have analytical skills. You just might not know it yet. When learning new things, sometimes people overlook their own skills, but it's important you take the time to acknowledge them, especially since these skills are going to help you as a data analyst.
In fact, you're probably more prepared than you think. Don't believe me? Well, let me prove it. Let's start by defining what I'm talking about here.
Analytical skills are qualities and characteristics associated with solving problems using facts.
There are a lot of aspects to analytical skills, but, we'll focus on five essential points. They are curiosity, understanding context, having technical mindset, data design, and data strategy. Now, you may be thinking, "I don't have these kinds of skills," or "I only have a couple of them." But stay with me, and I bet you'll change your mind. Let's start with curiosity.
Curiosity is all about wanting to learn something. Curious people usually seek out new challenges and experiences. This leads to knowledge. The very fact that you're here with me right now demonstrates that you have curiosity. That was an easy one. Now think about understanding context.
Context is the condition in which something exists or happens.
This can be a structure or an environment. A simple way of understanding context is by counting to 5. One, two, three, four, five.
All of those numbers exist in the context of one through five. But what if a friend of yours said to you, one, two, four, five, three? Well, the three will be out of context. Simple, right? But it can be a little tricky. There's a good chance that you might not even notice the three being out of context if you aren't paying close attention. That's why listening and trying to understand the full picture is critical. In your own life, you put things into context all the time.
For example, let's think about your grocery list. If you group together items like flour, sugar, and yeast, that's you adding context to your groceries. This saves you time when you're at the baking aisle at the grocery store. Let's look at another example. Have you ever shuffled a deck of cards and noticed the joker? If you're playing a game that doesn't include jokers, identifying that card means you understand it's out of context. Remove it, and you're much more likely to play a successful game.
Now we know you have both curiosity and the ability to understand context. Let's move on to the third skill, a technical mindset. A technical mindset involves the ability to break things down into smaller steps or pieces and work with them in an orderly and logical way.
For instance, when paying your bills, you probably already break down the process into smaller steps. Maybe you start by sorting them by the date they're due. Next, you might add them up and compare that amount to the balance in your bank account.
This would help you see if you can pay your bills now, or if you should wait until the next paycheck. Finally, you'd pay them. When you take something that seems like a single task, like paying your bills, and break it into smaller steps with an orderly process, that's using a technical mindset. Now let's explore the fourth part of an analytical skill set, data design.
Data design is how you organize information. As a data analyst, design typically has to do with an actual database. But, again, the same skills can easily be applied to everyday life. For example, think about the way you organize the contacts in your phone. That's actually a type of data design. Maybe you list them by first name instead of last, or maybe you use email addresses instead of their names. What you're really doing is designing a clear, logical list that lets you call or text a contact in a quick and simple way.
The last, but definitely not least, the fifth and final element of analytical skills is data strategy.
Data strategy is the management of the people, processes, and tools used in data analysis. Let's break that down. You manage people by making sure they know how to use the right data to find solutions to the problem you're working on.
For processes, it's about making sure the path to that solution is clear and accessible. For tools, you make sure the right technology is being used for the job. Now, you may be doubting my ability to give you an example from real life that demonstrates data strategy. But check this out. Imagine mowing a lawn.
Step 1 would be reading the owner's manual for the mower.
That's making sure the people involved, or you, in this example, know how to use the data available. The manual would instruct you to put on protective eyewear and closed-toe shoes.
Then, it's on to step 2: making the process, the path, clear and accessible. This will involve you walking around the lawn, picking up large sticks or rocks that might get in your way.
Finally, for step 3, you check the lawn mower, your tool, to make sure it has enough gas and oil, and is in working condition, so the lawn can be mown safely. There you have it. Now you know the five essential skills of a data analyst. Curiosity, understanding context, having a technical mindset, data design, and data strategy. I told you that you are already an analytical thinker. Now, you can start actively practicing these skills as you move through the rest of this course. Curious about what's next?
Analytical skills
Analytical skills are qualities and characteristics associated with solving problems using facts.
There are a lot of aspects to analytical skills, but we'll focus on five essential points.
They are
1. curiosity,
2. understanding context,
3. technical mindset
4. data design,
5. data strategy.
5 essential points
Curiosity: Curiosity is all about wanting to learn something. Curious people usually seek out new challenges and experiences. This leads to knowledge. The very fact that you're here with me right now demonstrates that you have curiosity.
Understanding context: Context is the condition in which something exists or happens. This can be a structure or an environment.
A simple way of understanding context is by counting to 5. One, two, three, four, five. All of those numbers exist in the context of one through five. But what if a friend of yours said to you, one, two, four, five, and three? Well, the three will be out of context. Simple, right? But it can be a little tricky. There's a good chance that you might not even notice the three being out of context if you aren't paying close attention.
That's why listening and trying to understand the full picture is critical. In your own life, you put things into context all the time. For example, let's think about your grocery list. If you group together items like flour, sugar, and yeast, that's you adding context to your groceries. This saves you time when you're at the baking aisle at the grocery store.
Let's look at another example. Have you ever shuffled a deck of cards and noticed the joker? If you're playing a game that doesn't include jokers, identifying that card means you understand it's out of context.
Having technical mindset: A technical mindset involves the ability to break things down into smaller steps or pieces and work with them in an orderly and logical way.
For instance, when paying your bills, you probably already break down the process into smaller steps. Maybe you start by sorting them by the date they're due.
Next, you might add them up and compare that amount to the balance in your bank account. This would help you see if you can pay your bills now, or if you should wait until the next paycheck. Finally, you'd pay them.
When you take something that seems like a single task, like paying your bills, and break it into smaller steps with an orderly process, that's using a technical mindset.
Data design : Data design is how you organize information. As a data analyst, design typically has to do with an actual database.
Data strategy : Data strategy is the management of the people, processes, and tools used in data analysis. You manage people by making sure they know how to use the right data to find solutions to the problem you're working on.
For processes, it's about making sure the path to that solution is clear and accessible. For tools, you make sure the right technology is being used for the job.
Learning Log: Explore data from your daily life
Overview
In a previous learning log, you reflected on how you use data analysis in your own life to make everyday decisions. Now, you’ll complete an entry in your learning log exploring data from an area of your life. By the time you complete this activity, you will have a stronger understanding of how you can apply your data analysis skills to more specific activities and situations in your life--starting with your own everyday decisions! Later, you are going to use the data you generate for this entry to practice organizing data to draw insights from it.
Create a list
Before you start, pick one area of your everyday life you would like to explore further. Think about how many times in the past few weeks you made decisions about anything related to this area. Then, create a list and include details, such as the date, time, cost, quantity, size, etc. Try to focus on things that can be represented by a number or category.
Here are a few thought-starters:
A. Number of cups of coffee you drink daily
B. Popular workout times at the gym
C. Nightly bedtime
For example, you could create a list exploring your daily coffee intake like this:
Daily coffee intake
A. Jan. 8th 8 am - bought coffee - one 10 oz. cup
B. Jan. 8th 10 am - made coffee at home - one 12 oz. cup
C. Jan. 9th 8 am - bought coffee - mug
D. Jan 10th 11 am - bought large coffee - 20 oz.
E. Jan 11th 8 am - made coffee at home - mug
This example includes a few different details like date and time, whether the coffee was purchased or homemade, and the quantity. You can choose to focus on any area of your life you want and track the details you are interested in exploring. Then, you will compile this list in a learning log template, linked below.
Reflection
After you have finished creating your detailed list exploring data from your own life, take a moment to reflect on that data. In your learning log entry, write 2-3 sentences (40-60 words) in response to each question below:
Are there any trends you noticed in your behavior?
Are there factors that influence your decision-making?
Is there anything you identified that might influence your future behavior?
When you’ve finished your entry in the learning log template, make sure to save the document so your response is somewhere accessible. This will help you continue applying data analysis to your everyday life. You will also be able to track your progress and growth as a data analyst.
Skills for data analysts
Name that skill, identify the five key skills used by data analysts.
Q. Which skill matches this description?
"The qualities and characteristics associated with solving problems using facts"
A : Analytical skills
"The analytical skill that involves breaking processes down into smaller steps and working with them in an orderly, logical way"
A : A technical mindset
"Analytical skills that involve how you organize information"
A : Data design
"The analytical skill that has to do with how you group things into categories"
A : Understanding context
"The analytical skill that involves managing the processes and tools used in data analysis"
A: Data strategy
Video 10 All about thinking analytically
Now that you know the five essential skills of a data analyst, you're ready to learn more about what it means to think analytically. People don't often think about thinking.
Thinking is second nature to us. It just happens automatically, but there are actually many different ways to think. Some people think creatively, some think critically, and some people think in abstract ways. Let's talk about analytical thinking.
Analytical thinking involves identifying and defining a problem and then solving it by using data in an organized, step-by-step manner. As data analysts, how do we think analytically? Well, to answer that question, we will now talk about a second set of five.
The five key aspects to analytical thinking.
They are:
A. visualization,
B. strategy,
C. problem-orientation,
D. correlation, and
E. finally, big-picture and detail-oriented thinking.
Let's start with visualization.
In data analytics, visualization is the graphical representation of information. Some examples include graphs, maps, or other design elements. Visualization is important because visuals can help data analysts understand and explain information more effectively.
Think about it like this. If you are trying to explain the Grand Canyon to someone, using words would be much more challenging than showing them a picture.
A visualization of the Grand Canyon would help you make your point much quicker.
Strategy
Now let's talk about the second part of analytical thinking, being strategic. With so much data available, having a strategic mindset is key to staying focused and on track.
Strategizing helps data analysts see what they want to achieve with the data and how they can get there. Strategy also helps improve the quality and usefulness of the data we collect. By strategizing, we know all our data is valuable and can help us accomplish our goals.
Next step on the analytical thinking checklist: being problem-oriented. Data analysts use a problem- oriented approach in order to identify, describe, and solve problems. It's all about keeping the problem top of mind throughout the entire project. For example, say a data analyst is told about the problem of a warehouse constantly running out of supplies.
They would move forward with different strategies and processes. But the number one goal would always be solving the problem of keeping inventory on the shelves.
Data analysts also ask a lot of questions. This helps improve communication and saves time while working on a solution. An example of that would be surveying customers about their experiences using a product and building insights from those questions to improve their product. This leads us to the fourth quality of analytical thinking: being able to identify a correlation between two or more pieces of data.
A correlation is like a relationship. You can find all kinds of correlations in data. Maybe it's the relationship between the length of your hair and the amount of shampoo you need. Or maybe you notice a correlation between a rainier season leading to a high number of umbrellas being sold. But as you start identifying correlations in data, there's one thing you always want to keep in mind: Correlation does not equal causation. In other words, just because two pieces of data are both trending in the same direction, that doesn't necessarily mean they are all related. We'll learn more about that later. Now the final piece of the analytical thinking puzzle: big-picture thinking. This means being able to see the big picture as well as the details.
A jigsaw puzzle is a great way to think about this. Big-picture thinking is like looking at a complete puzzle. You can enjoy the whole picture without getting stuck on every tiny piece that went into making it. If you only focus on individual pieces, you wouldn't be able to see past that, which is why big-picture thinking is so important. It helps you zoom out and see possibilities and opportunities. This leads to exciting new ideas or innovations. On the flip side, detail-oriented thinking is all about figuring out all of the aspects that will help you execute a plan. In other words, the pieces that make up your puzzle.
There are all kinds of problems in the business world that can benefit from employees who have both a big-picture and a detail-oriented way of thinking. Most of us are naturally better at one or the other. But you can always develop the skills to fit both pieces together. Now that you know the five aspects of analytical thinking, visualization, strategy, problem-orientation, correlation, and big-picture and detail-oriented thinking, you can put them to work for you when you're working with data. As you continue through this course, you'll learn how.
Analytical thinking involves identifying and defining a problem and then solving it by using data in an organized, step-by-step manner.
Complete the table
Recall the analytical skills discussed in the last lesson:
Curiosity - a desire to know more about something, asking the right questions
Understanding context - understanding where information fits into the big picture
Having a technical mindset - breaking big things into smaller steps
Data design - thinking about how to organize data and information
Data strategy - thinking about the people, processes, and tools used in data analysis
In the Analytical Skills Table, each row contains one of the analytical skills above. Put an X in the column that you think best describes your current level with each area. The three ratings are:
Strength - an area you feel is one of your strengths
Developing - you have some experience with this area, but there’s still significant room for growth
Emerging - this is new to you, and will be gaining experience in this area from this course
Update the Comments/Plans/Goals column with a quick note on why you chose the rating for each area.
Reflection
Consider the ratings you gave yourself in the Analytical Skills Table. How many times did you rate a skill as a strength? What about developing or emerging? Reflect on why you chose that rating for those categories.
Think about your past growth in each category and how you can use analytical thinking to foster growth in a weaker area. Write 5-7 sentences (100-150 words) reflecting on these questions and the ratings you gave yourself.
I think I'm good at the skill of Curiosity. Many time I want to know and learn more about something, but at the same time, I think I need to develop the ability to ask right and good questions. Understanding Context and having a technical mindset are skills that I need to develop further than now.
I like to find and collect information, but that doesn't always mean that I'm very good at using the information to fit into the big picture and break the big one into smaller steps. The skills of Data design and Data strategy are new to me. That's why I decide to study this course.
Video 11 Exploring core analytical skills
The more ways you can think, the easier it is to think outside the box and come up with fresh ideas. But why is it important to think in different ways? Well because in data analysis, solutions are almost never right in front of you.
You need to think critically to find out the right questions to ask. But you also need to think creatively to get new and unexpected answers.
What is the root cause of a problem? A root cause is the reason why a problem occurs. If we can identify and get rid of a root cause, we can prevent that problem from happening again. A simple way to wrap your head around root causes is with the process called the Five Whys. In the Five Whys you ask "why" five times to reveal the root cause. The fifth and final answer should give you some useful and sometimes surprising insights.
Here's an example of the Five Whys in action. Let's say you wanted to make a blueberry pie but couldn't find any blueberries. You've been trying to solve a problem by asking, why can't I make a blueberry pie? The answer will be, there are no blueberries at the store.
There's Why Number 1. You then ask, why were there no blueberries at the store? Then you discover that the blueberry bushes don't have enough fruit this season.
That's Why Number 2. Next, you'd ask, why was there not enough fruit? This would lead to the fact that birds were eating all the berries. Why Number 3, asked and answered.
Now we get to Why Number 4. Ask why a fourth time and the answer would be that, although the birds normally prefer mulberries and don't eat blueberries, the mulberry bush didn't produce fruit this season, so the birds are eating blueberries instead.
Finally, we get to Why Number 5, which should reveal the root cause. A late frost damaged the mulberry bushes, so it didn't produce any fruit. You can't make a blueberry pie because of the late frost months ago. See how the Five Whys can reveal some very surprising root causes. This is a great trick to know, and it can be a very helpful process in data analysis.
Another question commonly asked by data analysts is, where are the gaps in our process? For this, many people will use something called gap analysis.
Gap analysis lets you examine and evaluate how a process works currently in order to get where you want to be in the future. Businesses conduct gap analysis to do all kinds of things, such as improve a product or become more efficient.
The general approach to gap analysis is understanding where you are now compared to where you want to be. Then you can identify the gaps that exist between the current and future state and determine how to bridge them.
A third question that data analysts ask a lot is, what did we not consider before? This is a great way to think about what information or procedure might be missing from a process, so you can identify ways to make better decisions and strategies moving forward.
The way data analysts think and ask questions plays a big part in how businesses make decisions. That's why analytical thinking and understanding how to ask the right questions can have such a huge impact on the overall success of a business.
Learning Log: Reflect on your skills and expectations
Overview
You have already learned about the five essential aspects of analytical skills: curiosity, understanding context, having a technical mindset, data design, and data strategy. You have also discovered that you’re already practicing these skills. Now, you’ll complete an entry in your learning log exploring your own analytical strengths and weaknesses and your goals for the future.
By the time you complete this activity, you will have a stronger understanding of your analytical skill set and how you can practice and improve them. These analytical skills are key to helping you solve problems and create insights using data analysis. Thinking about them now will help you grow as a data analyst!
The analytical skills table
First, you’ll fill out an Analytical Skills Table in your learning log entry. The table will appear like this in the template:
The table has a row for each essential aspect of analytical skills:
Curiosity: a desire to know more about something, asking the right questions
Understanding context: understanding where information fits into the “big picture”
Having a technical mindset: breaking big things into smaller steps
Data design: thinking about how to organize data and information
Data strategy: thinking about the people, processes, and tools used in data analysis
You will put an X in the column that you think best describes your current level with each aspect. The three ratings are:
Strength: This is an area you feel is one of your strengths
Developing: You have some experience with this area, but there’s still significant room for growth
Emerging: This is new to you, and will gain experience in this area from this course
Then update the Comments/Plans/Goals column with a quick note to yourself about why you chose those ratings.
Access your learning log
To use the template for this course item, click the link below and select “Use Template.”
Link to learning log template: Reflect on your skills and expectations
Reflection
After you have completed the Analytical Skills Table, take a moment to reflect on your evaluations. In your learning log entry, write 2-3 sentences (40-60 words) in response to each question below:
What do you notice about the ratings you gave yourself in each area? How did you rate yourself in the areas that appeal to you most?
If you are asked to rate your experience level in these areas again in a week, what do you think the ratings will be, and why do you think that?
How do you plan on developing these skills from now on?
When you’ve finished your entry in the learning log template, make sure to save the document so your response is somewhere accessible. This will help you continue applying data analysis to your everyday life. You will also be able to track your progress and growth as a data analyst.
Video 12 Using data to drive successful outcomes
As a reminder, they're curiosity, understanding context, having a technical mindset, data design, and data strategy.
Data-driven decision-making involved using facts to guide business strategy.
It gives you greater confidence about your choice and your abilities to address business challenges. It helps you become more proactive when an opportunity presents itself, and it saves you time and effort when working towards a goal.
Curiosity and Context - The more you learn about the power of data, the more curious you're likely to become. You'll start to see patterns and relationships in everyday life, whether you're reading the news, watching a movie, or going to an appointment across town.
The analysts take their thinking a step further by using context to make predictions, research answers, and eventually draw conclusions about what they've discovered. This natural process is a great first step in becoming more data-driven.
Having a technical mindset - Data analysts have gut feelings too. But they've trained themselves to build on those feelings and use a more technical approach to explore them.
They do this by always seeking out the facts, putting them to work through analysis, and using the insights they gain to make informed decisions.
Data design - which has a strong connection to data-driven decision-making. To put it simply, designing your data so that is organized in a logical way makes it easy for data analysts to access, understand, and make the most of available information.
And it's important to keep in mind that data design doesn't just apply to databases. If you make decisions that are informed by data, you are more likely to make more informed and effective decisions.
Data strategy - which incorporates the people, processes, and tools used to solve a problem. This is a big one to remember because data strategy gives you a high-level view of the path you need to take to achieve your goals.
Also, data-driven decision-making isn't a one-person job. It's much more likely to be successful if everyone is on board and on the same page, so it's important to make sure specific procedures are in place and that your technology being used is aligned with your data-driven strategy.
Video 13 Real-world data magic
Here at Google, our mission is to organize the world's information and make it universally accessible and useful. All of our products, from idea to development to launch, are built on data and data-driven decision-making.
The HR department wanted to know if there was value in having managers. Were their contributions worthwhile? Or should everyone just be an individual contributor? To answer that question, Google's people analytics team looked at past performance reviews and employee surveys.
The data they found was plotted on a graph because as you've learned, visuals are extremely helpful when trying to understand a problem or concept. The graph revealed that Googlers had positive feelings about their managers, but the data was pretty general and the team wanted to learn more.
So they dug deeper and split the data into quartiles. A quartile divides data points into four equal parts or quarters. Here's where the really cool stuff started happening.
The data analysts discovered that there was a big difference between the very top and the very bottom quartiles.
As it turned out, the teams with the best managers were significantly happier, more productive, and more likely to want to keep working at Google. This confirmed that managers were valued and make a big difference.
Therefore, the idea of having only individual contributors was not implemented. But there was still more work to do. Just knowing that great managers create great results doesn't lead to actionable insights.
You have to identify what exactly makes a great manager, so the team took two additional steps to collect more data.
First, they launched an awards program where employees could nominate their favorite managers. For every submission you had to provide examples or data about what makes that manager great.
The second step involved interviewing managers who were graphed on the top and bottom quartiles.
This helped the analytics team see the differences between successful and less successful management behaviors.
The best behaviors were identified as were the most common reasons for a manager needing improvement. The final step was sharing these insights and putting a procedure in place for evaluating managers with these qualities in mind.
This data-driven decision continues to create an exceptional company culture for my colleagues and me. Thanks, data.
Another interesting example comes from the nonprofit sector. Nonprofits are organizations dedicated to advancing a social cause or advocating for a particular effort, such as food security, education or the arts.
In this case, data analysts researched how journalists can make a more meaningful impact for the nonprofits they would write about. Because journalists write for newspapers, magazines, and other news outlets, they can help nonprofits reach readers like you and me, who then take action to help nonprofits reach their goals.
The data analysts used a tracker to monitor story topics, clicks, web traffic, comments, shares and more.
Then they evaluated the information to make recommendations for how the journalists could do their jobs even better. In the end, they came up with some great ideas for how nonprofits and journalists can motivate people everywhere to work together and make the world a better place.
Exploring the wonderful world of data
Data has its own life cycle, and the work of data analysts often intersects with that cycle. In this part of the course, you’ll learn how the data life cycle and data analysts' work both relate to your progress through this program. You’ll also be introduced to applications used in the data analysis process.
Identify key software applications critical to the work of a data analyst including spreadsheets, databases, query languages, and visualization tools
Identify relationships between the data analysis process and the courses in the Google Data Analytics Certificate
Explain the data analysis process, making specific reference to the ask, prepare, process, analyze, share, and act phases
Discuss the use of data in everyday life decisions
Discuss the role of spreadsheets, query languages, and data visualization tools in data analytics
Discuss the phases of the data life cycle
Video 14 learning about data phases and tools
It's great to have you back. We've talked a little bit about the data analysis process. As a quick refresher, the data analysis process phases are :
A. Ask
B. prepare
C. Process
D. Analyze excel and sql
E. Share tableau
F. act. R programming
You might remember me saying earlier that this entire program is modeled after these steps.
Now, we're going to really dig in and explore how each of these phases work together.
But I'm getting a little ahead of myself. First, let's spend a little time understanding the data life cycle.
No, data isn't actually alive, but it does have a life cycle.
How do data analysts bring data to life? Well, it starts with the right data analysis tool.
These include:
A. Spreadsheets
B. databases,
C. query languages,
D. and visualization software.
Don't worry if you don't know how these work, or even what they are. At one point, every data analyst has been right where you are right now, and they probably had a lot of the same questions.
After a few weeks, I noticed that even the people who were further in their careers were not as technically minded as I was. That became a great opportunity for me to add value.
My aha spreadsheet moment came when I started researching shortcuts that I could use to work with the spreadsheets more efficiently. This would really streamline the process of getting those reports moved over to the new system.
Once everything started flowing, I remember getting emails from other finance analysts at the company. They were so grateful that someone had come in and fixed a problem that no one else could. That inspired me to go even further and learn how to use spreadsheets in all sorts of incredible ways. As you continue through this course, I bet you'll be just as impressed as I was. And before you know it, you'll bring data to life too. Let's get started.
Video 14 Stages of Data life cycle
Following the data life cycle
Organize the six phases of data analysis
In this categorization exercise, you’ll confirm the correct order of the six phases of data analysis.
1. Ask: define the problem and confirm stakeholder expectations
2. prepare: collects and store data for analysis
3. process: clean and transform data to ensure integrity
4. Analyze: use data analysis tools to draw conclusions
5. share: interpret and communicate results to others to make data driven decisions
6. act: put your insights to work in order to solver the original problems
Phases of the data life cycle
The life cycle of data is
1. plan,
2. Capture
3. Manage
4. Analyze
5. archive
6. destroy.
Plan
This actually happens well before starting an analysis project.
During planning:
A. a business decides what kind of data it needs,
B. how it will be managed throughout its life cycle,
C. who will be responsible for it,
D. and the optimal outcomes.
For example, let's say an electricity provider wanted to gain insights into how to save people energy.
In the planning phase, they might decide to capture information on how much electricity its customers use each year, what types of buildings are being powered, and what types of devices are being powered inside of them.
The electricity company would also decide which team members will be responsible for collecting, storing, and sharing that data. All of this happens during planning, and it helps set up the rest of the project.
Capture
This is where data is collected from a variety of different sources and brought into the organization. With so much data being created every day, the ways to collect it are truly endless. One common method is getting data from outside resources.
For example, if you were doing data analysis on weather patterns, you'd probably get data from a publicly available dataset like the National Climatic Data Center.
Another way to get data is from a company's own documents and files, which are usually stored inside a database. While we've mentioned databases before, we haven't gone into too much detail about what they are.
A database is a collection of data stored in a computer system. In the case of our electricity provider, the business would probably measure data usage among its customers within a database that it owns.
As a quick note, when you maintain a database of customer information, ensuring data integrity, credibility, and privacy are all important concerns.
Manage
Here we're talking about how we care for our data, how and where it's stored, the tools used to keep it safe and secure, and the actions taken to make sure that it's maintained properly. This phase is very important to data cleansing, which we'll cover later on.
Analyze
This is where data analysts really shine. In this phase, the data is used to solve problems, make great decisions, and support business goals. For example, one of our electricity company's goals might be to find ways to help customers save energy.
Archive
Archiving means storing data in a place where it's still available, but may not be used again. During analysis, analysts handle huge amounts of data. Can you imagine if we had to sort through all of the available data that's out there, even if it was no longer useful and relevant to our work?
It makes way more sense to archive it than to keep it around. During analysis, analysts handle huge amounts of data. Can you imagine if we had to sort through all of the available data that's out there, even if it was no longer useful and relevant to our work? It makes way more sense to archive it than to keep it around.
Destroy
let's get back to our electricity provider example. They would have data stored on multiple hard drives. To destroy it, the company would use a secure data erasure software. If there were any paper files, they would be shredded too. This is important for protecting a company's private information, as well as private data about its customers.
Q. In the data life cycle, which phase involves using data to solve problems, make good decisions, and support business goals?
A. Analyze
Variations of the data life cycle
You learned that there are six stages to the data life cycle. Here is a recap:
Plan: Decide what kind of data is needed, how it will be managed, and who will be responsible for it.
1. Capture: Collect or bring in data from a variety of different sources.
2. Manage: Care for and maintain the data.
3. This includes determining how and where it is stored and the tools used to do so.
4. Analyze: Use the data to solve problems, make decisions, and support business goals.
5. Archive: Keep relevant data stored for long-term and future reference.
6. Destroy: Remove data from storage and delete any shared copies of the data.
Warning: Be careful not to mix up or confuse the six stages of the data life cycle
(Plan, Capture, Manage, Analyze, Archive, and Destroy)
with the six phases of the data analysis life cycle
(Ask, Prepare, Process, Analyze, Share, and Act).
They shouldn't be used or referred to interchangeably.
The data life cycle provides a generic or common framework for how data is managed. You may recall that variations of the data analysis life cycle were described in Origins of the data analysis process. The same can be done for the data life cycle. T
he rest of this reading provides a glimpse of how government, finance, and education institutions can view data life cycles a little differently.
U.S. Fish and Wildlife Service
The U.S. Fish and Wildlife Service uses the following data life cycle:
1. Plan
2. Acquire
3. Maintain
4. Access
5. Evaluate
6. Archive
For more information, refer to U.S. Fish and Wildlife's Data Management Life Cycle page.
The U.S. Geological Survey (USGS)
The USGS uses the data life cycle below:
1. Plan
2. Acquire
3. Process
4. Analyze
5. Preserve
6. Publish/Share
Several cross-cutting or overarching activities are also performed during each stage of their life cycle:
Describe (metadata and documentation)
Manage Quality
Backup and Secure
For more information, refer to the USGS Data Lifecycle page.
Financial institutions
Financial institutions may take a slightly different approach to the data life cycle as described in The Data Life Cycle, an article in Strategic Finance magazine:
1. Capture
2. Qualify
3. Transform
4. Utilize
5. Report
6. Archive
7. Purge
Harvard Business School (HBS)
One final data life cycle informed by Harvard University research has eight stages:
1. Generation
2. Collection
3. Processing
4. Storage
5. Management
6. Analysis
7. Visualization
8. Interpretation
For more information, refer to 8 Steps in the Data Life Cycle.
1. Generation.
2. Collection.
3. Processing.
4. Storage.
5. Management.
6. Analysis.
7. Visualization.
8. Interpretation
Key takeaway
Understanding the importance of the data life cycle will set you up for success as a data analyst. Individual stages in the data life cycle will vary from company to company or by industry or sector.
Historical data is important to both the U.S. Fish and Wildlife Service and the USGS, so their data life cycle focuses on archiving and backing up data.
Harvard's interests are in research and teaching, so its data life cycle includes visualization and interpretation even though these are more often associated with a data analysis life cycle.
The HBS data life cycle also doesn't call out a stage for purging or destroying data. In contrast, the data life cycle for finance clearly identifies archive and purge stages.
To sum it up, although data life cycles vary, one data management principle is universal. Govern how data is handled so that it is accurate, secure, and available to meet your organization's needs.
Self-Reflection: Collecting data
Overview
Now that you are familiar with the phases of the data life cycle, you can take a moment to think about your learning. In this self-reflection, you will consider your thoughts about collecting data and how data collection fits into the data life cycle.
To start, you will consider a simple scenario: discussing the data life cycle in a mock interview for a data analyst role. Then, you will respond to three brief questions. You’ve done the hard work to learn the basics of the data life cycle, so get the most out of it: This reflection will help your knowledge stick!
The scenario: interview for a data analyst position
Imagine that you interview for a data analyst role at a local ice cream company. The hiring manager explains that the company needs a data analyst because they want to learn more about their customers. First, they want to understand their customers’ ice cream flavor preferences. Then, they will use this customer data to help make important decisions.
The hiring manager explains that they do not collect any customer data, and they don’t know where to begin. The hiring manager asks you: Can you please explain how you would approach this task?
Before responding to the question, you consider each step of the data life cycle.
Recap: The data life cycle
The steps of the data life cycle are:
Plan: What plans and decisions do you need to make? What data do you need to answer your question?
Capture: Where does your data come from? How will you get it?
Manage: How will you store your data? What should it be used for? How do you keep this data secure and protected?
Analyze: How will the company analyze the data? What tools should they use?
Archive: What should they do with their data when it gets old? How do they know when it's time?
Destroy: Should they ever dispose of any data? If so, when and how?
In this activity, you will use what you learned about the data life cycle in a mock interview for a data analyst role at an ice cream company.
Instructions
Read the scenario below and then share your response in the reflection section.
You are interviewing for a data analyst role at a local ice cream company.
They are interested in using data on customer ice cream flavor preferences to help drive important decisions.
The hiring manager asks you:
“We want to better understand our customers’ ice cream flavor preferences, but honestly we don’t even know where to start. How would you approach this if you were part of our team?”
Reflection
Before responding to the hiring manager’s question, consider each step of the data life cycle:
Plan - What plans and decisions do you need to make? What data do you need to answer your question?
Capture - Where does your data come from? How will you get it?
Manage - How will you store your data? What should it be used for, and how do you keep this data secure and protected?
Analyze - How will the company analyze the data? What tools should they use?
Archive - What should they do with their data when it gets old? How do they know when it's time?
Destroy - Should they ever dispose of any data? If so, when and how?
Write 5-10 sentences explaining your recommendations for collecting customer flavor preference data.
Here are some topics to consider for your response:
What kind of data should they be gathering?
How should they gather this data?
Where will the data live? How will they store the data?
Once they have the data, how will they use it?
How do they keep their data secure and protected?
What should they do with old data? What are their options?
First, we need to know which flavors of ice cream are the best sellers, and the data can be obtained by looking at the company's sales status. And we can also conduct a survey of customers' preferences.
The combination of the survey results and sales status will help we know the ranking of taste preferences and give us a more direct answer to what flavor customers prefer.
It looks good to keep this data for about a year and analyze it in the same way after a year because customers' tastes can be changed more easily than we think.
Correct
Thank you for the submission! Understanding the life cycle of the data you’re working with is crucial to any project.
Your response should focus on helping the ice cream company come up with specific answers to the questions associated with each step of the data life cycle.
To use their data successfully:
The ice cream company must first figure out what data they need and where they can get it.
Once they have the data, they have to be sure of what they will (and won’t) use it for.
The ice cream company also has to be mindful of how they will keep the data secure, and how to deal with old data that has outlived its usefulness.
As a data analyst, these are the types of questions you should always be seeking to answer about your data.
My submission
To use their data successfully:
The ice cream company must first figure out what data they need and where they can get it.
Once they have the data, they have to be sure of what they will (and won’t) use it for.
The ice cream company also has to be mindful of how they will keep the data secure, and how to deal with old data that has outlived its usefulness.
Here are the some popular ways to protect your data
Here are some practical steps you can take today to tighten up your data security.
Back up your data in differnt hard dirve and also in cloud computing
Use strong passwords and make sure two authentication factor enabled
Take care when working remotely.
Be aware of suspicious emails in your email box.
Install anti-virus and malware protection into your system
Don't leave paperwork or laptops unattended.
Make sure your Wi-Fi is secure.
if your data is old then below are some helpful points regarding to the old data.
if your company have a secure work in the form of paper then shreded it.and if your data in the the hard drive then You need to make sure you use Secure Empty Trash (for Apple OS), or use a third-party overwriting program (for Windows OS) to remove data on electronic devices.
External hard-drives are also best erased using secure deletion software and for large-scale deletion of data IT personnel need to step in.
As a data analyst, these are the types of questions you should always be seeking to answer about your data.
Outlining the data analysis process
Video 16 six phases of data analysis
Data analysis isn't a life cycle. It's the process of analyzing data.
Ask, Prepare, Process, Analyze, Share, Act
ASK
In this phase, we do two things. We define the problem to be solved and we make sure that we fully understand stakeholder expectations. Stakeholders hold a stake in the project. They are people who have invested time and resources into a project and are interested in the outcome.
Let's break that down. First, defining a problem means you look at the current state and identify how it's different from the ideal state. Usually there's an obstacle we need to get rid of or something wrong that needs to be fixed. For instance, a sports arena might want to reduce the time fans spend waiting in the ticket line.
The obstacle is figuring out how to get the customers to their seats more quickly. Another important part of the ask phase is understanding stakeholder expectations.
The first step here is to determine who the stakeholders are. That may include your manager, an executive sponsor, or your sales partners. There can be lots of stakeholders. But what they all have in common is that they help make decisions, influence actions and strategies, and have specific goals they want to meet.
They also care about the project and that's why it's so important to understand their expectations. For instance, if your manager assigns you a data analysis project related to business risk, it would be smart to confirm whether they want to include all types of risks that could affect the company, or just risks related to weather such as hurricanes and tornadoes.
Communicating with your stakeholders is key in making sure you stay engaged and on track throughout the project. So as a data analyst, developing strong communication strategies is very important. This part of the ask phase helps you keep focused on the problem itself, not just its symptoms. As you learned earlier, the five whys are extremely helpful here?
PREPARE
This is where data analysts collect and store data they'll use for the upcoming analysis process. You'll learn more about the different types of data and how to identify which kinds of data are most useful for solving a particular problem. You'll also discover why it's so important that your data and results are objective and unbiased. In other words, any decisions made from your analysis should always be based on facts and be fair and impartial.
PROCESS
Here, data analysts find and eliminate any errors and inaccuracies that can get in the way of results. This usually means cleaning data, transforming it into a more useful format, combining two or more datasets to make information more complete and removing outliers, which are any data points that could skew the information.
After that, you'll learn how to check the data you prepare to make sure it's complete and correct.
This phase is all about getting the details right. So you'll also fix typos, inconsistencies, or missing and inaccurate data. To top it off, you'll gain strategies for verifying and sharing your data cleansing with stakeholders.
ANALYZE
Analyzing the data you've collected involves using tools to transform and organize that information so that you can draw useful conclusions, make predictions, and drive informed decision-making. There are lots of powerful tools data analysts use in their work and in this course you'll learn about two of them, spreadsheets and structured query language, or SQL, which is often pronounced "sequel."
SHARE
Here you'll learn how data analysts interpret results and share them with others to help stakeholders make effective data-driven decisions.
In the share phase, visualization is a data analyst's best friend. So this course will highlight why visualization is essential to getting others to understand what your data is telling you. With the right visuals, facts and figures become so much easier to see and complex concepts become easier to understand.
We'll explore different kinds of visuals and some great data visualization tools. You'll also practice your own presentation skills by creating compelling slideshows and learning how to be fully prepared to answer questions.
ACT
This is the exciting moment when the business takes all of the insights you, the data analyst, have provided and puts them to work in order to solve the original business problem and will be acting on what you've learned throughout this program. This is when you prepare for your job search and have the chance to complete a case study project.
It's a great opportunity for you to bring together everything you've worked on throughout this course. Plus adding a case study to your portfolio helps you stand out from the other candidates when you interview for your first data analyst job.
Learn about the process through the program:
1. Learn more about the Ask phase of the process in the Ask Questions to Make Data-Driven Decisions course.
2. Learn more about the Prepare phase of the process in the Prepare Data for Exploration course.
3. Learn more about the Process phase of the process in the Process Data from Dirty to Clean course.
4. Learn more about the Analyze phase of the process in the Analyze Data to Answer Questions and Data Analysis with R Programming courses.
5. Learn more about the Share phase of the process in the Share Data Through the Art of Visualization
6. Data Analysis with R Programming courses.
7. Learn more about the Act phase of the process in the Google Data Analytics Capstone: Complete a Case Study course.
Note: The course links are for you to preview and not complete the courses at this time. You may mark this activity as complete after you understand how the courses align to the data analysis process.
V17 Example of the data process
Ask
"You want to ask all of the right questions at the beginning of the engagement so that you better understand what your leaders and stakeholders need from this analysis." what is the problem that we're trying to solve? What is the purpose of this analysis? What are we hoping to learn from it?
Prepare
"We need to be thinking about the type of data we need in order to answer the questions that we've set out to answer based on what we learned when we asked the right questions." We also need to be thinking about how we're going to collect that data or if we need to collect that data. It may be the case that we need to collect this data brand-new. So we need to think about what type of data we're going to be collecting and how.
Process
It begins with cleaning. "This is where you get a chance to understand its structure, its quirks, its nuances, and you really get a chance to understand deeply what type of data you're going to be working with and understanding what potential that data has to answer all of your questions." Do we have all of the data that we anticipated we would have?
Are we missing data at random or is it missing in a systematic way such that maybe something went wrong with our data collection effort? If needed, did we code all of our data the right way? Are there any outliers that we need to treat differently?
Analyze
the first thing we do is run through a series of analyses that we've already planned ahead of time based on the questions that we know we want to answer from the very, very beginning of the process. One thing that's probably the hardest about this particular process, the hardest thing about analyzing data, is that we as analysts are trained to look for patterns.
Over time as we become better and better at our jobs, what we'll often find is that we can start to intuit what we might see in the data. "This is the point where we have to take a step back and let the data speak for itself.
“As data analysts, we are storytellers, but we also have to keep in mind that it is not our story to tell. That story belongs to the data, and it is our job as analysts to amplify and tell that story in as unbiased and objective a way as possible.
Share
"All of this work from asking the right questions to collecting your data, to analyzing and sharing, doesn't mean much of anything if we aren't taking action on what we've just learned." This is where we use all of those data-driven insights to decide what types of interventions we want to introduce, not only at the organizational level, but also at the team level as well.
The data analysis process is rigorous, but it is lengthy. I can completely appreciate that we as data analysts, get so excited about just diving right into the data and doing what we do best.
Learning Log: Organize your data in a table
Overview
By now, you have started to think about data in your daily life and how you use this data to make decisions. Earlier in this course, you completed a learning log where you recorded some data from your daily life. Next, you will consider how to organize this data. In this activity, you’ll write an entry in your learning log to track your thinking and reflections about how to organize data.
By the time you complete your entry, you will understand how to create and format a table to store the data that you collect. Tables are one of the most common ways data is organized for analysis. This foundational skill will help you more easily analyze data, and will serve as a go-to tool in your data analyst’s toolkit.
Structuring your data
To get started, consider the data you have collected in your learning log entries so far in this course. Now, take a moment and prepare to organize this data. One of the simplest ways to add structure to your data is to put it in a table.
To record your data in a table, you need to understand how a table is structured:
A table consists of rows and columns
Each row is a different observation
Each column is a different attribute of that observation
For example, here is a collection of observations in a learning log about how many cups of coffee are consumed each day:
10/19, 2.5 cups of coffee
10/20, 2 cups of coffee
10/21, 1 cup of coffee
10/22, 1.5 cups of coffee
10/23, 1.5 cups of coffee
There are five data points. Each piece of data consists of a date and the number of cups of coffee consumed that day. You can structure this as a table with six rows and two columns. This includes five rows of data and one header row with titles:
Date
Cups of Coffee / Day
10/19
2.5
10/20
2
10/21
1
10/22
1.5
10/23
1.5
You can also create a table with more detailed data. For instance, if your data also contained information about whether there was cream and sugar in the coffee, it might appear like this:
10/19, 2.5 cups, cream, sugar
10/20, 2 cups, no cream, no sugar
10/21, 1 cup, cream, sugar
10/22, 1.5 cups, cream, no sugar
10/23 1.5 cups, cream, sugar
You can represent this by adding two more columns to your table, one titled “Cream” and one titled “Sugar.”
Date
Cups Coffee/Day
Cream
Sugar
10/19
2.5
yes
yes
10/20
2
no
no
10/21
1
yes
yes
10/22
1.5
yes
no
10/23
1.5
yes
yes
The data analysis toolbox
V18 Exploring data analyst tools
I'm looking forward to introducing you to some of the tools data analyst use each and every day.
There are tons of options out there.
But the most common ones you'll see analyst use are:
1. Spreadsheets
2. query languages
3. visualization tools.
And this video is going to give you a quick look at how these tools are being used by data analysts every day.
The most common ones you'll see analyst use are :
Spreadsheets
The usefulness of your data depends on how well it's structured. When you put your data into a spreadsheet, you can see patterns, group information and easily find the information you need.
Formula: a set of instructions that performs a specific calculation using the data in a spreadsheet.
Function: a preset command that automatically performs a specific process or task using the data in a spreadsheet.
Query languages for databases
Query language : a computer programming language that allows you to retrieve and manipulate data from a database - structured query language(SQL)
Visualization tools
Data visualization: the graphical representation of information. This makes it easier for stakeholders to draw conclusions, make decisions, and come up with strategies. Some popular visualization tools are Tableau and Looker.
Q. Fill in the blank: A query language is a computer programming language that enables data analysts to retrieve and manipulate data from a _____. database
Key data analyst tools
As you are learning, the most common programs and solutions used by data analysts include spreadsheets, query languages, and visualization tools. In this reading, you will learn more about each one. You will cover when to use them, and why they are so important in data analytics.
Spreadsheets
Data analysts rely on spreadsheets to collect and organize data. Two popular spreadsheet applications you will probably use a lot in your future role as a data analyst are Microsoft Excel and Google Sheets.
Digital worksheets structure data in a meaningful way by letting you
Collect, store, organize, and sort information
Identify patterns and piece the data together in a way that works for each specific data project
Create excellent data visualizations, like graphs and charts.
Databases and query languages
A database is a collection of structured data stored in a computer system. Some popular Structured Query Language (SQL) programs include MySQL, Microsoft SQL Server, and Big Query.
Query languages
Allow analysts to isolate specific information from a database(s)
Make it easier for you to learn and understand the requests made to databases
Allow analysts to select, create, add, or download data from a database for analysis
Visualization tools
Data analysts use a number of visualization tools, like graphs, maps, tables, charts, and more. Two popular visualization tools are Tableau and Looker.
These tools
Turn complex numbers into a story that people can understand
Help stakeholders come up with conclusions that lead to informed decisions and effective business strategies
Have multiple features
Tableau's simple drag-and-drop feature lets users create interactive graphs in dashboards and worksheets
Looker communicates directly with a database, allowing you to connect your data right to the visual tool you choose
A career as a data analyst also involves using programming languages, like R and Python, which are used a lot for statistical analysis, visualization, and other data analysis.
As you will continue to learn, data analysts have a lot of tools to choose from. This is a first look at some of the possibilities, and you will explore all of these tools in-depth throughout this program.
Choosing the right tool for the job
As a data analyst, you will usually have to decide which program or solution is right for the particular project you are working on. In this reading, you will learn more about how to choose which tool you need and when.
Depending on which phase of the data analysis process you’re in, you will need to use different tools. For example, if you are focusing on creating complex and eye-catching visualizations, then the visualization tools we discussed earlier are the best choice.
But if you are focusing on organizing, cleaning, and analyzing data, then you will probably be choosing between spreadsheets and databases using queries.
Spreadsheets and databases both offer ways to store, manage, and use data. The basic content for both tools are sets of values. Yet, there are some key differences, too:
Spreadsheets
Databases
Software applications
Data stores - accessed using a query language (e.g. SQL)
Structure data in a row and column format
Structure data using rules and relationships
Organize information in cells
Organize information in complex collections
Provide access to a limited amount of data
Provide access to huge amounts of data
Manual data entry
Strict and consistent data entry
Generally one user at a time
Multiple users
Controlled by the user
Controlled by a database management system
You don’t have to choose one or the other because each serves its own purpose. Generally, data analysts work with a combination of the two, as both tools are very useful in data analytics. For example, you can store data in a database, then export it to a spreadsheet for analysis.
Or, if you are collecting information in a spreadsheet, and it becomes too much for that particular platform, you can import it into a database.
And, later in this course, you will learn about programming languages like R that give you even greater control of your data, its analysis, and the visualizations you create.
As you continue learning about these important tools, you will gain the knowledge to choose the right tool for any data job.
Self-Reflection: Reviewing past concepts
Overview
Now that you have been introduced to working with data, you can pause for a moment and think about what you are learning. In this self-reflection, you will consider your thoughts about the data analysis process and data life cycle, then respond to brief questions.
This self-reflection will help you develop insights into your own learning and prepare you to apply your knowledge of the phases of data analysis to your data analysis toolbox. As you answer questions—and come up with questions of your own—you will consider concepts, practices, and principles to help refine your understanding and reinforce your learning. You’ve done the hard work, so make sure to get the most out of it: This reflection will help your knowledge stick!
So far we’ve learned about the data life cycle and the data analysis process. They cover the following steps:
Data life cycle:
1. Plan
2. Capture
3. Manage
4. Analyze
5. Archive
6. Destroy
Data analysis process:
1. Ask
2. Prepare
3. Process
4. Analyze
5. Share
6. Act
Week 4 Setting up a data toolbox
As you're learning, spreadsheets, query languages, and data visualization tools are all a big part of a data analyst’s job. In this part of the course, you’ll learn more about the basic concepts involved and explore some examples of how these tools work.
Describe spreadsheets, query languages, and data visualization tools, giving specific examples
Demonstrate an understanding of the uses, basic features, and functions of a spreadsheet
Explain the basic concepts involved in the use of SQL including specific examples of queries
Identify the basic concepts involved in data visualization, giving specific examples
V19 the ins and outs of core data tools
Spreadsheet
Sql
Data visualization
V20 Columns and rows and cells, oh my!
An attribute is a characteristic or quality of data used to label a column in a table.
An observation includes all of the attributes for something contained in a row of a data table.
A formula is a set of instructions that performs a specific calculation using the data in a spreadsheet.
Q. In a table, an attribute is a characteristic or quality of data used for what purpose?
To gather related data
To reference a cell
To perform a calculation
To label a column
Correct. In a table, an attribute is a characteristic or quality of data used to label a column.
Hands-On Activity: Generating a chart from a spreadsheet
Activity overview
So far, you have planned a project, identified the data you need, and collected the data. Earlier in this course, you completed a learning log where you recorded some data from your daily life, then took the practical step of organizing it. Now, you’re ready for the most satisfying step of the data analysis project: visualizing your data!
For this activity, you will move your data to a spreadsheet and bring it to life in a chart. By the time you complete this activity, you will understand how to create a simple graphical representation of information. This is a skill data analysts use to make data easy to understand and interesting to look at. It’s important for reports, presentations, infographics, and more.
What you will need
To get started, first determine what software you’d like to use. We suggest using Google Sheets or Microsoft Excel to create your chart.
Save the spreadsheet with your preferred file naming convention, and store it in a folder to help you stay organized.
To use the template for this course item, click the link below and select “Use Template.”
Link to template: Data Chart Template
Working with spreadsheets
Now that you have a template ready, you can start the activity:
Step 1: Open a spreadsheet
Open your spreadsheet in Google Sheets or Microsoft Excel.
Step 2: Familiarize yourself with the spreadsheet
If you are already familiar with spreadsheets, that’s great! If spreadsheets are new to you, don’t worry—they are just like the table you created in a previous activity.
To get familiar with spreadsheets, you should consider a spreadsheet’s format:
Each rectangular block is a cell.
Each cell is meant for one data point, just like in the table you created previously.
Now, consider the cells that run across the top of the spreadsheet (horizontally) and along the left side of the spreadsheet (vertically):
Cells are organized by columns and rows.
Each column has a distinct letter, and each row has a distinct number.
Each cell has a unique identifier composed of the column letter and row number. This identifier is like the cell’s address.
You also have a chart embedded in the spreadsheet. However, the chart is blank, because it doesn’t have any data yet. Next, you will add data!
Step 3: Add your data
Now, you can add your own data from previous learning logs to the spreadsheet. Notice that cell A1 contains the label “Date”, and cell B1 contains the label “Value”.
This lines up with the same structure you used in the table you created in a previous learning log entry, where you recorded data from your daily life. Just like your table, all the “date” parts of your data points go in the cells in column A, and the values you recorded on those dates go in the corresponding cells in column B. It should display like this:
Date
Value
10/19
2.5
10/20
2
10/21
1
10/22
1.5
10/23
1.5
Next, take all of the data that you previously recorded in your learning log and use it to populate the spreadsheet in the appropriate columns: Add the dates in column A, and the values in column B.
Displaying a chart in a spreadsheet
Next, you will update the chart in your spreadsheet based on the data you entered.
Step 4: Reviewing your chart
Now that you’ve entered your data into the Date and Value columns, your spreadsheet should display like this?
Did you notice what happened to the chart when you entered the data in the spreadsheet? The spreadsheet visualized the data for you by making it into a basic bar chart. There are many different ways to customize charts, but first you need to clean up your chart. Then, you can interpret the data visualization!
Step 5: Cleaning up your chart
Now that you have a chart, it’s time to clean it up. It’s generally a good idea to tidy your chart by making it descriptive and aesthetically pleasing. Keep in mind that good data analysts are data storytellers!
To change the title of your chart, double click on the chart. You should find that a “Chart Editor” menu displays on the side of your screen. Click on “Chart & axis titles”, and then enter the name of your chart in the “Title Text” box:
If you’re using Microsoft Excel, you can double click on the title in the chart to edit directly.
Don’t be afraid to play around with the options here: You can always download another copy of the template if you make a mistake you can’t fix!
More spreadsheet resources
In the spirit of lifelong learning, it is good to have resources to turn to when you want to know more about using spreadsheets. Two of the most well known and used spreadsheet platforms are Google Sheets and Microsoft Excel. Both provide free online training resources that you can access anytime you need them. Bookmark these links if you want to access them later. Google Sheets Training and Help
Learn even more ways to move, store, and analyze your data with the Google Sheets Training and Help page, located in the Google Workspace Learning Center. This hub offers an expanded list of tips, from beginner to advanced, along with cheat sheets, templates, guides, and tutorials. Google Sheets Cheat Sheet
Want to learn more about Google Sheets? This online help article features a short list of the most important functions you will use, including rows, columns, cells, and functions.
Microsoft Excel for Windows Training
Get to know Excel spreadsheets a little better by visiting this free online training center. Offering everything from a quick-start guide and introduction to tutorials and templates, you will find everything you need to know, all in one place.
V21 SQL in action
SQL Guide: Getting started
Just as humans use different languages to communicate with others, so do computers. Structured Query Language (or SQL, sometimes pronounced as “sequel”) lets data analysts talk to their databases.
SQL is one of the most useful data analyst tools, especially when you are working with large datasets in tables.
It can help you investigate huge databases, track down text (referred to as strings) and numbers, and filter for the exact kind of data you need—much faster than a spreadsheet can.
In this reading, we will go over the basics of using SQL and explore an example query to demonstrate how SQL works in action.
Basic syntax for SQL
Every programming language, including SQL, has to follow a unique set of guidelines known as syntax. As soon as you enter your search criteria using the correct syntax, the query should start working to pull the information you have requested from the target database.
What is a query?
A query is basically a request for data or information from a database. For example, ''Tell me how many comedy movies were made in 1985” or ''How many people live in Puerto Rico?'' When we query databases, we use SQL to communicate our question or request. Both the user and the database can always exchange information as long as you “speak” the same language.
The foundation of every SQL query is the same:
Use SELECT to choose the columns you want to return.
Use FROM to choose the tables where the columns you want are located.
Use WHERE to filter for certain information.
A SQL query is like filling in a template. You will find that if you are writing a SQL query from scratch, it is helpful to start a query by writing the SELECT, FROM, and WHERE keywords in the following format:
Next, enter the table name after the FROM; the table columns you want after the SELECT; and, finally, the conditions you want to place on your query after the WHERE. Make sure to add a new line and indent when adding these, as shown below:
Following this method each time makes it easier to write SQL queries. It can also help you make fewer syntax errors.
Example of a query
Here is how a simple query would appear in BigQuery, a data warehouse on the Google Cloud Platform.
Select first_name from customer where first_name = ‘tony’
The above query uses three commands to locate customers with the first name Tony:
SELECT the column named first_name
FROM a table named customer_name (in a dataset named customer_data) (The dataset name is always followed by a dot, and then the table name.)
But only return the data WHERE the first_name is Tony
The results from the query might be similar to the following:
first_name
Tony
Tony
Tony
As you can conclude, this query had the correct syntax, but wasn't very useful after the data was returned.
Multiple columns in a query
In real life, you will need to work with more data beyond customers named Tony. Multiple columns that are chosen by the same SELECT command can be indented and grouped together.
If you are requesting multiple data fields from a table, you need to include these columns in your SELECT command. Each column is separated by a comma as shown below:
Select
columnA, column, column from table
where certain condition is met
Here is an example of how it would appear in BigQuery:
Select customer_id, first_name, last_name from customer_data
where first_name = ‘tony’
The above query uses three commands to locate customers with the first name Tony.
SELECT the columns named customer_id, first_name, and last_name
FROM a table named customer_name (in a dataset named customer_data) (The dataset name is always followed by a dot, and then the table name.)
But only return the data WHERE the first_name is Tony
The only difference between this query and the previous one is that more data columns are selected. The previous query selected first_name only while this query selects customer_id and last_name in addition to first_name.
In general, it is a more efficient use of resources to select only the columns that you need. For example, it makes sense to select more columns if you will actually use the additional fields in your WHERE clause. If you have multiple conditions in your WHERE clause, they may be written like this:
Select
Columna, coulumnb, column
From
Table where the data lives
Where
Condition 1and condition 2 and condition 3
Notice that unlike the SELECT command that uses a comma to separate fields/variables/parameters, the WHERE command uses the AND statement to connect conditions. As you become a more advanced writer of queries, you will make use of other connectors/operators such as OR and NOT.
Here is a BigQuery example with multiple fields used in a WHERE clause:
Select
customer_id, first_name, last_name
from customer_data
Where customer_id >0
and first_name =’tony’ and last_name= ‘magnolia’
The above query uses three commands to locate customers with a valid (greater than 0) customer ID whose first name is Tony and last name is Magnolia.
SELECT the columns named customer_id, first_name, and last_name
FROM a table named customer_name (in a dataset named customer_data) (The dataset name is always followed by a dot, and then the table name.)
But only return the data WHERE customer_id is greater than 0, first_name is Tony, and last_name is Magnolia.
Note that one of the conditions is a logical condition that checks to see if customer_id is greater than zero.
If only one customer is named Tony Magnolia, the results from the query could be:
customer_id
first_name
last_name
1967
Tony
Magnolia
If more than one customer has the same name, the results from the query could be:
customer_id
first_name
last_name
1967
Tony
Magnolia
7689
Tony
Magnolia
Key takeaway
The most important thing to remember is how to use SELECT, FROM, and WHERE in a query. Queries with multiple fields will become simpler after you practice writing your own SQL queries later in the program.
Video 22 everyday struggles when learning new skills
Endless SQL possibilities
You have learned that a SQL query uses SELECT, FROM, and WHERE to specify the data to be returned from the query. This reading provides more detailed information about formatting queries, using WHERE conditions, selecting all columns in a table, adding comments, and using aliases. All of these make it easier for you to understand (and write) queries to put SQL in action. The last section of this reading provides an example of what a data analyst would do to pull employee data for a project.
Basic components of a query (and a few useful tips)
Capitalization, indentation, and semicolons
You can write your SQL queries in all lowercase and don’t have to worry about extra spaces between words. However, using capitalization and indentation can help you read the information more easily. Keep your queries neat, and they will be easier to review or troubleshoot if you need to check them later on.
In the SQL syntax shown above, the SELECT statement identifies the column you want to pull data from by name, field1, and the FROM statement identifies the table in which the column is located by name, table.
Finally, the WHERE statement narrows your query so that the database returns only the data with an exact value match or the data that matches a certain condition that you want.
WHERE conditions
In the query shown above, the SELECT clause identifies the column you want to pull data from by name, field1, and the FROM clause identifies the table where the column is located by name, table. Finally, the WHERE clause narrows your query so that the database returns only the data with an exact value match or the data that matches a certain condition that you want to satisfy.
For example, if you are looking for a specific customer with the last name Chavez, the WHERE statement would be: WHERE field1 = 'Chavez';
However, if you are looking for all customers with a last name that begins with the letters “Ch”, the WHERE statement would be: WHERE field1 LIKE 'Ch%';
You can see that the LIKE statement is very powerful because it allows you to tell the database to look for a certain pattern! The percent sign (%) is used as a wildcard to match one or more characters. In our example, both Chavez and Chen would be returned. Note that in some databases the asterisk (*) is used as the wildcard instead of the percent sign (%).
Can you use SELECT * ?
In our example, if you replace SELECT field1 with **SELECT *** you would be selecting all the columns in the table. From a syntax point of view, it is a correct SQL statement, but you should use it sparingly and with caution because depending on how many columns a table has, you could be selecting a tremendous amount of data.
Finally, you will notice that we have shown the SQL statement with a semicolon at the end. The semicolon is a statement terminator and is part of the American National Standards Institute (ANSI) SQL-92 standard which is a recommended common syntax for adoption by all SQL databases.
However, not all SQL databases have adopted or enforce the semicolon, so it’s possible you may come across some SQL statements that aren’t terminated with a semicolon. If a statement works without a semicolon, it’s fine.
Comments
Some tables aren’t designed with descriptive enough naming conventions. In our previous example, field1 was the column for a customer’s last name, but you wouldn’t know it by the name. A better name would have been something like last_name.
In these cases, you can place comments alongside your SQL statements to help you remember what the name represents. Comments are text placed between certain characters, /* and /, or after two dashes (--*) as shown below.
Comments can also be added outside of a statement as well as within a statement. You can use this flexibility to provide an overall description of what you are going to do, step-by-step notes about how you achieve it, and why you set different parameters/conditions.
The more comfortable you get with SQL, the easier it will be to read and understand queries at a glance. Still, it never hurts to have comments in a query to remind yourself of what you’re trying to do. This also makes it easier for others to understand your query if your query is shared. As your queries become more and more complex, this practice will save you a lot of time and energy to understand complex queries you wrote months or years ago.
In the above example, a comment has been added before the SQL statement to explain what the query does. Additionally, a comment has been added next to each of the column names to describe the column and its use. Two dashes (--) are generally supported. So it is best to use -- and be consistent with it. You can use # in place of -- in the above query, but # is not recognized in all SQL versions; for example, MySQL doesn’t recognize #. You can also place comments between /* and */ if the database you are using supports it.
As you develop your skills professionally, depending on the SQL database you use, you can pick the appropriate comment delimiting symbols you prefer and stick with those as a consistent style. As your queries become more and more complex, the practice of adding helpful comments will save you a lot of time and energy to understand queries that you may have written months or years prior.
Aliases
You can also make it easier on yourself by assigning a new name or alias to the column or table names to make them easier to work with (and avoid the need for comments).
This is done with a SQL AS statement. In the example below, you are changing field1 to last_name and table to customers for the duration of the query only. It doesn’t change the names in the actual table used in the database.
Example of a query with aliases
Putting SQL to work (what you might do as a data analyst)
Imagine you are a data analyst for a small business and your manager asks you for some employee data. You decide to write a query with SQL to get what you need from the database.
Let’s say you want to pull all the columns: empID, firstName, lastName, jobCode, and salary. Because you know the database isn’t that big, instead of creating a SELECT statement for each column, you use SELECT *. This will select all the columns from the Employee table in the FROM statement.
This will select all the columns from the Employee table in the FROM clause.
Select * from Employee
Now, let’s get more specific about the data we want from the Employee table. If you want all the data about employees working in the SFI job code, you can use a WHERE statement to filter out the data based on this additional requirement.
Here, you use:
select * from employee
where jobcode = 'sfi'
The portion of the resulting data returned from the SQL query might look like this:
empID
firstName
lastName
jobCode
salary
0002
Homer
Simpson
SFI-
Marge
Simpson
SFI-
Bart
Simpson
SFI-
Lisa
Simpson
SFI-
Ned
Flanders
SFI-
Barney
Gumble
SFI
32000
Suppose you notice a large salary range for the SFI job code, so you would like to flag all employees in all departments with lower salaries for your manager.
Because interns are also included in the table and they have salaries less than $30,000, you want to make sure your results give you only the full time employees with salaries that are $30,000 or less. In other words, you want to exclude interns with the INT job code who earn less than $30,000. A SQL AND statement will enable you to find this information.
You create a SQL query similar to below, where <> means "does not equal":
select * from employee
where jobcode <> 'INT'
AND salary <= 3000;
empID
firstName
lastName
jobCode
salary
0002
Homer
Simpson
SFI-
Marge
Simpson
SFI-
Bart
Simpson
SFI-
Edna
Krabappel
TUL-
Moe
Szyslak
ANA
28000
The resulting data from the SQL query might look like the following (interns with the job code INT aren't returned):
With quick access to this kind of data in SQL, you can provide your manager with tons of different insights about employee data, including whether employee salaries across the business are equitable. Fortunately, the query shows only an additional two employees might need a salary adjustment and you share the results with your manager.
Pulling the data, analyzing it, and implementing a solution might ultimately help improve employee satisfaction and loyalty–making SQL a pretty powerful tool.
Resources to learn more
SQL Cheat Sheet This starter guide for standard SQL syntax used in PostgreSQL offers videos, activities, and readings on SQL. By the time you are finished, you will get to know a lot more about SQL and be prepared to use it for business analysis and other tasks.
W3Schools SQL Tutorial: If you would like to explore a detailed tutorial of SQL, this is the perfect place to start. This tutorial includes interactive examples you can edit, test, and recreate. Use it as a reference or complete the whole tutorial to practice using SQL. Click the green Start learning SQL now button or the Next button ****to begin the tutorial.
SQL Guide: Getting started
This reading covers some of the best practices for formatting queries so they are easy to read and understand. You will be introduced to conventions that help the purpose and function of a query to stand out. This involves how you work with commands in queries as well as how you work with multiple fields. The reading closes with an introduction to how you add comments to your queries to help explain your thinking and the expected outcome of a query.
Capitalization and indentation
Shown below is a recommended format for writing queries. Capitalize SELECT, FROM, and WHERE. Make sure to add a new line and indent when adding the fields.
Here’s an example of what this could look like in Big Query.
The query uses three commands to locate customers with the first name Tony.
SELECT the column named ‘first_name’
FROM a table named ‘customer_name’ (in a database named ‘customer_data’)
but only the data WHERE ‘first_name’ is ‘Tony’
Tip: This is like filling in a template. Always start queries by writing the SELECT, FROM, and WHERE statements in this format. Enter your table name after the FROM command, enter the field(s) you want after the SELECT command, and then finally, enter the conditions you want to place on your query after the WHERE command. You’ll find this makes it easier to write SQL queries and it might be more naturally intuitive to proceed from large to small by entering the large details (table) before the small details (fields and conditions).
Multiple fields
Using the indentation previously described also makes it easier for you to group multiple fields together that are affected by the same command.
If you are requesting data from a table that has multiple fields, you need to include these fields in your SELECT command:
Here is an example of what this could look like in BigQuery:
The query uses three commands to locate customers with the first name Tony.
SELECT the columns named ‘customer_id’, ‘first_name’, and ‘last_name’
FROM a table named ‘customer_name’ (in a database named ‘customer_data’)
but only the data WHERE ‘first_name’ is ‘Tony’
This query is no different than the previous query except that more data columns were selected. In general, it’s a better use of resources to select columns that you’ll make use of in your query.
It makes sense to select more columns if you use the additional fields in your WHERE statement. If you have multiple conditions in your WHERE statement they may be written like this:
Notice that unlike the SELECT command that uses a comma to separate fields/variables/parameters, the WHERE command uses the AND statement to connect conditions. This is important to remember because as you become a more advanced writer of queries you will make use of other connectors/operators such as OR and NOT to connect your conditions.
Here is an example of what this could look like in BigQuery:
The query uses three commands to locate customers with a valid (greater than 0) customer ID whose first name is Tony and last name is Magnolia.
SELECT the columns named ‘customer_id’, ‘‘first_name’, and ‘last_name’
FROM a table named ‘customer_name’ (in a database named ‘customer_data’)
but only the data WHERE ‘customer_id’ is greater than zero, ‘first_name’ is ‘Tony’, and ‘last_name’ is ‘Magnolia’.
Note that one of the conditions is a logical condition where we are checking to see if customer_id is greater than zero.
Comments as reminders
The more comfortable you get with SQL, the easier it will be to read and understand queries at a glance. Still, it never hurts to have comments in a query to remind yourself of what you’re trying to do.
This also makes it easier for others to understand your query if your query is shared. As your queries become more and more complex, this practice will save you a lot of time and energy to understand complex queries you wrote months or years ago.
Over time, your style of query will change and comments will help you remember what your thinking was at the time. Use the “--” symbols to make comments in your query. It tells SQL to ignore whatever comes after the “--” within the same line. For example:
Notice that comments can be added outside of a statement as well as within a statement. You can use this flexibility to provide an overall description of what you are going to do, step-by-step notes about how you achieve it, and why you set different parameters/conditions.
Here is an example of how comments could be written in BigQuery:
In the example , we provide a comment next to each of the column names and give a description of the column and its uses. Two dashes (--) are generally supported. So it’s best to use -- and be consistent with it. You can use # in place of -- in the above query but # is not recognized in all SQL versions; for example, MySQL doesn’t recognize #. You can also use /* before and */ after a comment if the database you’re using supports it.
V23 become a data viz whiz
Planning a data visualization
Earlier, you learned that data visualization is the graphical representation of information. As a data analyst, you will want to create visualizations that make your data easy to understand and interesting to look at.
Because of the importance of data visualization, most data analytics tools (such as spreadsheets and databases) have a built-in visualization component while others (such as Tableau) specialize in visualization as their primary value-add. In this reading, you will explore the steps involved in the data visualization process and a few of the most common data visualization tools available.
Steps to plan a data visualization
Let’s go through an example of a real-life situation where a data analyst might need to create a data visualization to share with stakeholders. Imagine you’re a data analyst for a clothing distributor.
The company helps small clothing stores manage their inventory, and sales are booming. One day, you learn that your company is getting ready to make a major update to its website. To guide decisions for the website update, you’re asked to analyze data from the existing website and sales records. Let’s go through the steps you might follow.
Step 1: Explore the data for patterns
First, you ask your manager or the data owner for access to the current sales records and website analytics reports. This includes information about how customers behave on the company’s existing website, basic information about who visited, who bought from the company, and how much they bought.
While reviewing the data you notice a pattern among those who visit the company’s website most frequently: geography and larger amounts spent on purchases. With further analysis, this information might explain why sales are so strong right now in the northeast—and help your company find ways to make them even stronger through the new website.
Step 2: Plan your visuals
Next it is time to refine the data and present the results of your analysis. Right now, you have a lot of data spread across several different tables, which isn’t an ideal way to share your results with management and the marketing team. You will want to create a data visualization that explains your findings quickly and effectively to your target audience. Since you know your audience is sales oriented, you already know that the data visualization you use should:
Show sales numbers over time
Connect sales to location
Show the relationship between sales and website use
Show which customers fuel growth
Step 3: Create your visuals
Now that you have decided what kind of information and insights you want to display, it is time to start creating the actual visualizations.
Keep in mind that creating the right visualization for a presentation or to share with stakeholders is a process.
It involves trying different visualization formats and making adjustments until you get what you are looking for. In this case, a mix of different visuals will best communicate your findings and turn your analysis into the most compelling story for stakeholders. So, you can use the built-in chart capabilities in your spreadsheets to organize the data and create your visuals.
line charts can track sales over time 2) maps can connect sales to locations 3) donut charts can show customer segments 4) bar charts can compare total visitors that make a purchase
Build your data visualization toolkit
There are many different tools you can use for data visualization.
You can use the visualizations tools in your spreadsheet to create simple visualizations such as line and bar charts.
You can use more advanced tools such as Tableau that allow you to integrate data into dashboard-style visualizations.
If you’re working with the programming language R you can use the visualization tools in RStudio.
Your choice of visualization will be driven by a variety of drivers including the size of your data, the process you used for analyzing your data (spreadsheet, or databases/queries, or programming languages). For now, just consider the basics.
Spreadsheets (Microsoft Excel or Google Sheets)
In our example, the built-in charts and graphs in spreadsheets made the process of creating visuals quick and easy. Spreadsheets are great for creating simple visualizations like bar graphs and pie charts, and even provide some advanced visualizations like maps, and waterfall and funnel diagrams (shown in the following figures).
But sometimes you need a more powerful tool to truly bring your data to life. Tableau and RStudio are two examples of widely used platforms that can help you plan, create, and present effective and compelling data visualizations.
Visualization software (Tableau)
Tableau is a popular data visualization tool that lets you pull data from nearly any system and turn it into compelling visuals or actionable insights. The platform offers built-in visual best practices, which makes analyzing and sharing data fast, easy, and (most importantly) useful. Tableau works well with a wide variety of data and includes an interactive dashboard that lets you and your stakeholders click to explore the data interactively.
You can start exploring Tableau from the How-to Video resources. Tableau Public is free, easy to use, and full of helpful information. The Resources page is a one-stop-shop for how-to videos, examples, and datasets for you to practice with. To explore what other data analysts are sharing on Tableau, visit the Viz of the Day page where you will find beautiful visuals ranging from the Hunt for (Habitable) Planets to Who’s Talking in Popular Films.
Programming language (R with RStudio)
A lot of data analysts work with a programming language called R. Most people who work with R end up also using RStudio, an integrated developer environment (IDE), for their data visualization needs. As with Tableau, you can create dashboard-style data visualizations using RStudio.
Check out their website to learn more about RStudio.
You could easily spend days exploring all the resources provided at RStudio.com, but the RStudio Cheatsheets and the RStudio Visualize Data Primer are great places to start. When you have more time, check out the webinars and videos which offer advice and helpful perspectives for both beginners and advanced users.
Key takeaway
The best data analysts use lots of different tools and methods to visualize and share their data. As you continue learning more about data visualization throughout this course, be sure to stay curious, research different options, and continuously test new programs and platforms to help you make the most of your data.
V24 the power of visualization.
Video 25 to 27 about quick lab
Week 5 Discovering data career possibilities
Businesses of all kinds value the work done by data analysts. In this part of the course, you’ll find out about these businesses and the specific jobs and tasks that analysts perform for them. You’ll also learn how your data analyst certificate will help you meet many of the requirements for a position with these businesses.
Describe the role of a data analyst with specific reference to job roles
Discuss how the Google Data Analytics Certificate can help a candidate meet the requirements of a given job
Explain how a business task may be appropriate for a data analyst, with reference to fairness and the value of the data analyst
Identify companies that would potentially hire data analysts
Describe how one's prior experiences may be applied to a career as a data analyst
Determine whether the use of data constitutes fair or unfair practices
Understand the different ways organizations use data
Explain the concept of data-driven decision-making including specific examples
v28 let’s get down to business
we're going to talk more about the:
the role of data analyst
business tasks for data analysts
fairness in analysis
opportunities for you
and your future success
V29 the job of data analyst
Now let's look at where data analysts actually do their work
You'll learn much more about the industries you could work in as a data analyst.
And how companies in these fields are already using data analytics to do some really cool things.
Across industries like
technology,
marketing,
finance,
health care
Let's look at a real life example of a brand you'll probably recognize, Coca-Cola. Data is changing the way Coca-Cola approaches its marketing strategies. Coca-Cola uses data gathered from consumer feedback to create advertising that speaks directly to different audiences with different interests. How does this work? You know those high tech Coca-Cola vending machines you see at movie theater sometimes? It's always fun getting to make your own flavors.
Well, those machines have built-in artificial intelligence and data analysis tools. This helps Coca-Cola see all the different kinds of flavor combinations people are coming up with, which they can then use as inspiration for new products. How cool is that?
Ever wonder how Google gives you the right answer to any question in just seconds? That's powered by data too. We use all kinds of data to determine a website's reliability and accuracy to make sure you get the most useful results for any search you make.
But it isn't just big companies like Coca-Cola and Google that use data. Small businesses everywhere are also starting to take advantage of data driven insights to improve their operations and make better decisions. Small businesses can use data to do all kinds of things.
They might use data analytics to better understand their customers' buying habits, create more effective social media messaging, or, in the case of one city zoo and aquarium, predict the number of daily visitors based on local climate data.
City zoo and aquarium realized that, on rainy days, they were seeing huge drop offs in attendance, but they had no way to accurately predict when those rainy days would hit. This made staffing a real challenge. Some days they found themselves overstaffed, other days they were unprepared for the rush of visitors. To deal with this, data analyst took years of weather records from the zoo and use that data to accurately predict future weather patterns.
This made it easy for the zoo to know how much staff they needed when. Because the zoo could predict and manage their staffing needs more accurately, they were able to provide a better experience for visitors and dedicate more resources to creating a better experience for the animals too. We see a similar thing in the healthcare industry.
Their data analysts look at clinic attendance data to help hospitals and doctors offices predict when rush hours will hit so they can be ready for it. Your local city hospital is a great example. Let's say they've been getting complaints about long wait times.
Sometimes an hour or more, which made it hard for some patients to get the care they needed. So data analyst use data about the hospitals daily foot traffic to help them make more informed decisions about how many doctors they need on staff at any given time.
This helped reduce wait times, improve their patients experience, and make better use of the health care worker's time too. Like I said, there are many ways that companies in different industries put data to use, but they can only do that if they have data analyst they can rely on.
So you might be wondering, how you fit into the equation? Well, you've got plenty of options, but you don't have to decide what industry you want to work in right away. There will be plenty of time to think about that as you make your way through this program. By the time you finish this program, you have the core skills that will make you valuable in any industry that makes data driven decisions. Which, as it turns out, is most industries, even zoos. Coming up, we'll check out the business task where data can be helpful. And, we'll explore even more how data analysts are empowering businesses through data. I'll see you then.
V30 path to become a data analyst
in analytics, i fell like the key to success is being able to blend the personal side with the technical side
Self-Reflection: Business use of data
Overview
Now that you have been introduced to the role of a data analyst, you can pause for a moment and think about what you are learning. In this self-reflection, you will consider your thoughts about how industries use data and respond to brief questions.
This self-reflection will help you develop insights into your own learning and prepare you to connect your knowledge of a data analyst’s responsibilities to real-world business scenarios.
As you answer questions—and come up with questions of your own—you will consider concepts, practices, and principles to help refine your understanding and reinforce your learning. You’ve done the hard work, so make sure to get the most out of it: This reflection will help your knowledge stick!
How a business uses data
In this self-reflection, you’ll consider the businesses you interact with day-to-day and reflect on how they use data to improve their customer experience.
Pick a company, service, or product that you've had personal experience with that uses data to improve its customer service. Some examples are local restaurants, health care providers, internet providers, or your favorite smartphone app.
Then, think of a specific customer experience problem this company, service or product might have that you suspect could be addressed with data. This could be something like a restaurant tracking sales of a new product, or internet service providers trying to figure out where outages occur.
Try to avoid broad problems and think of specific issues. A good example of a problem would be that the meal you ordered from a delivery service arrived cold.
Reflection
Consider the company, service, or product you chose in this reflection:
How could it use data to improve customer experience?
What kinds of data would it need to collect?
How could insights from that data solve a problem?
Video 31 supporting careers in data analytics
For any analyst, for any person that's honestly at the early stages of their career, understanding data, respecting data and knowing how to work with data is incredibly important because, my vision is that every role in some form or fashion will involve data and its use in learning how to extract insights from it will be at the core of any critical role across any company organization.
Generally in those first two years, you are developing the core skill sets that make you a fantastic generalist, and then in the next 2-5 years, you're learning about something very specific as as it relates to your job. Whether it's the area that you're supporting or maybe a very technical component.
Like, let's say you want to become a SQL expert so that you can manipulate large data sets for financial analysis purposes. Similarly, even if you come into finance as a data analyst, you can pop out of finance and go into what a lot of people like to call the business, which is typically your Operations Functions and become a business analyst or a data analysts.
There's so many different paths that you can take from the starting point that you really can't predict your end. I'm just deeply passionate about working with and supporting young people and really giving them a jumpstart to their career.
This stems from honestly my own personal experience, where in the first two years of my career, I had essentially zero support from my manager and my direct management chain. Having gone through that experience my first few years, I realize and I felt experience how that can slow you down, and especially when you are somebody that has a lot of potential and a lot of ability, you want to be in an environment that fosters that ability and really wants to see you grow.
I think it's incredibly important to have programs like these that take away all the barriers that remove any of the constructs that prevent people from being able to find out what they need to be in an industry like this, to be successful in a role like a data analyst, so that they themselves can dream about where they can go in their career. My name is Tony. I'm a Finance program manager at Google.
Learning Log: Reflect on the data analysis process
Overview
By now, you have started getting familiar with the data analysis process. Now, you’ll complete an entry in your learning log reflecting on your experience with the data analysis process and your progress in this course. By the time you complete this activity, you will have a stronger understanding of how to use the steps of this process to organize data analysis tasks and solve big problems with data. This framework will continue to help guide you through your own work in this course--and as a junior data analyst!
The data analysis process so far
Take a moment to appreciate all the work you have done in this course. You identified a question to answer, and systematically worked your way through the data analysis process to answer that question—just like professional data analysts do every day!
In reviewing the data analysis process so far, you have already performed a lot of these steps. Here are some examples to think about before you begin writing your learning log entry:
You asked an interesting question and defined a problem to solve through data analysis to answer that question.
You thought deeply about what data you would need and how you would collect it in order to prepare for analysis.
You processed your data by organizing and structuring it in a table and then moving it to a spreadsheet.
You analyzed your data by inspecting and scanning it for patterns.
You shared your first data visualization: a bar chart.
Finally, after completing all the other steps, you acted: You reflected on your results, made decisions, and gained insight into your problem--even if that insight was that you didn't have enough data, or that there were no obvious patterns in your data.
As you progress through the rest of the program, you will continue using and revisiting these steps to help guide you through your own analysis tasks. You will also learn more about different tools that can help you along the way!
Access your learning log
To use the template for this course item, click the link below and select Use Template.
Link to learning log template: Reflect on the data analysis process
Reflection
In your learning log, write 2-3 sentences (40-60 words) reflecting on the data analysis process and your experiences so far by answering each of the questions below:
Which part(s) of the data analysis process did you enjoy the most? What did you enjoy about it?
Processing and Analyzing Data are the most enjoyable parts for me. These parts require more practical processes and actions. I like these things that I have to do myself.
What were some of the key ideas you learned in this course?
The key idea I learned in this course is that it is super important to ask good questions. It's like the alpha and omega in data analysis.
Are there concepts or portions of the content that you would like to learn more about? If so, what are they? Which upcoming course do you think would teach you the most about this area?
I would like to learn more about how to effectively visualize data.
Now that you’ve gained experience doing data analysis, how do you feel about becoming a data analyst? Have your feelings changed since you began this course? If so, how?
I became more interested in becoming a data analyst and felt that this field suited me well.
When you’ve finished your entry in the learning log template, make sure to save the document so your response is somewhere accessible. This will help you continue applying data analysis to your everyday life. You will also be able to track your progress and growth as a data analyst.
Learn about data analyst job opportunities
Small businesses everywhere are also starting to take advantage of data driven insights to improve their operations and make better decisions.
"In analytics I feel like the key to success is being able to blend the personal side with the technical side." - Joey, Analytics program manager at REWS
Ways a business uses data
Pick a company, service, or product that you've had personal experience with that uses data to improve its customer service. Some examples are local restaurants, health care providers, internet providers, etc.
Then, think of a specific customer experience problem this company, service or product might have that you suspect could be addressed with data.
This could be something like a restaurant tracking sales of a new product, or internet service providers trying to figure out where outages are occurring. Try to avoid broad problems and think of specific issues. For example, the meal you ordered from a delivery service arrived cold.
In the text box below, write 1-3 sentences (20-60 words) on the company, service or product you chose and how they could use data to improve customer experience.
All sorts of industries can benefit from using data to solve business problems, including improving overall customer experience.
Question 2 Next, we’ll think about the type of data that needs to be collected and how the data will be used. In the example of the restaurant, let’s pretend the problem that will be solved is delivery food arriving cold to customers' homes. More data about the delivery process could help the restaurant streamline the process and deliver food on time.
In the text box below, write 1-3 sentences (20 to 60 words) reflecting on what types of data could be used to resolve the problem referenced in the question above.
Data analytics helps businesses make better decisions, but getting there is a process. It begins with an analysis of a business problem, identifies data that can provide insights about that problem, and then uses data analysis to arrive at an answer. Sometimes you get an answer that solves your business problem, but it’s often just as likely that you discover other questions to investigate further, with more data analysis.
"For any analyst, for any person that's honestly at the early stages of their career, understanding data, respecting data and knowing how to work with data is incredibly important because, my vision is that every role in some form or fashion will involve data and its use in learning how to extract insights from it will be at the core of any critical role across any company organization." - Tony, Finance Program Manage at Google
Roles of a data analyst
Multiple choice exercise
Which industry is it?
Select the industry that matches the example of how an analyst uses data.
Use geographic data to power GPS technology in cars.
Technology relies on software and hardware to function.
Use demographic data to target advertisements for a new consumer product for youths.
Marketing uses audience insights to make decisions.
Use stock market data to determine which portfolios to invest in.
Finance relies on daily market trends for insight.
Use bed occupancy data to determine the number of nurses and orderlies to schedule on a given shift.
Healthcare involves reviewing hospital traffic to inform staff decisions.
Use past booking data to accurately anticipate levels of demand for hotel rooms.
Hospitality looks at seasonal trends to predict demand.
Use population data to determine which communities need federal funding.
Government relies on demographic information in order to provide proper support.
The importance of fair business decisions
V32 the power of data in business
Issue : a topic or subject to investigate
Question : designed to discover information
Problem: an obstacle or complication that needs to be worked out.
Business task : the question or problem data analysis answers for business
Data-driven decision-making : using facts to guide business strategy
The simplest way to think about decision making is that it's a choice between consequences, good bad or a combination of both.
Data helps us see the whole thing. With data, we have a complete picture of the problem and its causes, which lets us find new and surprising solutions we never would've been able to see before.
Data analytics helps businesses make better decisions. It all starts with a business task and the question it's trying to answer.
"I think one of the most important things to remember about data analytics is that data is data. (...) I found that data acts like a living and breathing thing." - Rachel, Business systems and analytics lead at Verily
V33 data detectives
I think one of the most important things to remember about data analytics is that data is data.
i found that data acts like a living and breathing thing.
The best advice i have for any data analyst starting out is keep at it. if the angle you're taking doesn't work, try to find another one. try to come at it i a different way, try to ask a different questions and eventually the data will yield and you'll get the insights you're looking for
V34 Understanding data and fairness
Fairness: ensuring that your analysis doesn't create or reinforce bias.
Data analysts have another important responsibility. making sure that their analysis are fair
Data is based on collected facts. How can it be unfair?
In other words as a data analyst, you want to help create systems that are fair and inclusive to everyone.
As a data analyst, it's your responsibility to make sure your analysis is fair, and factors in the complicated social context that could create bias in your conclusions. It's important to think about fairness from the moment you start collecting data for a business task to the time you present your conclusions to your stakeholders.
So far, we've covered the different roles data analysts play in business environments and the kinds of tasks that come with those roles.
But data analysts have another important responsibility: making sure that their analyses are fair. Now, I know what you're probably thinking, data is based on collected facts, how can it be unfair?
Well, that's a good question. Let's learn what fairness means when we talk about data analysis and why it's important for you as an analyst to keep in mind.
Fairness means ensuring that your analysis doesn't create or reinforce bias. In other words, as a data analyst, you want to help create systems that are fair and inclusive to everyone. Sounds simple enough? Well, here's the tough part about fairness in data analytics.
There isn't one standard definition of it, but hopefully the way we've just described it can give you one way to think about fairness for right now, but it's about to get a bit trickier. Sometimes conclusions based on data can be true and unfair.
What can you do then? Well, let's find out with an example. Let's say we have a company that's kind of notorious for being a boys club. There isn't much representation of other genders. This company wants to see which employees are doing well, so they start gathering data on employee performance and their own company culture. The data shows that men are the only people succeeding at this company. Their conclusion? That they should hire more men.
After all, they're doing really well here, right? But that's not a fair conclusion for a couple of reasons. First, it doesn't even consider all of the available data on company culture, so it paints an incomplete picture. Second, it doesn't think about the other surrounding factors that impact the data, or in other words, the conclusion doesn't consider the difficulties that people of different gender identities have trying to navigate a toxic work environment.
If the company only looks at this conclusion, they won't acknowledge and address how harmful their culture is and they won't understand why certain people are set up to fail within it. That's why it's important to keep fairness in mind when analyzing data. The conclusion that only men are succeeding at this company is true, but it ignores other systematic factors that are contributing to this problem. But don't worry, there's a way to make a fair conclusion here.
An ethical data analyst can look at the data gathered and conclude that the company culture is preventing some employees from succeeding, and the company needs to address those problems to boost performance. See how this conclusion paints a much more complete and fair picture. It recognizes the fact that some people aren't doing as well in this company and factors in why that could be instead of discriminating against a huge number of applicants in the future.
As a data analyst it's your responsibility to make sure your analysis is fair and factors in the complicated social context that could create bias in your conclusions. It's important to think about fairness from the moment you start collecting data for a business task to the time you present your conclusions to your stakeholders. We'll learn more about bias in the data analysis process later on in another course. For now, let's check out an example of a data analysis that does a good job of considering fairness in its conclusion.
A team of Harvard data scientists were developing a mobile platform to track patients at risk of cardiovascular disease in an area of the United States called the Stroke Belt. It's important to call out that there were a variety of reasons people living in this area might be more at risk.
With that in mind, these data scientists recognized that fairness needed to be a priority for this project, so they built fairness into their models. The team took several fairness measures to make sure they were being as fair as possible when examining sensitive and potentially biased data. First, they teamed analysts with social scientists who could provide insights on human bias and the social context that created them.
They also collected self-reported data in a separate system to avoid the potential for racial bias, which might skew the results of their study and unfairly represent patients.
To make sure this sample population was representative, they oversampled non-dominant groups to ensure the model was including them. It's clear that the team made fairness a top priority every step of the way.
This helped them collect data and create conclusions that didn't negatively impact the communities they were studying. Hopefully these examples have given you a better idea of what fairness means in data analysis. But we're going to keep building on our understanding of fairness throughout this program and you'll get to practice with some activities.
Self-Reflection: Business cases
Overview
Now that you have explored how businesses use data in the real world, you can pause for a moment and think about what you are learning. In this self-reflection, you will consider fairness and data use in three example business cases and respond to brief questions with your thoughts.
This self-reflection will help you develop insights into your own learning and prepare you to apply your knowledge of fairness practices to scenarios that represent real-life business case studies. As you answer questions—and come up with questions of your own—you will consider concepts, practices, and principles to help refine your understanding and reinforce your learning. You’ve done the hard work, so make sure to get the most out of it: This reflection will help your knowledge stick!
Case Study #1
To improve the effectiveness of its teaching staff, the administration of a high school offered the opportunity for all teachers to participate in a workshop. They were not required to attend; instead, the administration encouraged teachers to sign up. Of the 43 teachers on staff, 19 chose to take the workshop.
At the end of the academic year, the administration collected data on teacher performance for all teachers on staff. The data was collected via student survey. In the survey, students were asked to rank each teacher's effectiveness on a scale of 1 (very poor) to 6 (very good).
The administration compared data on teachers who attended the workshop to data on teachers who did not. The comparison revealed that teachers who attended the workshop had an average score of 4.95, while teachers who did not attend had an average score of 4.22. The administration concluded that the workshop was a success.
Case Study #1
In an effort to improve the teaching quality of its staff, the administration of a high school offered the chance for all teachers to participate in a workshop, though they were not required to attend. Instead, they were encouraged to sign up on a first-come, first-served basis. Of the 43 teachers on staff, 19 chose to take the workshop.
At the end of the academic year, the administration collected data on all teachers’ performance. Then they compared the data on those teachers who attended the workshop to the teachers who did not attend. The data was collected via student surveys that ranked a teacher's effectiveness on a scale of 1 (very poor) to 6 (outstanding). The data revealed that those who attended the workshop had an average score of 4.95, while teachers that did not attend the workshop had an average score of 4.22. The administration concluded that the workshop was a success.
Case Study #2
An automotive company tests the driving capabilities of its self-driving car prototype. They carry out the tests on various types of roadways—specifically, a race track, trail track, and dirt road.
The researchers only test the prototype during the daytime. They collect two types of data: sensor data from the car during the drives and video data of the drives from cameras on the car.
They review the data after the initial tests. The results illustrate that the new self-driving car meets the performance standards across each of the roadways. As a result, the car can progress to the next phase of testing, which will include driving in various weather conditions.
Consider this scenario:
What are the examples of fair or unfair practices?
How could a data analyst correct the unfair practices?
Now, write 2-3 sentences (40-60 words) in response to each of these questions. Type your response in the text box below.
Reflection
Are there examples of fair or unfair practices in the above case? If there are unfair practices, how could a data analyst correct them?
In the text box below, write 3-5 sentences (60-100 words) answering these questions.
This is an example of unfair practice. It is tempting to conclude — as the administration did — that the workshop was a success. However, since the workshop was voluntary and not random, it is impossible to find a relationship between attending the workshop and the higher rating.
It is possible that the workshop was effective, but other explanations for the differences in the ratings cannot be ruled out. For example, another explanation could be that the staff volunteering for the workshop was the better, more motivated teachers. This group of teachers would be rated higher whether or not the workshop was effective.
It’s also worth noting that there is no direct connection between student survey responses and the attendance of the workshop, so this data isn’t actually useful. The data analyst could correct this by asking for the teachers to be selected randomly to participate in the workshop, and by adjusting the data they collect to measure something more directly related to workshop attendance, like the success of a technique they learned in that workshop.
Case Study #2
A self-driving car prototype is going to be tested on its driving abilities. The test is carried out on various types of roadways — specifically a race track, trail track, and dirt road.
The prototype is only being tested during the day time. The data collected includes sensor data from the car during the drives, as well as video of the drive from cameras on the car.
The results of the initial tests illustrate that the new self-driving car met the performance standards across each of the different tracks and will progress to the next phase of testing, which will include driving in different weather conditions.
Reflection
Are there examples of fair or unfair practices in the above case? If there are unfair practices, how could a data analyst correct them?
In the text box below, write 3-5 sentences (60-100 words) answering these questions.
This case study shows an unfair practice. While the prototype is being tested on three different tracks, it is only being tested during the day, for example. Conditions on each track may be very different during the day and night and this could change the results significantly. The data analyst should correct this by asking the test team to add in night-time testing to get a full view of how the prototype performs at any time of the day on the tracks.
Case Study #3
An amusement park is trying to determine what kinds of new rides visitors would be most excited for the park to build. In order to understand their visitors’ interests, the park develops a survey. They decide to distribute the survey by the roller coasters because the lines are long enough that visitors will have time to fully answer all of the questions. After collecting this survey data, they find that most visitors apparently want more roller coasters at the park.
Reflection
Are there examples of fair or unfair practices in the above case? If there are unfair practices, how could a data analyst correct them?
In the text box below, write 3-5 sentences (60-100 words) answering these questions.
This case study contains an unfair practice. While the decision to distribute surveys in places where visitors would have time to respond makes sense, it accidentally introduces sampling bias. Because the only respondents to the survey are people waiting in line for the roller coasters, the results are unfairly biased towards roller coasters. A data analyst could reduce sampling bias by distributing the survey at the entrance and exit of the amusement park to avoid targeting roller coaster fans.
"How do we actually improve the lives of people by using data? (...) I think aspiring data analysts need to keep in mind that a lot of the data that you're going to encounter is data that comes from people so at the end of the day, data are people." - Alex, Research scientist at Google
Case opinion
Recently, you were presented with cases about data analytics in the real world. One case involved an unfair conclusion about the performance of women who worked at a business. It demonstrated that data can sometimes be true, yet unfair. In addition, it highlighted the importance of asking, "Why?" when reviewing the results of data analysis.
Another example involved data analysts prioritizing fairness and going out of their way to ensure their data was as fair as possible. Because they were working with sensitive and potentially biased health data, they chose to collaborate with social scientists in order to better understand the social context behind that data.
If you need to, return to the video to refresh your understanding of the examples before you continue. Then, discuss the first case and how the analysts at that company could improve their process:
What could they have done differently to be fairer in their analysis?
What could have made their conclusion less biased?
Submit two more more paragraphs (100-200 words total). Then, visit the discussion forum to read what other learners have written, and respond to at least two of them with your own thoughts.
Video 35 fair and ethical data decision
Hi, I'm Alex. I'm a research scientist at Google. My team is called the ethical AI team, we're a group of folks that really are concerned not only about how AI the technology operates, but how it interacts with society and how it might help or harm marginalized communities.
When we talk about data ethics, we think about what is the good and right way of using data?
What are going to be ways that uses of data are going to be beneficial to people? When it comes to data ethics, it's not just about minimizing harm but it's actually this concept of beneficence.
How do we actually improve the lives of people by using data? When we think about data ethics we're thinking about, who's collecting the data? Why are they collecting it?
How are they collecting it and for what purpose? Because of the way that organizations have imperatives to make money or to report to somebody or provide some analysis, we also have to keep strongly in mind how this is actually going to benefit people at the end of the day.
Are the people represented in this data going to be benefited by this? I think that's the thing you never want to lose sight of as a data scientist or a data analyst. I think aspiring data analysts need to keep in mind that a lot of the data that you're going to encounter is data that comes from people so at the end of the day, data are people.
You want to have a responsibility to those people that are represented in those data. Second, is thinking about how to keep aspects of their data protected and private.
We don't want to go through our practice thinking about data instances as something we can just throw on the web. No, there needs to be considerations about how to keep that information, and likenesses like their images, or their voices, or their text.
How do we keep that private? We also need to think about how we can have mechanisms of giving users and giving consumers more control over their data. It's not going to be sufficient just to say, we collect all this data and trust us with all these data.
But we need to ensure that there's actionable ways in which people can consent to giving those data, and ways that they can ask for it to be revoked or removed.
Data's growing and at the same time, we need to empower people to have control over their own data.
The future is that data is always growing, we haven't seen any evidence that data is actually shrinking. With the knowledge that data's growing, these issues become more and more piqued, and more and more important to think about.
V36 Data analysts in different industries
Data is already being used by countless industries in all kinds of different ways,
tech,
marketing,
finance,
health care, the list goes on.
But one thing that's important to keep in mind, every industry has specific data needs that have to be addressed differently by their data analysts.
there's a lot of important factors to think about when searching for your dream job.
Let’s talk about some of the most common factors first, industry, tools, location, travel and culture and culture.
Data is already being used by countless industries in all kinds of different ways, tech, marketing, finance, health care, the list goes on.
Every industry has specific data needs that have to be addressed differently by their data analysts.
The same revenue data can be used in three different ways by data analysts in three different industries, financial services, Telecom, and tech. For example, a finance analyst at a bank post public revenue data of Telecom company X to create a forecast that predicts where revenues will be in the future to recommend the stock price.
The business analyst at Telecom company X uses that same data to advise the sales team.
Then a data analyst at the company who created a customer management tool for Telecom company X will use that revenue data to determine how efficiently the software is performing. Finance, telecom, and tech, all use data differently, so they need analysts who have different skills.
The key is to think about your interests early in your job search. That'll lead you in the right direction, and it will help you in interviews too. Potential employers will want to know why you're interested in their company, and how you can address their needs, so if you can speak about your motivation to work in data analytics during interviews, you'll make yourself stand out in a great way.
Data analyst roles and job descriptions
As technology continues to advance, being able to collect and analyze the data from that new technology has become a huge competitive advantage for a lot of businesses. Everything from websites to social media feeds are filled with fascinating data that, when analyzed and used correctly, can help inform business decisions.
A company’s ability to thrive now often depends on how well it can leverage data, apply analytics, and implement new technologies.
This is why skilled data analysts are some of the most sought-after professionals in the world. A study conducted by IBM estimates that companies in the United States will fill 2,720,000 Data Science and Analytics jobs by 2020.
Because the demand is so strong, you’ll be able to find job opportunities in virtually any industry. Do a quick search on any major job site and you’ll notice that every type of business from zoos, to health clinics, to banks are seeking talented data professionals?
Even if the job title doesn’t use the exact term “data analyst,” the job description for most roles involving data analysis will likely include a lot of the skills and qualifications you’ll gain by the end of this program. In this reading, we’ll explore some of the data analyst-related roles you might find in different companies and industries.
“The Quant Crunch: How the Demand for Data Science Skills is Disrupting the Job Market,” by Will Markow, Soumya Braganza, and Bledi Taska, with Steven M. Miller and Debbie Hughes. https://www.ibm.com/downloads/cas/3RL3VXGA
Decoding the job description
The data analyst role is one of many job titles that contain the word “analyst.”
To name a few others that sound similar but may not be the same role:
Business analyst — analyzes data to help businesses improve processes, products, or services
Business intelligence analyst — analyzes data for finance or market insights
Data analytics consultant — analyzes the systems and models for using data
Data engineer — prepares and integrates data from different sources for analytical use
Data scientist — uses expert skills in technology and social science to find trends through data analysis
Data specialist — organizes or converts data for use in databases or software systems
Operations analyst — analyzes data to assess the performance of business operations and workflows
Data analysts, data scientists, and data specialists sound very similar but focus on different tasks. As you start to browse job listings online, you might notice that companies’ job descriptions seem to combine these roles or look for candidates who may have overlapping skills.
The fact that companies often blur the lines between them means that you should take special care when reading the job descriptions and the skills required.
The table below illustrates some of the overlap and distinctions between them:
We used the role of data specialist as one example of many specializations within data analytics, but you don’t have to become a data specialist! Specializations can take a number of different turns. For example, you could specialize in developing data visualizations and likewise go very deep into that area.
Job specializations by industry
We learned that the data specialist role concentrates on in-depth knowledge of databases. In similar fashion, other specialist roles for data analysts can focus on in-depth knowledge of specific industries. For example, in a job as a business analyst you might wear some different hats than in a more general position as a data analyst.
As a business analyst, you would likely collaborate with managers, share your data findings, and maybe explain how a small change in the company’s project management system could save the company 3% each quarter. Although you would still be working with data all the time, you would focus on using the data to improve business operations, efficiencies, or the bottom line.
Other industry-specific specialist positions that you might come across in your data analyst job search include:
Marketing analyst — analyzes market conditions to assess the potential sales of products and services
HR/payroll analyst — analyzes payroll data for inefficiencies and errors
Financial analyst — analyzes financial status by collecting, monitoring, and reviewing data
Risk analyst — analyzes financial documents, economic conditions, and client data to help companies determine the level of risk involved in making a particular business decision
Healthcare analyst — analyzes medical data to improve the business aspect of hospitals and medical facilities
Key takeaway
Explore data analyst job descriptions and industry-specific analyst roles. You will start to get a better sense of the different data analyst jobs out there and which types of roles you’re most interested to go after.
Title: Decoding the job description
Data analysts:
Problem solving : Use existing tools and methods to solve problems with existing types of data
Analysis : Analyze collected data to help stakeholders make better decisions
Other relevant skills : database queries, data visualization, dashboards, reports and spreadsheets
Data scientists:
Problem solving : Invent new tools and models, ask open-ended questions, and collect new types of data
Analysis : Analyze and interpret complex data to make business predictions
Other relevant skills : advanced statistics, machine learning, deep learning, data optimization, and programming
Data specialists:
Problem solving : Use in-depth knowledge of databases as a tool to solve problems and manage data
Analysis : Organize large volumes of data for use in data analytics or business operations
Other relevant skills : data manipulation, information security, data models, scalability of data, and disaster recovery
We used the role of data specialist as one example of many specializations within data analytics, but you don’t have to become a data specialist! Specializations can take a number of different turns. For example, you could specialize in developing data visualizations and likewise go very deep into that area.
Job specializations by industry
We learned that the data specialist role concentrates on in-depth knowledge of databases. In similar fashion, other specialist roles for data analysts can focus on in-depth knowledge of specific industries. For example, in a job as a business analyst you might wear some different hats than in a more general position as a data analyst. As a business analyst, you would likely collaborate with managers, share your data findings, and maybe explain how a small change in the company’s project management system could save the company 3% each quarter. Although you would still be working with data all the time, you would focus on using the data to improve business operations, efficiencies, or the bottom line.
Other industry-specific specialist positions that you might come across in your data analyst job search include:
Marketing analyst — analyzes market conditions to assess the potential sales of products and services
HR/payroll analyst — analyzes payroll data for inefficiencies and errors
Financial analyst — analyzes financial status by collecting, monitoring, and reviewing data
Risk analyst — analyzes financial documents, economic conditions, and client data to help companies determine the level of risk involved in making a particular business decision
Healthcare analyst — analyzes medical data to improve the business aspect of hospitals and medical facilities
Key takeaway
Explore data analyst job descriptions and industry-specific analyst roles. You will start to get a better sense of the different data analyst jobs out there and which types of roles you’re most interested to go after.
Video 37 Samah: Interview best practices
"_Think about a time where you've used data to solve a problem, whether it's in your professional or personal projects. _
_Increase your professional network. It's really important to have your LinkedIn updated along with websites like GitHub, where you can showcase a lot of the data analyst’s projects you've done. _
Prepare questions for the interviewer. Make sure they're not broad questions.
They should be questions that will help you understand the team and the job better.
If you're given a case study in an interview, you should expect to be given a business problem along with the sample data set. Then you'd be asked to take that sample data set, analyze it, and come up with a solution.
One of the things you can do to help prepare yourself for this is to ensure you are analyzing the data and coming up with a solution that relates back to that data. Sometimes there is no right answer, and a lot of times interviewers are looking to see your thought process and the way you get to your solution.
I highly encourage that if you find a role that you're interested in, not only apply to it, but go the next step. Look for the recruiter. Look for the hiring manager online.
See if you can reach out to them and set up a coffee chat or send them your resume directly. When you reach out directly to a hiring manager or recruiter, it really shows your eagerness for the role and your interests for the role. Even if sometimes you don't get a response from reaching out, you never know, you try multiple different times.
That one time you get a response back from a recruiter or hiring manager, could be the time you get the job that you really wanted."
Video 38 weekly wrap
Video 39 course wrap up