INTRODUCTION TO DATA SCIENCE

by techgirl
1435 views

Data Science is using data collected,organizing and analyzing ,to make conclusions about the data. #Code4employment program covers the Microsoft Virtual Academy:Introduction to DataScience course,which I covered.

Sub-topics covered;

  • What is Data science?
  • Data organization
  • Data visualization
  • Data analysis
  • Becoming a data scientist
  • Using excel to support data analysis

Data organization

This content breaks down data in three levels;

Level 1- This data is humanly readable.It can be read and scrolled through by an individual.

Level 2 -This data may not be humanly readable, but still understandable so as the data scientist you’re bringing in multiple tables and might use tools such as PowerBI , or Tableau, e.t.c .One will also start using some programming language such as Python with Pandas or Spark, R & Java.There is also the need to mash up data,this will rely on the individuals intuition so as to know how to get certain conclusions from the data.

Level 3-Data that is non-uniform, huge amounts of data, leading to using distributed systems since one machine might not be able to handle the data.Programming languages used are Python,R, Scala.Databases used to deal with big data sets include Hadoop, or Pig, or Hive .

Data visualization

Data visualization mainly involves two steps,understanding the data and explaining results to other people.Data visualization’s three level are;

Level 1-Involves using tools such as Excel, Power BI e.t.c to generate classic data presentations such as histograms,trend lines,applying basic statistics and visual graphics.Programming languages used Python and R are used to visualize the data.

Level 2-Involves using different data types.For example information from those you frequently message is closely clustered to you unlike from those you rarely message and thus you can start having graphics that represent that.In this level,graphics are going to be designs and decision guiding graphics. Tools used will be such PowerBI and Excel and programming tools such as Python, or D3.js, or R.

Level 3-Involves the ability to present architectures of experimental designs and how you’re gonna analyze the data.It also has a component from machine learning where the machine does several of these experiments and hypotheses.As well explaining to the user how you are setting up the parameters for the machine to learn with such that it can analyze the data for you.

Data analysis

Levels of data analysis:

Level 1-Uses classical graphics and tools such as excel.This is mostly practiced on data that is not bulky.

Level 2-This is mainly performed on big data and involves computer programming skills such as python,R e.t.c Machine learning is also applied on this level.A lot of the computer programming applied perform statistical methods on the data.

Level 3-Mainly involves attacking data with computer programming as well using neural networks.The data scientist uses tools such as tensor flow or develop their tools to analyze the data.

Becoming a data scientist

Using excel to support data analytics

Example of a data set downloaded from Kaggle and imported to excel and did some analysis on.

Data science is very interesting and I intend to learn and practice more.Head over to #code4employment and learn something as well.

0 comment
0

You may also like

Leave a Comment