The job title “Data Scientist” is becoming the most luring career opportunity of the 21st century. Over the last few years there has been a noticeable increase in both segments that is data generation and retention by the companies.
So, who is data scientist? Data scientist is the expert who extracts the desired information from the collected data and figure out just what can be done with it. A recent survey of C-suite executives by KPMG depicts that; in this era where enterprise data is likely to exceed 240 Exabyte’s per day by 2020 would lead to the need for data scientists with the skills to extract valuable insights from this data.
In the near future United States alone will need to hire between 140,000 – 190,000 data scientists to cope up with the new data economy. Even before we jump into the must have characteristics of a data scientist, it is important to understand what is data science?
Data science is multidisciplinary, involves algorithm development, data inference and technology in order to deal with rationally complex problems. Core of Data Science is only data. Data science is all about using this data in creative ways to develop business value as shown below:
The different data science roles or avenues in industries these days:
Data science roles are miscellaneous and skills required for each vary significantly. Here, are different data science roles along with their skill sets, technical knowledge and approaches required to carry them.
- 1. Data Scientist: A data scientist works with the mindset of a curious data wizard. He/she has to handle the raw data, analyzing it using statistical techniques, and to share his/her insights with his peers in a convincing way.
- 2. Data Analyst: Just like the data scientist, the skills that are required for this role are diverse and cover the entire spectrum of the data science process with a healthy “figure-it-out” outlook.
- 3. Data Architect: With the rise of big data, the significance of the data architect’s job is also emerging out. He/She creates the blueprints for data management systems to integrate, centralize, protect and maintain the data sources. A Data Architect is always on the top of every new innovation in the company.
- 4. Data Engineer: Data engineer is the jack of all trades. He/She deals with databases and large –scale processing systems. Also has to cover both statistical programming languages and languages oriented more towards web development.
- 5. Data Statistician: For getting useful insights from data. By using statistical theories and methodologies and a logics, a data statistician harvests the data and turns it into information or knowledge. He/She handles all sorts of data.
- 6. Database Administrator: Database Administrator makes sure that the database is available to all relevant users, is performing suitably and is being saved. A DA makes sure that all backup and recovery systems are in place, secured and keeps track of the technologies that support these.
- 7. Business Analyst: The business analyst is a bit different role which is less technical. He/She links data insights to actionable business insights and spread messages across the entire organization. He/She is an intermediary between the business guys and the techies.
- 8. Data and Analyst Manager: A data analytics manager sets the right priorities of any industry, by combining strong technical skills with social skills to manage a team.
Before going further, let’s have a look at data science job postings to learn more to know differences between various roles in the industry.
Let’s have a quick look at the average salaries displayed for each roles.

Qualities that a data scientist must have:
A data scientist must:
- 1. Work on data with a mathematical mind-set. Learning skills like machine learning, data analysis, data mining and statistics are essential. A data scientist is expected to construe and represent data mathematically.
- 2. Know and use a common language to access, investigate and model data. He/She must know statistical programming language like R, Python or MATLAB, database querying language (SQL). Data extraction, hypothesis testing and exploration are key aspects of the data science practice.
- 3. Be from a strong computer science or software engineering background. He/She must work comfortably with Java, C++, algorithms and Hadoop. These skills will be used to leverage data to architect systems.
- 4. Be able to visualize and communicate data. This is an essential quality, especially at companies who make data-driven decisions. To communicate, means describing your findings or work to audiences, both technical and non-technical. Visualization is helpful when used with data visualization tools (like ggplot and d3.js).
- 5. Think like a (data-driven) problem solver. It’s important to keep in mind - what things are important and what things aren’t? How to interact with the engineers and product managers? What methods should be used? When to go with approximations?