Taxonomy of Data Professionals: Find the Right One for Your Business
The authors would like to thank Oleg Kapustin, Data Management Solutions Architect, and Nikolay Sokolovskiy, Business Analyst at DataArt, for providing valuable suggestions and comments.
Managing data is good, drawing insights from it – even better. To turn a business into a data-driven one, one can't simply collect, store, and analyze data. The business of the world’s leading companies, including Google, Facebook, and Amazon, relies, among other things, on processing data and planning for a long-term analytics strategy. According to recent research by Forrester, data-driven organizations “harness digital insights to optimize products, services, and operations and will grow at least eight times faster than global GDP, or 27% annually.” According to McKinsey & Company, data-driven organizations are 23 times more likely to acquire customers, nine times as likely to retain those customers, and 19 times as likely to be profitable compared to other companies.
The skills and expertise of data specialists – data architects, (big) data engineers, Data QA and DevOps, data analysts, BI engineers, and data scientists – have a critical potential to transform traditional businesses into data-driven ones. No wonder the demand for such professionals of all levels and in all industries is steadily surging. Data scientists were leading the Glassdoor ratings of best jobs in America for four consecutive years (2016 – 2019); in 2020, data scientist still remains a thriving role, with #3 position in the rating. As the global big data as a service market is growing year after year, data specialists are in great demand.
DataArt has many years of experience matching data specialists with specific business requirements and needs. In this post, we’ll explore the roles and key responsibilities of various data specialists, including a data architect, a data engineer, a data analyst, and a data scientist.
“Unicorn” Data Specialists
Though common, the misconception that data specialists are interchangeable is wrong. They are not - despite a certain overlap in their skills and responsibilities. There is no convention on titles and skillsets, so businesses often fail to describe the exact tasks they expect data guys to perform. Some companies end up hiring jacks-of-all-trades instead of niche specialists.
At DataArt, we know how to find the right data specialist for every client’s need. We created a unique, well-defined map with principal skills, competencies, and responsibilities for each data-related job profile. We use it to staff the teams depending on the client’s business objectives, long-term data and analytics strategy, and expected outcomes.
Hire the Right Data Specialist for Your Project
How to Find the Right Data Professional for Your Business Needs?
As an employer, you will more likely find good-fit candidates if you accurately describe your business goals and if you hire for tomorrow’s, not just today’s needs. So, start by defining the long-term data governance strategy for your company, then narrow this down to specific roles and titles.
Below, you will find a brief description of the core tasks and competencies of some data specialists, along with brief case studies.
Data Architect’s Job Description
A data architect is a specialist who takes the lead in the organization's data governance strategy. He/she communicates with business stakeholders and defines their data-related needs and requirements. Based on these, data architect designs a data model that satisfies existing business requirements and accommodates the ones that may appear in the future.
What Do Data Architects Do?
The work of a data architect begins before the actual data is ingested, processed, and analyzed. The key tasks you can expect a data architect to perform is assessing business requirements, internal and external data sources, and existing infrastructure to design a system where all pieces of data are stored, integrated, and protected. A data architect creates and maintains a company-wide repository of data architecture artifacts, procedures, and an inventory of data used to implement the architecture. Besides, he/she ensures the company does not store “dark data,” or the data not used for deriving insights.
By working closely with business stakeholders, a data architect compiles a so-called “data dictionary” – a glossary with clear, unambiguous, and agreed-upon definitions of analytical metrics. He/she also guides the rest of the data specialists in the team by framing company-wide analytics strategy and data-related activities. This way, business stakeholders and other data specialists access clean and reliable data at scale for their analytical needs.
Case Study: Data Architect’s Role in a Project
DataArt recently completed the first phase of a large digital transformation project for a global educational organization in financial and investment domains. Their business model used to rely solely on offline education and certification. When the pandemic hit, their entire operations were brought to a standstill. Nothing but a full digital transformation could save the institution from a collapse. Modernizing a legacy platform for data management and analytics was the firststage in this transformation. As part of the discovery phase, DataArt’s team interviewed 50+ stakeholders in the company to classify their data-related pain points and suggest a road map that would bring the data platform to the desired state.
DataArt’s team, led by an experienced Data Management Architect, audited the client’s existing data architecture, identified gaps and inefficiencies in it, and recommended changes. These changes concerned key guiding principles of data management, such as data modeling, data integration, data storage, data quality, security and compliance. The architect also advised on critical improvements for the company’s data governance strategy. He analyzed and described key metadata assets from both technical and business standpoints, laying the foundation for the data dictionary and the data catalog.
So far, DataArt helped the client make the first essential step towards digitizing the business, which will save it from downfall in the “new normal” reality.
Data Engineer's Job Description
The Dice 2020 Tech Job Report labeled data engineer the fastest-growing job category in technology in 2019, with a 50% year-over-year growth in the number of open positions. Data engineers form a critical part of any high-functioning data team. They build platforms where data is prepared for operational or analytical purposes, with a focus on data format, resilience, and security.
What Do Data Engineers Do?
Data engineers manage and optimize the company’s data infrastructure, design data pipelines and streams that pump data into dedicated storage. They oversee how the data is collected (either by choosing an off-the-shelf ingestion pipeline or building and maintaining a custom one) and stored, and make the need-to-have data accessible for different business users within a company. They also help data scientists to mine the data and implement machine learning models, which help to discover hidden patterns and predict future tendencies. Data engineers are responsible for sorting and cataloging this data, tagging datasets with metadata.
Counterintuitive as it may appear, data engineers these days do not necessarily have programming skills; though some of them know Python, a programming language most commonly used in the analytics realm, or other programming languages. Data engineers are well versed in data wrangling, especially when talking of big data, SQL and NoSQL, cloud technologies, distributed storage and processing tools.
Case Study: Data Engineers’ Role in a Project
DataArt was engaged by a financial startup that sought to reimagine banking services in the digital space. We helped the client to design and implement a powerful custom platform driven by data and data science.
The platform implemented by DataArt’s Data Engineers consisted of a data lake and multiple data pipelines managed automatically by the workflow engine. Data processing engine (including batch and near real-time processing) provided tools and APIs for accessing the data and analyzing it. New data sources were introduced, while processing of existing ones was made faster.
The platform gave quantitative analysts and traders (or, in financial jargon, quants) convenient mechanisms to access the data. Using the platform, quants improved existing trading strategies and algorithms and developed new, more accurate ones. Performance of some quants increased threefold, which generated more revenue for the client. Find out more about the platform in this case study.
The system took the company’s analytical capabilities to a new level, aiding in its strategic decision-making process.
Data Analyst's Job Description
Data analysts form a subspecies of analysts, whose focus is more on the technical parts of the company’s data (data itself), rather than analytics for strategic planning (data-driven business decisions). Data analysts help the business to answer tactical questions and, by applying statistical methods of data analysis, discover opportunities for future growth.
Data analysts, sometimes big data analysts, do not have to know machine learning (of all the roles described in this post, this knowledge is critical only to data scientists), but they certainly require a profound knowledge of statistics. Basic programming skills (Python, R, SAS) will be a plus. Other competencies of a data analyst are proficiency in Microsoft Excel, SQL and NoSQL databases, common visualization tools and techniques.
What Do Data Analysts Do?
Common tasks of data analysts usually are in line with what data scientists do: data cleaning, performing exploratory data analysis (EDA), testing hypotheses, uncovering patterns, and (sometimes) visualizing the data. However, the numerical datasets that analysts operate are well-defined, so they are used to help the business answer more tangible, tactical questions: why sales decreased in a certain region, why a promo campaign failed, and so forth. As opposed to data scientists, whose main tasks are to predict trends and strategically guide the business, data analysts recognize existing data patterns in past and current reports, and visualize them – in support of already made business decisions. Business analysts may also advise different teams across the organization as to which datasets will prove most valuable for them.
Case Study: Data Analyst’s Role in a Project
A large private equity firm engaged DataArt to improve its strategic reporting and data analytics capability. After resolving issues with data governance, data quality, and data integrity in the company’s data warehouse and BI and reporting tools, DataArt methodically transformed the design of these systems into a service-oriented, metadata driven, layered one.
These improvements have significantly lowered the time required for data discovery and integration. The data sourced from the DWH was now fully trusted and easily understood, so business analysts used it to test hypotheses and advise business stakeholders on the most efficient ways for investment management, fund monitoring and external communications.
Hire the Right Data Specialist for Your Project
Data Scientist's Job Description
Data science aids decision makers to masterfully swim breaststroke in a lane of valuable data insights, instead of doggy-paddling helplessly in some random data pools. Jobs in data science are proverbially called “the sexiest of the 21st century." No wonder – a data scientist is the person responsible for drawing insights from your company’s existing data, discovering trends and patterns, and turning them into actionable items for business stakeholders. He/she helps your company to better understand itself and its client base.
Contrary to data engineer or data analyst, data scientist directly participates in the data-driven decision-making process and suggests implications for new business decisions, directions, products and processes. To be able to do this, a data scientist constructs models to automatically scrape the datasets and the algorithms to analyze them. For these models and algorithms to be functional, a data scientist should have excellent knowledge of programming (most commonly, Python and R), big data frameworks, machine learning algorithms, and statistical tools.
What Do Data Scientists Do?
The direction of data science varies from one company to another and depends on the strategic business questions. Yet, all data scientists “estimate the unknown" and help businesses gain operational clarity and transparency. Some data insights may be unexpected, or even unwelcome, and it is also a data scientist’s task to communicate them to the team and suggest ways for improvement and optimization.
In some companies, it is data scientists who perform data pre-processing: mine and clean unstructured data, then transform it and prepare for practical use. In other companies, however, these are the tasks of a data engineer or a data analyst. Either way, the data is of no use to the business until a data scientist processes it. Using ML tools and algorithms, data scientists build statistical models and constantly fine-tune them to gain richer insights, optimize the performance, and predict trends and patterns. Some of the advanced techniques data scientists use for deriving business insights are clustering, neural networks, decision trees, and the like.
Case Study: Data Scientist’s Role in a Project
DataArt was hired to design, develop, and deploy a predictive maintenance platform for a leading global provider of intelligent material handling systems. The platform maximizes operational effectiveness and reduces downtime and maintenance costs of the client’s conveyor systems.
Conveyor systems have many degradable machinery parts. Even if a minor part broke down, the entire system halted. This resulted in financial losses, some of which reached $1+ million per hour for select customers. DataArt’s team installed IoT sensors on all degradable conveyor parts. Data Scientists developed ML models for each component part and trained each model to identify the range of “normal” conditions (temperature, vibration, power consumption, etc.) for a specific part in real time. When conditions deviate from the expected range, the system sends an alert. Find more about the platform in this case study.
The predictive maintenance system for one conveyor is comprised of 220+ conveyer/facility-specific ML models. The platform enables the most efficient allocation of maintenance resources, reduces maintenance costs, and allows the client’s customers to protect their high-value assets from unplanned downtime.
DataArt Will Help You Find the Right Data Professional
As businesses continue to automate and improve their data analytics platforms, data specialists are critical for boosting corporate analytical capabilities. Analytics and data science cannot happen in a silo. The entire team of data architects, engineers, analysts, and data scientists should be tightly integrated and synced – for even better data integrity.
At DataArt, we develop unique approaches to staffing teams with the right data specialists. They bring the most value to the business depending on its immediate and long-term analytics needs. Contact us today to unlock a data-driven strategy for your business.