computerweekly.com

Appian: 3-ways data fabrics are changing the role of data engineers

This is a guest post written by Jacob Rank in his role as senior director of product management at Appian.

Rank manages a team of product managers and a product portfolio that encompasses the work of 140+ engineers across initiatives in AI/ML, process mining, RPA and intelligent document process (IDP).

Talking about the role of data engineers as the “architects of scalable and resilient data infrastructure”, Rank says that as technology advances, the role for these engineers is becoming more demanding.

Appian’s Rank writes as follows…

Organisations are expanding their cloud footprints and incorporating real-time AI analytics into operations. But, responsive and accurate AI requires real-time data. This leaves data engineers facing pressure to provide actionable data faster and with fewer resources – no small task when dealing with disconnected data sources and siloes.

The data engineering landscape is being transformed by the notion of thedata fabric. Data fabrics connect data sources in a virtual layer so engineers can focus on the innovative work organisations need to keep their competitive edge.

Data fabric defined

Let’s define data fabric: It’s an architecture layer and tool set that connects data across disparate systems to create a unified view. \r\n

As a virtualized data layer, it enables you to access data without migrating it from where it lives—whether in a data lake or data warehouse, relational database, enterprise resource planning (ERP) system like SAP, CRM like Salesforce, or SaaS application. The data may be on-premises, in a cloud service, or in multi-cloud environments.

\r\n"}}" readability="68">

To define data fabric, it is an architecture layer and toolset that connects data across disparate systems to create a unified view. As a virtualised data layer, it enables engineers to access data without migrating it from where it lives – whether in a data lake or data warehouse, relational database, enterprise resource planning (ERP) system or SaaS application. The data may be on-premises, in a cloud service, or in multi-cloud environments. Data fabric plays a key role in a modern process automation platform that optimises complex business processes end to end, which is crucial as enterprises seek to scale automation across the business to achieve holistic improvement, not just isolated wins.

Use of a data fabric won’t just improve efficiency. It will redefine the roles of data professionals.

Historically, data engineers have built the infrastructure and data management systems and data scientists have analysed data and built models on it. But data fabric technology is blurring the line between these roles.

Data engineers with analytical data science skills are in the best position to add value as data fabric architectures get more powerful. Here are three ways data engineers can adapt to keep up with the changing requirements of their role.

#1 Shift from coder to innovator

Data engineers use Python, Python R and other statistical languages and tools to pull data from siloed sources. Once centralised, they can manage and transform the data before delivering it to the next stage of the process. Building these data pipelines often involves coding or stitching together many high-code scripts. Like any traditional software engineering approach, this introduces coding errors that cause delays and security risks.

A data fabric simplifies data connectivity by enabling easy integration across different systems. Whether it’s Java database connectivity (JDBC) for databases or HTTP for integrations, a data fabric approach unifies the data landscape and reduces the risk of mistakes. It also supports reporting and real-time read and write-back capabilities that enable self-service for end users, allowing them to access customer records and update them in one interface.

But not all data fabrics have read and write capabilities that scale for large, complex use cases. As Appian CTO and founder Michael Beckley explains, “Data fabrics are increasingly the preferred data plane in companies’ AI stacks, but most are optimised for read-only access and don’t scale well for writes beyond 2,000 rows per record.” Appian data fabric can natively read and write 10 million rows per record, making it scalable for enterprise processes. Data engineers should know the scalability of the data fabric they are working with, because it will impact the extent to which they can shift from coding to more innovative work.

Enterprise data fabrics represent a fundamental shift.

By nearly eliminating the need for pipeline maintenance and troubleshooting, data fabric lets engineers shift their attention to higher-value activities like data modelling, algorithm development and advanced analytics – areas that have historically been the domain of data scientists. This means that data engineers are increasingly driving business insights and strategic decision-making, rather than simply enabling data movement.

#2 Real-time data insight proactivity

Data fabrics unify data across sources, butoperationaldata fabrics take this a step further by incorporating real-time data processing. With real-time data access, businesses can react to insights and make faster data-driven decisions. For example, an operational data fabric provides real-time inventory information so managers and suppliers can make proactive decisions that keep the supply chain running smoothly.

Operational data fabrics make reporting easier, too. Traditionally, data engineers would have to build and manage complex batch-processing workflows to report on insights. But operational data fabrics transform data with minimal intervention. This means business users can access real-time reports and receive alerts to take quick action. Meanwhile, engineers can shift their focus from reactive troubleshooting to proactive optimisation.

#3 Less time on security, more strategytime

Security is essential in data engineering. It is complex, easy to get wrong and has high consequences. This is especially true if a business uses a data lake or warehouse model. With this approach, all sensitive data is in one place. If the data gets compromised, the breach is catastrophic. And traditional tooling places the full burden of security on the data engineer.

Data fabric shifts most of the responsibility from the individual or group of engineers to the underlying platform. The best data fabric technology operates with strong governance, with security embedded into the model and defined at various data layers based on roles. Contextual rules dynamically determine who can access what. As a result, control policies are consistent across all applications and workflows.

For example, an engineer can define data access rules based on user attributes. These policies apply universally, whether data is accessed through an internal dashboard, an API, or an external AI model. This eliminates the risk of accidental security gaps. It also makes certain that governance is maintained automatically across all environments.

Data fabric turns data engineers from gatekeepers to enablers. Instead of managing security configurations, they can focus on designing scalable, compliant data environments. By taking the burden of security off of engineers, data fabric frees them to take on the more strategic, analytical work that used to be left to data scientists.

Towards a unified future

In the future, data engineers will no longer focus purely on infrastructure. Instead, they will work in dynamic, intelligent environments. As data fabric streamlines workflows, engineers will shift toward creating self-service platforms. They will fine-tune governance policies and optimise data for real-time decision-making.

The boundary between data engineers and data scientists will blur. Engineers will need to develop a deeper understanding of data science methodologies. Data scientists will need greater fluency in data architecture. The result will be a merging of the two roles into a broader discipline, where technical expertise and analytical insight come together for better data outcomes.

In an interview on how to know if your organisation needs a data fabric, Appian head of global industry and value Mike Heffner used these two words to describe the current state of data at most businesses: Lost opportunity.

Use of a data fabric keeps you from losing out on the value of your data… and the best way to implement data fabric is in a process platform.

Read full news in source page