25 SAP Data Intelligence Real-Time Scenario-Based Interview Questions

1.

Your data pipeline pulling data from an SAP S/4HANA OData service is failing intermittently. How would you trace the failure inside Data Intelligence and ensure pipeline resilience?

2.

A large file (20 GB CSV) from Amazon S3 must be processed daily. How would you design the Data Intelligence pipeline to handle it efficiently without memory issues?

3.

You need to load data from SAP BW/4HANA into Azure Data Lake using Data Intelligence. What operators and connection settings would you use?

4.

Your Graph pipeline has performance bottlenecks when doing complex transformations. How do you debug and optimize pipelines inside Modeler?

5.

You have to convert unstructured PDF invoices into structured data and push it into SAP HANA Cloud. How would you design the DI solution?

6.

A machine learning model deployed in Data Intelligence must be retrained whenever new transactional data is available. How do you automate this using DI pipelines?

7.

Your source system uses Kafka but the target is SAP Data Warehouse Cloud. How would you build this real-time streaming integration?

8.

A pipeline fails because the HANA type mapping for integers and decimals is inconsistent. How would you fix schema mismatch issues?

9.

An SFTP connection to a vendor server fails during runtime. How do you troubleshoot Connection Management and Agent Health in DI?

10.

You must orchestrate a complex workflow: extract → cleanse → enrich → load → ML scoring. How would you structure the graph in Modeler?

11.

Your business wants audit logs for every pipeline execution. How do you implement monitoring using the Monitoring Dashboard and Metadata Explorer?

12.

Your DI job loads duplicate sales records into HANA Cloud. How would you handle deduplication inside the pipeline?

13.

You want to schedule pipelines hourly and maintain separate runtime states for dev, test, and prod. What is your approach to version management?

14.

Your pipeline processes JSON messages, but the target table in HANA Cloud requires relational format. How would you design the transformation?

15.

Business wants a rule-based data quality check (e.g., missing customer name, negative quantities). How do you implement data quality validation inside the pipeline?

16.

Your DI discovery job cannot detect lineage for a source dataset in SAP BW. What steps do you take to enable lineage in Metadata Explorer?

17.

A custom Python operator fails due to missing external libraries. How do you package and deploy libraries in Data Intelligence?

18.

You need to store intermediate pipeline results for reprocessing. Which DI storage options do you use and why?

19.

A pipeline must write into both SAP HANA Cloud and Google BigQuery. How do you design multi-target output?

20.

You want to enforce data masking for personal data (PII) before sending to non-SAP systems. Explain your masking/obfuscation strategy.

21.

A memory leak occurs in a custom operator written in Python. How would you troubleshoot operator logs and container runtime?

22.

A new business requirement needs a REST API that returns processed pipeline results. How do you expose pipeline output as an API in DI?

23.

You have to integrate SAP Data Intelligence with SAP DataSphere. What operators/configurations would you use?

24.

Pipeline performance degrades over time. How will you use the System Management tools to check CPU/Memory/Container scaling?

25.

You must migrate pipelines from Data Hub to Data Intelligence Cloud. What is your migration strategy and what compatibility issues do you expect?