While interviewing for data warehousing jobs, you may be asked questions about Informatica concepts as well as Informatica-based scenarios. Here are the most commonly-asked Informatica interview questions and answers that will help you ace your upcoming interview. These Informatica questions are suitable for both freshers and experienced professionals at any level. For your convenience, we have divided this list of 100+ Informatica questions into 3 sections:
- Informatica Basic Interview Questions
- Informatica Advanced Interview Questions
- Informatica Scenario Based Interview Questions
Informatica Basic Interview Questions
Q1. Differentiate between a database, a data warehouse, and a data mart?
Ans. The database includes a set of sensibly affiliated data, which is usually small in size as compared to a data warehouse. In contrast, in a data warehouse, there are assortments of all sorts of data from where data is taken out only according to the customer’s needs. Datamart is also a set of data that is designed to cater to the needs of different domains.
Q2. Explain Informatica PowerCenter.
Ans. Informatica PowerCenter is a GUI based ETL (Extract, Transform, Load) tool. This data integration tool extracts data from different OLTP source systems, transforms it into a homogeneous format and loads the data throughout the enterprise at any speed. It is known for its wide range of applications.
Explore courses related to Informatica:
|Popular Technology Courses||Popular Software Tools Courses|
|Popular Big Data Analytics Courses||Popular Programming Courses|
Q3. Explain the difference between Informatica 7.0 and 8.0?
Ans. The main difference between Informatica 8.0 and Informatica 7.0 is that in the 8.0 series Informatica corp has introduced the power exchange concept.
Q4. How will you filter rows in Informatica?
Ans. In Informatica, rows can be filtered in two ways:
- Source Qualifier Transformation: Rows are filtered while reading data from a relational data source.
- Filter Transformation: Rows are filtered within a mapped data from any source.
Q5. What is a Sorter Transformation?
Ans. Sorter transformation is used to sort the data in an ascending or descending order based on single or multiple keys. It sorts collections of data by port or ports.
Q6. What is Expression Transformation?
Ans. An expression transformation is a collective Powercenter mapping transformation. It is a connected, passive transformation that calculates values on a single row and can also be used to test conditional statements before passing the data to other transformations.
Q7. What is Joiner Transformation?
Ans. The joiner transformation is an active and connected transformation that helps to create joins in Informatica. It is used to join two heterogeneous sources.
Q8. What is a Decode in Informatica?
Ans. In Informatica, we use the application of traditional CASE or IF which is possible by the decode in Informatica. A decode in Informatica is a function used within an Expression Transformation.
Q9. What is a Router Transformation?
Ans. The Router Transformation allows users to split a single pipeline of data into multiple. It is an active and connected transformation that is similar to filter transformation.
Q10. What is a Rank Transformation?
Ans. The Rank Transformation is an active and connected transformation that is used to sort and rank the top or bottom set of records based on a specific port. It filters data based on groups and ranks. The rank transformation has an output port by which it assigns a rank to the rows.
Q11. What is Filter Transformation?
Ans. Filter transformation is used to filter the records based on the filter condition. It is an active transformation as it changes the no of records.
Q12. What is a Sequence Generator Transformation?
Ans. Sequence Generator Transformation is used to generate primary fundamental values or a range of sequence numbers for calculations or processing. It is passive and connected.
Also Read>> Top Online IT Courses for IT Professionals
Q13. What is a Master Outer Join?
Ans. A master outer join is a specific join typesetting within a joiner transformation. In a master outer join, all records from the detail source are returned by the join and only matching rows from the master source are returned.
Q14. What are some examples of Informatica ETL programs?
Ans. Some examples of Informatica ETL programs are:
Q15. What is a dimensional table? What are the different dimensions?
Ans. A Dimension table is a table in a star schema of a data warehouse. Dimension tables are used to describe dimensions. They contain attributes that describe fact records in the table.
For example, a product dimension could contain the name of products, their description, unit price, weight, and other attributes as applicable.
The different types of dimension tables are:
- SCD (Slowly Changing Dimension):
The dimension attributes tend to change slowly with time rather than changing in a regular interval of time.
- Conformed Dimension:
Conformed dimensions are those which are exactly the same with every possible fact table to which they are joined. It is used to maintain consistency.
This dimension is shared among multiple subject areas or data marts. The same can be used in different projects without any modifications.
- Junk Dimension:
A junk dimension is a collection of attributes of low cardinality. It contains different transactional code flags or text attributes that are unrelated to any other attribute. A junk dimension is a structure that provides a convenient place to store the junk attributes.
- Degenerated Dimension:
It is derived from the fact table and does not have its own dimension table. The attributes are stored in the fact table and not as a separate dimension table.
- Role-playing dimension:
Role-playing dimensions are the dimensions used for multiple purposes within the same database
Q16. What is star schema?
Ans. It is the simplest form of data warehouse schema that consists of one or more dimensions and fact tables. It is used to develop data warehouses and dimensional data marts.
Q17. Describe snowflake schema.
Ans. A snowflake schema is a fact table that is connected to several dimensional tables such that the entity-relationship diagram resembles a snowflake shape. It is an extension of a Star Schema and adds additional dimensions. The dimension tables are normalized which splits data into additional tables.
Q18. What is a Mapplet?
Ans. A Mapplet is a reusable object containing a set of transformations that can be used to create reusable mappings in Informatica.
Q19. What is a natural primary key?
Ans. A natural primary key uniquely identifies each record within a table and relates records to additional data stored in other tables.
Q20. What is a surrogate key?
Ans. A surrogate key is a sequentially generated unique number attached with each record in a Dimension table. It is used in substitution for the natural primary key.
Q21. What is the difference between a repository server and a powerhouse?
Ans. A repository server controls the complete repository, which includes tables, charts, and various procedures, etc
A powerhouse server governs the implementation of various processes among the factors of the server’s database repository.
Q22. How many repositories can be created in Informatica?
Ans. We can create as many repositories in Informatica as required.
Q23. Describe Data Concatenation.
Ans. Data concatenation is the process of bringing different pieces of the record together.
Q24. How can one identify whether the mapping is correct or not without connecting the session?
Ans. With the help of debugging options.
Also Read>> Top Data Analyst Interview Questions and Answers
Q25. Name the designer tools for creating transformations?
Ans. Mapping designer, transformation developer, and mapplet designer are used for creating transformations.
Q26. Differentiate between sessions and batches?
Ans. A session is a set of commands for the server to move data to the target, while a batch is a set of tasks that can include one or more tasks.
Q27. What is Enterprise Data Warehousing?
Ans. Enterprise data warehousing is a process of creating a centralized repository of operational data so that it can be used as per the reporting and analytics requirements. It has a single point of access, and the data is provided to the server via an only source store.
Q28. What are the different names of the Data Warehouse System?
Ans. The Data Warehouse System has the following names –
- Analytic Application
- Business Intelligence Solution
- Data Warehouse
- Decision Support System (DSS)
- Executive Information System
- Management Information System
Q29. Name different available editions of INFORMATICA PowerCenter.
Ans. Different editions of INFORMATICA PowerCenter are –
- Standard Edition
- Advanced Edition
- Premium Edition
Informatica Advanced Interview Questions
Q30. What is the difference between static cache and dynamic cache?
Ans. In the static cache, the data will remain the same for the entire session, whereas in the dynamic cache, whenever a row is inserted, the cache will also be updated.
Q31. What is the command used to run a batch?
Ans. To run a batch in Informatica, we use the pmcmd command.
Q32. What are the differences between the ROUTER and FILTER?
Ans. Differences between the Router and Filter are:
|Captures data rows that don’t meet the conditions to a default output group||Tests data for one condition and drops the data rows that don’t meet the condition|
|Single input and multi-output group transformation||Single input and single output group transformation|
|User can specify multiple filter conditions||User can only one filter condition|
|It does not block input rows and failed records||Chances that records can get blocked|
|Acts like IIF() function in Informatica or CASE||Works as SQL WHERE clause|
Q33. What is a Domain?
Ans. A Domain comprises nodes and services and serves as the fundamental administrative unit in the Informatica tool. It categorizes all related relationships and nodes into folders and sub-folders depending upon the administration requirement.
Q34. Why should we partition a Session?
Ans. Partition not only helps in optimizing a Session but also helps in loading a colossal volume of data and improves the server’s operation and efficiency.
Q35. What is Complex Mapping?
Ans. Complex Mapping is a mapping with huge requirements based on many dependencies. It doesn’t necessarily need to have hundreds of 100 transformations, it can be a complex map even with five odd transformations. In case the requirement has many business restrictions and constraints, it is complex mapping.
Q36. What are the features of Complex Mapping?
Ans. The features of Complex Mapping are –
- Complicated and huge requirements
- Complex business logic
- Multiple transformations
Q37. What is a Lookup Transformation?
Ans. Lookup Transformations are passive transformations with admission rights to RDBMS-based data sets. It is used to look up a source, source qualifier, or target to get the relevant data.
Q38. What are different Lookup Caches(s)?
Ans. Lookups can be cached or uncached and can be divided as –
- Static cache
- Dynamic cache
- Persistent cache
- Shared cache
Q39. What are Mapplets?
Ans. Mapplets are a reusable object that can be created in the Mapplet Designer and has a set of transformations that allow the reuse of transformation logic in multiple mappings.
Q40. What is the use of the Source Qualifier?
Ans. Source Qualifiers represent rows using the PowerCenter server that the integrations service reads during a session. The source qualifier transformation converts the source data types to the Informatica native data types, hence eliminating the need to alter the data types of the ports in the source qualifier transformation.
Q41. Define Workflow.
Ans. Workflow is a set of multiple tasks that enable a server to communicate and get the tasks implemented. These tasks are connected with the start task link and trigger the required sequence to start a process.
Check out the top SQL Interview Questions and Answers
Q42. How many tools are there in the Workflow Manager?
Ans. There are three types of tools in the Workflow Manager –
- Task Developer – To create tasks that need to be run in the workflow.
- Workflow Designer – To create a workflow by connecting tasks with links.
- Worklet Designer – To create a worklet
Q43. What is a Target Load Order?
Ans. Also known as Target Load Plan, a Target Load Order specifies the order of target loading by integration service. It is dependent on the source qualifiers in a mapping.
Q44. What is the Command Task?
Ans. A Command Task is used to run the shell/UNIX commands in Windows during the workflow. It allows a user to specify UNIX commands in the command task to remove rejected files, create files, copy files, rename files, and archive files, among others.
Q45. What is a Standalone Command Task?
Ans. Standalone Command Task allows the shell commands to run anywhere during the workflow.
Q46. What is the PowerCenter Repository?
Ans. A PowerCenter Repository is a relational database like Oracle and SQL server. It consists of the following Metadata –
- ODBC Connection
- Session and session logs
- Source Definition
- Target Definition
Q47. What is the Snowflake Schema? What is its advantage?
Ans. Snowflake Schema is a logical arrangement where dimension tables are normalized in a multidimensional database. It is designed in a manner that looks like a snowflake, thus the name. It contributes to improving the Select Query performance.
Q48. What are the Different Components of PowerCenter?
Ans. A PowerCenter has eight crucial components –
- PowerCenter Service
- PowerCenter Clients
- PowerCenter Repository
- PowerCenter Domain
- Repository Service
- Integration Service
- PowerCenter Administration Console
- Web Service Hub
Q49. What does the PowerCenter Client application consist of?
Ans. PowerCenter Client application is comprised of the following tools:
- Mapping Architect for Visio
- Repository Manager
- Workflow Manager
- Workflow Monitor
Q50. How will you define the Tracing Level?
Ans. Tracing Level refers to the amount of information that the server writes in the session log. Tracing Level is created and configured either at –
- The transformation level
- The session-level
- Else at both the levels
Different types of Tracing Level are –
- Verbose Initialization
- Verbose Data
Q51. What is a Surrogate Key?
Ans. A Surrogate Key is any column or set of columns attached to every record in a Dimension table in a Data Warehouse. It is used as a substitute or replacement for the primary key when the update process becomes difficult for a future requirement.
Q52. What is a Session?
Ans. A session in Informatica is a set of instructions to be followed when data is being transferred from source to target using Session Command. A Session Command can be a pre-session command or a post-session command.
Q53. What is a User-Defined Event?
Ans. A User-Defined Event is a flow of tasks in a workflow. It allows users to create and name an event.
Q54. Explain the difference between partitioning of file targets and partitioning of the relational target?
Ans. Partitions can be accomplished on both relational and flat files. Informatica holds up the following partitions:
- Database partitioning
- Hash-Key partitioning
- Key Range partitioning
Q55. Mention what are the unsupported repository objects for a mapplet?
Ans. The following are the unsupported repository objects for a mapplet:
- COBOL source definition
- Normalizer transformations
- Pre or post-session stored procedures
- Target definitions
- Non-reusable sequence generator transformations
- Joiner transformations
- IBM MQ source definitions.
- Power mart 3.5 styles Look Up functions
- XML source definitions
Q56. Explain what are direct and indirect loading options in sessions?
Ans. The following are the differences between direct and indirect loading options in sessions:
- Direct loading is used for Single transformation, whereas indirect transformation can be used for multiple transformations or files.
- In this direction, we can perform the recovery process, but in Indirect, we can’t do it.
Q57. What is the difference between static and dynamic cache? Explain with one example.
Ans. The difference between static and dynamic cache are:
Static – Once the data is cached, it will not change, example unconnected lookup uses static cache.
Dynamic – The cache is updated to reflect the update in the table (or source) for which it is referring to. (ex. connected lookup).
Q58. Is it possible to start Batches within a batch?
Ans. It is not possible to start a batch within a batch if you want to start a batch that resides in a batch, create a new independent batch and copy the necessary sessions into the new batch.
Q59. What is the procedure to import VSAM files from source to target? Do I need a special plugin?
Ans. As far as I know, by using the power exchange tool to convert VSAM files to oracle tables then do mapping as usual to the target table.
Q60. Mention how many types of facts are there and what are they?
Ans. There are three types of facts:
- Additive fact: A fact that can be summarized by anyone of dimension or all dimensions EX: QTY, REVENUE
- Semi additive fact: a fact that can be summarized for a few dimensions, not for all dimensions. ex: current balance
- Non-additive fact: a fact that cannot be summarized by any of the dimensions. ex: percentage of profit
Q61. Mention the methods for creating reusable transformations.
Ans. There are two methods used for creating reusable transformations:
- By using the transformation developer tool.
- By converting a non-reusable transformation into a reusable transformation in mapping.
Q62. What is the procedure to use the pmcmd command in a workflow or run a session?
Ans. By using the command in the command task, there is an option pression. We can write a suitable command of pmcmd to run the workflow.
Q63. What is the default join that the source qualifier provides?
Ans. Inner equi join is the default join provided by the source qualifier.
Q64. Explain the aggregate fact table and where is it used?
Ans. There are two types of fact tables:
- Aggregated fact table – The aggregated fact table consists of aggregated columns. Example- Total-Sal, Dep-Sal.
- Factless fact table – The factless fact table doesn’t consist of aggregated columns and it only has FK to the Dimension tables.
Q65. To provide support for mainframes source data, which files are used as source definitions?
Ans. COBOL Copy-book files are used as a source definition.
Q66. What is the procedure to load the time dimension?
Ans. By using SCD Type 1/2/3, we can load any dimensions based on the requirement. We can also use the procedure to populate the time dimension.
Q67. Explain the difference between the summary filter and the details filter?
Ans. Summary Filter- we can apply a record group that consists of common values.
Detail Filter- we can apply to every record in a database.
Q68. What are the differences between connected lookup and unconnected lookup?
Ans. The differences between connected lookup and unconnected lookup are:
|Connected Lookup||Unconnected Lookup|
|1. Gets the input directly from the other transformations and participates in the data flow.||1. It takes the input values from the result or the function of the LKP expression.|
|2. It can return to multiple output ports.||2. This Lookup returns to only one output port.|
|3. This can be both dynamic and static.||3. It cannot be dynamic.|
Q69. How many input parameters can be included in an unconnected lookup?
Ans. Any number of input parameters can be included in an unconnected lookup. However, the return value would only be one. For example, parameters like column 1, column 2, column 3, and column 4 can be provided in an unconnected lookup but there will be only one return value.
Check out the top Database and SQL Courses
Q70. Mention the advantages of a partitioned session.
Ans. The advantages of a partitioned session in Informatica are:
- Increases the manageability, efficiency, and operation of the server
- Involves the solo implementation sequences in the session
- Simplifies common administration tasks.
Q71. Explain the different methods for the implementation of parallel processing?
Ans. The different partition algorithms for the implementation of parallel processing are:
- Pass-through Partitioning: In this portioning, the Integration Service passes all rows from one partition point to the next partition point without redistributing them.
- Database partitioning: In this partitioning, the Integration Service queries the database system for table partition information and reads partitioned data from the corresponding nodes in the database.
- Round-Robin Partitioning: The Integration service distributes data evenly among all partitions.
- Key Range Partitioning: It enables you to specify one or more ports to form a compound partition key for a source or target. The Integration Service then passes data to each partition depending on the ranges you specify for each port.
- Hash Auto-Keys Partitioning: The hash function groups rows of data among partitions. With the hash auto-key partition, the Integration Service uses all grouped or sorted ports as a compound partition key.
- Hash User-Keys Partitioning: It groups rows of data among partitions based on a user-defined partition key.
Q72. Name some of the mapping development practices?
Ans: The following are some of the mapping development practices in Informatica:
- Source Qualifier
Q73. Explain the Event and what are its types?
Ans. The event can be any action or functionality that can be implemented in a workflow. There are two types of events:
- Event Wait Task: It waits until an event occurs. The specific event for which the Event Wait task should wait can be defined. Once the event is triggered, this task gets accomplished and assigns the next task in the workflow.
- Events Raise Task: It triggers the specific event in the workflow.
Q74. What is a Fact Table? What are its different types?
Ans. A Fact table is a centralized table in the star schema. It contains summarized numerical and historical data (facts). There are two types of columns in a Fact table:
- Columns that contain the measure called facts
- Columns that are foreign keys to the dimension tables
The different types of Fact Tables are:
- Additive: These facts can be summed up through all of the dimensions in the fact table.
- Semi-Additive: The facts can be summed up for only some of the dimensions in the fact table.
- Non-Additive: The facts that cannot be summed up for any of the dimensions present in the fact table.
Q75. Explain OLAP?
Ans: OLAP stands for Online Analytical Processing. It is used to analyze database information from multiple database systems at one time. It offers a multi-dimensional analysis of data for business decisions.
Q76. What are the different types of OLAP?
Ans: There are three types of OLAP techniques, namely:
- MOLAP (Multi-dimensional OLAP)
- ROLAP (Relational OLAP)
- HOLAP (Hybrid OLAP)
Q77. What are the advantages of using OLAP services?
Ans. The advantages of using OLAP services are as follows:
- It is a single platform for all types of analytical business needs.
- Offers consistency of information and calculations.
- It complies with regulations to safeguard sensitive data.
- Applies security restrictions on users and objects to protect data.
Q78. What are the different types of lookup transformation in Informatica?
Ans. The different types of lookup transformation in Informatica are:
- Relational Lookup (Flat File)
- Pipeline Lookup
- Cached/Uncached Lookup
- Connected/Unconnected Lookup
Q79. Explain pre-session and post-session shell commands?
Ans. You can call a Command task the pre-or post-session shell command for a Session task. They can be called in COMPONENTS TAB of the session. They can be run in Pre-Session Command or Pre-Session Success Command or Post-Session Failure Command. The application of shell commands can be changed as per the use case.
Q80. Explain Junk Dimensions?
Ans. A Junk Dimension is a collection of some random codes or flags that do not belong in the fact table or any of the existing dimension tables. These attributes are unrelated to any particular dimension. The nature of these attributes is like random codes or flags, for example, non-generic comments or just yes/no.
Explore the top Business Intelligence Tools Courses
Q81. What are the output files created by the Informatica server during the session running?
Ans. The following are the output files created by the Informatica server during the session running
- Informatica server log: This file is created for all status and error messages by default name: pm.server.log. An error log for error messages is also created.
- Session log file: Session log files are created for each session. It writes information about sessions into log files such as initialization process, creation of SQL commands for reader and writer threads, etc.
- Session detail file: The Session detail file contains load statistics for each target in mapping. It includes information such as table name, number of rows written or rejected.
- Performance detail file: This file contains session performance details that help in identifying areas where performance can be improved.
- Reject file: It contains the rows of data that the writer does not write to targets.
- Control file: Control file and a target file are created when you run a session that uses the external loader. The control file has information about the target flat file such as data format and loading instructions, etc.
- Post session email: With the help of this file, you can automatically communicate information about a session run to designated recipients.
- Indicator file: Informatica server can be configured to create an indicator file while using the flat file as a target. For each target row, the indicator file contains a number to indicate whether the row was marked for insert, update, delete or reject.
- Output file: If a session writes to a target file, a target file based on file properties entered in the session property sheet is created.
- Cache files: When the Informatica server creates a memory cache, it also creates cache files.
- Workflow log: It contains the high-level and detailed information of sessions, nodes, integration services, repository information, etc.
- Badfile cache: It contains the bad records or rejected records.
Q82. Name the files are created during the session RUMs.
Ans. The following files are created during the session RUMs:
- Errors log
- Session log
- Bad file
- Workflow low
Q83. What is the difference between Mapping and Mapplet?
Ans. The differences between Mapping and Mapplet are:
|It is a collection of source, target, and transformation.||It is a collection of transformation only.|
|Mapping is developed with different transformations.||It can be reused with other mapping and mapplets.|
|It is not reusable.||Mapplets are reusable components.|
|It focuses on what data move to target and what modifications are done upon that.||Mapplet is developed for complex calculations used in multiple mappings.|
Q84. What is a Stored Procedure Transformation? What are its uses?
Ans. It is a passive transformation that populates and maintains databases. It helps you to use or call Stored procedures inside the Informatica Workflow. It can be used in connected as well as the unconnected mode.
The major uses of Stored Procedure Transformation are:
- Check the status of a target database before loading data into it.
- Identify if enough space exists in a database.
- Carry out a complex calculation.
- Drop and recreate indexes.
Q85. What is DTM Process?
Ans. The Data Transformation Manager Process (DTM) process is started by PowerCenter Integration Service to run a session. The main role of the DTM process is to create and manage threads that carry out the session tasks. The DTM process performs various tasks, including:
- Reading the session information
- Forming dynamic partitions
- Creating partition groups
- Validating code pages
- Running the processing threads
- Running post-session operations
- Sending post-session email
Q86. What is the difference between a fact table and a dimension table?
Ans. The differences between a fact table and a dimension table are:
|Fact Table||Dimension Table|
|A fact table contains summarized numerical and historical data (facts).||A dimension table is one of the companion tables to a fact table in the star schema. It contains dimensions of a fact.|
|A fact table contains more records and fewer attributes.||The dimension table contains more attributes and fewer records.|
|It is defined by data grain.||It is descriptive, complete, and wordy.|
|The primary key in the fact table is mapped as foreign keys to dimensions.||It has primary key columns that uniquely identify each dimension.|
|It can have data in numeric as well as textual format.||It contains attributes in textual format.|
|It does not contain hierarchy and grows vertically.||It contains hierarchy and grows horizontally. The dimension can also contain one or more hierarchical relationships.|
Informatica Scenario Based Interview Questions
What are the Scenario-Based questions?
In a Scenario-based interview, you will be first offered a scenario and then asked questions related to it. Your response to Informatica scenario-based interview questions will show your technical skills as well as your soft skills, such as problem-solving and critical thinking.
Now that you are just one step away to land a job in your dream job, you must prepare well for all the likely interview questions. Keep in mind that every interview round is different, especially when scenario-based Information questions are asked.
Q87. How do you load the last N rows from a flat-file into a target table in Informatica?
Ans. Considering that the source has data:
Now follow the below steps to load the last 3 rows into a target table
- Assign the row numbers to each record by using expression transformation. Name the row to calculate as N_calculate.
- Create a dummy output port and assign 1 to the port in the same expression transformation.
- This will return 1 for each row.
|Ports in Expression Transformation|
|Outputs in Expression Transformation|
|col, N_calculate, N_dummy|
|ABC, 1, 1|
|DEF, 2, 1|
|GHI, 3, 1|
|JKL, 4, 1|
|MNO, 5, 1|
- Pass expression transformation output to the aggregator transformation
- Do not specify condition ‘any group’
- Create a N_total_records output port in the aggregator
- Assign the N_calculatet port to it.
- By default, it will return the last row
- It will contain DUMMY port
- Now it will hold the value as 1 and N_total_records port (it will keep the value of the total number of records available in the source)
|Ports in Aggregator Transformation|
|Outputs in Aggregator Transformation|
- Now pass the value of expression and aggregator transformation to the joiner transformation
- Merge the dummy port
- Check the property sorted input in the joiner transformation to connect both expression and aggregator transformation
- Now the join condition will be O_dummy (port from aggregator transformation) = O_dummy (port from expression transformation)
|Outputs in Joiner Transformation|
|col, N_calculate, N_total_records|
|ABC, 1, 5|
|DEF, 2, 5|
|GHI, 3, 5|
|JKL, 4, 5|
|MNO, 5, 5|
- Pass the joiner transformation to filter transformation
- Mention the filter condition as N_total_records (port from aggregator)-N_calculate(port from expression) <=2
- Thus, the filter condition in the filter transformation will be N_total_records – N_calculate <=2
|Outputs in Filter Transformation|
|col N_calculate, N_total_records|
|GHI, 3, 5|
|JKL, 4, 5|
|MNO, 5, 5|
Check out the popular Business Data Mining Courses
Q88. Solve the below situations if data has duplicate rows.
Situation – Give steps to load all unique names in one table and duplicate names in another table.
Solution 1 – We want solution tables as:
Amazon and Flipkart in one table
Walmart, Walmart, Walmart, Snapdeal, and Snapdeal in another table
Follow the below steps
- Sort the name data by using a sorter transformation
- Pass the sorted output to an expression transformation
- Form a dummy port N_dummy and assign 1 to the port
- Now for each row, the Dummy output port will return 1
Expression Transformation Output
- Pass the acquired expression transformation output to aggregator transformation
- Check ‘groupby’ on name port
- Create an output port in aggregator N_calculate_of_each_name and write an expression calculate(name).
Aggregator Transformation Output
- Pass the expression and aggregator transformation output to joiner transformation
- Join the name ports
- Review the property sorted input to connect both transformations to joiner transformation
Joiner Transformation Output
|name, N_dummy, N_calculate_of_each_name|
|Amazon, 1, 1|
|Walmart, 1, 3|
|Walmart, 1, 3|
|Walmart, 1, 3|
|Snapdeal, 1, 2|
|Snapdeal, 1, 2|
|Flipkart, 1, 1|
- Move the joiner output to router transformation
- Create one group
- Specify it as O_dummy=O_count_of_each_name
- Connect the group to one table
- Connect default output group to another table
- You will get separate tables for both
Q89. Situation 2 – Solve the below situations if data has duplicate rows.
Situation – Load each name once in one table and duplicate products in another table.
Solution 2 – We want the output as:
The below steps will give the desired solution:
- Sort the name data by using a sorter transformation
- Pass name output to expression transformation
- Create a variable port,Z_curr_name
- Assign the name port to variable port
- Create Z_calculate port
- Write in the expression editor, IIF(Z_curr_name=Z_prev_name, V_calculate+1,1)
- Form another variable and call it as port Z_prev_port
- Assign the name port to this variable
- Form the output portN_calculate port
- Assign Z_calculate to this output port
|Expression Transformation Name port|
|Z_calculate=IIF(Z_curr_name=Z_prev_name, Z_calculate+1, 1)|
|Expression Transformation Output|
- Route the expression transformation to router transformation
- Form a group
- Specify condition as N_calculate=1
- Merge the group to one table
- Merge the default group output to another table
Learn more about Data Analysis
Q90. In Informatica, how do you use Normalizer Transformation for the below-mentioned condition
|Quarter 1 Purchase||Quarter 2 Purchase||Quarter 3 Purchase||
Quarter 4 Purchase
Ans. If you want to transform a single row into multiple rows, Normalizer Transformation will help. Also, it is used for converting multiple rows into a single row to make data look organized. As per the above scenario-based Informatica interview question, we want the solution to look as:
Follow the steps to achieve the desired solution by using normalizer transformation:
Step 1 –
- Create a table “purchase_source” and assign a target table as “purchase_target”
- Import the table to informatica
- Create a mapping for both the tables having a source as “purchase_source” “purchase_target” respectively
- Create a new transformation from the transformation menu
- Enter the name “xyz_purchase”
- Select create option
- Select done (now the transformation is created)
Step 2 –
- Double click on normalization transformation
- Go to normalizer tab and select it
- From the tab, click on the icon, this will create two columns
- Enter the names of columns
- Fix number of occurrences to 4 for purchase and 0 for the state name
- Select OK
- 4 columns will be generated and appear in the transformation
Step 3 –
- In the mapping, link all four columns in source qualifier of the four Quarters to the normalizer
- Link state name column to normalizer column
- Link state_name and purchase columns to target table
- Link lkp_purchase column to target table
- Create session and workflow
- Save the mapping and execute it
- You will get the desired rearranged output
Q91. What to do when you get the below error?
AA_10000 Normalizer Transformation: Initialization Error: [Cannot match AASid with BBTid.]
Ans. Follow the below process –
- Remove all the unconnected input ports to the normalizer transformation
- If OCCURS is present, check that the number of input ports is equal to the number of OCCURS
Q92. What are the steps to create, design, and implement SCD Type 1 mapping in Informatica using the ETL tool?
Ans. The SCD Type 1 mapping helps in the situation when you don’t want to store historical data in the Dimension table as this method overwrites the previous data with the latest data.
The process to be followed:
- Identify new records
- Insert it into the dimension table
- Identify the changed record
- Update it in the dimension table
If the source table looks like:
CREATE TABLE Students (
Now we require using the SCD Type 1 method to load the data present in the source table into the student dimension table.
CREATE TABLE Students_Dim (
Follow the steps to generate SCD Type 1 mapping in Informatica
- In the database, create source and dimension tables
- Create or import source definition in the mapping designer tool’s source analyzer
- Import the Target Definition from Warehouse designer or Target designer
- Create a new mapping from the mapping designer tab
- Drag and drop the source
- Select Create option from toolbar’s Transformation section
- Select Lookup Transformation
- Enter the name and click on create
- From the window, select Student dimension table and click OK
- Edit lkp transformation
- Add a new port In_Student_Id from the properties tab
- Connect the port to source qualifier transformation’ Student_Id port
- From the lkp transformation’s condition tab, enter the Lookup condition as Student_Id = IN_Student_Id
- Click OK
- Now, connect source qualifier transformation’s student_id port to lkp transformation’s In_Student_Id port
- Create expression transformation using the input port as Stud_Key, Name, Location, Src_Name, Src_Location
- Create an output port as New_Flag, Changes_Flag
- In the expression transformation’s output ports, enter the below-mentioned expression
- New_Flag = IIF(ISNULL(Stud_Key),1,0)
Changed_Flag = IIF(NOT ISNULL(Stud_Key)
AND (Name != Src_Name
OR Location != Src_Location),
1, 0 )
- Connect lkp transformation port to expression transformation port
- Also, connect source qualifier transformation port to expression transformation port
- Form a filter transformation and move the ports of source qualifier transformation
- Edit the filter transformation and set new Filter Condition as New_Flag=1 from the edit filter transformation option
- Press OK
- Create an update strategy transformation
- Connect all filter transformation port just exclude except the New_Flag port
- From the properties tab of update strategy, enter DD_INSERT as the strategy expression
- Drag target definition to mapping
- Connect relevant ports to target definition from update strategy
- Create a sequence generator transformation
- Connect NEXTVAL port to target surrogate key port (stud_key)
- Create a different filter transformation
- In the filter transformation, drag lkp transformation’s port (Stud_Key), source qualifier transformation (Name, Location), expression transformation (changed_flag) ports
- Go to the properties tab to edit the filter transformation
- Mention the filter condition as Changed_Flag=1
- Click OK
- Create the update strategy
- Connect parts of filter transformation to update strategy
- From the update strategy properties tab, enter expressions DD_Update
- In this mapping, drag target definition
- From the update strategy, connect all the appropriate ports to target definition
Explore the concept of Business Analytics
Q93. Give steps to use PMCMD Utility Command.
Ans. There are 4 different built-in command-line programs:
PMCMD command helps for the following functions:
- Start workflows
- Schedule workflows
- Start a workflow from a specific task
- Stop and abort workflows and sessions
Below are the steps to use PMCMD command:
pmcmd startworkflow -service informatica-integration-Service -d domain-name -u user-name -p password -f folder-name -w workflow-name
Scheduling the workflow
pmcmd scheduleworkflow -service informatica-integration-Service -d domain-name -u user-name -p password -f folder-name -w workflow-name
Start a workflow from a specific task
pmcmd startask -service informatica-integration-Service -d domain-name -u user-name -p password -f folder-name -w workflow-name -startfrom task-name
pmcmd abortworkflow -service informatica-integration-Service -d domain-name -u user-name -p password -f folder-name -w workflow-name
pmcmd aborttask -service informatica-integration-Service -d domain-name -u user-name -p password -f folder-name -w workflow-name task-name
Q94. How to configure the target load order in Informatica?
Ans. Follow the below steps:
- Create mapping containing multiple target load order groups in the PowerCenter designer
- From the toolbar, click on the Mappings and then click on Target Load Plan
- You will see a pop up that will have a list of source qualifier transformations in the mapping. Also, it will have the target from which it receives data from each source qualifier
- From the list, pick a source qualifier
- Using the Up and Down button, move source qualifier within load order
- Click ok
- You will get the desired output
Q95. Using the incremental aggregation in the below table, what will be the output in the next table?
|Product ID||Bill Number||Cost||Data|
Ans. When the first load is finished the table will become:
|Product ID||Bill Number||Load_Key||Data|
Q96. What is the syntax of the INITCAP function?
Ans. This function is used to capitalize the first character of each word in the string and makes all other characters in lowercase.
Below is the Syntax:
These were some of the most popular scenario-based Informatica interview questions.
Q97. How will you generate sequence numbers using expression transformation?
Ans. We can generate sequence numbers using expression transformation by following the below steps:
- Create a variable port and increment it by 1
- Allocate the variable port to an output port. The two ports in the expression transformation are: V_count=V_count+1 and O_count=V_count
Also Read>> Top Database Interview Questions and Answers
Q98. How will you load the first 4 rows from a flat-file into a target?
Ans. The first 4 rows can be loaded from a flat-file into a target using the following steps:
- Allocate row numbers to each record.
- Create the row numbers by using the expression transformation or by using the sequence generator transformation.
- Pass the output to filter transformation and specify the filter condition as O_count <=4
Q99. What is the difference between Source Qualifier and Filter Transformation?
Ans. The differences between Source Qualifier and Filter Transformation are:
|Source Qualifier Transformation||Filter Transformation|
|1. It filters rows while reading the data from a source.||1. Filters rows from within mapping.|
|2. It can filter rows only from relational sources.||2. This can filter rows from any type of source system at the mapping level.|
|3. Source Qualifier limits the row sets extracted from a source.||3. It limits the row set sent to a target.|
|4. It reduces the number of rows used in mapping thereby enhancing performance.||4. To maximize performance, Filter Transformation is added close to the source to filter out the unwanted data early.|
|5. Filter condition uses the standard SQL to run in the database.||5. Filter Transformation defines a condition using any statement or transformation function that returns either a TRUE or FALSE value.|
Q100. Create a mapping to load the cumulative sum of salaries of employees into the target table. Consider the following employee’s data as a source.
The target table data should look like the following:
employee_id, salary, cumulative_sum
1, 2000, 2000
2, 3000, 5000
3, 4000, 9000
4, 5000, 14000
Ans. The following steps need to be followed to get the desired output:
- Connect the source Qualifier to the expression transformation
- Create a variable port V_cum_sal in the expression transformation
- Write V_cum_sal+salary in the expression editor
- Create an output port O_cum_sal and assign V_cum_sal to it
Q101. Create a mapping to find the sum of salaries of all employees. The sum should repeat for all the rows. Consider the employee’s data provided in Q14. as a source.
The output should look like:
employee_id, salary, salary_sum
1, 2000, 14000
2, 3000, 14000
3, 4000, 14000
4, 5000, 14000
Ans. The following steps should be followed to get the desired output:
Connect the source qualifier to the expression transformation.
Create a dummy port in the expression transformation and assign value 1 to it. The ports will be:
Provide the output of expression transformation to the aggregator transformation.
Create a new port O_sum_salary
Write- SUM(salary) in the expression editor.
The ports will be:
Provide the output of expression transformation and aggregator transformation to joiner transformation.
Join the DUMMY port.
Check the property sorted input and connect expression and aggregator to joiner transformation.
Provide the output of the joiner to the target table.
Q102. Create a mapping to get the previous row salary for the current row. In case, there is no previous row for the current row, then the previous row salary should be displayed as null.
The output should look like:
employee_id, salary, pre_row_salary
1, 2000, Null
2, 3000, 2000
3, 4000, 3000
4, 5000, 4000
Ans. The following steps will be followed to get the desired output:
- Connect the source Qualifier to the expression transformation.
- Create a variable port V_count in the expression transformation.
- Increment it by 1 for each row.
- Create V_salary variable port and assign IIF(V_count=1,NULL,V_prev_salary) to it.
- Create variable port V_prev_salary and assign Salary.
- Create output port O_prev_salary and assign V_salary.
- Connect the expression transformation to the target ports.
The ports in the expression transformation will be:
Q103. What is the name of the scenario in which the Informatica server rejects files?
Ans: The Informatica server rejects files when there is a rejection of the update strategy transformation. In such a rare case scenario the database comprising the information and data also gets interrupted.
Q104. What will happen in the following scenario:
If the SELECT list COLUMNS in the Custom override SQL Query and the OUTPUT PORTS order in SQ transformation do not match?
Ans. Such a scenario where the SELECT list COLUMNS in the Custom override SQL Query and the OUTPUT PORTS order in SQ transformation do not match – may result in session failure.
Q105. What can be done to enhance the performance of the joiner condition?
Ans. The joiner condition performance can be enhanced by the following:
- Sort the data before applying to join.
- If the data is unsorted, then consider the source with fewer rows as the master source.
- Perform joins in a database.
- If joins cannot be performed for some tables, then the user can create a stored procedure and then join the tables in the database.
Q106. How do you load alternate records into different tables through mapping flow?
Ans. To load alternate records into different tables through mapping flow, just add a sequence number to the records and then divide the record number by 2. If it can be divided, then move it to one target. If not, then move it to the other target.
It involves the following steps:
- Drag the source and connect to an expression transformation.
- Add the next value of a sequence generator to the expression transformation.
- Make two ports, Odd and Even in the expression transformation.
- Write the expression below
v_count (variable port) = v_count+1
o_count (output port) = v_count
- Connect a router transformation and drag the port (products, v_count) from expression into the router transformation.
- Make two groups in the router
- Give condition
- Send the two groups to different targets
Q107. How do you implement Security Measures using a Repository manager?
Ans. There are 3 ways to implement security measures:
- Folder Permission within owners, groups, and users.
- Locking (Read, Write, Retrieve, Save, and Execute).
- Repository Privileges
Q108. How can you store previous session logs in Informatica?
Ans. The following steps will enable you to store previous session logs in Informatica:
- Go to Session Properties > Config Object > Log Options
- Select the properties:
Save session log by –> SessionRuns
Save session log for these runs –> Change the number that you want to save the number of log files (Default is 0)
- If you want to save all of the log files created by every run, and then select the option Save session log for these runs –> Session TimeStamp
Q109. Mention the performance considerations while working with Aggregator Transformation?
Ans. The following are the performance considerations while working with Aggregator Transformation:
- To reduce unnecessary aggregation, filter the unnecessary data before aggregating.
- To minimize the size of the data cache, connect only the needed input/output ports to the succeeding transformations.
- Use Sorted input to minimize the amount of data cached to enhance the session performance.
If you have recently completed a professional course/certification, click here to submit a review and get FREE certification highlighter worth Rs. 500.