Walmart interview experience Real time questions & tips from candidates to crack your interview

SDE-3

Walmart

5 rounds | 7 Coding problems

Nishchay Agrawal

Experienced | Jun 2023 | SDE-3 at Meesho

Selected

Interview preparation journey

Journey

Walmart is the world’s largest company by revenue, and #1 Fortune Company in the world. Walmart was named the largest corporation by revenue on the Fortune Global 500 list for the 11th straight year I graduated in 2021 Batch of computer science and engineering. I worked as SDE-2 & previously worked at Morgan Stanley. I have around 2.5 years of experience (including internship). I took a referral on LinkedIn for the Data Engineer position at Walmart Bengaluru Division. Walmart Global Tech consists of five rounds for the Software Engineer role before final decisions are made.

Application story

I applied through off-campus. I graduated in 2021 Batch of computer science and engineering. I worked as SDE-2 (Data Engineer-II) & previously worked at Morgan Stanley. I have around 2.5 years of experience (including internship). I took a referral on LinkedIn for the Data Engineer position at Walmart Bengaluru Division. Walmart Global Tech consists of five rounds for the Software Engineer role before final decisions are made.

Why selected/rejected for the role?

Selected. I got positive feedback from HR. Fortunately, I got selected for the SDE-3 position at Walmart. Finally, I was part of my dream company & World’s FortuneNumber 1 Company Walmart

Preparation

Duration: 1 month

Topics: Data Structures, System Design, SQL, Python, Cloud AWS, Azure, GCP, Spark

Tip

Tip 1 : Focus on Data Structures
Tip 2 : Focus on SQL, Coding
Tip 3 : Focus on System Design

Application process

Where: Company Website

Eligibility: 8 CGPA

Resume tip

Tip 1: Coding skills should on top for the interview
Tip 2: System Design should have skills on the design the solution

Interview rounds

Round

Hard

Telephonic

Duration60 minutes

Interview date1 Jun 2023

Coding problem1

Round 1: Preliminary Round (Screening Round): Telephonic Round

This round consists of a detailed explanation of my previous project what I have worked on mix panel, Kafka, ETL concepts, Datahub Spark Lineage, Spark, and Data Model that I prepared during Experimentation (A/B testing) & on presto architecture. This round goes well for 45 minutes (telephonic round). They asked why they want to work for Walmart.

1. Trapping Rain Water

Moderate

15m average time

80% success

0/80

Asked in companies

You have been given a long type array/list 'arr’ of size 'n’.

It represents an elevation map wherein 'arr[i]’ denotes the elevation of the 'ith' bar.

Print the total amount of rainwater that can be trapped in these elevations.

Note :

The width of each bar is the same and is equal to 1.

Example:

Input: ‘n’ = 6, ‘arr’ = [3, 0, 0, 2, 0, 4].

Output: 10

Explanation: Refer to the image for better comprehension:

Note :

You don't need to print anything. It has already been taken care of. Just implement the given function.

Problem approach

Approach 1 (Brute Approach): This approach is the brute approach. The idea is to:

Traverse every array element and find the highest bars on the left and right sides. Take the smaller of two heights. The difference between the smaller height and the height of the current element is the amount of water that can be stored in this array element.

Try solving now

Round

Hard

Video Call

Duration75 minutes

Interview date5 Feb 2024

Coding problem3

I got a call from HR that my screening round is cleared and I was shortlisted for technical discussion. This round lasted for about 1 hour and 30 minutes and was taken by the Senior Data Engineer at Walmart.
This interview basically focused on Medium Level Data Structure and algorithm Questions with Hard LevelSQL-based questions, python coding questions, Big Data concepts, Spark, Kubernetes, Airflow Architectural Based Questions, Cloud Computing concepts, SDLC, Agile methodology (based on Scrum framework at a high level),
Some questions were asked on concepts of DevOps Strategy (basic level), CI/CD pipeline, NoSQL databases, AWS services-based scenario questions, and medium data structure-based questions (Array & Stack, Linkedin List with Tree).
There are two DSA Questions being asked I remember some of them

SQL Question Interview

The Employee table holds all employees including their managers. Every employee has an Id, and there is also a column for the manager Id. Write a SQL query that finds out managers with at least 5 direct report. For the above table, your SQL query should return:
How to find the nth highest salary for each department using Window Function or without using window function.
3. Given an employee table with attributes are empId, empSalary, and empDeptId and department table with attributes deptId, depName, and CourseOffered. I was asked to write an SQL query to find the employee which has the highest salary in each department using windows functions on the notepad. I used the dense_rank window function for constructing SQL queries. I was asked to explain the reason for using dense_rank instead rank function.

Some questions on Spark Optimisation with Hadoop Concepts such as

i) How Airflow Kubernetes works using Pod Concepts

ii) How the Airflow scheduler works with Worker machine with the webserver

iii) Difference between Container Deployment vs. Stateful Deployment in K8s. Explain how Kubernetes manages the fault tolerance

iv) You have a Spark job that is taking longer than expected to complete. What steps would you take to identify and troubleshoot performance bottlenecks?

v) You have a Spark cluster with limited resources. How would you allocate resources and configure the cluster for optimal performance?

vi) He asked me to write code for uploading Parquet files on the S3 bucket using the boto3 library (as I worked on AWS). I wrote the code for the same using Python and boto3 library on a notepad.

vii) How Airlfow Stored logs in the S3 bucket & how the backend database of airflow plays an essential role

There are other questions on Spark optimization, Kubernetes, Airflow, and Big Data Concepts along with Project-based explanation.

1. SQL Question

Given an employee table with attributes are empId, empSalary, and empDeptId and department table with attributes deptId, depName, and CourseOffered. I was asked to write an SQL query to find the employee which has the highest salary in each department using windows functions on the notepad. I used the dense_rank window function for constructing SQL queries.

I was asked to explain the reason for using dense_rank instead rank function.

Problem approach

Using the DENSE_RANK window function instead of RANK in this scenario is a good choice when you want to handle cases where multiple employees within the same department have the same salary. The DENSE_RANK function assigns the same rank to identical salary values and then continues with the next consecutive rank, without leaving gaps.

WITH RankedEmployees AS (
SELECT
empId,
empSalary,
empDeptId,
DENSE_RANK() OVER (PARTITION BY empDeptId ORDER BY empSalary DESC) AS salaryRank
FROM
Employee
)

SELECT
empId,
empSalary,
empDeptId
FROM
RankedEmployees
WHERE
salaryRank = 1;

2. Minimum Coins

Moderate

30m average time

70% success

0/80

Asked in companies

Bob went to his favourite bakery to buy some pastries. After picking up his favourite pastries his total bill was P cents. Bob lives in Berland where all the money is in the form of coins with denominations {1, 2, 5, 10, 20, 50, 100, 500, 1000}.

Bob is not very good at maths and thinks fewer coins mean less money and he will be happy if he gives minimum number of coins to the shopkeeper. Help Bob to find the minimum number of coins that sums to P cents (assume that Bob has an infinite number of coins of all denominations).

Problem approach

This problem is a variation of the problem discussed Coin Change Problem. Here instead of finding the total number of possible solutions, we need to find the solution with the minimum number of coins.

The minimum number of coins for a value V can be computed using the below recursive formula.

If V == 0:
0 coins required
If V > 0:
minCoins(coins[0..m-1], V ) = min { 1 + minCoins(V-coin[i])} where, 0 <=i <= m-1 and coins[i] <= V.

Try solving now

3. Partitioning in a Linked List

Given a linked list and a value x, partition a linked list around a value x, such that all nodes less than x come before all nodes greater than or equal to x. If x is contained within the list the values of x only need to be after the elements less than x (see below). The partition element x can appear anywhere in the “right partition”; it does not need to appear between the left and right partitions.

(Learn)

Round

Hard

Video Call

Duration60 minutes

Interview date7 Feb 2024

Coding problem1

I got a call from HR that my first round was cleared and I was shortlisted for technical discussion. This round lasted for about 1 hour 45 minutes and was taken by the Staff Data Engineer at Walmart.
The interview starts with the System Design. I was asked to design the Mixpanel system (which is event driven system) because I used Mixpanel in the Meesho. So I opened the draw.io & started making how the Mixpanel works. How the events are captured by different systems such as Android App, Web App & IOS App.
In the System Design, some questions are being asked.
i) How the load balancer works in the Mixpanel.

ii) How the requests are being handled. Let's suppose you open the Presto URL on Chrome, then this request goes to the DNS for IP address resolution then goes to the load balancer then the target gateway then finally to the Presto Coordinator. So I was asked to explain each concept

iii) He asked me to write a custom API using the spring-boot by writing only the service & controller classes using Springboot & Java API

iv) Some questions on Spark Coding. he asked me to write code to read data from delta lake (S3 bucket) & run the upsert command to update the data if the data already exists based on the primary key & insert the data if the data does not exists. I wrote the code using the DataFrame

v) Questions on Spark Optimisation such as Skewed Join, Broadcast Join, CBO & repartion vs coalesce

vi) Questions on Spark Tungsten & Catalyst Optimiser

vii) Now questions start on Java & Advanced Java

Questions on Java based on Java collection such as Interface, Map, LinkedList design & Garbage collection

Java Coding Question & OOPS Concepts

i) He asked me to write the java code to run the garbage collection using GC collector thread

ii) He asked me to explain the concept of multithreading. Then he asked me to write code for Synchronisation using Synchronised Thread

iii) Some Questions on Serialisation vs Deserialisation.(Learn)

iv) Explain the use case of the transient keyword in java.

Questions on System Design Conceptual & Synchronisation

i) What is the Semaphore variable? How do you prevent deadlock in the system. (Learn)

ii) He asked me to complete the Semaphore code for the synchronisation achievement. So I wrote Semphore in Java

import java.util.*;

class Semaphore_Interview_Round_Tedchnical {
public enum Value { Zero, One }
public Queue q = new LinkedList();
public Value value = Value.One;

public void P(Semaphore s, Process p)
{
if (s.value == Value.One) {
s.value = Value.Zero;
}
else {
q.add(p);
p.Sleep();
}
}

public void V(Semaphore s)
{
if (s.q.size() == 0) {
s.value = Value.One;
}
else {
Process p = q.peek();
q.remove();
p.Wakeup();
}
}
}
This is the code I submitted

The last Questions they asked on ETL concepts & data warehouse concepts which are general questions

i) What is the difference between Snowflake vs Star Schema

ii) How you design the data warehouse from scratch if you have new requirements. So that time I explained, the Snowflake & Databricks that I set in Morgan Stanely from the beginning

iii) Normalisation Concepts & What is SCD-2 Type with Example

iv) I was asked a question on Presto. How to onboard delta lake catalog to Presto.

v) He asked me about Agile. I explained Agile with an agile framework (Scrum) by taking concepts of a sprint, Jira Board, and iterative approach in detail. Why Agile is preferred over the waterfall model.

1. Computer Network Questions

How the requests are being handled. Let's suppose you open the Presto URL on Chrome, then this request goes to the DNS for IP address resolution then goes to the load balancer then the target gateway then finally to the Presto Coordinator. So I was asked to explain each concept

Problem approach

DNS Resolution:

When you type the Presto URL in Chrome, the browser needs to resolve the domain name to an IP address.
The Domain Name System (DNS) is responsible for this resolution process. It translates human-readable domain names like "presto.example.com" into IP addresses like "192.0.2.1".
Your browser sends a DNS query to a DNS resolver, which may be provided by your ISP or a public DNS service like Google DNS or Cloudflare DNS.
The DNS resolver looks up the IP address associated with the domain name and returns it to the browser.
Load Balancer:

Once the browser has the IP address of the Presto server, it sends an HTTP request to that IP address.
In many modern web applications, especially those with high traffic or multiple server instances, there is often a load balancer in front of the servers.
The load balancer distributes incoming requests across multiple servers to ensure efficient resource utilization and improve reliability and scalability.
The load balancer forwards the request to one of the available Presto Coordinator nodes.
Target Gateway:

The request is received by the Presto Coordinator, which acts as the entry point for queries into the Presto cluster.
The Coordinator is responsible for parsing SQL queries, planning query execution, and coordinating with other nodes in the cluster to execute the query.
The Coordinator also maintains metadata about the cluster, including information about available worker nodes and data distribution.
Presto Coordinator:

The Presto Coordinator processes the incoming query request.
It parses the SQL query, optimizes it, and generates a query plan.
The query plan may involve accessing data stored in various data sources, such as HDFS, S3, or a relational database.
The Coordinator coordinates the execution of the query across multiple Presto worker nodes.
It distributes tasks to the worker nodes and aggregates the results before returning them to the client.
In summary, when you open the Presto URL in Chrome, the request undergoes DNS resolution to find the IP address of the Presto server. The request then passes through a load balancer, which forwards it to one of the Presto Coordinator nodes. The Coordinator processes the query, plans its execution, and coordinates with worker nodes to execute the query in a distributed manner. Finally, the results are aggregated and returned to the client.

Round

Hard

Video Call

Duration60 minutes

Interview date8 Feb 2024

Coding problem1

Round 4: Techno-Managerial Interview (Managerial Round): 1 hour 10 minutes

The interview started with my introduction, my expertise, tech skillset that I had worked on. Most of the questions were asked based on Data Modeling, Databricks, Datahub, PySpark and architecture Design (ETL Design).

One- two questions were asked based on batch processing & stream processing using Spark.
He asked me to explain my project on Mixpanel and how you create the data model on delta tables so that instead of lot of raw tables get created. I explained our complete platform that I worked at Meesho such as how the data source comes. I also explained the complete data pipeline I set on data bricks to take silver or mix panel data & run the multi-tasking job to create an aggregated table based on business requirement
He asked me what the open source projects worked on then I explained on Datahub, and Spark Lineage build (which helps to find the source & destination table for the Spark application). For that I explained how I create Spark jar with Spark listener & spline package.
Question on Cost Optimisation:
✅Can you share an example of a project you worked on that had a significant impact on your organization?
✅ How did you contribute to cost optimization initiatives while working with cloud technologies?
✅ Could you describe a specific cost optimization strategy you implemented in the cloud and its results?
I was asked how you capture the event logs or what is happening on the data bricks and what user activities such as who is creating the cluster & who is running the jobs. So for that I explained the open source project I used Overwatch (Databricks Open Source Job)
Questions asked based on Spark monitoring & Spark performance management. I explained all the answers in deep dive by taking practical examples.
Some questions on JIRA or Scrum Different Calls. How you will manage multiple tasks using Agile methodology.

1. Work Related Questions

Question on Cost Optimisation:
✅Can you share an example of a project you worked on that had a significant impact on your organization?
✅ How did you contribute to cost optimization initiatives while working with cloud technologies?
✅ Could you describe a specific cost optimization strategy you implemented in the cloud and its results?

Round

Hard

Video Call

Duration30 minutes

Interview date14 Feb 2024

Coding problem1

Round 5: Director Round (Behavioral & Technical Round)

This interview was taken by the Director of Walmart. This round lasted for about 45–60 minutes. I was asked to introduce myself. Then, there is a discussion on the Meesho & Morgan Stanley projects (I explained the Datahub Spark Lineage Project Tenant Project) with that I had worked on, my Data engineer experience at Meesho roles & responsibilities in the project. I was also asked to explain my research papers on Web Crawler for Ranking of Websites Based on Web Traffic and Page Views that I published in International Conferences of IEEE & Springer. Some of the questions were related to the core principles or core values of Walmart and my inspirations. Then he asked questions related to team management & leadership qualities. I was mainly asked questions that were situation-based such as ” Tell me about a time when you faced a challenging situation at work and how you handled it.”. Then, he jumped into my resume and asked me some technical questions related to Presto vs Spark working as both are using Distributed architecture, Databricks, AWS & Delta Lakes concepts with Data Governance. Some questions that I remembered are:

i) What is Avro file format & what is its significance in delta tables?(Learn)

ii) Difference between presto vs. spark underlying architecture

iii) Can Presto work with Near Real-Time Data ( Streaming Data Source)?

iv) How did you develop the Datahub using Open Source Projects such as Spline & Datahub?

v) What do you think about Data uncertainty?

I told him that I am a Gold Medalist of Uttarakhand State of B.Tech. He was very impressed with the answers that I gave during the director's round

1. Work Related Questions

Tell me about a time when you faced a challenging situation at work and how you handled it.”. Then, he jumped into my resume and asked me some technical questions related to Presto vs Spark working as both are using Distributed architecture, Databricks

Skill covered: Programming

How do you remove whitespace from the start of a string?

strip()

rstrip()

lstrip()

remove()

Choose another skill to practice

Similar interview experiences

Database Administrator

4 rounds | 8 problems

Interviewed by Walmart

Jul 2021

Experienced

Selected

1193 views

1 comments

0 upvotes

SDE-3

3 rounds | 3 problems

Interviewed by Walmart

Feb 2022

Experienced

Selected

2431 views

0 comments

0 upvotes

SDE-3

5 rounds | 7 problems

Interviewed by Walmart

Nov 2022

Experienced

Selected

3769 views

1 comments

0 upvotes

SDE-3

3 rounds | 4 problems

Interviewed by Walmart

Sep 2021

Experienced

Selected

1224 views

0 comments

0 upvotes

Companies with similar interview experiences

SDE-3

4 rounds | 5 problems

Interviewed by Ola

Sep 2021

Experienced

Selected

1404 views

0 comments

0 upvotes

SDE-3

3 rounds | 3 problems

Interviewed by Oracle

May 2021 | Experienced

SDE - 2 at Ivy Comptech

Selected

0 views

0 comments

0 upvotes

SDE-3

3 rounds | 6 problems

Interviewed by Ola

Feb 2023

Experienced

Selected

1268 views

0 comments

0 upvotes

SDE-3

Interview preparation journey

Are you ready for your Dream Job?

Interview rounds

1. Trapping Rain Water

You have been given a long type array/list 'arr’ of size 'n’.

It represents an elevation map wherein 'arr[i]’ denotes the elevation of the 'ith' bar.

Print the total amount of rainwater that can be trapped in these elevations.

Note :

Example:

Note :

1. SQL Question

2. Minimum Coins

Bob went to his favourite bakery to buy some pastries. After picking up his favourite pastries his total bill was P cents. Bob lives in Berland where all the money is in the form of coins with denominations {1, 2, 5, 10, 20, 50, 100, 500, 1000}.

Bob is not very good at maths and thinks fewer coins mean less money and he will be happy if he gives minimum number of coins to the shopkeeper. Help Bob to find the minimum number of coins that sums to P cents (assume that Bob has an infinite number of coins of all denominations).

3. Partitioning in a Linked List

1. Computer Network Questions

1. Work Related Questions

1. Work Related Questions