Orbitz as a Big Data Analytics Example
Orbitz is one of the top travel businesses at the forefront of leveraging big data to create next-generation mobile experiences as they look to drive more bookings from smartphones and tablet users.
Users of Orbitz perform millions of searches every day, from which they collect hundreds of gigabytes of raw data each day. This collection intends to know the users' preferences that might be useful for the company to make the user interaction smooth. Orbitz was interested to see whether it could identify consumer preferences to determine the best performing hotels to display to users to increase bookings.
Data Challenges at Orbitz
Orbitz deals with millions of data every day. So managing and storing this huge amount of data is very challenging. So, the important question is that how does Orbitz store and process all of this data?
Data challenges of Orbitz are shown below:
-
Adding data to a data warehouse requires a lengthy plan and implementation.
- The data teams need to be very judicious about adding the data.
-
To economically store and process the growing volumes of data, they needed a solution.
Orbitz solved the above challenges by storing the data in Hadoop. Hadoop is not a replacement for a Data Warehouse but rather is a complement to it. It also offers benefits other than just cost.
Storing, Processing and Analysing Data
The steps that are followed by the Orbitz to efficiently store, process and analyse the data are given below:
-
First, Web analytics software provides session data about the user interaction.
-
The raw data is stored in HDFS(Hadoop Distributed File System).
-
Extract the data from raw Webtrends logs for input to a trained classification process.
-
Logs provide input to MapReduce processing which extracts required fields.
-
The previous processes used a series of Perl and Bash scripts for extracting data serially.
-
After the extraction, the data will be in the Hive.
-
Once the data is in the Hive, It provides input data to machine learning processes. It is used to create data exports for further analysis with R scripts.
- Hive + R platform is used for query processing and statistical analysis to identify the best preferences for the user.
Frequently Asked Questions
What is Orbitz?
Orbitz is a travel fare aggregator website owned by Orbitz Worldwide, Inc., a subsidiary of Expedia Group. It was established in 1999, but its website went light in 2001.
What is Big Data?
As the name suggests, Big Data is a collection of data that is huge and yet growing in time exponentially.
What are the types of Big Data?
There are mainly three types of big data:
- Structured
- Unstructured
-
Semi-Structured
What is Hadoop?
Hadoop is an open-source framework used to efficiently store and process large datasets ranging in size from gigabytes of data to petabytes of data.
Conclusion
In this article, we have extensively discussed the Orbitz.
In this article, we started with the basic introduction, then we discussed,
- What Orbitz is
- Orbitz as a Big data example
- Data challenges in Orbitz
-
Storing, processing and analysing data in Orbitz.
We hope that this blog has helped you enhance your knowledge regarding Orbitz and if you would like to learn more, check out our articles on Mining Big Data with Hive, Big Data, Hadoop, and Data Warehouse. Do upvote our blog to help other ninjas grow.
Head over to our practice platform Coding Ninjas Studio to practice top problems, attempt mock tests, read interview experiences, interview bundle, follow guided paths for placement preparations and much more.!
Happy Reading!