EXPERT ADVICE

Harnessing Data Lakes for E-Commerce Business Growth

There were 5,524 store closures in the United States alone last year. There is no doubt that the retail industry increasingly is moving to e-commerce.

For mid-sized retail companies, this trend can be a major opportunity to gain market share in their local areas and continue to grow in an industry dominated by Amazon and other giant retailers.

However, this growth requires a strong digital transformation strategy, and mid-sized retail companies in general aren’t showing much movement in this area.

Despite talk about the importance of data science, few companies really are pushing new technologies and practices as a way to gain a competitive advantage. Instead, business analysts appear content to generate micro-strategy reports and avoid any major digital transformation.

Luckily, mid-sized e-commerce-only companies appear to be the exception.

Adopting an innovative, software company mindset, those retailers aren’t afraid to revamp their data infrastructure and take advantage of new technology. They are investing heavily in data lakes, machine learning and data analysis to stay ahead of the curve.

Data lakes have been a powerful tool for more technology-focused companies for years, for their ability to store, sort and analyze structured and unstructured data from a variety of sources in a single database.

Data lakes also allow data scientists to run custom queries on the data that would not be possible with more structured data storage and analytics tools. Now, e-commerce companies are taking advantage of the ability to create a richer, more scalable and customizable data infrastructure.

E-Commerce Adopts a Software Company Mindset

In the “retail apocalypse,” most mid-sized retailers find themselves in the same spot. They all feel pressure from the enterprise retailers, they are all relatively cash-strapped, and they all understand the value of data science.

Using data can help retailers with price optimization, inventory management and fraud detection, to start. However, not all companies are moving toward a position where they are able to take advantage of these benefits.

Unlike other retailers, many e-commerce companies are choosing to think like software companies in order to restructure their business and infrastructure. Software companies have a willingness to throw out what isn’t working and restructure their organization and data infrastructure to better take advantage of new technology.

Access to flexible, agile data forms, such as data lakes, can bring serious value to a digital transformation strategy.

Unlike traditional retailers that hold on to their existing mindset when it comes to infrastructure investments, personnel and digital strategy, e-commerce companies are thriving by embracing new technology.

Following are several tools e-commerce companies that adopt this mindset can use to harness data lakes effectively for business growth:

  • Advanced Machine Learning and Data Analysis: E-commerce companies have large and diverse data sets to pull from for machine learning and data analysis. Data lakes allow them to draw data from wider data sets and queries in new ways for deeper analysis and more artificial intelligence applications. With these storage and processing environments, e-commerce IT teams have access to machine learning, ETL (extract, transform, load), schema-on-read querying capabilities, and more analysis tools to provide a competitive advantage.
  • Flexible Data Models: The e-commerce companies seeing the most benefit from data lakes are those with the resources necessary to build data models to suit their specific needs. Data lakes allow data scientists to innovate and write schema with very few restrictions on what can be done. Combining data and running models that can’t be found in traditional data analysis tools is where companies draw significant benefit. Even more important, these models can be built and customized for each department and use case. Sharing models across departments and providing teams with the tools necessary to rework and modify models to suit each specific case is crucial.
  • Scalable Data Science: Data lakes make it extremely easy for growing e-commerce companies to scale up their data storage and processing needs quickly as they grow. Choosing a cloud-based data lake solution makes it even more efficient to immediately scale to fit the business needs.

Data Lakes Challenges for E-Commerce

Clearly, many e-commerce businesses understand the benefits available with data lakes. Unfortunately, not all companies are seeing these benefits. The biggest reason for this is that taking advantage of data analytics and data lakes can’t happen in an IT vacuum. It requires a complete data transformation across the entire organization where everyone buys into the goals and value of data science.

The other main reason is many companies take the wrong approach when building a data lake strategy.

There are three challenges e-commerce companies must overcome to reap the benefits of data lakes.

1. Need for Departmental Silos

One of the benefits of data lakes is the ability to store data in an unstructured way, which allows everyone in the organization to access data from any source across departments and job functions. However, this complete lack of structure can turn your data lake into a data swamp that no department is able to make valuable use of, because it is simply too difficult to run queries and find the data most relevant to them.

The fact is, most centralized data lake initiatives fail because they are centralized. Most of the departments that need access to this data don’t have the technical knowledge or ability to sort through it. The answer lies in the technology of Software as a Service models that would enable them to build a network of mini-data lakes (also called “data pools”) to support this multitude of data teams, without having to move data or force everybody into the same standards.

These multicloud, interconnected mini-data lakes provide the benefits of a traditional data lake while allowing each branch and department to create their own data models that fit their specific needs and draw from the data most important to them.

While wide data availability is a benefit of traditional data lakes, data pools provide the same availability with siloes to make it more efficient to access the data most relevant to that specific department. This added silo works to provide a more efficient, stable and fast data environment.

2. Lack of Clear Use Cases

Big data has endless potential to help improve retail. From using machine learning for product recommendations and assortment optimization to personalized marketing and inventory management, the use cases span across the entire organization.

However, building a data lake doesn’t unlock all of these great initiatives automatically. Seeing data lakes as a silver bullet can lead to failure if you build a powerful engine that has nothing to support it.

E-commerce companies that succeed with data lakes go into the project with a clear understanding of what they want the data lake to achieve for their organization. This helps structure and build the data lakes and data pools to suit those specific needs. Your IT team can prioritize specific queries and schema based on these goals and ensure they work with existing technology in the organization.

Make sure your organization isn’t approaching digital transformation purely to seem like an innovative software company. You need to be working toward specific goals that will have a real impact on your company and the direction you plan to head. Then a data lake strategy can be used to support that overall data strategy.

3. Demand for Technical Expertise

Another major barrier to e-commerce companies seeing the benefits of data lakes is the cost and expertise required to build a data lake infrastructure. Creating a data lake and building the custom schema required to benefit fully does require specialized knowledge that many companies don’t have access to — no matter how much they adopt the software company mindset.

Data Lake as a Service solutions have emerged to answer this need and provide a simplified, cloud-based data lake option. These Infrastructure as a Service options provide not only a less expensive and time-intensive option for setting up a data lake, but also much more scalability and flexibility in deployment options.

These solutions also provide more tools for branches and departments to leverage mini-data lakes without relying on data engineers to build custom data models. Giving more power to each department allows each individual branch to operate based on its own intel and needs. Rather than relying on a centralized data science team that doesn’t have a full understanding of the specific demands of marketing, operations, etc., each team is able to steer its own data lake strategy.

No matter what industry you are in, data teams are software teams. By changing your thinking and allowing your e-commerce company to structure its personnel and its mindset to look more like a software company, you open up endless opportunities to take more advantage of your data.

It can be difficult to make major changes to a company that has been around for decades or more. Hiring data scientists and making the IT department a more significant part of your organization could be jarring. However, that’s what digital transformation requires.

The e-commerce companies that are willing to change and adopt a more agile, data-driven strategy will thrive in the new digital landscape.

Alex Bordei

Alex Bordei is the VP of product and engineering at Lentiq, offering a multicloud, production-scale Data Lake as a Service platform.

Leave a Comment

Please sign in to post or reply to a comment. New users create a free account.

Related Stories
More in

CRM Buyer Channels