Home Insights Creating an image similarity function with TensorFlow and its application in e-commerce

Creating an image similarity function with TensorFlow and its application in e-commerce

Creating an image similarity function with TensorFlow and its application in e-commerce

In our previous post, we showed you how to use image recognition to solve the issue of misattribution in e-commerce catalogs. Once you start to trust your models and have trained them to detect a valuable amount of attributes, it is easy to expand from attribute verification to auto tagging. However, our approach to misattribution is only very efficient when you already have a training set with a large set of images. The question remains, how can we take advantage of this technology when we only have a couple of examples in our library? In this post, we’ll show you how to solve this issue by building an image similarity function, which can be used to build image search and fill attribution gaps.

Image similarity: Filling attribution gaps with image recognition without a training set

As we know, our previous approach works well when you already have attributed data. The data set serves as a comparison for the new input from product images. Now, we need to figure out how to approach cases where the “Hawaiian” style attribute is needed, but there is no training set.

One approach would be to spend a couple of months creating an initial training set for your model. However, based on our experience, we posited that there has to be a way to get attributes from “similar images” using an untagged catalog. Taking this hypothesis as our starting point, we created an image similarity function. This turned out to be our first step to implementing image search within our e-commerce catalog.

With an image similarity function you can take a couple of examples that illustrate a new attribute. After which, you can search for images similar to the examples and assign the attribute to those images, as a baseline. As in the case of misattribution, we started with InceptionV3 Convolutional Neural Networks in TensorFlow. You may recall, Inception V3 is already trained and is able to recognize thousands of features. This time, instead of retraining the classification layer, we’ve taken vectors from the pool_3 layer and started our research. The pool_3 layer is the last pooling layer of this model, mentioned in the previous post.

We started by building a vector representation of every image in our catalog, which generated thousands of vectors. The result was a rich multi-dimensional vector space.

Visualization of building a vector representation for the images in our catalog using  pool_3 layer vectors that are needed for building a image similarity function for our e-commerce catalog.

Upon analysis, we discovered that Euclidian squared distance works as similarity function for images within the multi-dimensional space.  This means, the more similar two images are, the closer their corresponding coordinates manifest within the space. As a result, when you apply a distance function that spans across all items in a set, you can build a lot of different features, including:

  • Clusterization algorithms: Identify clusters of products/images that represent new attributes. A merchandiser could, for example, select the centroid of the cluster (the most representative point within the group) and assign a new attribute for the represented image. The attribute would then be automatically assigned to all images in the given cluster.
  • Image navigation: Identify the starting point for your image gallery through which a customer can gradually navigate to the desired image/product.
  • Image search: Retrieve the top number of corresponding images to a given image/product.
  • Visual filtering: Build a custom image filter, based on a user’s preferences.

Image search: Retrieve images similar to a given image

After performing a market analysis, we decided to focus on image search as our preferred functionality. Image search can serve as the basis for a better search and navigation experience for users. Before getting into the technical details, let’s start with a couple of examples.  Below, you can see the images that were used as the query with the image search results presented in a table underneath. 

Example of image search using women’s strappy heels. The query is at the top and the results are below.
An example of image search with women’s patterned tops
An example of image search, powered by machine learning , to find women’s patterned button up shirts.
An example of using tensor flow to power an image search for women’s knee high dresses

Our first version of image search returned good quality results; however, it had a number of performance issues. As such, we shifted our focus to performance improvement.

Performance improvements: Find the nearest vectors in multidimensional space

Considering our previous learning about distance functions in multi-dimensional space, the image search solution can now be rephrased as “find the nearest vectors in multidimensional space”. Again, this is because the more similar two images are, the closer they appear. We built a K-dimensional TREE data structure to perform this operation. A K-dimensional TREE is a data structure for organizing a number of points in a space with K dimensions within a Euclidian plane. It is a binary search tree with other constraints imposed on it. K-d trees are very useful for range and nearest neighbour searches.

By implementing the K-d Tree, we saw a significant performance boost, with a 15x boost within a 500k image index. The K-d Tree data structure is very efficient, but it’s complexity grows tremendously in proportion to vectors’ dimensionality. In order to combat this issue, we decided to use Principal component analysis (PCA),  to simplify vectors into their core components.  PCA is a methodology for emphasizing variation and highlighting patterns in datasets. It is often used to simplify data for mining and visualization purposes. After testing the results post PCA, we found that decreasing a vector’s dimensionality from 2048 to 1024 did not affect image search quality significantly.

Using a K-dimensional TREE data structure to find nearest vectors in multidimensional space and perform image search

Other performance considerations

Performance is closely tied to the success of implementing this kind of approach. The speed of search within e-commerce is critical to the consumer’s user experience and subsequent conversion. For this reason, image search needs to be built in such a way as to optimize performance. Luckily, many sophisticated performance optimizations are possible within this approach, such as ball tree. A ball tree is a data structure that can be used for fast, high-dimensional nearest-neighbour searches. The scorer can be programmed to match every image, or alternatively it can be programmed to skip some images with low similarity based on a threshold or more advanced criteria.

The critical point is that the scoring mechanism is able to work on the same query in tandem with all other Lucene scoring mechanisms. This also makes it possible to combine visual search with filtration. Lucene has a number of tools to help with performance as well, such as its two-phase iterator API.  It consists of an approximation phase that quickly iterates a superset of matching documents, and a verification phase that can be used to check if a document in this superset actually matches the query. 

Imagine that you are searching for “knee-length dress” and applying a filter as well. The fact that we can dissociate the approximation from the verification phase allows us to intersect the approximation with the filter first, so that we will verify positions on a smaller set of attributes. This is very useful for our search use-case in general. However, a two-phase iteration pattern also applies to geo-distance queries, where it can use a distance computation as a verification, and only apply filters that can return all images in the index as an approximation and run image similarity as a verification.

For more information on building search with Solr Lucene, check out our Search blog series. 

Conclusion

Using Machine Learning for image recognition is nascent technology with huge potential for business applications. In this series of blog posts, we’ve covered how to use ML to build an image search functionality within an e-commerce catalog and how it can be used to resolve misattribution and attribution gaps, leveraging Google’s TensorFlow framework. In subsequent posts, we will cover how to integrate ML technology with other best-of-breed tools like Solr-based search. 

Don’t forget to subscribe to our blog to get the latest Open Source blueprints for Search, QA, Real-time Analytics, In-Stream Processing and more. If you liked this post, comment below.

Tags

You might also like

Vibrant translucent cubes and silhouettes of people in a digital cityscape, visually representing the dynamic and layered nature of AI software development, where diverse technologies, data, and human collaboration intersect to build innovative, interconnected digital solutions
Article
Your centralized command center for managing AI-native development
Article Your centralized command center for managing AI-native development

Fortune 1000 enterprises are at a critical inflection point. Competitors adopting AI software development are accelerating time-to-market, reducing costs, and delivering innovation at unprecedented speed. The question isn’t if you should adopt AI-powered development, it’s how quickly and effectivel...

Colorful, translucent spiral staircase representing the iterative and evolving steps of the AI software development lifecycle
Article
Agentic AI now builds autonomously. Is your SDLC ready to adapt?
Article Agentic AI now builds autonomously. Is your SDLC ready to adapt?

According to Gartner, by 2028, 33% of enterprise software applications will include agentic AI. But agentic AI won’t just be embedded in software; it will also help build it. AI agents are rapidly evolving from passive copilots to autonomous builders, prompting organizations to rethink how they dev...

Code on the left side with vibrant pink, purple, and blue fluid colors exploding across a computer screen, representing the dynamic nature of modern web development.
Article
Tailwind CSS: The developers power tool
Article Tailwind CSS: The developers power tool

When it comes to the best web development frameworks, finding the right balance between efficiency, creativity, and maintainability is key to building modern, responsive designs. Developers constantly seek tools and approaches that simplify workflows while empowering them to create visually strikin...

Cube emitting colorful data points, with blue, red, and gold light particles streaming upward against a black background, representing data transformation and AI capabilities.
Article
Data as a product: The missing link in your AI-readiness strategy
Article Data as a product: The missing link in your AI-readiness strategy

Most enterprise leaders dip their toe into AI, only to realize their data isn’t ready—whether that means insufficient data, legacy data formats, lack of data accessibility, or poorly performing data infrastructure. In fact, Gartner predicts that through 2026, organizations will abandon 60% of AI pr...

Multicolor whisps of smoke on a black background
Article
Headless CMS for the AI era with Grid Dynamics, Contentstack, and Google Cloud
Article Headless CMS for the AI era with Grid Dynamics, Contentstack, and Google Cloud

For many businesses, moving away from familiar but inherently unadaptable legacy suites is challenging. However, eliminating this technical debt one step at a time can bolster your confidence. The best starting point is transitioning from a monolithic CMS to a headless CMS. This shift to a modern c...

Orange blocks against a grey background to represent microservices in the cloud
Article
Cloud modernization playbook: From monolith to microservices
Article Cloud modernization playbook: From monolith to microservices

Many organizations have already embraced practices like Agile and DevOps to enhance collaboration and responsiveness in meeting customer needs. While these advancements mark significant milestones, the journey doesn't end here. Microservices offer another powerful way to accelerate business capabil...

5 emerging Kubernetes use cases beyond container scheduling
Article
Kubernetes use cases beyond container scheduling
Article Kubernetes use cases beyond container scheduling

From AI/ML workloads and multi-tenancy to test labs and edge computing, uncover 5 practical examples of Kubernetes-based platform engineering.

Get in touch

Let's connect! How can we reach you?

    Invalid phone format
    Submitting
    Creating an image similarity function with TensorFlow and its application in e-commerce

    Thank you!

    It is very important to be in touch with you.
    We will get back to you soon. Have a great day!

    check

    Something went wrong...

    There are possible difficulties with connection or other issues.
    Please try again after some time.

    Retry