This disclosure relates generally relates to artificial intelligence powered styling agent.
A user can view a selected garment on an avatar using virtual try-on technology, however a recommendation for a complementary garment to complete the outfit can be useful and advantageous to the user.
To facilitate further description of the embodiments, the following drawings are provided in which:
For simplicity and clarity of illustration, the drawing figures illustrate the general manner of construction, and descriptions and details of well-known features and techniques may be omitted to avoid unnecessarily obscuring the present disclosure. Additionally, elements in the drawing figures are not necessarily drawn to scale. For example, the dimensions of some of the elements in the figures may be exaggerated relative to other elements to help improve understanding of embodiments of the present disclosure. The same reference numerals in different figures denote the same elements.
The terms “first,” “second,” “third,” “fourth,” and the like in the description and in the claims, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the terms so used are interchangeable under appropriate circumstances such that the embodiments described herein are, for example, capable of operation in sequences other than those illustrated or otherwise described herein. Furthermore, the terms “include,” and “have,” and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, device, or apparatus that comprises a list of elements is not necessarily limited to those elements, but may include other elements not expressly listed or inherent to such process, method, system, article, device, or apparatus.
The terms “left,” “right,” “front,” “back,” “top,” “bottom,” “over,” “under,” and the like in the description and in the claims, if any, are used for descriptive purposes and not necessarily for describing permanent relative positions. It is to be understood that the terms so used are interchangeable under appropriate circumstances such that the embodiments of the apparatus, methods, and/or articles of manufacture described herein are, for example, capable of operation in other orientations than those illustrated or otherwise described herein.
The terms “couple,” “coupled,” “couples,” “coupling,” and the like should be broadly understood and refer to connecting two or more elements mechanically and/or otherwise. Two or more electrical elements may be electrically coupled together, but not be mechanically or otherwise coupled together. Coupling may be for any length of time, e.g., permanent or semi-permanent or only for an instant. “Electrical coupling” and the like should be broadly understood and include electrical coupling of all types. The absence of the word “removably,” “removable,” and the like near the word “coupled,” and the like does not mean that the coupling, etc. in question is or is not removable.
As defined herein, two or more elements are “integral” if they are comprised of the same piece of material. As defined herein, two or more elements are “non-integral” if each is comprised of a different piece of material.
As defined herein, “approximately” can, in some embodiments, mean within plus or minus ten percent of the stated value. In other embodiments, “approximately” can mean within plus or minus five percent of the stated value. In further embodiments, “approximately” can mean within plus or minus three percent of the stated value. In yet other embodiments, “approximately” can mean within plus or minus one percent of the stated value.
As defined herein, “real-time” can, in some embodiments, be defined with respect to operations carried out as soon as practically possible upon occurrence of a triggering event. A triggering event can include receipt of data necessary to execute a task or to otherwise process information. Because of delays inherent in transmission and/or in computing speeds, the term “real-time” encompasses operations that occur in “near” real-time or somewhat delayed from a triggering event. In a number of embodiments, “real-time” can mean real-time less a time delay for processing (e.g., determining) and/or transmitting data. The particular time delay can vary depending on the type and/or amount of the data, the processing speeds of the hardware, the transmission capability of the communication hardware, the transmission distance, etc. However, in many embodiments, the time delay can be less than 1 milliseconds, 1 second, 2 seconds, or 1 minute, or another suitable time delay period.
Turning to the drawings,
Continuing with
As used herein, “processor” and/or “processing module” means any type of computational circuit, such as but not limited to a microprocessor, a microcontroller, a controller, a complex instruction set computing (CISC) microprocessor, a reduced instruction set computing (RISC) microprocessor, a very long instruction word (VLIW) microprocessor, a graphics processor, a digital signal processor, or any other type of processor or processing circuit capable of performing the desired functions. In some examples, the one or more processors of the various embodiments disclosed herein can comprise CPU 210.
In the depicted embodiment of
In some embodiments, network adapter 220 can comprise and/or be implemented as a WNIC (wireless network interface controller) card (not shown) plugged or coupled to an expansion port (not shown) in computer system 100 (
Although many other components of computer system 100 (
When computer system 100 in
Turning ahead in the drawings,
In many embodiments, system 300 can include a styling model system 310 and/or a web server 320. Styling model system 310 and/or web server 320 can each be a computer system, such as computer system 100 (
In a number of embodiments, each of styling model system 310 and/or web server 320 can be a special-purpose computer programed specifically to perform specific functions not associated with a general-purpose computer, as described in greater detail below.
In some embodiments, web server 320 can be in data communication through network 330 with one or more user computers, such as user computers 340 and/or 341. Network 330 can be a public network, a private network or a hybrid network. In some embodiments, user computers 340-341 can be used by users, such as users 350 and 351, which also can be referred to as customers, in which case, user computers 340 and 341 can be referred to as customer computers. In many embodiments, web server 320 can host one or more sites (e.g., websites) that allow users to browse and/or search for items (e.g., products), to add items to an electronic shopping cart, and/or to order (e.g., purchase) items, in addition to other suitable activities.
In some embodiments, an internal network that is not open to the public can be used for communications between styling model system 310 and/or web server 320 within system 300. Accordingly, in some embodiments, styling model system 310 (and/or the software used by such systems) can refer to a back end of system 300, which can be operated by an operator and/or administrator of system 300, and web server 320 (and/or the software used by such system) can refer to a front end of system 300, and can be accessed and/or used by one or more users, such as users 350-351, using user computers 340-341, respectively. In these or other embodiments, the operator and/or administrator of system 300 can manage system 300, the processor(s) of system 300, and/or the memory storage unit(s) of system 300 using the input device(s) and/or display device(s) of system 300.
In certain embodiments, user computers 340-341 can be desktop computers, laptop computers, a mobile device, and/or other endpoint devices used by one or more users 350 and 351, respectively. A mobile device can refer to a portable electronic device (e.g., an electronic device easily conveyable by hand by a person of average size) with the capability to present audio and/or visual data (e.g., text, images, videos, music, etc.). For example, a mobile device can include at least one of a digital media player, a cellular telephone (e.g., a smartphone), a personal digital assistant, a handheld digital computer device (e.g., a tablet personal computer device), a laptop computer device (e.g., a notebook computer device, a netbook computer device), a wearable user computer device, or another portable computer device with the capability to present audio and/or visual data (e.g., images, videos, music, etc.). Thus, in many examples, a mobile device can include a volume and/or weight sufficiently small as to permit the mobile device to be easily conveyable by hand. For examples, in some embodiments, a mobile device can occupy a volume of less than or equal to approximately 1790 cubic centimeters, 2434 cubic centimeters, 2876 cubic centimeters, 4056 cubic centimeters, and/or 5752 cubic centimeters. Further, in these embodiments, a mobile device can weigh less than or equal to 15.6 Newtons, 17.8 Newtons, 22.3 Newtons, 31.2 Newtons, and/or 44.5 Newtons.
Exemplary mobile devices can include (i) an iPod®, iPhone®, iTouch®, iPad®, MacBook® or similar product by Apple Inc. of Cupertino, California, United States of America, (ii) a Blackberry® or similar product by Research in Motion (RIM) of Waterloo, Ontario, Canada, (iii) a Lumia® or similar product by the Nokia Corporation of Keilaniemi, Espoo, Finland, and/or (iv) a Galaxy™ or similar product by the Samsung Group of Samsung Town, Seoul, South Korea. Further, in the same or different embodiments, a mobile device can include an electronic device configured to implement one or more of (i) the iPhone® operating system by Apple Inc. of Cupertino, California, United States of America, (ii) the Blackberry® operating system by Research In Motion (RIM) of Waterloo, Ontario, Canada, (iii) the Palm® operating system by Palm, Inc. of Sunnyvale, California, United States, (iv) the Android™ operating system developed by the Open Handset Alliance, (v) the Windows Mobile™ operating system by Microsoft Corp. of Redmond, Washington, United States of America, or (vi) the Symbian™ operating system by Nokia Corp. of Keilaniemi, Espoo, Finland.
Further still, the term “wearable user computer device” as used herein can refer to an electronic device with the capability to present audio and/or visual data (e.g., text, images, videos, music, etc.) that is configured to be worn by a user and/or mountable (e.g., fixed) on the user of the wearable user computer device (e.g., sometimes under or over clothing; and/or sometimes integrated with and/or as clothing and/or another accessory, such as, for example, a hat, eyeglasses, a wrist watch, shoes, etc.). In many examples, a wearable user computer device can include a mobile device, and vice versa. However, a wearable user computer device does not necessarily include a mobile device, and vice versa.
In specific examples, a wearable user computer device can include a head mountable wearable user computer device (e.g., one or more head mountable displays, one or more eyeglasses, one or more contact lenses, one or more retinal displays, etc.) or a limb mountable wearable user computer device (e.g., a smart watch). In these examples, a head mountable wearable user computer device can be mountable in close proximity to one or both eyes of a user of the head mountable wearable user computer device and/or vectored in alignment with a field of view of the user.
In more specific examples, a head mountable wearable user computer device can include (i) Google Glass™ product or a similar product by Google Inc. of Menlo Park, California, United States of America; (ii) the Eye Tap™ product, the Laser Eye Tap™ product, or a similar product by ePI Lab of Toronto, Ontario, Canada, and/or (iii) the Raptyr™ product, the STAR 1200™ product, the Vuzix Smart Glasses M100™ product, or a similar product by Vuzix Corporation of Rochester, New York, United States of America. In other specific examples, a head mountable wearable user computer device can include the Virtual Retinal Display™ product, or similar product by the University of Washington of Seattle, Washington, United States of America. Meanwhile, in further specific examples, a limb mountable wearable user computer device can include the iWatch™ product, or similar product by Apple Inc. of Cupertino, California, United States of America, the Galaxy Gear or similar product of Samsung Group of Samsung Town, Seoul, South Korea, the Moto 360 product or similar product of Motorola of Schaumburg, Illinois, United States of America, and/or the Zip™ product, One™ product, Flex™ product, Charge™ product, Surge™ product, or similar product by Fitbit Inc. of San Francisco, California, United States of America.
In several embodiments, system 300 can include one or more input devices (e.g., one or more keyboards, one or more keypads, one or more pointing devices such as a computer mouse or computer mice, one or more touchscreen displays, a microphone, etc.), and/or can each include one or more display devices (e.g., one or more monitors, one or more touch screen displays, projectors, etc.). In these or other embodiments, one or more of the input device(s) can be similar or identical to keyboard 104 (
Meanwhile, in many embodiments, system 300 also can be configured to communicate with and/or include one or more databases. The one or more databases can include a product database that contains information about products, items, or SKUs (stock keeping units), for example, among other data as described herein, such as described herein in further detail. The one or more databases can be stored on one or more memory storage units (e.g., non-transitory computer readable media), which can be similar or identical to the one or more memory storage units (e.g., non-transitory computer readable media) described above with respect to computer system 100 (
The one or more databases can each include a structured (e.g., indexed) collection of data and can be managed by any suitable database management systems configured to define, create, query, organize, update, and manage database(s). Exemplary database management systems can include MySQL (Structured Query Language) Database, PostgreSQL Database, Microsoft SQL Server Database, Oracle Database, SAP (Systems, Applications, & Products) Database, and IBM DB2 Database.
In many embodiments, styling model system 310 can include a communication system 311, an identification system 312, a selecting system 313, a searching system 314, a segmenting system 315, a visualization system 316, a filtering system 317, a training system 318, and/or an augmenting system 319. In many embodiments, the systems of styling model system 310 can be modules of computing instructions (e.g., software modules) stored at non-transitory computer readable media that operate on one or more processors. In other embodiments, the systems of styling model system 310 can be implemented in hardware. Styling model system 310 can be a computer system, such as computer system 100 (
In many embodiments, styling model system 310 can include a framework that mimics the recommendations of a styling agent (SA) when given a garment (e.g., chosen by a user), such as a shirt, the styling agent can then recommend other similar garments that complement the shirt. In several embodiments, the styling agent can utilize two components: a segmentation model and a visual search model (e.g., styling model), as described in greater detail below. In many embodiments, the framework that mimics the recommendation of a styling agent (SA) can be an improvement over conventional recommendation systems that often rely on attempts to try and mimic an acquired preference or taste of a user. In some embodiments, the convention recommendations system often mimic human behavior by encouraging models to mimic subjective tastes or preferences usually learned from examples or data of a user.
Turning ahead in the drawings,
In some embodiments, an advantage of using the styling agent framework includes building or generating a visual recommendation system such that, given a garment G, the SA can automatically propose garments that will build a complete outfit without mimicking subjective preferences of the user. In many embodiments, the styling agent framework can begin with garment data collected from stock images of a vendor, where each garment has a set of images {g}i=1N, where the chosen garment can appear dressed on a model or avatar with complementing garments, and where g refers to a garment and N refers to the number of images i in the set of images, where the number of images in this set equals one. In various embodiments, using the SA can be subdivided into two sub-tasks: (1) finding the complementing garments in gi, and (2) finding the most similar garments to the anchor (e.g., primary) garment among other garments in a catalog. In several embodiments, a sub-task can be a semantic segmentation task, where the classes are garment classes e.g., shirts, bottoms, jackets, skirts, etc. In some embodiments, the sub-task searches from complementary garments from the vendor in order to find the best image of a complementary garment upon which to base a visualization search.
In many embodiments, the sub-task can define a score where the complementing garments are most visible and apply the score to choose the best image out of the remaining images. In several embodiments, SA can generate a respective score for each complementing garment by scanning the image using a scanner, to identify a presence of the anchor item (e.g., main item) in the image, where a higher score correlates to a higher percentage of the anchor item that appears in the scanned image and a lower score correlates to a lower percentage of the anchor item viewed in that appears in the image, such as a partial view of the anchor image. In some embodiments, once the image is scanned, the SA analyzes a ratio of an area of the complementary item compared to an area of the anchor items where a higher ratio correlates to a higher score such as a ratio between 1.0 and 0.0. For example, a higher ratio of 1.0 indicates that the image of the complementing garment is most visible and the score can be applied to choose the best image out of the remaining images. In many embodiments, a scanned image can show a location of a complementary garment that is cropped where the ratio of the area will be lower thus reducing the score for that scanned image. In some embodiments, when the location of the complementary item in the scanned image touches the image boundaries, the SA determines that the image can be cropped to some percentage when finding the ratio of that image.
In various embodiments, the next sub-task can apply a visual search algorithm that completes the SA system. In a number of embodiments, a visual search model can utilize self-supervised techniques that can capture similarity in a representation space, and where the recommendations are based on K-nearest neighbors in the representation space. In various embodiments, SA can output recommendations based on similarity in the representation space and implement a three-step approach that gradually integrates auxiliary information into a training procedure such as categorical garment features (collar type, sleeve length, etc.), while also integrating feedback by individuals or experts, such as fashion expert recommendations.
Turning back in the drawings,
In these or other embodiments, one or more of the activities of method 400 can be implemented as one or more computing instructions configured to run at one or more processors and configured to be stored at one or more non-transitory computer-readable media. Such non-transitory computer-readable media can be part of a computer system such as styling model system 310 and/or web server 320. The processor(s) can be similar or identical to the processor(s) described above with respect to computer system 100 (
In a number of embodiments, method 400 can include an activity 405 of receiving stock images comprising an anchor garment. In many embodiments, the anchor garment (e.g., a primary garment) can be part of a set of images used for dressing. In some embodiments, the stock images of the anchor garment can be from a vendor catalog. In various embodiments, the anchor garment can be utilized as a basis used by a styling agent to recommend other garments similar to a complementary garment from a catalog inspired by similarity of visual features and/or initial styling by the styling agent. In some embodiments, the recommendations can be based on a premise that each garment (e.g., item or product) has potential matching garments from one of multiple images in a catalog. In some embodiments, the stock images of the anchor garment can be from a vendor catalog.
In various embodiments, method 400 also can include an activity 410 of automatically identifying the anchor garment and complementary garments within the stock images. In several embodiments, activity 410 can include using a segmentation model to identify the anchor garment and the complementary garments within the stock images. In some embodiments, the segmentation also can be a semantic segmentation model. In various embodiments, semantic segmentation can include a per-pixel multi-class classification task. In several embodiments, a training example can include a tuple {x, y} where x∈R3×N×N, is an RGB (Red Green Blue) image, and y∈RC×N×N where C is the number of classes, R is a real number and each channel is a binary map representing where the class exists in the image. In some embodiments, an encoder model can be used to for semantic segmentation, such an encoder model can include an Encoder-Decoder Convolutional Neural Network (CNN) architecture or a Feature Pyramid Network (FPN) with an Efficient Net.
In many embodiments, the system can begin by generating training data to train a segmentation model to output (e.g., extract) potential complementing garments. In several embodiments, the segmentation model can be a deep neural network model.
In a number of embodiments, the segmentation model can be trained on two types of image data (e.g., training data) over a period of time: (1) garment images, and (2) synthetic dressings. In many embodiments, garment images can include an image of a single garment on a white background, while synthetic dressing images can include a model or avatar dressed with a single garment or more.
In several embodiments, augmenting the synthetic dressings to make the images more similar to the group of stock images {gi}i=1N that can be used to conduct a search for similar garments (as described below in activity 420), wherein gi refers to group of stock images, N refers to the number of stock images, and i refers to the image and the search for similar garments can include search stock images from a vendor. In various embodiments, the segmentation model can utilize a loss function, such as a cross entropy loss function, to further train the segmentation model, as follows:
Where ŷ refers to the model's prediction, and L is the cross entropy loss function.
Jumping ahead in the drawings,
In some embodiments, activity 410 also can include identifying the anchor garment based on which garment is most commonly found in the stock images.
In a number of embodiments, method 400 further can include an activity 415 of selecting an image of the stock images in which a mask area of a first complementary garment of the complementary garments as a ratio of an area of the anchor garment is largest over other complementary garments of the complementary garments. For example, the ratio R is between the area A of the main garment (gm), A(gm), and the sum of all other completing garment, Σi=1NA(gi), or
In some embodiments, the ratio can be sufficient for in-distribution images of garments (e.g., examples), however test data also can include out-of-distribution (OOD) images of garments (e.g., examples). Further constraints can be added to account for the OOD images. As an example, ODD images can include image 1355 (
In many embodiments, activity 415 of selecting the image also can include filtering out the stock images in which the complementary garments are partially cropped out.
For example,
Returning to
In some embodiments, activity 420 can include pre-training a visual search model. In various embodiments, pre-training the visual search model can include a multiple level process: self-supervised pre-training, deep clustering, and an imitation learning.
Jumping ahead in the drawings,
For example, for each garment, g, selected in previous analyses described as a candidate garment for outfit completion, the visual search model can output a latent representation vg of each candidate garment, such that, vg=Ø(g), where Ø symbolizes the model, and the candidate garment as a complementary garment is expressed as:
where d is the cosine distance, gi refers to a garment from a catalog (e.g., styling catalog), and ĝ refers to the garment chosen from the catalog.
In various embodiments, training the visual search model can include implementing a training procedure for pre-training on a self-supervised task, deep clustering, and integrating fashion experts recommendations, as described in greater detail below. In many embodiments, the pre-training step for the self-supervised tasks can include for each example X, the self-supervised task creates (e.g., define) a transformation T, where an objective is to minimize a distance between the example and the transformed example, such that:
where X is the original image or a garment from a database, T[X] is a transformation of image X that preserves its identity, for example: translation, rotation, cropping, noise, small changes in color, etc., d is the distance, min is the minimization function, and Lssl is the loss function.
In some embodiments, activity 420 further can include pre-training the visual search model by augmenting batch images for training the visual search model with positive examples or negative examples. In several embodiments, augmenting batch images for use as data (training data) to train the visual search model allows the SA to learn from representations of garments that are invariant to a wide range of augmentations.
Jumping ahead in the drawings,
In various embodiments, activity 420 of augmenting batch images can include generating new images to be the positive examples, based on the stock images that comprise the first complementary garment, by at least one of the following: changing hues of the first complementary garment, changing an angle of or skewing the first complementary garment, changing a size of the first complementary garment, adding holes in the stock images of the first complementary garment, or changing an avatar model wearing the first complementary garment using a virtual try on (VTO) mode.
In several embodiments, augmenting the batch images additionally can include automatically selecting the negative examples from images of other garments in the item catalog.
In various embodiments, augmenting the batch images also can include generating new images to be the negative examples, based on the stock images that comprise the first complementary garment, by changing a color of the first complementary garment.
In various embodiments, activity 420 further can include performing deep clustering on the visual search model, as pre-trained, to mine k-nearest neighbors, with hard negative mining based on garment metadata. In many embodiments, deep clustering and imitation learning can utilize garment features and enlist individuals (e.g., styling experts) to construct a latent space where garments that have similar features, e.g. neck collar, fit, sleeve length, are similar to each other. In some embodiments, the output of deep clustering and imitation learning can be used as a basis for recommendations on similarity of the complementary garments within the latent space. In many embodiments, the SA uses imitation learning to determine a quality metric of the recommendations.
Turning ahead in the drawings,
In various embodiments, training a deep clustering algorithm can use the representations (e.g., training data) from the pre-training the visual search, where given a dataset, D={x1, x2, . . . , xn} for each example xi, we mine its K-NN, Nxi, where N refers to a number of neighbors. In some embodiments, training a deep clustering algorithm can include using a SCAN algorithm. In several embodiments, during training, the SA can maximize the inner product between xi and all the examples in Nxi to create an embedding for similar products to get similar embedding. In some embodiments, the input data used deep clustering algorithm can be transformed into embedding vectors using trainable weights. In several embodiments, input data can include examples of images (e.g., pairs or K-group images) converted to embedding vectors by calculating a derivative of the trainable weights for each image and modifying the weights to account for a lower loss value. In various embodiments, converting the embedding vectors for each respective image cannot be performed manually or calculated in the human mind.
In several embodiments, to prevent the deep clustering algorithm from collapsing into one cluster, the SA can maximize the entropy of the cluster's centroids, where a final loss is expressed as:
where Ø is the model, Ø′c is the centroid of the c'th cluster, N refers to a number of nearest neighbors, k is an index, i is an index, and Lscan is the loss function. In several embodiments, the final loss can be minimized during training (e.g., optimization).
In several embodiments, deep clustering can capture additional garment properties, such as sleeve length, type of collar, fit, etc., by utilizing meta-data collected and stored in a data base the meta-data contains information of multiple properties regarding the garments. In some embodiments, for each garment, a properties vector P=[p1, p2 . . . , p] can be constructed then utilized in the following manner:
Where Pxi refers to properties of xi, Pk refers properties (e.g., vector) of a neighbor garment k, i is an index, k is an index, and Lmeta is a loss function.
In a number of embodiments, SA computes the inner product of the meta-data vectors to determine whether the example k is negative or positive where for a positive example the sign is retained as is and for a negative example the sign is inverted. An advantage of computing the inner product of the meta-data vectors is that this operation can pull the positive examples closer and push the negative examples further away. In several embodiments, hard negative mining locates garments that have different properties than the anchor garment, and we maximize the distance between the anchor garment and other garments that have different properties. In various embodiments, different properties identified in a pair of the images of garments can include a shirt and a pair of pants. In some embodiments, the SA generates the distance using the meta-data vectors or embeddings for each garment in the pair of garments by calculating a derivative with respect to weights, then modifies the weights so the embeddings of the different garments are farther away from each other. In several embodiments, a loss function can output a higher loss when different garments receive similar embeddings thus derivating the loss and changing the weights makes the embeddings of different garments go further away from each other. In various embodiments, SA mines Nxi at the end of each epoch (e.g., iteration) of the deep learning algorithm. As an example, deep learning can be trained on tens or hundreds of epochs viewing each example thousands of times where the deep learning algorithm can sometimes get stuck in running epochs using the same examples. In some embodiments, in some local minimums, the SA can utilize augmentations to create epochs that are different to avoid getting stuck in a deep learning algorithm. Such augmentations can include randomizing a transformation T each time an example is fed into the deep learning algorithm.
In several embodiments, activity 420 also can include performing active learning. In many embodiments, performing active learning can include submitting style proposals to individuals for feedback, wherein the style proposals each comprise the anchor garment and at least one of the similar garments as a group. In many embodiments, performing active learning can include receiving feedback from the individuals. In many embodiments, performing active learning can include using the style proposals that are rejected as negative examples in a feedback loop.
For example, in various embodiments, active learning can include imitation learning where the SA learns to mimic the selections or style preferences of an individual with styling expertise. For example, turning ahead in the drawings,
In some embodiments, performing active learning can include capturing a fashion preference or taste by incorporating the tagging of opinions of individuals other than the user, such as fashion experts. In several embodiments, after deep learning is completed outputting complementary garments for selection, for each garment, g the top-K similar garments, K={ki}, can be located. In various embodiments, active learning can include sending this output results of deep learning to a number of individuals, such as fashion experts, where the individuals submit a tag for each garment in K if it is similar to g or not, where a similar garment receives the label 1, dissimilar receives −1, and then deep learning can be repeated for the remaining images based on the opinions of the individuals. In some embodiments, the label (e.g., sign) can be determined by a majority vote of the individuals (e.g., experts).
Turning ahead in the drawings,
Returning to
Jumping ahead in the drawings,
Returning to
In some embodiments, identification system 312 can at least partially perform activity 410 (
In several embodiments, selecting system 313 can at least partially perform activity 415 (
In various embodiments, searching system 314 can at least partially perform activity 1250 (
In some embodiments, segmenting system 315 can at least partially perform activity 410 (
In many embodiments, visualization system 316 can at least partially perform activity 420 (
In various embodiments, filtering system 317 can at least partially perform activity 820 (
In some embodiments, training system 318 can at least partially perform activity 420 further can include pre-training the visual search model by augmenting batch images for training the visual search model with positive examples or negative examples
In many embodiments, augmenting system 319 can at least partially perform activity 420 (
In several embodiments, web server 320 can include a webpage system 321. Webpage system 321 can at least partially perform sending instructions to user computers (e.g., 350-351 (
In many embodiments, the techniques described herein can be used continuously at a scale that cannot be handled using manual techniques. For example, the number of daily and/or monthly visits to the content source can exceed approximately ten million and/or other suitable numbers, the number of registered users to the content source can exceed approximately one million and/or other suitable numbers, and/or the number of products and/or items sold on the website can exceed approximately ten million (10,000,000) approximately each day.
In a number of embodiments, the techniques described herein can solve a technical problem that arises only within the realm of computer networks, as displaying an avatar modeling an anchor garment with a complementary garment using virtual try-on technology does not exist outside the realm of computer networks. Moreover, the techniques described herein can solve a technical problem that cannot be solved outside the context of computer networks. Specifically, the techniques described herein cannot be used outside the context of computer networks, in view of a lack of data, and because a content catalog, such as an online catalog, that can power and/or feed an online website that is part of the techniques described herein would not exist.
Various embodiments can include a system. A system can include one or more processors and one or more non-transitory computer-readable media storing computing instructions, that when executed on the one or more processors, cause the one or more processors to perform certain acts. The acts can include receiving stock images comprising an anchor garment. The acts also can include automatically identifying the anchor garment and complementary garments within the stock images. The acts further can include selecting an image of the stock images in which a mask area of a first complementary garment of the complementary garments as a ratio of an area of the anchor garment is largest over other complementary garments of the complementary garments. The acts additionally can include performing an image search, using the image, in an item catalog for similar garments to the first complementary garment. The acts also can include displaying, on a user interface, an avatar wearing the anchor garment and at least one of the similar garments.
A number of embodiments can include a method. A method being implemented via execution of computing instructions configured to run on one or more processors and stored at one or more non-transitory media. The method can include receiving stock images comprising an anchor garment. The method also can include automatically identifying the anchor garment and complementary garments within the stock images. The method further can include selecting an image of the stock images in which a mask area of a first complementary garment of the complementary garments as a ratio of an area of the anchor garment is largest over other complementary garments of the complementary garments. The method additionally can include performing an image search, using the image, in an item catalog for similar garments to the first complementary garment. The method also can include displaying, on a user interface, an avatar wearing the anchor garment and at least one of the similar garments being implemented via execution of computing instructions configured to run at one or more processors and stored at one or more non-transitory computer-readable media.
Although generating an avatar modeling an anchor garment with a complementary garment has been described with reference to specific embodiments, it will be understood by those skilled in the art that various changes may be made without departing from the spirit or scope of the disclosure. Accordingly, the disclosure of embodiments is intended to be illustrative of the scope of the disclosure and is not intended to be limiting. It is intended that the scope of the disclosure shall be limited only to the extent required by the appended claims. For example, to one of ordinary skill in the art, it will be readily apparent that any element of
Replacement of one or more claimed elements constitutes reconstruction and not repair. Additionally, benefits, other advantages, and solutions to problems have been described with regard to specific embodiments. The benefits, advantages, solutions to problems, and any element or elements that may cause any benefit, advantage, or solution to occur or become more pronounced, however, are not to be construed as critical, required, or essential features or elements of any or all of the claims, unless such benefits, advantages, solutions, or elements are stated in such claim.
Moreover, embodiments and limitations disclosed herein are not dedicated to the public under the doctrine of dedication if the embodiments and/or limitations: (1) are not expressly claimed in the claims; and (2) are or are potentially equivalents of express elements and/or limitations in the claims under the doctrine of equivalents.